Sample records for identify metagenomic clones

  1. Intracellular screen to identify metagenomic clones that induce or inhibit a quorum-sensing biosensor.

    PubMed

    Williamson, Lynn L; Borlee, Bradley R; Schloss, Patrick D; Guan, Changhui; Allen, Heather K; Handelsman, Jo

    2005-10-01

    The goal of this study was to design and evaluate a rapid screen to identify metagenomic clones that produce biologically active small molecules. We built metagenomic libraries with DNA from soil on the floodplain of the Tanana River in Alaska. We extracted DNA directly from the soil and cloned it into fosmid and bacterial artificial chromosome vectors, constructing eight metagenomic libraries that contain 53,000 clones with inserts ranging from 1 to 190 kb. To identify clones of interest, we designed a high throughput "intracellular" screen, designated METREX, in which metagenomic DNA is in a host cell containing a biosensor for compounds that induce bacterial quorum sensing. If the metagenomic clone produces a quorum-sensing inducer, the cell produces green fluorescent protein (GFP) and can be identified by fluorescence microscopy or captured by fluorescence-activated cell sorting. Our initial screen identified 11 clones that induce and two that inhibit expression of GFP. The intracellular screen detected quorum-sensing inducers among metagenomic clones that a traditional overlay screen would not. One inducing clone carries a LuxI homologue that directs the synthesis of an N-acyl homoserine lactone quorum-sensing signal molecule. The LuxI homologue has 62% amino acid sequence identity to its closest match in GenBank, AmfI from Pseudomonas fluorescens, and is on a 78-kb insert that contains 67 open reading frames. Another inducing clone carries a gene with homology to homocitrate synthase. Our results demonstrate the power of an intracellular screen to identify functionally active clones and biologically active small molecules in metagenomic libraries.

  2. Development of high-throughput phenotyping of metagenomic clones from the human gut microbiome for modulation of eukaryotic cell growth.

    PubMed

    Gloux, Karine; Leclerc, Marion; Iliozer, Harout; L'Haridon, René; Manichanh, Chaysavanh; Corthier, Gérard; Nalin, Renaud; Blottière, Hervé M; Doré, Joël

    2007-06-01

    Metagenomic libraries derived from human intestinal microbiota (20,725 clones) were screened for epithelial cell growth modulation. Modulatory clones belonging to the four phyla represented among the metagenomic libraries were identified (hit rate, 0.04 to 8.7% depending on the screening cutoff). Several candidate loci were identified by transposon mutagenesis and subcloning.

  3. Taxonomic and functional assignment of cloned sequences from high Andean forest soil metagenome.

    PubMed

    Montaña, José Salvador; Jiménez, Diego Javier; Hernández, Mónica; Angel, Tatiana; Baena, Sandra

    2012-02-01

    Total metagenomic DNA was isolated from high Andean forest soil and subjected to taxonomical and functional composition analyses by means of clone library generation and sequencing. The obtained yield of 1.7 μg of DNA/g of soil was used to construct a metagenomic library of approximately 20,000 clones (in the plasmid p-Bluescript II SK+) with an average insert size of 4 Kb, covering 80 Mb of the total metagenomic DNA. Metagenomic sequences near the plasmid cloning site were sequenced and them trimmed and assembled, obtaining 299 reads and 31 contigs (0.3 Mb). Taxonomic assignment of total sequences was performed by BLASTX, resulting in 68.8, 44.8 and 24.5% classification into taxonomic groups using the metagenomic RAST server v2.0, WebCARMA v1.0 online system and MetaGenome Analyzer v3.8 software, respectively. Most clone sequences were classified as Bacteria belonging to phlya Actinobacteria, Proteobacteria and Acidobacteria. Among the most represented orders were Actinomycetales (34% average), Rhizobiales, Burkholderiales and Myxococcales and with a greater number of sequences in the genus Mycobacterium (7% average), Frankia, Streptomyces and Bradyrhizobium. The vast majority of sequences were associated with the metabolism of carbohydrates, proteins, lipids and catalytic functions, such as phosphatases, glycosyltransferases, dehydrogenases, methyltransferases, dehydratases and epoxide hydrolases. In this study we compared different methods of taxonomic and functional assignment of metagenomic clone sequences to evaluate microbial diversity in an unexplored soil ecosystem, searching for putative enzymes of biotechnological interest and generating important information for further functional screening of clone libraries.

  4. A function-based screen for seeking RubisCO active clones from metagenomes: novel enzymes influencing RubisCO activity.

    PubMed

    Böhnke, Stefanie; Perner, Mirjam

    2015-03-01

    Ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO) is a key enzyme of the Calvin cycle, which is responsible for most of Earth's primary production. Although research on RubisCO genes and enzymes in plants, cyanobacteria and bacteria has been ongoing for years, still little is understood about its regulation and activation in bacteria. Even more so, hardly any information exists about the function of metagenomic RubisCOs and the role of the enzymes encoded on the flanking DNA owing to the lack of available function-based screens for seeking active RubisCOs from the environment. Here we present the first solely activity-based approach for identifying RubisCO active fosmid clones from a metagenomic library. We constructed a metagenomic library from hydrothermal vent fluids and screened 1056 fosmid clones. Twelve clones exhibited RubisCO activity and the metagenomic fragments resembled genes from Thiomicrospira crunogena. One of these clones was further analyzed. It contained a 35.2 kb metagenomic insert carrying the RubisCO gene cluster and flanking DNA regions. Knockouts of twelve genes and two intergenic regions on this metagenomic fragment demonstrated that the RubisCO activity was significantly impaired and was attributed to deletions in genes encoding putative transcriptional regulators and those believed to be vital for RubisCO activation. Our new technique revealed a novel link between a poorly characterized gene and RubisCO activity. This screen opens the door to directly investigating RubisCO genes and respective enzymes from environmental samples.

  5. Metagenomic approaches to identify and isolate bioactive natural products from microbiota of marine sponges.

    PubMed

    Gurgui, Cristian; Piel, Jörn

    2010-01-01

    Many marine sponges harbor massive consortia of symbiotic bacteria belonging to diverse phyla. Sponges are also an unusually rich source of biologically active natural products, and evidence is accumulating that these compounds might often be synthesized by the symbionts. Since the study of sponge-associated bacteria is generally hampered by very low cultivation rates, cultivation-independent, metagenomic methods have recently been applied to sponges. These methods allow for the isolation of biosynthetic gene clusters that can ultimately be exploited to develop sustainable natural product sources by heterologous expression. However, general challenges encountered in sponge metagenomic research are the poor quality of the isolated DNA with respect to size and yield, the difficulty to identify genes of interest among numerous homologs, insufficient clone numbers in metagenomic libraries, and time-consuming screening procedures to identify and isolate rare positive clones. Here, we give an overview of methods that address these problems and can be used to streamline isolation of biosynthetic and other genes of interest.

  6. [Cloning, expression and characterization of a novel esterase from marine sediment microbial metagenomic library].

    PubMed

    Xu, Shiqing; Hu, Yongfei; Yuan, Aihua; Zhu, Baoli

    2010-07-01

    To clone, express and characterize a novel esterase from marine sediment microbial metagenomic library. Using esterase segregation agar containing tributyrin, we obtained esterase positive fosmid clone FL10 from marine sediment microbial metagenomic library. This fosmid was partially digested with Sau3A I to construct the sublibrary, from which the esterase positive subclone pFLS10 was obtained. The full length of the esterase gene was amplified and cloned into the expressing vector pET28a, and the recombinant plasmid was transformed into E. coli BL21 cells. We analyse the enzyme activity and study the characterization of the esterase after its expression and purification. An ORF (Open Reading Frame) of 924 bp was identified from the subclone pFLS10. Sequence analysis indicated that it showed 71% amino acid identity to esterase (ADA70030) from a marine sediment metagenomic library. The esterase is a novel low-temperature-active esterase and had highest lipolytic activity to the substrate of 4-nitrophenyl butyrate (C4). The optimum temperature of the esterase was 20 degrees C, the optimum pH was 7.5. The esterase in this study had good thermostability at 20 degrees C and good pH stability at pH8 -10. Significant increase in lipolytic activity was observed with addition of K+ and Mg2+, while decrease with Mn2+ etc. We obtained the novel esterase gene fls10 from the marine sediment microbial metagenomic library. The esterase had good thermostability and high lipolytic activity at low temperature and under basic conditions, which laid a basis for industrial application.

  7. Cloning and characterization of a novel α-amylase from a fecal microbial metagenome.

    PubMed

    Xu, Bo; Yang, Fuya; Xiong, Caiyun; Li, Junjun; Tang, Xianghua; Zhou, Junpei; Xie, Zhenrong; Ding, Junmei; Yang, Yunjuan; Huang, Zunxi

    2014-04-01

    To isolate novel and useful microbial enzymes from uncultured gastrointestinal microorganisms, a fecal microbial metagenomic library of the pygmy loris was constructed. The library was screened for amylolytic activity, and 8 of 50,000 recombinant clones showed amylolytic activity. Subcloning and sequence analysis of a positive clone led to the identification a novel gene (amyPL) coding for α-amylase. AmyPL was expressed in Escherichia coli BL21 (DE3) and the purified AmyPL was enzymatically characterized. This study is the first to report the molecular and biochemical characterization of a novel α-amylase from a gastrointestinal metagenomic library.

  8. Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries.

    PubMed

    Lam, Kathy N; Charles, Trevor C

    2015-01-01

    Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite

  9. Hybrid sequencing approach applied to human fecal metagenomic clone libraries revealed clones with potential biotechnological applications.

    PubMed

    Džunková, Mária; D'Auria, Giuseppe; Pérez-Villarroya, David; Moya, Andrés

    2012-01-01

    Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be "domesticated" for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7-15 kb) cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs) with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.

  10. Identifying Differentially Abundant Metabolic Pathways in Metagenomic Datasets

    NASA Astrophysics Data System (ADS)

    Liu, Bo; Pop, Mihai

    Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of such studies is to identify specific functional adaptations of microbial communities to their habitats. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic data-sets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge. We show that MetaPath outperforms other common approaches when evaluated on simulated datasets. We also demonstrate the power of our methods in analyzing two, publicly available, metagenomic datasets: a comparison of the gut microbiome of obese and lean twins; and a comparison of the gut microbiome of infant and adult subjects. We demonstrate that the subpathways identified by our method provide valuable insights into the biological activities of the microbiome.

  11. Identifying biologically relevant differences between metagenomic communities.

    PubMed

    Parks, Donovan H; Beiko, Robert G

    2010-03-15

    Metagenomics is the study of genetic material recovered directly from environmental samples. Taxonomic and functional differences between metagenomic samples can highlight the influence of ecological factors on patterns of microbial life in a wide range of habitats. Statistical hypothesis tests can help us distinguish ecological influences from sampling artifacts, but knowledge of only the P-value from a statistical hypothesis test is insufficient to make inferences about biological relevance. Current reporting practices for pairwise comparative metagenomics are inadequate, and better tools are needed for comparative metagenomic analysis. We have developed a new software package, STAMP, for comparative metagenomics that supports best practices in analysis and reporting. Examination of a pair of iron mine metagenomes demonstrates that deeper biological insights can be gained using statistical techniques available in our software. An analysis of the functional potential of 'Candidatus Accumulibacter phosphatis' in two enhanced biological phosphorus removal metagenomes identified several subsystems that differ between the A.phosphatis stains in these related communities, including phosphate metabolism, secretion and metal transport. Python source code and binaries are freely available from our website at http://kiwi.cs.dal.ca/Software/STAMP CONTACT: beiko@cs.dal.ca Supplementary data are available at Bioinformatics online.

  12. Activity-Based Screening of Metagenomic Libraries for Hydrogenase Enzymes.

    PubMed

    Adam, Nicole; Perner, Mirjam

    2017-01-01

    Here we outline how to identify hydrogenase enzymes from metagenomic libraries through an activity-based screening approach. A metagenomic fosmid library is constructed in E. coli and the fosmids are transferred into a hydrogenase deletion mutant of Shewanella oneidensis (ΔhyaB) via triparental mating. If a fosmid exhibits hydrogen uptake activity, S. oneidensis' phenotype is restored and hydrogenase activity is indicated by a color change of the medium from yellow to colorless. This new method enables screening of 48 metagenomic fosmid clones in parallel.

  13. Discovery of new cellulases from the metagenome by a metagenomics-guided strategy.

    PubMed

    Yang, Chao; Xia, Yu; Qu, Hong; Li, An-Dong; Liu, Ruihua; Wang, Yubo; Zhang, Tong

    2016-01-01

    Energy shortage has become a global problem. Production of biofuels from renewable biomass resources is an inevitable trend of sustainable development. Cellulose is the most abundant and renewable resource in nature. Lack of new cellulases with unique properties has become the bottleneck of the efficient utilization of cellulose. Environmental metagenomes are regarded as huge reservoirs for a variety of cellulases. However, new cellulases cannot be obtained easily by functional screening of metagenomic libraries. In this work, a metagenomics-guided strategy for obtaining new cellulases from the metagenome was proposed. Metagenomic sequences of DNA extracted from the anaerobic beer lees converting consortium enriched at thermophilic conditions were assembled, and 23 glycoside hydrolase (GH) sequences affiliated with the GH family 5 were identified. Among the 23 GH sequences, three target sequences (designated as cel7482, cel3623 and cel36) showing low identity with those known GHs were chosen as the putative cellulase genes to be functionally expressed in Escherichia coli after PCR cloning. The three cellulases were classified into endo-β-1,4-glucanases by product pattern analysis. The recombinant cellulases were more active at pH 5.5 and within a temperature range of 60-70 °C. Computer-assisted 3D structure modeling indicated that the active residues in the active site of the recombinant cellulases were more similar to each other compared with non-active site residues. The recombinant cel7482 was extremely tolerant to 2 M NaCl, suggesting that cel7482 may be a halotolerant cellulase. Moreover, the recombinant cel7482 was shown to have an ability to resist three ionic liquids (ILs), which are widely used for cellulose pretreatment. Furthermore, active cel7482 was secreted by the twin-arginine translocation (Tat) pathway of Bacillus subtilis 168 into the culture medium, which facilitates the subsequent purification and reduces the formation of inclusion body in

  14. A Metagenomic Advance for the Cloning and Characterization of a Cellulase from Red Rice Crop Residues.

    PubMed

    Meneses, Carlos; Silva, Bruna; Medeiros, Betsy; Serrato, Rodrigo; Johnston-Monje, David

    2016-06-25

    Many naturally-occurring cellulolytic microorganisms are not readily cultivable, demanding a culture-independent approach in order to study their cellulolytic genes. Metagenomics involves the isolation of DNA from environmental sources and can be used to identify enzymes with biotechnological potential from uncultured microbes. In this study, a gene encoding an endoglucanase was cloned from red rice crop residues using a metagenomic strategy. The amino acid identity between this gene and its closest published counterparts is lower than 70%. The endoglucanase was named EglaRR01 and was biochemically characterized. This recombinant protein showed activity on carboxymethylcellulose, indicating that EglaRR01 is an endoactive lytic enzyme. The enzymatic activity was optimal at a pH of 6.8 and at a temperature of 30 °C. Ethanol production from this recombinant enzyme was also analyzed on EglaRR01 crop residues, and resulted in conversion of cellulose from red rice into simple sugars which were further fermented by Saccharomyces cerevisiae to produce ethanol after seven days. Ethanol yield in this study was approximately 8 g/L. The gene found herein shows strong potential for use in ethanol production from cellulosic biomass (second generation ethanol).

  15. Current and future resources for functional metagenomics

    PubMed Central

    Lam, Kathy N.; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D.; Charles, Trevor C.

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries—physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research. PMID:26579102

  16. Current and future resources for functional metagenomics.

    PubMed

    Lam, Kathy N; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D; Charles, Trevor C

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries-physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research.

  17. Longitudinal Metagenomic Analysis of Hospital Air Identifies Clinically Relevant Microbes.

    PubMed

    King, Paula; Pham, Long K; Waltz, Shannon; Sphar, Dan; Yamamoto, Robert T; Conrad, Douglas; Taplitz, Randy; Torriani, Francesca; Forsyth, R Allyn

    2016-01-01

    We describe the sampling of sixty-three uncultured hospital air samples collected over a six-month period and analysis using shotgun metagenomic sequencing. Our primary goals were to determine the longitudinal metagenomic variability of this environment, identify and characterize genomes of potential pathogens and determine whether they are atypical to the hospital airborne metagenome. Air samples were collected from eight locations which included patient wards, the main lobby and outside. The resulting DNA libraries produced 972 million sequences representing 51 gigabases. Hierarchical clustering of samples by the most abundant 50 microbial orders generated three major nodes which primarily clustered by type of location. Because the indoor locations were longitudinally consistent, episodic relative increases in microbial genomic signatures related to the opportunistic pathogens Aspergillus, Penicillium and Stenotrophomonas were identified as outliers at specific locations. Further analysis of microbial reads specific for Stenotrophomonas maltophilia indicated homology to a sequenced multi-drug resistant clinical strain and we observed broad sequence coverage of resistance genes. We demonstrate that a shotgun metagenomic sequencing approach can be used to characterize the resistance determinants of pathogen genomes that are uncharacteristic for an otherwise consistent hospital air microbial metagenomic profile.

  18. Functional Screening of Metagenome and Genome Libraries for Detection of Novel Flavonoid-Modifying Enzymes

    PubMed Central

    Rabausch, U.; Juergensen, J.; Ilmberger, N.; Böhnke, S.; Fischer, S.; Schubach, B.; Schulte, M.

    2013-01-01

    The functional detection of novel enzymes other than hydrolases from metagenomes is limited since only a very few reliable screening procedures are available that allow the rapid screening of large clone libraries. For the discovery of flavonoid-modifying enzymes in genome and metagenome clone libraries, we have developed a new screening system based on high-performance thin-layer chromatography (HPTLC). This metagenome extract thin-layer chromatography analysis (META) allows the rapid detection of glycosyltransferase (GT) and also other flavonoid-modifying activities. The developed screening method is highly sensitive, and an amount of 4 ng of modified flavonoid molecules can be detected. This novel technology was validated against a control library of 1,920 fosmid clones generated from a single Bacillus cereus isolate and then used to analyze more than 38,000 clones derived from two different metagenomic preparations. Thereby we identified two novel UDP glycosyltransferase (UGT) genes. The metagenome-derived gtfC gene encoded a 52-kDa protein, and the deduced amino acid sequence was weakly similar to sequences of putative UGTs from Fibrisoma and Dyadobacter. GtfC mediated the transfer of different hexose moieties and exhibited high activities on flavones, flavonols, flavanones, and stilbenes and also accepted isoflavones and chalcones. From the control library we identified a novel macroside glycosyltransferase (MGT) with a calculated molecular mass of 46 kDa. The deduced amino acid sequence was highly similar to sequences of MGTs from Bacillus thuringiensis. Recombinant MgtB transferred the sugar residue from UDP-glucose effectively to flavones, flavonols, isoflavones, and flavanones. Moreover, MgtB exhibited high activity on larger flavonoid molecules such as tiliroside. PMID:23686272

  19. Novel Lipolytic Enzymes Identified from Metagenomic Library of Deep-Sea Sediment

    PubMed Central

    Jeon, Jeong Ho; Kim, Jun Tae; Lee, Hyun Sook; Kim, Sang-Jin; Kang, Sung Gyun; Choi, Sang Ho; Lee, Jung-Hyun

    2011-01-01

    Metagenomic library was constructed from a deep-sea sediment sample and screened for lipolytic activity. Open-reading frames of six positive clones showed only 33–58% amino acid identities to the known proteins. One of them was assigned to a new group while others were grouped into Families I and V or EstD Family. By employing a combination of approaches such as removing the signal sequence, coexpression of chaperone genes, and low temperature induction, we obtained five soluble recombinant proteins in Escherichia coli. The purified enzymes had optimum temperatures of 30–35°C and the cold-activity property. Among them, one enzyme showed lipase activity by preferentially hydrolyzing p-nitrophenyl palmitate and p-nitrophenyl stearate and high salt resistance with up to 4 M NaCl. Our research demonstrates the feasibility of developing novel lipolytic enzymes from marine environments by the combination of functional metagenomic approach and protein expression technology. PMID:21845199

  20. Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

    PubMed

    Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

    2009-06-01

    The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.

  1. Cloning, Overexpression, and Characterization of a Metagenome-Derived Phytase with Optimal Activity at Low pH.

    PubMed

    Tan, Hao; Wu, Xiang; Xie, Liyuan; Huang, Zhongqian; Gan, Bingcheng; Peng, Weihong

    2015-06-01

    A phytase gene was identified in a publicly available metagenome derived from subsurface groundwater, which was deduced to encode for a protein of the histidine acid phosphatase (HAP) family. The nucleotide sequence of the phytase gene was chemically synthesized and cloned, in order to further overexpress the phytase in Escherichia coli. Purified protein of the recombinant phytase demonstrated an activity for phytic acid of 298 ± 17 μmol P/min/mg, at the pH optimum of 2.0 with the temperature of 37 °C. Interestingly, the pH optimum of this phytase is much lower in comparison with most HAP phytases known to date. It suggests that the phytase could possess improved adaptability to the low pH condition caused by the gastric acid in livestock and poultry stomachs.

  2. Open resource metagenomics: a model for sharing metagenomic libraries.

    PubMed

    Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C

    2011-11-30

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the

  3. Open resource metagenomics: a model for sharing metagenomic libraries

    PubMed Central

    Neufeld, J.D.; Engel, K.; Cheng, J.; Moreno-Hagelsieb, G.; Rose, D.R.; Charles, T.C.

    2011-01-01

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM2BL [1]). The CM2BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the

  4. Structural and mechanistic analysis of a β-glycoside phosphorylase identified by screening a metagenomic library.

    PubMed

    Macdonald, Spencer S; Patel, Ankoor; Larmour, Veronica L C; Morgan-Lang, Connor; Hallam, Steven J; Mark, Brian L; Withers, Stephen G

    2018-03-02

    Glycoside phosphorylases have considerable potential as catalysts for the assembly of useful glycans for products ranging from functional foods and prebiotics to novel materials. However, the substrate diversity of currently identified phosphorylases is relatively small, limiting their practical applications. To address this limitation, we developed a high-throughput screening approach using the activated substrate 2,4-dinitrophenyl β-d-glucoside (DNPGlc) and inorganic phosphate for identifying glycoside phosphorylase activity and used it to screen a large insert metagenomic library. The initial screen, based on release of 2,4-dinitrophenyl from DNPGlc in the presence of phosphate, identified the gene bglP, encoding a retaining β-glycoside phosphorylase from the CAZy GH3 family. Kinetic and mechanistic analysis of the gene product, BglP, confirmed a double displacement ping-pong mechanism involving a covalent glycosyl-enzyme intermediate. X-ray crystallographic analysis provided insights into the phosphate-binding mode and identified a key glutamine residue in the active site important for substrate recognition. Substituting this glutamine for a serine swapped the substrate specificity from glucoside to N -acetylglucosaminide. In summary, we present a high-throughput screening approach for identifying β-glycoside phosphorylases, which was robust, simple to implement, and useful in identifying active clones within a metagenomics library. Implementation of this screen enabled discovery of a new glycoside phosphorylase class and has paved the way to devising simple ways in which enzyme specificity can be encoded and swapped, which has implications for biotechnological applications. © 2018 by The American Society for Biochemistry and Molecular Biology, Inc.

  5. Functional metagenomics reveals novel β-galactosidases not predictable from gene sequences.

    PubMed

    Cheng, Jiujun; Romantsov, Tatyana; Engel, Katja; Doxey, Andrew C; Rose, David R; Neufeld, Josh D; Charles, Trevor C

    2017-01-01

    The techniques of metagenomics have allowed researchers to access the genomic potential of uncultivated microbes, but there remain significant barriers to determination of gene function based on DNA sequence alone. Functional metagenomics, in which DNA is cloned and expressed in surrogate hosts, can overcome these barriers, and make important contributions to the discovery of novel enzymes. In this study, a soil metagenomic library carried in an IncP cosmid was used for functional complementation for β-galactosidase activity in both Sinorhizobium meliloti (α-Proteobacteria) and Escherichia coli (γ-Proteobacteria) backgrounds. One β-galactosidase, encoded by six overlapping clones that were selected in both hosts, was identified as a member of glycoside hydrolase family 2. We could not identify ORFs obviously encoding possible β-galactosidases in 19 other sequenced clones that were only able to complement S. meliloti. Based on low sequence identity to other known glycoside hydrolases, yet not β-galactosidases, three of these ORFs were examined further. Biochemical analysis confirmed that all three encoded β-galactosidase activity. Lac36W_ORF11 and Lac161_ORF7 had conserved domains, but lacked similarities to known glycoside hydrolases. Lac161_ORF10 had neither conserved domains nor similarity to known glycoside hydrolases. Bioinformatic and structural modeling implied that Lac161_ORF10 protein represented a novel enzyme family with a five-bladed propeller glycoside hydrolase domain. By discovering founding members of three novel β-galactosidase families, we have reinforced the value of functional metagenomics for isolating novel genes that could not have been predicted from DNA sequence analysis alone.

  6. Activity screening of environmental metagenomic libraries reveals novel carboxylesterase families

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Popovic, Ana; Hai, Tran; Tchigvintsev, Anatoly

    Metagenomics has made accessible an enormous reserve of global biochemical diversity. In order to tap into this vast resource of novel enzymes, we have screened over one million clones from metagenome DNA libraries derived from sixteen different environments for carboxylesterase activity and identified 714 positive hits. Here, we validated the esterase activity of 80 selected genes, which belong to 17 different protein families including unknown and cyclase-like proteins. Three metagenomic enzymes exhibited lipase activity, and seven proteins showed polyester depolymerization activity against polylactic acid and polycaprolactone. Detailed biochemical characterization of four new enzymes revealed their substrate preference, whereas their catalyticmore » residues were identified using site-directed mutagenesis. The crystal structure of the metal-ion dependent esterase MGS0169 from the amidohydrolase superfamily revealed a novel active site with a bound unknown ligand. Thus, activity-centered metagenomics has revealed diverse enzymes and novel families of microbial carboxylesterases, whose activity could not have been predicted using bioinformatics tools.« less

  7. Activity screening of environmental metagenomic libraries reveals novel carboxylesterase families

    DOE PAGES

    Popovic, Ana; Hai, Tran; Tchigvintsev, Anatoly; ...

    2017-03-08

    Metagenomics has made accessible an enormous reserve of global biochemical diversity. In order to tap into this vast resource of novel enzymes, we have screened over one million clones from metagenome DNA libraries derived from sixteen different environments for carboxylesterase activity and identified 714 positive hits. Here, we validated the esterase activity of 80 selected genes, which belong to 17 different protein families including unknown and cyclase-like proteins. Three metagenomic enzymes exhibited lipase activity, and seven proteins showed polyester depolymerization activity against polylactic acid and polycaprolactone. Detailed biochemical characterization of four new enzymes revealed their substrate preference, whereas their catalyticmore » residues were identified using site-directed mutagenesis. The crystal structure of the metal-ion dependent esterase MGS0169 from the amidohydrolase superfamily revealed a novel active site with a bound unknown ligand. Thus, activity-centered metagenomics has revealed diverse enzymes and novel families of microbial carboxylesterases, whose activity could not have been predicted using bioinformatics tools.« less

  8. Identifying personal microbiomes using metagenomic codes

    PubMed Central

    Franzosa, Eric A.; Huang, Katherine; Meadow, James F.; Gevers, Dirk; Lemon, Katherine P.; Bohannan, Brendan J. M.; Huttenhower, Curtis

    2015-01-01

    Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed >80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability. PMID:25964341

  9. Metagenomics: Probing pollutant fate in natural and engineered ecosystems.

    PubMed

    Bouhajja, Emna; Agathos, Spiros N; George, Isabelle F

    2016-12-01

    Polluted environments are a reservoir of microbial species able to degrade or to convert pollutants to harmless compounds. The proper management of microbial resources requires a comprehensive characterization of their genetic pool to assess the fate of contaminants and increase the efficiency of bioremediation processes. Metagenomics offers appropriate tools to describe microbial communities in their whole complexity without lab-based cultivation of individual strains. After a decade of use of metagenomics to study microbiomes, the scientific community has made significant progress in this field. In this review, we survey the main steps of metagenomics applied to environments contaminated with organic compounds or heavy metals. We emphasize technical solutions proposed to overcome encountered obstacles. We then compare two metagenomic approaches, i.e. library-based targeted metagenomics and direct sequencing of metagenomes. In the former, environmental DNA is cloned inside a host, and then clones of interest are selected based on (i) their expression of biodegradative functions or (ii) sequence homology with probes and primers designed from relevant, already known sequences. The highest score for the discovery of novel genes and degradation pathways has been achieved so far by functional screening of large clone libraries. On the other hand, direct sequencing of metagenomes without a cloning step has been more often applied to polluted environments for characterization of the taxonomic and functional composition of microbial communities and their dynamics. In this case, the analysis has focused on 16S rRNA genes and marker genes of biodegradation. Advances in next generation sequencing and in bioinformatic analysis of sequencing data have opened up new opportunities for assessing the potential of biodegradation by microbes, but annotation of collected genes is still hampered by a limited number of available reference sequences in databases. Although metagenomics

  10. A novel feruloyl esterase from rumen microbial metagenome: Gene cloning and enzyme characterization in the release of mono- and diferulic acids

    USDA-ARS?s Scientific Manuscript database

    A feruloyl esterase (FAE) gene was isolated from a rumen microbial metagenome, cloned into E. coli, and expressed in active form. The enzyme (RuFae4) was classified as a Type D feruloyl esterase based on its action on synthetic substrates and ability to release diferulates. The RuFae4 alone releas...

  11. Methods for comparative metagenomics

    PubMed Central

    Huson, Daniel H; Richter, Daniel C; Mitra, Suparna; Auch, Alexander F; Schuster, Stephan C

    2009-01-01

    Background Metagenomics is a rapidly growing field of research that aims at studying uncultured organisms to understand the true diversity of microbes, their functions, cooperation and evolution, in environments such as soil, water, ancient remains of animals, or the digestive system of animals and humans. The recent development of ultra-high throughput sequencing technologies, which do not require cloning or PCR amplification, and can produce huge numbers of DNA reads at an affordable cost, has boosted the number and scope of metagenomic sequencing projects. Increasingly, there is a need for new ways of comparing multiple metagenomics datasets, and for fast and user-friendly implementations of such approaches. Results This paper introduces a number of new methods for interactively exploring, analyzing and comparing multiple metagenomic datasets, which will be made freely available in a new, comparative version 2.0 of the stand-alone metagenome analysis tool MEGAN. Conclusion There is a great need for powerful and user-friendly tools for comparative analysis of metagenomic data and MEGAN 2.0 will help to fill this gap. PMID:19208111

  12. Isolation and characterization of a novel metagenomic enzyme capable of degrading bacterial phytotoxin toxoflavin

    PubMed Central

    Lee, Boyoung; Park, Ji Hyun; Oh, Joon Young; Choi, Jung Sup; Kim, Jin-Cheol

    2018-01-01

    Toxoflavin, a 7-azapteridine phytotoxin produced by the bacterial pathogens such as Burkholderia glumae and Burkholderia gladioli, has been known as one of the key virulence factors in crop diseases. Because the toxoflavin had an antibacterial activity, a metagenomic E. coli clone capable of growing well in the presence of toxoflavin (30 μg/ml) was isolated and the first metagenome-derived toxoflavin-degrading enzyme, TxeA of 140 amino acid residues, was identified from the positive E. coli clone. The conserved amino acids for metal-binding and extradiol dioxygenase activity, Glu-12, His-8 and Glu-130, were revealed by the sequence analysis of TxeA. The optimum conditions for toxoflavin degradation were evaluated with the TxeA purified in E. coli. Toxoflavin was totally degraded at an initial toxoflavin concentration of 100 μg/ml and at pH 5.0 in the presence of Mn2+, dithiothreitol and oxygen. The final degradation products of toxoflavin and methyltoxoflavin were fully identified by MS and NMR as triazines. Therefore, we suggested that the new metagenomic enzyme, TxeA, provided the clue to applying the new metagenomic enzyme to resistance development of crop plants to toxoflavin-mediated disease as well as to biocatalysis for Baeyer-Villiger type oxidation. PMID:29293506

  13. Gene cloning and characterization of a novel esterase from activated sludge metagenome

    PubMed Central

    2009-01-01

    A metagenomic library was prepared using pCC2FOS vector containing about 3.0 Gbp of community DNA from the microbial assemblage of activated sludge. Screening of a part of the un-amplified library resulted in the finding of 1 unique lipolytic clone capable of hydrolyzing tributyrin, in which an esterase gene was identified. This esterase/lipase gene consists of 834 bp and encodes a polypeptide (designated EstAS) of 277 amino acid residuals with a molecular mass of 31 kDa. Sequence analysis indicated that it showed 33% and 31% amino acid identity to esterase/lipase from Gemmata obscuriglobus UQM 2246 (ZP_02733109) and Yarrowia lipolytica CLIB122 (XP_504639), respectively; and several conserved regions were identified, including the putative active site, HSMGG, a catalytic triad (Ser92, His125 and Asp216) and a LHYFRG conserved motif. The EstAS was overexpressed, purified and shown to hydrolyse p-nitrophenyl (NP) esters of fatty acids with short chain lengths (≤ C8). This EstAS had optimal temperature and pH at 35°C and 9.0, respectively, by hydrolysis of p-NP hexanoate. It also exhibited the same level of stability over wide temperature and pH ranges and in the presence of metal ions or detergents. The high level of stability of esterase EstAS with its unique substrate specificities make itself highly useful for biotechnological applications. PMID:20028524

  14. Mining the metagenome of activated biomass of an industrial wastewater treatment plant by a novel method.

    PubMed

    Sharma, Nandita; Tanksale, Himgouri; Kapley, Atya; Purohit, Hemant J

    2012-12-01

    Metagenomic libraries herald the era of magnifying the microbial world, tapping into the vast metabolic potential of uncultivated microbes, and enhancing the rate of discovery of novel genes and pathways. In this paper, we describe a method that facilitates the extraction of metagenomic DNA from activated sludge of an industrial wastewater treatment plant and its use in mining the metagenome via library construction. The efficiency of this method was demonstrated by the large representation of the bacterial genome in the constructed metagenomic libraries and by the functional clones obtained. The BAC library represented 95.6 times the bacterial genome, while, the pUC library represented 41.7 times the bacterial genome. Twelve clones in the BAC library demonstrated lipolytic activity, while four clones demonstrated dioxygenase activity. Four clones in pUC library tested positive for cellulase activity. This method, using FTA cards, not only can be used for library construction, but can also store the metagenome at room temperature.

  15. Prospecting Metagenomic Enzyme Subfamily Genes for DNA Family Shuffling by a Novel PCR-based Approach*

    PubMed Central

    Wang, Qiuyan; Wu, Huili; Wang, Anming; Du, Pengfei; Pei, Xiaolin; Li, Haifeng; Yin, Xiaopu; Huang, Lifeng; Xiong, Xiaolong

    2010-01-01

    DNA family shuffling is a powerful method for enzyme engineering, which utilizes recombination of naturally occurring functional diversity to accelerate laboratory-directed evolution. However, the use of this technique has been hindered by the scarcity of family genes with the required level of sequence identity in the genome database. We describe here a strategy for collecting metagenomic homologous genes for DNA shuffling from environmental samples by truncated metagenomic gene-specific PCR (TMGS-PCR). Using identified metagenomic gene-specific primers, twenty-three 921-bp truncated lipase gene fragments, which shared 64–99% identity with each other and formed a distinct subfamily of lipases, were retrieved from 60 metagenomic samples. These lipase genes were shuffled, and selected active clones were characterized. The chimeric clones show extensive functional and genetic diversity, as demonstrated by functional characterization and sequence analysis. Our results indicate that homologous sequences of genes captured by TMGS-PCR can be used as suitable genetic material for DNA family shuffling with broad applications in enzyme engineering. PMID:20962349

  16. Ensemble-based classification approach for micro-RNA mining applied on diverse metagenomic sequences.

    PubMed

    ElGokhy, Sherin M; ElHefnawi, Mahmoud; Shoukry, Amin

    2014-05-06

    MicroRNAs (miRNAs) are endogenous ∼22 nt RNAs that are identified in many species as powerful regulators of gene expressions. Experimental identification of miRNAs is still slow since miRNAs are difficult to isolate by cloning due to their low expression, low stability, tissue specificity and the high cost of the cloning procedure. Thus, computational identification of miRNAs from genomic sequences provide a valuable complement to cloning. Different approaches for identification of miRNAs have been proposed based on homology, thermodynamic parameters, and cross-species comparisons. The present paper focuses on the integration of miRNA classifiers in a meta-classifier and the identification of miRNAs from metagenomic sequences collected from different environments. An ensemble of classifiers is proposed for miRNA hairpin prediction based on four well-known classifiers (Triplet SVM, Mipred, Virgo and EumiR), with non-identical features, and which have been trained on different data. Their decisions are combined using a single hidden layer neural network to increase the accuracy of the predictions. Our ensemble classifier achieved 89.3% accuracy, 82.2% f-measure, 74% sensitivity, 97% specificity, 92.5% precision and 88.2% negative predictive value when tested on real miRNA and pseudo sequence data. The area under the receiver operating characteristic curve of our classifier is 0.9 which represents a high performance index.The proposed classifier yields a significant performance improvement relative to Triplet-SVM, Virgo and EumiR and a minor refinement over MiPred.The developed ensemble classifier is used for miRNA prediction in mine drainage, groundwater and marine metagenomic sequences downloaded from the NCBI sequence reed archive. By consulting the miRBase repository, 179 miRNAs have been identified as highly probable miRNAs. Our new approach could thus be used for mining metagenomic sequences and finding new and homologous miRNAs. The paper investigates a

  17. Marine Metagenome as A Resource for Novel Enzymes.

    PubMed

    Alma'abadi, Amani D; Gojobori, Takashi; Mineta, Katsuhiko

    2015-10-01

    More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel biocatalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metagenomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we discuss challenges in the metagenomics as an important part of bioinformatics analysis in big data. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  18. Heterologous viral expression systems in fosmid vectors increase the functional analysis potential of metagenomic libraries.

    PubMed

    Terrón-González, L; Medina, C; Limón-Mortés, M C; Santero, E

    2013-01-01

    The extraordinary potential of metagenomic functional analyses to identify activities of interest present in uncultured microorganisms has been limited by reduced gene expression in surrogate hosts. We have developed vectors and specialized E. coli strains as improved metagenomic DNA heterologous expression systems, taking advantage of viral components that prevent transcription termination at metagenomic terminators. One of the systems uses the phage T7 RNA-polymerase to drive metagenomic gene expression, while the other approach uses the lambda phage transcription anti-termination protein N to limit transcription termination. A metagenomic library was constructed and functionally screened to identify genes conferring carbenicillin resistance to E. coli. The use of these enhanced expression systems resulted in a 6-fold increase in the frequency of carbenicillin resistant clones. Subcloning and sequence analysis showed that, besides β-lactamases, efflux pumps are not only able contribute to carbenicillin resistance but may in fact be sufficient by themselves to convey carbenicillin resistance.

  19. Molecular cloning, expression, and characterization of four novel thermo-alkaliphilic enzymes retrieved from a metagenomic library.

    PubMed

    Maruthamuthu, Mukil; van Elsas, Jan Dirk

    2017-01-01

    Enzyme discovery is a promising approach to aid in the deconstruction of recalcitrant plant biomass in an industrial process. Novel enzymes can be readily discovered by applying metagenomics on whole microbiomes. Our goal was to select, examine, and characterize eight novel glycoside hydrolases that were previously detected in metagenomic libraries, to serve biotechnological applications with high performance. Here, eight glycosyl hydrolase family candidate genes were selected from metagenomes of wheat straw-degrading microbial consortia using molecular cloning and subsequent gene expression studies in Escherichia coli. Four of the eight enzymes had significant activities on either p NP-β-d-galactopyranoside, p NP-β-d-xylopyranoside, p NP-α-l-arabinopyranoside or p NP-α-d-glucopyranoside. These proteins, denoted as proteins 1, 2, 5 and 6, were his-tag purified and their nature and activities further characterized using molecular and activity screens with the p NP-labeled substrates. Proteins 1 and 2 showed high homologies with (1) a β-galactosidase (74%) and (2) a β-xylosidase (84%), whereas the remaining two (5 and 6) were homologous with proteins reported as a diguanylate cyclase and an aquaporin, respectively. The β-galactosidase- and β-xylosidase-like proteins 1 and 2 were confirmed as being responsible for previously found thermo-alkaliphilic glycosidase activities of extracts of E. coli carrying the respective source fosmids. Remarkably, the β-xylosidase-like protein 2 showed activities with both p NP-Xyl and p NP-Ara in the temperature range 40-50 °C and pH range 8.0-10.0. Moreover, proteins 5 and 6 showed thermotolerant α-glucosidase activity at pH 10.0. In silico structure prediction of protein 5 revealed the presence of a potential "GGDEF" catalytic site, encoding α-glucosidase activity, whereas that of protein 6 showed a "GDSL" site, encoding a 'new family' α-glucosidase activity. Using a rational screening approach, we identified and

  20. Production of indole antibiotics induced by exogenous gene derived from sponge metagenomes.

    PubMed

    Takeshige, Yuya; Egami, Yoko; Wakimoto, Toshiyuki; Abe, Ikuro

    2015-05-01

    Sponge metagenomes are accessible genetic sources containing genes and gene clusters responsible for the biosynthesis of sponge-derived bioactive natural products. In this study, we obtained the clone pDC112, producing turbomycin A and 2,2-di(3-indolyl)-3-indolone, based on the functional screening of the metagenome library derived from the marine sponge Discodermia calyx. The subcloning experiment identified ORF 25, which is homologous to inosine 5'-monophosphate dehydrogenase and required for the production of 2,2-di(3-indolyl)-3-indolone in Escherichia coli.

  1. Functional metagenomics to decipher food-microbe-host crosstalk.

    PubMed

    Larraufie, Pierre; de Wouters, Tomas; Potocki-Veronese, Gabrielle; Blottière, Hervé M; Doré, Joël

    2015-02-01

    The recent developments of metagenomics permit an extremely high-resolution molecular scan of the intestinal microbiota giving new insights and opening perspectives for clinical applications. Beyond the unprecedented vision of the intestinal microbiota given by large-scale quantitative metagenomics studies, such as the EU MetaHIT project, functional metagenomics tools allow the exploration of fine interactions between food constituents, microbiota and host, leading to the identification of signals and intimate mechanisms of crosstalk, especially between bacteria and human cells. Cloning of large genome fragments, either from complex intestinal communities or from selected bacteria, allows the screening of these biological resources for bioactivity towards complex plant polymers or functional food such as prebiotics. This permitted identification of novel carbohydrate-active enzyme families involved in dietary fibre and host glycan breakdown, and highlighted unsuspected bacterial players at the top of the intestinal microbial food chain. Similarly, exposure of fractions from genomic and metagenomic clones onto human cells engineered with reporter systems to track modulation of immune response, cell proliferation or cell metabolism has allowed the identification of bioactive clones modulating key cell signalling pathways or the induction of specific genes. This opens the possibility to decipher mechanisms by which commensal bacteria or candidate probiotics can modulate the activity of cells in the intestinal epithelium or even in distal organs such as the liver, adipose tissue or the brain. Hence, in spite of our inability to culture many of the dominant microbes of the human intestine, functional metagenomics open a new window for the exploration of food-microbe-host crosstalk.

  2. A Novel Prosthetic Joint Infection Pathogen, Mycoplasma salivarium, Identified by Metagenomic Shotgun Sequencing.

    PubMed

    Thoendel, Matthew; Jeraldo, Patricio; Greenwood-Quaintance, Kerryl E; Chia, Nicholas; Abdel, Matthew P; Steckelberg, James M; Osmon, Douglas R; Patel, Robin

    2017-07-15

    Defining the microbial etiology of culture-negative prosthetic joint infection (PJI) can be challenging. Metagenomic shotgun sequencing is a new tool to identify organisms undetected by conventional methods. We present a case where metagenomics was used to identify Mycoplasma salivarium as a novel PJI pathogen in a patient with hypogammaglobulinemia. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.

  3. Screening a wide host-range, waste-water metagenomic library in tryptophan auxotrophs of Rhizobium leguminosarum and of Escherichia coli reveals different classes of cloned trp genes.

    PubMed

    Li, Youguo; Wexler, Margaret; Richardson, David J; Bond, Philip L; Johnston, Andrew W B

    2005-12-01

    A metagenomic cosmid library was constructed, in which the insert DNA was derived from bacteria in a waste-water treatment plant and the vector was the wide host-range cosmid pLAFR3. The library was screened for clones that could correct defined tryptophan auxotrophs of the alpha-proteobacterium Rhizobium leguminosarum and of Escherichia coli. A total of 26 different cosmids that corrected at least one trp mutant in one or both of these species were obtained. Several cosmids corrected the auxotrophy of one or more R. leguminosarum trp mutants, but not the corresponding mutants in E. coli. Conversely, one cosmid corrected trpA, B, C, D and E mutants of E. coli but none of the trp mutants of R. leguminosarum. Two of the Trp+ cosmids were examined in more detail. One contained a trp operon that resembled that of the pathogen Chlamydophila caviae, containing the unusual kynU gene, which specifies kynureninase. The other, whose trp genes functioned in R. leguminosarum but not in E. coli, contained trpDCFBA in an operon that is likely co-transcribed with five other genes, most of which had no known link with tryptophan synthesis. The sequences of these TRP proteins, and the products of nine other genes encoded by this cosmid, failed to affiliate them with any known bacterial lineage. For one metagenomic cosmid, lac reporter fusions confirmed that its cloned trp genes were transcribed in R. leguminosarum, but not in E. coli. Thus, rhizobia, with their many sigma-factors, may be well-suited hosts for metagenomic libraries, cloned in wide host-range vectors.

  4. Finding and identifying the viral needle in the metagenomic haystack: trends and challenges

    PubMed Central

    Soueidan, Hayssam; Schmitt, Louise-Amélie; Candresse, Thierry; Nikolski, Macha

    2015-01-01

    Collectively, viruses have the greatest genetic diversity on Earth, occupy extremely varied niches and are likely able to infect all living organisms. Viral infections are an important issue for human health and cause considerable economic losses when agriculturally important crops or husbandry animals are infected. The advent of metagenomics has provided a precious tool to study viruses by sampling them in natural environments and identifying the genomic composition of a sample. However, reaching a clear recognition and taxonomic assignment of the identified viruses has been hampered by the computational difficulty of these problems. In this perspective paper we examine the trends in current research for the identification of viral sequences in a metagenomic sample, pinpoint the intrinsic computational difficulties for the identification of novel viral sequences within metagenomic samples, and suggest possible avenues to overcome them. PMID:25610431

  5. A novel esterase gene cloned from a metagenomic library from neritic sediments of the South China Sea

    PubMed Central

    2011-01-01

    Background Marine microbes are a large and diverse group, which are exposed to a wide variety of pressure, temperature, salinity, nutrient availability and other environmental conditions. They provide a huge potential source of novel enzymes with unique properties that may be useful in industry and biotechnology. To explore the lipolytic genetic resources in the South China Sea, 23 sediment samples were collected in the depth < 100 m marine areas. Results A metagenomic library of South China Sea sediments assemblage in plasmid vector containing about 194 Mb of community DNA was prepared. Screening of a part of the unamplified library resulted in isolation of 15 unique lipolytic clones with the ability to hydrolyze tributyrin. A positive recombinant clone (pNLE1), containing a novel esterase (Est_p1), was successfully expressed in E. coli and purified. In a series of assays, Est_p1 displayed maximal activity at pH 8.57, 40°C, with ρ-Nitrophenyl butyrate (C4) as substrate. Compared to other metagenomic esterases, Est_p1 played a notable role in specificity for substrate C4 (kcat/Km value 11,500 S-1m M-1) and showed no inhibited by phenylmethylsulfonyl fluoride, suggested that the substrate binding pocket was suitable for substrate C4 and the serine active-site residue was buried at the bottom of substrate binding pocket which sheltered by a lid structure. Conclusions Esterase, which specificity towards short chain fatty acids, especially butanoic acid, is commercially available as potent flavoring tools. According the outstanding activity and specificity for substrate C4, Est_p1 has potential application in flavor industries requiring hydrolysis of short chain esters. PMID:22067554

  6. A metagenomic β-glucuronidase uncovers a core adaptive function of the human intestinal microbiome

    PubMed Central

    Gloux, Karine; Berteau, Olivier; El oumami, Hanane; Béguet, Fabienne; Leclerc, Marion; Doré, Joël

    2011-01-01

    In the human gastrointestinal tract, bacterial β-D-glucuronidases (BG; E.C. 3.2.1.31) are involved both in xenobiotic metabolism and in some of the beneficial effects of dietary compounds. Despite their biological significance, investigations are hampered by the fact that only a few BGs have so far been studied. A functional metagenomic approach was therefore performed on intestinal metagenomic libraries using chromogenic glucuronides as probes. Using this strategy, 19 positive metagenomic clones were identified but only one exhibited strong β-D-glucuronidase activity when subcloned into an expression vector. The cloned gene encoded a β-D-glucuronidase (called H11G11-BG) that had distant amino acid sequence homologies and an additional C terminus domain compared with known β-D-glucuronidases. Fifteen homologs were identified in public bacterial genome databases (38–57% identity with H11G11-BG) in the Firmicutes phylum. The genomes identified derived from strains from Ruminococcaceae, Lachnospiraceae, and Clostridiaceae. The genetic context diversity, with closely related symporters and gene duplication, argued for functional diversity and contribution to adaptive mechanisms. In contrast to the previously known β-D-glucuronidases, this previously undescribed type was present in the published microbiome of each healthy adult/child investigated (n = 11) and was specific to the human gut ecosystem. In conclusion, our functional metagenomic approach revealed a class of BGs that may be part of a functional core specifically evolved to adapt to the human gut environment with major health implications. We propose consensus motifs for this unique Firmicutes β-D-glucuronidase subfamily and for the glycosyl hydrolase family 2. PMID:20615998

  7. Functional metagenomics to mine the human gut microbiome for dietary fiber catabolic enzymes.

    PubMed

    Tasse, Lena; Bercovici, Juliette; Pizzut-Serin, Sandra; Robe, Patrick; Tap, Julien; Klopp, Christophe; Cantarel, Brandi L; Coutinho, Pedro M; Henrissat, Bernard; Leclerc, Marion; Doré, Joël; Monsan, Pierre; Remaud-Simeon, Magali; Potocki-Veronese, Gabrielle

    2010-11-01

    The human gut microbiome is a complex ecosystem composed mainly of uncultured bacteria. It plays an essential role in the catabolism of dietary fibers, the part of plant material in our diet that is not metabolized in the upper digestive tract, because the human genome does not encode adequate carbohydrate active enzymes (CAZymes). We describe a multi-step functionally based approach to guide the in-depth pyrosequencing of specific regions of the human gut metagenome encoding the CAZymes involved in dietary fiber breakdown. High-throughput functional screens were first applied to a library covering 5.4 × 10(9) bp of metagenomic DNA, allowing the isolation of 310 clones showing beta-glucanase, hemicellulase, galactanase, amylase, or pectinase activities. Based on the results of refined secondary screens, sequencing efforts were reduced to 0.84 Mb of nonredundant metagenomic DNA, corresponding to 26 clones that were particularly efficient for the degradation of raw plant polysaccharides. Seventy-three CAZymes from 35 different families were discovered. This corresponds to a fivefold target-gene enrichment compared to random sequencing of the human gut metagenome. Thirty-three of these CAZy encoding genes are highly homologous to prevalent genes found in the gut microbiome of at least 20 individuals for whose metagenomic data are available. Moreover, 18 multigenic clusters encoding complementary enzyme activities for plant cell wall degradation were also identified. Gene taxonomic assignment is consistent with horizontal gene transfer events in dominant gut species and provides new insights into the human gut functional trophic chain.

  8. High yield of functional metagenomic library from mangroves constructed in fosmid vector.

    PubMed

    Gonçalves, A C S; dos Santos, A C F; dos Santos, T F; Pessoa, T B A; Dias, J C T; Rezende, R P

    2015-10-02

    In the present study, metagenomic technique and fosmid vectors were used to construct a library of clones for exploring the biotechnological potential of mangrove soils by isolation of functional genes encoding hydrolytic enzymes. The library was built with genomic DNA from the soil samples of mangrove sediments and the functional screening of 1824 clones (~64 Mbp) was performed to detect the hydrolytic activity specific for cellulases, amylases (at acidic, neutral and basic pH), lipases/esterases, proteases, and nitrilases. Significant numbers of clones, positive for the tested enzyme activities were obtained. Our results indicate the importance and biotechnological potential of mangrove soils especially when compared to those obtained using other soil metagenomic libraries.

  9. Rapid and efficient method to extract metagenomic DNA from estuarine sediments.

    PubMed

    Shamim, Kashif; Sharma, Jaya; Dubey, Santosh Kumar

    2017-07-01

    Metagenomic DNA from sediments of selective estuaries of Goa, India was extracted using a simple, fast, efficient and environment friendly method. The recovery of pure metagenomic DNA from our method was significantly high as compared to other well-known methods since the concentration of recovered metagenomic DNA ranged from 1185.1 to 4579.7 µg/g of sediment. The purity of metagenomic DNA was also considerably high as the ratio of absorbance at 260 and 280 nm ranged from 1.88 to 1.94. Therefore, the recovered metagenomic DNA was directly used to perform various molecular biology experiments viz. restriction digestion, PCR amplification, cloning and metagenomic library construction. This clearly proved that our protocol for metagenomic DNA extraction using silica gel efficiently removed the contaminants and prevented shearing of the metagenomic DNA. Thus, this modified method can be used to recover pure metagenomic DNA from various estuarine sediments in a rapid, efficient and eco-friendly manner.

  10. A Primer on Metagenomics

    PubMed Central

    Wooley, John C.; Godzik, Adam; Friedberg, Iddo

    2010-01-01

    Metagenomics is a discipline that enables the genomic study of uncultured microorganisms. Faster, cheaper sequencing technologies and the ability to sequence uncultured microbes sampled directly from their habitats are expanding and transforming our view of the microbial world. Distilling meaningful information from the millions of new genomic sequences presents a serious challenge to bioinformaticians. In cultured microbes, the genomic data come from a single clone, making sequence assembly and annotation tractable. In metagenomics, the data come from heterogeneous microbial communities, sometimes containing more than 10,000 species, with the sequence data being noisy and partial. From sampling, to assembly, to gene calling and function prediction, bioinformatics faces new demands in interpreting voluminous, noisy, and often partial sequence data. Although metagenomics is a relative newcomer to science, the past few years have seen an explosion in computational methods applied to metagenomic-based research. It is therefore not within the scope of this article to provide an exhaustive review. Rather, we provide here a concise yet comprehensive introduction to the current computational requirements presented by metagenomics, and review the recent progress made. We also note whether there is software that implements any of the methods presented here, and briefly review its utility. Nevertheless, it would be useful if readers of this article would avail themselves of the comment section provided by this journal, and relate their own experiences. Finally, the last section of this article provides a few representative studies illustrating different facets of recent scientific discoveries made using metagenomics. PMID:20195499

  11. Mining for Nonribosomal Peptide Synthetase and Polyketide Synthase Genes Revealed a High Level of Diversity in the Sphagnum Bog Metagenome

    PubMed Central

    Müller, Christina A.; Oberauner-Wappis, Lisa; Peyman, Armin; Amos, Gregory C. A.; Wellington, Elizabeth M. H.

    2015-01-01

    Sphagnum bog ecosystems are among the oldest vegetation forms harboring a specific microbial community and are known to produce an exceptionally wide variety of bioactive substances. Although the Sphagnum metagenome shows a rich secondary metabolism, the genes have not yet been explored. To analyze nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), the diversity of NRPS and PKS genes in Sphagnum-associated metagenomes was investigated by in silico data mining and sequence-based screening (PCR amplification of 9,500 fosmid clones). The in silico Illumina-based metagenomic approach resulted in the identification of 279 NRPSs and 346 PKSs, as well as 40 PKS-NRPS hybrid gene sequences. The occurrence of NRPS sequences was strongly dominated by the members of the Protebacteria phylum, especially by species of the Burkholderia genus, while PKS sequences were mainly affiliated with Actinobacteria. Thirteen novel NRPS-related sequences were identified by PCR amplification screening, displaying amino acid identities of 48% to 91% to annotated sequences of members of the phyla Proteobacteria, Actinobacteria, and Cyanobacteria. Some of the identified metagenomic clones showed the closest similarity to peptide synthases from Burkholderia or Lysobacter, which are emerging bacterial sources of as-yet-undescribed bioactive metabolites. This report highlights the role of the extreme natural ecosystems as a promising source for detection of secondary compounds and enzymes, serving as a source for biotechnological applications. PMID:26002894

  12. Inhibition of the growth of Bacillus subtilis DSM10 by a newly discovered antibacterial protein from the soil metagenome.

    PubMed

    O'Mahony, Mark M; Henneberger, Ruth; Selvin, Joseph; Kennedy, Jonathan; Doohan, Fiona; Marchesi, Julian R; Dobson, Alan D W

    2015-01-01

    A functional metagenomics based approach exploiting the microbiota of suppressive soils from an organic field site has succeeded in the identification of a clone with the ability to inhibit the growth of Bacillus subtilis DSM10. Sequencing of the fosmid identified a putative β-lactamase-like gene abgT. Transposon mutagenesis of the abgT gene resulted in a loss in ability to inhibit the growth of B. subtilis DSM10. Further analysis of the deduced amino acid sequence of AbgT revealed moderate homology to esterases, suggesting that the protein may possess hydrolytic activity. Weak lipolytic activity was detected; however the clone did not appear to produce any β-lactamase activity. Phylogenetic analysis revealed the protein is a member of the family VIII group of lipase/esterases and clusters with a number of proteins of metagenomic origin. The abgT gene was sub-cloned into a protein expression vector and when introduced into the abgT transposon mutant clones restored the ability of the clones to inhibit the growth of B. subtilis DSM10, clearly indicating that the abgT gene is involved in the antibacterial activity. While the precise role of this protein has yet to fully elucidated, it may be involved in the generation of free fatty acid with antibacterial properties. Thus functional metagenomic approaches continue to provide a significant resource for the discovery of novel functional proteins and it is clear that hydrolytic enzymes, such as AbgT, may be a potential source for the development of future antimicrobial therapies.

  13. Novel polyhydroxyalkanoate copolymers produced in Pseudomonas putida by metagenomic polyhydroxyalkanoate synthases.

    PubMed

    Cheng, Jiujun; Charles, Trevor C

    2016-09-01

    Bacterially produced biodegradable polyhydroxyalkanoates (PHAs) with versatile properties can be achieved using different PHA synthases (PhaCs). This work aims to expand the diversity of known PhaCs via functional metagenomics and demonstrates the use of these novel enzymes in PHA production. Complementation of a PHA synthesis-deficient Pseudomonas putida strain with a soil metagenomic cosmid library retrieved 27 clones expressing either class I, class II, or unclassified PHA synthases, and many did not have close sequence matches to known PhaCs. The composition of PHA produced by these clones was dependent on both the supplied growth substrates and the nature of the PHA synthase, with various combinations of short-chain-length (SCL) and medium-chain-length (MCL) PHA. These data demonstrate the ability to isolate diverse genes for PHA synthesis by functional metagenomics and their use for the production of a variety of PHA polymer and copolymer mixtures.

  14. Mining for hemicellulases in the fungus-growing termite Pseudacanthotermes militaris using functional metagenomics.

    PubMed

    Bastien, Géraldine; Arnal, Grégory; Bozonnet, Sophie; Laguerre, Sandrine; Ferreira, Fernando; Fauré, Régis; Henrissat, Bernard; Lefèvre, Fabrice; Robe, Patrick; Bouchez, Olivier; Noirot, Céline; Dumon, Claire; O'Donohue, Michael

    2013-05-14

    The metagenomic analysis of gut microbiomes has emerged as a powerful strategy for the identification of biomass-degrading enzymes, which will be no doubt useful for the development of advanced biorefining processes. In the present study, we have performed a functional metagenomic analysis on comb and gut microbiomes associated with the fungus-growing termite, Pseudacanthotermes militaris. Using whole termite abdomens and fungal-comb material respectively, two fosmid-based metagenomic libraries were created and screened for the presence of xylan-degrading enzymes. This revealed 101 positive clones, corresponding to an extremely high global hit rate of 0.49%. Many clones displayed either β-d-xylosidase (EC 3.2.1.37) or α-l-arabinofuranosidase (EC 3.2.1.55) activity, while others displayed the ability to degrade AZCL-xylan or AZCL-β-(1,3)-β-(1,4)-glucan. Using secondary screening it was possible to pinpoint clones of interest that were used to prepare fosmid DNA. Sequencing of fosmid DNA generated 1.46 Mbp of sequence data, and bioinformatics analysis revealed 63 sequences encoding putative carbohydrate-active enzymes, with many of these forming parts of sequence clusters, probably having carbohydrate degradation and metabolic functions. Taxonomic assignment of the different sequences revealed that Firmicutes and Bacteroidetes were predominant phyla in the gut sample, while microbial diversity in the comb sample resembled that of typical soil samples. Cloning and expression in E. coli of six enzyme candidates identified in the libraries provided access to individual enzyme activities, which all proved to be coherent with the primary and secondary functional screens. This study shows that the gut microbiome of P. militaris possesses the potential to degrade biomass components, such as arabinoxylans and arabinans. Moreover, the data presented suggests that prokaryotic microorganisms present in the comb could also play a part in the degradation of biomass within the

  15. Cloning, Expression and Characteristics of a Novel Alkalistable and Thermostable Xylanase Encoding Gene (Mxyl) Retrieved from Compost-Soil Metagenome

    PubMed Central

    Verma, Digvijay; Kawarabayasi, Yutaka; Miyazaki, Kentaro; Satyanarayana, Tulasi

    2013-01-01

    Background The alkalistable and thermostable xylanases are in high demand for pulp bleaching in paper industry and generating xylooligosaccharides by hydrolyzing xylan component of agro-residues. The compost-soil samples, one of the hot environments, are expected to be a rich source of microbes with thermostable enzymes. Methodology/Principal Findings Metagenomic DNA from hot environmental samples could be a rich source of novel biocatalysts. While screening metagenomic library constructed from DNA extracted from the compost-soil in the p18GFP vector, a clone (TSDV-MX1) was detected that exhibited clear zone of xylan hydrolysis on RBB xylan plate. The sequencing of 6.321 kb DNA insert and its BLAST analysis detected the presence of xylanase gene that comprised 1077 bp. The deduced protein sequence (358 amino acids) displayed homology with glycosyl hydrolase (GH) family 11 xylanases. The gene was subcloned into pET28a vector and expressed in E. coli BL21 (DE3). The recombinant xylanase (rMxyl) exhibited activity over a broad range of pH and temperature with optima at pH 9.0 and 80°C. The recombinant xylanase is highly thermostable having T1/2 of 2 h at 80°C and 15 min at 90°C. Conclusion/Significance This is the first report on the retrieval of xylanase gene through metagenomic approach that encodes an enzyme with alkalistability and thermostability. The recombinant xylanase has a potential application in paper and pulp industry in pulp bleaching and generating xylooligosaccharides from the abundantly available agro-residues. PMID:23382818

  16. Identification and characterization of a cellulase-encoding gene from the buffalo rumen metagenomic library.

    PubMed

    Nguyen, Nhung Hong; Maruset, Lalita; Uengwetwanit, Tanaporn; Mhuantong, Wuttichai; Harnpicharnchai, Piyanun; Champreda, Verawat; Tanapongpipat, Sutipa; Jirajaroenrat, Kanya; Rakshit, Sudip K; Eurwilaichitr, Lily; Pongpattanakitshote, Somchai

    2012-01-01

    Microorganisms residing in the rumens of cattle represent a rich source of lignocellulose-degrading enzymes, since their diet consists of plant-based materials that are high in cellulose and hemicellulose. In this study, a metagenomic library was constructed from buffalo rumen contents using pCC1FOS fosmid vector. Ninety-three clones from the pooled library of approximately 10,000 clones showed degrading activity against AZCL-HE-Cellulose, whereas four other clones showed activity against AZCL-Xylan. Contig analysis of pyrosequencing data derived from the selected strongly positive clones revealed 15 ORFs that were closely related to lignocellulose-degrading enzymes belonging to several glycosyl hydrolase families. Glycosyl hydrolase family 5 (GHF5) was the most abundant glycosyl hydrolase found, and a majority of the GHF5s in our metagenomes were closely related to several ruminal bacteria, especially ones from other buffalo rumen metagenomes. Characterization of BT-01, a selected clone with highest cellulase activity from the primary plate screening assay, revealed a cellulase encoding gene with optimal working conditions at pH 5.5 at 50 °C. Along with its stability over acidic pH, the capability efficiently to hydrolyze cellulose in feed for broiler chickens, as exhibited in an in vitro digestibility test, suggests that BT-01 has potential application as a feed supplement.

  17. Production of Avaroferrin and Putrebactin by Heterologous Expression of a Deep-Sea Metagenomic DNA

    PubMed Central

    Fujita, Masaki J.; Sakai, Ryuichi

    2014-01-01

    The siderophore avaroferrin (1), an inhibitor of Vibrio swarming that was recently identified in Shewanella algae B516, was produced by heterologous expression of the biosynthetic gene cluster cloned from a deep-sea sediment metagenomic DNA, together with two analogues, bisucaberin (2) and putrebactin (3). Avaroferrin (1) is a macrocyclic heterodimer of N-hydroxy-N-succinyl cadaverine (4) and N-hydroxy-N-succinyl-putrescine (5), whereas analogues 2 and 3 are homodimers of 4 and 5, respectively. Heterologous expression of two other related genes from culturable marine bacteria resulted in production of compounds 1–3, but in quite different proportions compared with production through expression of the metagenomic DNA. PMID:25222668

  18. Mining for Nonribosomal Peptide Synthetase and Polyketide Synthase Genes Revealed a High Level of Diversity in the Sphagnum Bog Metagenome.

    PubMed

    Müller, Christina A; Oberauner-Wappis, Lisa; Peyman, Armin; Amos, Gregory C A; Wellington, Elizabeth M H; Berg, Gabriele

    2015-08-01

    Sphagnum bog ecosystems are among the oldest vegetation forms harboring a specific microbial community and are known to produce an exceptionally wide variety of bioactive substances. Although the Sphagnum metagenome shows a rich secondary metabolism, the genes have not yet been explored. To analyze nonribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs), the diversity of NRPS and PKS genes in Sphagnum-associated metagenomes was investigated by in silico data mining and sequence-based screening (PCR amplification of 9,500 fosmid clones). The in silico Illumina-based metagenomic approach resulted in the identification of 279 NRPSs and 346 PKSs, as well as 40 PKS-NRPS hybrid gene sequences. The occurrence of NRPS sequences was strongly dominated by the members of the Protebacteria phylum, especially by species of the Burkholderia genus, while PKS sequences were mainly affiliated with Actinobacteria. Thirteen novel NRPS-related sequences were identified by PCR amplification screening, displaying amino acid identities of 48% to 91% to annotated sequences of members of the phyla Proteobacteria, Actinobacteria, and Cyanobacteria. Some of the identified metagenomic clones showed the closest similarity to peptide synthases from Burkholderia or Lysobacter, which are emerging bacterial sources of as-yet-undescribed bioactive metabolites. This report highlights the role of the extreme natural ecosystems as a promising source for detection of secondary compounds and enzymes, serving as a source for biotechnological applications. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  19. The environment shapes microbial enzymes: five cold-active and salt-resistant carboxylesterases from marine metagenomes.

    PubMed

    Tchigvintsev, Anatoli; Tran, Hai; Popovic, Ana; Kovacic, Filip; Brown, Greg; Flick, Robert; Hajighasemi, Mahbod; Egorova, Olga; Somody, Joseph C; Tchigvintsev, Dmitri; Khusnutdinova, Anna; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Savchenko, Alexei; Golyshin, Peter N; Jaeger, Karl-Erich; Yakunin, Alexander F

    2015-03-01

    Most of the Earth's biosphere is cold and is populated by cold-adapted microorganisms. To explore the natural enzyme diversity of these environments and identify new carboxylesterases, we have screened three marine metagenome gene libraries for esterase activity. The screens identified 23 unique active clones, from which five highly active esterases were selected for biochemical characterization. The purified metagenomic esterases exhibited high activity against α-naphthyl and p-nitrophenyl esters with different chain lengths. All five esterases retained high activity at 5 °C indicating that they are cold-adapted enzymes. The activity of MGS0010 increased more than two times in the presence of up to 3.5 M NaCl or KCl, whereas the other four metagenomic esterases were inhibited to various degrees by these salts. The purified enzymes showed different sensitivities to inhibition by solvents and detergents, and the activities of MGS0010, MGS0105 and MGS0109 were stimulated three to five times by the addition of glycerol. Screening of purified esterases against 89 monoester substrates revealed broad substrate profiles with a preference for different esters. The metagenomic esterases also hydrolyzed several polyester substrates including polylactic acid suggesting that they can be used for polyester depolymerization. Thus, esterases from marine metagenomes are cold-adapted enzymes exhibiting broad biochemical diversity reflecting the environmental conditions where they evolved.

  20. Functional metagenomics reveals novel salt tolerance loci from the human gut microbiome

    PubMed Central

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2012-01-01

    Metagenomics is a powerful tool that allows for the culture-independent analysis of complex microbial communities. One of the most complex and dense microbial ecosystems known is that of the human distal colon, with cell densities reaching up to 1012 per gram of faeces. With the majority of species as yet uncultured, there are an enormous number of novel genes awaiting discovery. In the current study, we conducted a functional screen of a metagenomic library of the human gut microbiota for potential salt-tolerant clones. Using transposon mutagenesis, three genes were identified from a single clone exhibiting high levels of identity to a species from the genus Collinsella (closest relative being Collinsella aerofaciens) (COLAER_01955, COLAER_01957 and COLAER_01981), a high G+C, Gram-positive member of the Actinobacteria commonly found in the human gut. The encoded proteins exhibit a strong similarity to GalE, MurB and MazG. Furthermore, pyrosequencing and bioinformatic analysis of two additional fosmid clones revealed the presence of an additional galE and mazG gene, with the highest level of genetic identity to Akkermansia muciniphila and Eggerthella sp. YY7918, respectively. Cloning and heterologous expression of the genes in the osmosensitive strain, Escherichia coli MKH13, resulted in increased salt tolerance of the transformed cells. It is hoped that the identification of atypical salt tolerance genes will help to further elucidate novel salt tolerance mechanisms, and will assist our increased understanding how resident bacteria cope with the osmolarity of the gastrointestinal tract. PMID:22534607

  1. i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets.

    PubMed

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S

    2011-11-30

    Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/

  2. An application of statistics to comparative metagenomics

    PubMed Central

    Rodriguez-Brito, Beltran; Rohwer, Forest; Edwards, Robert A

    2006-01-01

    Background Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments. Results Here we use a statistical method to compare curated subsystems, to predict the physiology, metabolism, and ecology from metagenomes. This approach can be used to identify those subsystems that are significantly different between metagenome sequences. Subsystems that were overrepresented in the Sargasso Sea and Acid Mine Drainage metagenome when compared to non-redundant databases were identified. Conclusion The methodology described herein applies statistics to the comparisons of metabolic potential in metagenomes. This analysis reveals those subsystems that are more, or less, represented in the different environments that are compared. These differences in metabolic potential lead to several testable hypotheses about physiology and metabolism of microbes from these ecosystems. PMID:16549025

  3. An application of statistics to comparative metagenomics.

    PubMed

    Rodriguez-Brito, Beltran; Rohwer, Forest; Edwards, Robert A

    2006-03-20

    Metagenomics, sequence analyses of genomic DNA isolated directly from the environments, can be used to identify organisms and model community dynamics of a particular ecosystem. Metagenomics also has the potential to identify significantly different metabolic potential in different environments. Here we use a statistical method to compare curated subsystems, to predict the physiology, metabolism, and ecology from metagenomes. This approach can be used to identify those subsystems that are significantly different between metagenome sequences. Subsystems that were overrepresented in the Sargasso Sea and Acid Mine Drainage metagenome when compared to non-redundant databases were identified. The methodology described herein applies statistics to the comparisons of metabolic potential in metagenomes. This analysis reveals those subsystems that are more, or less, represented in the different environments that are compared. These differences in metabolic potential lead to several testable hypotheses about physiology and metabolism of microbes from these ecosystems.

  4. Phylogenetic screening of a bacterial, metagenomic library using homing endonuclease restriction and marker insertion

    PubMed Central

    Yung, Pui Yi; Burke, Catherine; Lewis, Matt; Egan, Suhelen; Kjelleberg, Staffan; Thomas, Torsten

    2009-01-01

    Metagenomics provides access to the uncultured majority of the microbial world. The approaches employed in this field have, however, had limited success in linking functional genes to the taxonomic or phylogenetic origin of the organism they belong to. Here we present an efficient strategy to recover environmental DNA fragments that contain phylogenetic marker genes from metagenomic libraries. Our method involves the cleavage of 23S ribsosmal RNA (rRNA) genes within pooled library clones by the homing endonuclease I-CeuI followed by the insertion and selection of an antibiotic resistance cassette. This approach was applied to screen a library of 6500 fosmid clones derived from the microbial community associated with the sponge Cymbastela concentrica. Several fosmid clones were recovered after the screen and detailed phylogenetic and taxonomic assignment based on the rRNA gene showed that they belong to previously unknown organisms. In addition, compositional features of these fosmid clones were used to classify and taxonomically assign a dataset of environmental shotgun sequences. Our approach represents a valuable tool for the analysis of rapidly increasing, environmental DNA sequencing information. PMID:19767618

  5. Construction and screening of marine metagenomic libraries.

    PubMed

    Weiland, Nancy; Löscher, Carolin; Metzger, Rebekka; Schmitz, Ruth

    2010-01-01

    Marine microbial communities are highly diverse and have evolved during extended evolutionary processes of physiological adaptations under the influence of a variety of ecological conditions and selection pressures. They harbor an enormous diversity of microbes with still unknown and probably new physiological characteristics. Besides, the surfaces of marine multicellular organisms are typically covered by a consortium of epibiotic bacteria and act as barriers, where diverse interactions between microorganisms and hosts take place. Thus, microbial diversity in the water column of the oceans and the microbial consortia on marine tissues of multicellular organisms are rich sources for isolating novel bioactive compounds and genes. Here we describe the sampling, construction of large-insert metagenomic libraries from marine habitats and exemplarily one function based screen of metagenomic clones.

  6. Preparation of fosmid libraries and functional metagenomic analysis of microbial community DNA.

    PubMed

    Martínez, Asunción; Osburne, Marcia S

    2013-01-01

    One of the most important challenges in contemporary microbial ecology is to assign a functional role to the large number of novel genes discovered through large-scale sequencing of natural microbial communities that lack similarity to genes of known function. Functional screening of metagenomic libraries, that is, screening environmental DNA clones for the ability to confer an activity of interest to a heterologous bacterial host, is a promising approach for bridging the gap between metagenomic DNA sequencing and functional characterization. Here, we describe methods for isolating environmental DNA and constructing metagenomic fosmid libraries, as well as methods for designing and implementing successful functional screens of such libraries. © 2013 Elsevier Inc. All rights reserved.

  7. Ruminal metagenomic libraries as a source of relevant hemicellulolytic enzymes for biofuel production.

    PubMed

    Duque, Estrella; Daddaoua, Abdelali; Cordero, Baldo F; Udaondo, Zulema; Molina-Santiago, Carlos; Roca, Amalia; Solano, Jennifer; Molina-Alcaide, Eduarda; Segura, Ana; Ramos, Juan-Luis

    2018-04-17

    The success of second-generation (2G) ethanol technology relies on the efficient transformation of hemicellulose into monosaccharides and, particularly, on the full conversion of xylans into xylose for over 18% of fermentable sugars. We sought new hemicellulases using ruminal liquid, after enrichment of microbes with industrial lignocellulosic substrates and preparation of metagenomic libraries. Among 150 000 fosmid clones tested, we identified 22 clones with endoxylanase activity and 125 with β-xylosidase activity. These positive clones were sequenced en masse, and the analysis revealed open reading frames with a low degree of similarity with known glycosyl hydrolases families. Among them, we searched for enzymes that were thermostable (activity at > 50°C) and that operate at high rate at pH around 5. Upon a wide series of assays, the clones exhibiting the highest endoxylanase and β-xylosidase activities were identified. The fosmids were sequenced, and the corresponding genes cloned, expressed and proteins purified. We found that the activity of the most active β-xylosidase was at least 10-fold higher than that in commercial enzymatic fungal cocktails. Endoxylanase activity was in the range of fungal enzymes. Fungal enzymatic cocktails supplemented with the bacterial hemicellulases exhibited enhanced release of sugars from pretreated sugar cane straw, a relevant agricultural residue. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  8. Mining virulence genes using metagenomics.

    PubMed

    Belda-Ferre, Pedro; Cabrera-Rubio, Raúl; Moya, Andrés; Mira, Alex

    2011-01-01

    When a bacterial genome is compared to the metagenome of an environment it inhabits, most genes recruit at high sequence identity. In free-living bacteria (for instance marine bacteria compared against the ocean metagenome) certain genomic regions are totally absent in recruitment plots, representing therefore genes unique to individual bacterial isolates. We show that these Metagenomic Islands (MIs) are also visible in bacteria living in human hosts when their genomes are compared to sequences from the human microbiome, despite the compartmentalized structure of human-related environments such as the gut. From an applied point of view, MIs of human pathogens (e.g. those identified in enterohaemorragic Escherichia coli against the gut metagenome or in pathogenic Neisseria meningitidis against the oral metagenome) include virulence genes that appear to be absent in related strains or species present in the microbiome of healthy individuals. We propose that this strategy (i.e. recruitment analysis of pathogenic bacteria against the metagenome of healthy subjects) can be used to detect pathogenicity regions in species where the genes involved in virulence are poorly characterized. Using this approach, we detect well-known pathogenicity islands and identify new potential virulence genes in several human pathogens.

  9. Cloning, expression and characterization of a novel esterase from a South China Sea sediment metagenome

    NASA Astrophysics Data System (ADS)

    Zhang, Hao; Li, Fuchao; Chen, Huaxin; Zhao, Jin; Yan, Jinfei; Jiang, Peng; Li, Ronggui; Zhu, Baoli

    2015-07-01

    Lipolytic enzymes, including esterases and lipases, represent a group of hydrolases that catalyze the cleavage and formation of ester bonds. A novel esterase gene, scsEst01, was cloned from a South China Sea sediment metagenome. The scsEst01 gene consisted of 921 bp encoding 307 amino acid residues. The predicted amino acid sequence shared less than 90% identity with other lipolytic enzymes in the NCBI nonredundant protein database. ScsEst01 was successfully co-expressed in Escherichia coli BL21 (DE3) with chaperones (dnaK-dnaJ-grpE) to prevent the formation of inclusion bodies. The recombinant protein was purified on an immobilized metal ion affinity column containing chelating Sepharose charged with Ni2+. The enzyme was characterized using p -nitrophenol butyrate as a substrate. ScsEst01 had the highest lipolytic activity at 35°C and pH 8.0, indicative of a meso-thermophilic alkaline esterase. ScsEst01 was thermostable at 20°C. The lipolytic activity of scsEst01 was strongly increased by Fe2+, Mn2+ and 1% Tween 80 or Tween 20.

  10. Identification of fungi in shotgun metagenomics datasets

    PubMed Central

    Donovan, Paul D.; Gonzalez, Gabriel; Higgins, Desmond G.

    2018-01-01

    Metagenomics uses nucleic acid sequencing to characterize species diversity in different niches such as environmental biomes or the human microbiome. Most studies have used 16S rRNA amplicon sequencing to identify bacteria. However, the decreasing cost of sequencing has resulted in a gradual shift away from amplicon analyses and towards shotgun metagenomic sequencing. Shotgun metagenomic data can be used to identify a wide range of species, but have rarely been applied to fungal identification. Here, we develop a sequence classification pipeline, FindFungi, and use it to identify fungal sequences in public metagenome datasets. We focus primarily on animal metagenomes, especially those from pig and mouse microbiomes. We identified fungi in 39 of 70 datasets comprising 71 fungal species. At least 11 pathogenic species with zoonotic potential were identified, including Candida tropicalis. We identified Pseudogymnoascus species from 13 Antarctic soil samples initially analyzed for the presence of bacteria capable of degrading diesel oil. We also show that Candida tropicalis and Candida loboi are likely the same species. In addition, we identify several examples where contaminating DNA was erroneously included in fungal genome assemblies. PMID:29444186

  11. Identification and characterization of a novel fumarase gene by metagenome expression cloning from marine microorganisms

    PubMed Central

    2010-01-01

    Background Fumarase catalyzes the reversible hydration of fumarate to L-malate and is a key enzyme in the tricarboxylic acid (TCA) cycle and in amino acid metabolism. Fumarase is also used for the industrial production of L-malate from the substrate fumarate. Thermostable and high-activity fumarases from organisms that inhabit extreme environments may have great potential in industry, biotechnology, and basic research. The marine environment is highly complex and considered one of the main reservoirs of microbial diversity on the planet. However, most of the microorganisms are inaccessible in nature and are not easily cultivated in the laboratory. Metagenomic approaches provide a powerful tool to isolate and identify enzymes with novel biocatalytic activities for various biotechnological applications. Results A plasmid metagenomic library was constructed from uncultivated marine microorganisms within marine water samples. Through sequence-based screening of the DNA library, a gene encoding a novel fumarase (named FumF) was isolated. Amino acid sequence analysis revealed that the FumF protein shared the greatest homology with Class II fumarate hydratases from Bacteroides sp. 2_1_33B and Parabacteroides distasonis ATCC 8503 (26% identical and 43% similar). The putative fumarase gene was subcloned into pETBlue-2 vector and expressed in E. coli BL21(DE3)pLysS. The recombinant protein was purified to homogeneity. Functional characterization by high performance liquid chromatography confirmed that the recombinant FumF protein catalyzed the hydration of fumarate to form L-malate. The maximum activity for FumF protein occurred at pH 8.5 and 55°C in 5 mM Mg2+. The enzyme showed higher affinity and catalytic efficiency under optimal reaction conditions: Km= 0.48 mM, Vmax = 827 μM/min/mg, and kcat/Km = 1900 mM/s. Conclusions We isolated a novel fumarase gene, fumF, from a sequence-based screen of a plasmid metagenomic library from uncultivated marine microorganisms. The

  12. Functional Screening of Antibiotic Resistance Genes from a Representative Metagenomic Library of Food Fermenting Microbiota

    PubMed Central

    Devirgiliis, Chiara; Barile, Simona; Perozzi, Giuditta

    2014-01-01

    Lactic acid bacteria (LAB) represent the predominant microbiota in fermented foods. Foodborne LAB have received increasing attention as potential reservoir of antibiotic resistance (AR) determinants, which may be horizontally transferred to opportunistic pathogens. We have previously reported isolation of AR LAB from the raw ingredients of a fermented cheese, while AR genes could be detected in the final, marketed product only by PCR amplification, thus pointing at the need for more sensitive microbial isolation techniques. We turned therefore to construction of a metagenomic library containing microbial DNA extracted directly from the food matrix. To maximize yield and purity and to ensure that genomic complexity of the library was representative of the original bacterial population, we defined a suitable protocol for total DNA extraction from cheese which can also be applied to other lipid-rich foods. Functional library screening on different antibiotics allowed recovery of ampicillin and kanamycin resistant clones originating from Streptococcus salivarius subsp. thermophilus and Lactobacillus helveticus genomes. We report molecular characterization of the cloned inserts, which were fully sequenced and shown to confer AR phenotype to recipient bacteria. We also show that metagenomics can be applied to food microbiota to identify underrepresented species carrying specific genes of interest. PMID:25243126

  13. Isolation and characterization of novel lipases/esterases from a bovine rumen metagenome.

    PubMed

    Privé, Florence; Newbold, C Jamie; Kaderbhai, Naheed N; Girdwood, Susan G; Golyshina, Olga V; Golyshin, Peter N; Scollan, Nigel D; Huws, Sharon A

    2015-07-01

    Improving the health beneficial fatty acid content of meat and milk is a major challenge requiring an increased understanding of rumen lipid metabolism. In this study, we isolated and characterized rumen bacterial lipases/esterases using functional metagenomics. Metagenomic libraries were constructed from DNA extracted from strained rumen fluid (SRF), solid-attached bacteria (SAB) and liquid-associated rumen bacteria (LAB), ligated into a fosmid vector and subsequently transformed into an Escherichia coli host. Fosmid libraries consisted of 7,744; 8,448; and 7,680 clones with an average insert size of 30 to 35 kbp for SRF, SAB and LAB, respectively. Transformants were screened on spirit blue agar plates containing tributyrin for lipase/esterase activity. Five SAB and four LAB clones exhibited lipolytic activity, and no positive clones were found in the SRF library. Fosmids from positive clones were pyrosequenced and twelve putative lipase/esterase genes and two phospholipase genes retrieved. Although the derived proteins clustered into diverse esterase and lipase families, a degree of novelty was seen, with homology ranging from 40 to 78% following BlastP searches. Isolated lipases/esterases exhibited activity against mostly short- to medium-chain substrates across a range of temperatures and pH. The function of these novel enzymes recovered in ruminal metabolism needs further investigation, alongside their potential industrial uses.

  14. Summer Workshop in Metagenomics: One Week Plus Eight Students Equals Gigabases of Cloned DNA †

    PubMed Central

    Rios-Velazquez, Carlos; Williamson, Lynn L.; Cloud-Hansen, Karen A.; Allen, Heather K.; McMahon, Mathew D.; Sabree, Zakee L.; Donato, Justin J.; Handelsman, Jo

    2011-01-01

    We designed a week-long laboratory workshop in metagenomics for a cohort of undergraduate student researchers. During this course, students learned and utilized molecular biology and microbiology techniques to construct a metagenomic library from Puerto Rican soil. Pre-and postworkshop assessments indicated student learning gains in technical knowledge, skills, and confidence in a research environment. Postworkshop construction of additional libraries demonstrated retention of research techniques by the students. PMID:23653755

  15. Metagenome Analyses of Corroded Concrete Wastewater Pipe Biofilms Reveals a Complex Microbial System

    EPA Science Inventory

    Analysis of whole-metagenome pyrosequencing data and 16S rRNA gene clone libraries was used to determine microbial composition and functional genes associated with biomass harvested from crown (top) and invert (bottom) sections of a corroded wastewater pipe. Taxonomic and functio...

  16. Strain/species identification in metagenomes using genome-specific markers

    PubMed Central

    Tu, Qichao; He, Zhili; Zhou, Jizhong

    2014-01-01

    Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing. PMID:24523352

  17. Identifying airborne fungi in Seoul, Korea using metagenomics.

    PubMed

    Oh, Seung-Yoon; Fong, Jonathan J; Park, Myung Soo; Chang, Limseok; Lim, Young Woon

    2014-06-01

    Fungal spores are widespread and common in the atmosphere. In this study, we use a metagenomic approach to study the fungal diversity in six total air samples collected from April to May 2012 in Seoul, Korea. This springtime period is important in Korea because of the peak in fungal spore concentration and Asian dust storms, although the year of this study (2012) was unique in that were no major Asian dust events. Clustering sequences for operational taxonomic unit (OTU) identification recovered 1,266 unique OTUs in the combined dataset, with between 223᾿96 OTUs present in individual samples. OTUs from three fungal phyla were identified. For Ascomycota, Davidiella (anamorph: Cladosporium) was the most common genus in all samples, often accounting for more than 50% of all sequences in a sample. Other common Ascomycota genera identified were Alternaria, Didymella, Khuskia, Geosmitha, Penicillium, and Aspergillus. While several Basidiomycota genera were observed, Chytridiomycota OTUs were only present in one sample. Consistency was observed within sampling days, but there was a large shift in species composition from Ascomycota dominant to Basidiomycota dominant in the middle of the sampling period. This marked change may have been caused by meteorological events. A potential set of 40 allergy-inducing genera were identified, accounting for a large proportion of the diversity present (22.5᾿7.2%). Our study identifies high fungal diversity and potentially high levels of fungal allergens in springtime air of Korea, and provides a good baseline for future comparisons with Asian dust storms.

  18. deFUME: Dynamic exploration of functional metagenomic sequencing data.

    PubMed

    van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander

    2015-07-31

    Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.

  19. Amplicon-based metagenomics identified candidate organisms in soils that caused yield decline in strawberry

    PubMed Central

    Xu, Xiangming; Passey, Thomas; Wei, Feng; Saville, Robert; Harrison, Richard J.

    2015-01-01

    A phenomenon of yield decline due to weak plant growth in strawberry was recently observed in non-chemo-fumigated soils, which was not associated with the soil fungal pathogen Verticillium dahliae, the main target of fumigation. Amplicon-based metagenomics was used to profile soil microbiota in order to identify microbial organisms that may have caused the yield decline. A total of 36 soil samples were obtained in 2013 and 2014 from four sites for metagenomic studies; two of the four sites had a yield-decline problem, the other two did not. More than 2000 fungal or bacterial operational taxonomy units (OTUs) were found in these samples. Relative abundance of individual OTUs was statistically compared for differences between samples from sites with or without yield decline. A total of 721 individual comparisons were statistically significant – involving 366 unique bacterial and 44 unique fungal OTUs. Based on further selection criteria, we focused on 34 bacterial and 17 fungal OTUs and found that yield decline resulted probably from one or more of the following four factors: (1) low abundance of Bacillus and Pseudomonas populations, which are well known for their ability of supressing pathogen development and/or promoting plant growth; (2) lack of the nematophagous fungus (Paecilomyces species); (3) a high level of two non-specific fungal root rot pathogens; and (4) wet soil conditions. This study demonstrated the usefulness of an amplicon-based metagenomics approach to profile soil microbiota and to detect differential abundance in microbes. PMID:26504572

  20. Machine Learning Leveraging Genomes from Metagenomes Identifies Influential Antibiotic Resistance Genes in the Infant Gut Microbiome

    PubMed Central

    Olm, Matthew R.; Morowitz, Michael J.

    2018-01-01

    ABSTRACT Antibiotic resistance in pathogens is extensively studied, and yet little is known about how antibiotic resistance genes of typical gut bacteria influence microbiome dynamics. Here, we leveraged genomes from metagenomes to investigate how genes of the premature infant gut resistome correspond to the ability of bacteria to survive under certain environmental and clinical conditions. We found that formula feeding impacts the resistome. Random forest models corroborated by statistical tests revealed that the gut resistome of formula-fed infants is enriched in class D beta-lactamase genes. Interestingly, Clostridium difficile strains harboring this gene are at higher abundance in formula-fed infants than C. difficile strains lacking this gene. Organisms with genes for major facilitator superfamily drug efflux pumps have higher replication rates under all conditions, even in the absence of antibiotic therapy. Using a machine learning approach, we identified genes that are predictive of an organism’s direction of change in relative abundance after administration of vancomycin and cephalosporin antibiotics. The most accurate results were obtained by reducing annotated genomic data to five principal components classified by boosted decision trees. Among the genes involved in predicting whether an organism increased in relative abundance after treatment are those that encode subclass B2 beta-lactamases and transcriptional regulators of vancomycin resistance. This demonstrates that machine learning applied to genome-resolved metagenomics data can identify key genes for survival after antibiotics treatment and predict how organisms in the gut microbiome will respond to antibiotic administration. IMPORTANCE The process of reconstructing genomes from environmental sequence data (genome-resolved metagenomics) allows unique insight into microbial systems. We apply this technique to investigate how the antibiotic resistance genes of bacteria affect their ability to

  1. Multisubstrate Isotope Labeling and Metagenomic Analysis of Active Soil Bacterial Communities

    PubMed Central

    Verastegui, Y.; Cheng, J.; Engel, K.; Kolczynski, D.; Mortimer, S.; Lavigne, J.; Montalibet, J.; Romantsov, T.; Hall, M.; McConkey, B. J.; Rose, D. R.; Tomashek, J. J.; Scott, B. R.

    2014-01-01

    ABSTRACT Soil microbial diversity represents the largest global reservoir of novel microorganisms and enzymes. In this study, we coupled functional metagenomics and DNA stable-isotope probing (DNA-SIP) using multiple plant-derived carbon substrates and diverse soils to characterize active soil bacterial communities and their glycoside hydrolase genes, which have value for industrial applications. We incubated samples from three disparate Canadian soils (tundra, temperate rainforest, and agricultural) with five native carbon (12C) or stable-isotope-labeled (13C) carbohydrates (glucose, cellobiose, xylose, arabinose, and cellulose). Indicator species analysis revealed high specificity and fidelity for many uncultured and unclassified bacterial taxa in the heavy DNA for all soils and substrates. Among characterized taxa, Actinomycetales (Salinibacterium), Rhizobiales (Devosia), Rhodospirillales (Telmatospirillum), and Caulobacterales (Phenylobacterium and Asticcacaulis) were bacterial indicator species for the heavy substrates and soils tested. Both Actinomycetales and Caulobacterales (Phenylobacterium) were associated with metabolism of cellulose, and Alphaproteobacteria were associated with the metabolism of arabinose; members of the order Rhizobiales were strongly associated with the metabolism of xylose. Annotated metagenomic data suggested diverse glycoside hydrolase gene representation within the pooled heavy DNA. By screening 2,876 cloned fragments derived from the 13C-labeled DNA isolated from soils incubated with cellulose, we demonstrate the power of combining DNA-SIP, multiple-displacement amplification (MDA), and functional metagenomics by efficiently isolating multiple clones with activity on carboxymethyl cellulose and fluorogenic proxy substrates for carbohydrate-active enzymes. PMID:25028422

  2. Isolation and characterization of two serine proteases from metagenomic libraries of the Gobi and Death Valley deserts.

    PubMed

    Neveu, Julie; Regeard, Christophe; DuBow, Michael S

    2011-08-01

    The screening of environmental DNA metagenome libraries for functional activities can provide an important source of new molecules and enzymes. In this study, we identified 17 potential protease-producing clones from two metagenomic libraries derived from samples of surface sand from the Gobi and Death Valley deserts. Two of the proteases, DV1 and M30, were purified and biochemically examined. These two proteases displayed a molecular mass of 41.5 kDa and 45.7 kDa, respectively, on SDS polyacrylamide gels. Alignments with known protease sequences showed less than 55% amino acid sequence identity. These two serine proteases appear to belong to the subtilisin (S8A) family and displayed several unique biochemical properties. Protease DV1 had an optimum pH of 8 and an optimal activity at 55°C, while protease M30 had an optimum pH >11 and optimal activity at 40°C. The properties of these enzymes make them potentially useful for biotechnological applications and again demonstrate that metagenomic approaches can be useful, especially when coupled with the study of novel environments such as deserts.

  3. Identification and characterization of a chitin deacetylase from a metagenomic library of deep-sea sediments of the Arctic Ocean.

    PubMed

    Liu, Jinlin; Jia, Zhijuan; Li, Sha; Li, Yan; You, Qiang; Zhang, Chunyan; Zheng, Xiaotong; Xiong, Guomei; Zhao, Jin; Qi, Chao; Yang, Jihong

    2016-09-15

    The chemical and biological compositions of deep-sea sediments are interesting because of the underexplored diversity when it comes to bioprospecting. The special geographical location and climates make Arctic Ocean a unique ocean area containing an abundance of microbial resources. A metagenomic library was constructed based on the deep-sea sediments of Arctic Ocean. Part of insertion fragments of this library were sequenced. A chitin deacetylase gene, cdaYJ, was identified and characterized. A metagenomic library with 2750 clones was obtained and ten clones were sequenced. Results revealed several interesting genes, including a chitin deacetylase coding sequence, cdaYJ. The CdaYJ is homologous to some known chitin deacetylases and contains conserved chitin deacetylase active sites. CdaYJ protein exhibits a long N-terminal and a relative short C-terminal. Phylogenetic analysis revealed that CdaYJ showed highest homology to CDAs from Alphaproteobacteria. The cdaYJ gene was subcloned into the pET-28a vector and the recombinant CdaYJ (rCdaYJ) was expressed in Escherichia coli BL21 (DE3). rCdaYJ showed a molecular weight of 43kDa, and exhibited deacetylation activity by using p-nitroacetanilide as substrate. The optimal pH and temperature of rCdaYJ were tested as pH7.4 and 28°C, respectively. The construction of metagenomic library of the Arctic deep-sea sediments provides us an opportunity to look into the microbial communities and exploiting valuable gene resources. A chitin deacetylase CdaYJ was identified from the library. It showed highest deacetylation activity under slight alkaline and low temperature conditions. CdaYJ might be a candidate chitin deacetylase that possesses industrial and pharmaceutical potentials. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets.

    PubMed

    Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten

    Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.

  5. The Amordad database engine for metagenomics.

    PubMed

    Behnam, Ehsan; Smith, Andrew D

    2014-10-15

    Several technical challenges in metagenomic data analysis, including assembling metagenomic sequence data or identifying operational taxonomic units, are both significant and well known. These forms of analysis are increasingly cited as conceptually flawed, given the extreme variation within traditionally defined species and rampant horizontal gene transfer. Furthermore, computational requirements of such analysis have hindered content-based organization of metagenomic data at large scale. In this article, we introduce the Amordad database engine for alignment-free, content-based indexing of metagenomic datasets. Amordad places the metagenome comparison problem in a geometric context, and uses an indexing strategy that combines random hashing with a regular nearest neighbor graph. This framework allows refinement of the database over time by continual application of random hash functions, with the effect of each hash function encoded in the nearest neighbor graph. This eliminates the need to explicitly maintain the hash functions in order for query efficiency to benefit from the accumulated randomness. Results on real and simulated data show that Amordad can support logarithmic query time for identifying similar metagenomes even as the database size reaches into the millions. Source code, licensed under the GNU general public license (version 3) is freely available for download from http://smithlabresearch.org/amordad andrewds@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Metagenomic Insights into the Fibrolytic Microbiome in Yak Rumen

    PubMed Central

    Song, Lei; Liu, Di; Liu, Li; Chen, Furong; Wang, Min; Li, Jiabao; Zeng, Xiaowei; Dong, Zhiyang; Hu, Songnian; Li, Lingyan; Xu, Jian; Huang, Li; Dong, Xiuzhu

    2012-01-01

    The rumen hosts one of the most efficient microbial systems for degrading plant cell walls, yet the predominant cellulolytic proteins and fibrolytic mechanism(s) remain elusive. Here we investigated the cellulolytic microbiome of the yak rumen by using a combination of metagenome-based and bacterial artificial chromosome (BAC)-based functional screening approaches. Totally 223 fibrolytic BAC clones were pyrosequenced and 10,070 ORFs were identified. Among them 150 were annotated as the glycoside hydrolase (GH) genes for fibrolytic proteins, and the majority (69%) of them were clustered or linked with genes encoding related functions. Among the 35 fibrolytic contigs of >10 Kb in length, 25 were derived from Bacteroidetes and four from Firmicutes. Coverage analysis indicated that the fibrolytic genes on most Bacteroidetes-contigs were abundantly represented in the metagenomic sequences, and they were frequently linked with genes encoding SusC/SusD-type outer-membrane proteins. GH5, GH9, and GH10 cellulase/hemicellulase genes were predominant, but no GH48 exocellulase gene was found. Most (85%) of the cellulase and hemicellulase proteins possessed a signal peptide; only a few carried carbohydrate-binding modules, and no cellulosomal domains were detected. These findings suggest that the SucC/SucD-involving mechanism, instead of one based on cellulosomes or the free-enzyme system, serves a major role in lignocellulose degradation in yak rumen. Genes encoding an endoglucanase of a novel GH5 subfamily occurred frequently in the metagenome, and the recombinant proteins encoded by the genes displayed moderate Avicelase in addition to endoglucanase activities, suggesting their important contribution to lignocellulose degradation in the exocellulase-scarce rumen. PMID:22808161

  7. Metagenomic Insights into the RDX-Degrading Potential of the Ovine Rumen Microbiome

    PubMed Central

    Li, Robert W.; Giarrizzo, Juan Gabriel; Wu, Sitao; Li, Weizhong; Duringer, Jennifer M.; Craig, A. Morrie

    2014-01-01

    The manufacturing processes of royal demolition explosive (RDX), or hexahydro-1,3,5-trinitro-1,3,5-triazine, have resulted in serious water contamination. As a potential carcinogen, RDX can cause a broad range of harmful effects to humans and animals. The ovine rumen is capable of rapid degradation of nitroaromatic compounds, including RDX. While ruminal RDX-degrading bacteria have been identified, the genes and pathways responsible for RDX degradation in the rumen have yet to be characterized. In this study, we characterized the metabolic potential of the ovine rumen using metagenomic approaches. Sequences homologous to at least five RDX-degrading genes cloned from environmental samples (diaA, xenA, xenB, xplA, and xplB) were present in the ovine rumen microbiome. Among them, diaA was the most abundant, likely reflective of the predominance of the genus Clostridium in the ovine rumen. At least ten genera known to harbor RDX-degrading microorganisms were detectable. Metagenomic sequences were also annotated using public databases, such as Pfam, COG, and KEGG. Five of the six Pfam protein families known to be responsible for RDX degradation in environmental samples were identified in the ovine rumen. However, increased substrate availability did not appear to enhance the proliferation of RDX-degrading bacteria and alter the microbial composition of the ovine rumen. This implies that the RDX-degrading capacity of the ovine rumen microbiome is likely regulated at the transcription level. Our results provide metagenomic insights into the RDX-degrading potential of the ovine rumen, and they will facilitate the development of novel and economic bioremediation strategies. PMID:25383623

  8. Exploring neighborhoods in the metagenome universe.

    PubMed

    Aßhauer, Kathrin P; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-07-14

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis.

  9. Exploring Neighborhoods in the Metagenome Universe

    PubMed Central

    Aßhauer, Kathrin P.; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-01-01

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis. PMID:25026170

  10. Bambus 2: scaffolding metagenomes.

    PubMed

    Koren, Sergey; Treangen, Todd J; Pop, Mihai

    2011-11-01

    Sequencing projects increasingly target samples from non-clonal sources. In particular, metagenomics has enabled scientists to begin to characterize the structure of microbial communities. The software tools developed for assembling and analyzing sequencing data for clonal organisms are, however, unable to adequately process data derived from non-clonal sources. We present a new scaffolder, Bambus 2, to address some of the challenges encountered when analyzing metagenomes. Our approach relies on a combination of a novel method for detecting genomic repeats and algorithms that analyze assembly graphs to identify biologically meaningful genomic variants. We compare our software to current assemblers using simulated and real data. We demonstrate that the repeat detection algorithms have higher sensitivity than current approaches without sacrificing specificity. In metagenomic datasets, the scaffolder avoids false joins between distantly related organisms while obtaining long-range contiguity. Bambus 2 represents a first step toward automated metagenomic assembly. Bambus 2 is open source and available from http://amos.sf.net. mpop@umiacs.umd.edu. Supplementary data are available at Bioinformatics online.

  11. Bambus 2: scaffolding metagenomes

    PubMed Central

    Koren, Sergey; Treangen, Todd J.; Pop, Mihai

    2011-01-01

    Motivation: Sequencing projects increasingly target samples from non-clonal sources. In particular, metagenomics has enabled scientists to begin to characterize the structure of microbial communities. The software tools developed for assembling and analyzing sequencing data for clonal organisms are, however, unable to adequately process data derived from non-clonal sources. Results: We present a new scaffolder, Bambus 2, to address some of the challenges encountered when analyzing metagenomes. Our approach relies on a combination of a novel method for detecting genomic repeats and algorithms that analyze assembly graphs to identify biologically meaningful genomic variants. We compare our software to current assemblers using simulated and real data. We demonstrate that the repeat detection algorithms have higher sensitivity than current approaches without sacrificing specificity. In metagenomic datasets, the scaffolder avoids false joins between distantly related organisms while obtaining long-range contiguity. Bambus 2 represents a first step toward automated metagenomic assembly. Availability: Bambus 2 is open source and available from http://amos.sf.net. Contact: mpop@umiacs.umd.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21926123

  12. SIP metagenomics identifies uncultivated Methylophilaceae as dimethylsulphide degrading bacteria in soil and lake sediment

    PubMed Central

    Eyice, Özge; Namura, Motonobu; Chen, Yin; Mead, Andrew; Samavedam, Siva; Schäfer, Hendrik

    2015-01-01

    Dimethylsulphide (DMS) has an important role in the global sulphur cycle and atmospheric chemistry. Microorganisms using DMS as sole carbon, sulphur or energy source, contribute to the cycling of DMS in a wide variety of ecosystems. The diversity of microbial populations degrading DMS in terrestrial environments is poorly understood. Based on cultivation studies, a wide range of bacteria isolated from terrestrial ecosystems were shown to be able to degrade DMS, yet it remains unknown whether any of these have important roles in situ. In this study, we identified bacteria using DMS as a carbon and energy source in terrestrial environments, an agricultural soil and a lake sediment, by DNA stable isotope probing (SIP). Microbial communities involved in DMS degradation were analysed by denaturing gradient gel electrophoresis, high-throughput sequencing of SIP gradient fractions and metagenomic sequencing of phi29-amplified community DNA. Labelling patterns of time course SIP experiments identified members of the Methylophilaceae family, not previously implicated in DMS degradation, as dominant DMS-degrading populations in soil and lake sediment. Thiobacillus spp. were also detected in 13C-DNA from SIP incubations. Metagenomic sequencing also suggested involvement of Methylophilaceae in DMS degradation and further indicated shifts in the functional profile of the DMS-assimilating communities in line with methylotrophy and oxidation of inorganic sulphur compounds. Overall, these data suggest that unlike in the marine environment where gammaproteobacterial populations were identified by SIP as DMS degraders, betaproteobacterial Methylophilaceae may have a key role in DMS cycling in terrestrial environments. PMID:25822481

  13. A multi-substrate approach for functional metagenomics-based screening for (hemi)cellulases in two wheat straw-degrading microbial consortia unveils novel thermoalkaliphilic enzymes.

    PubMed

    Maruthamuthu, Mukil; Jiménez, Diego Javier; Stevens, Patricia; van Elsas, Jan Dirk

    2016-01-28

    Functional metagenomics is a promising strategy for the exploration of the biocatalytic potential of microbiomes in order to uncover novel enzymes for industrial processes (e.g. biorefining or bleaching pulp). Most current methodologies used to screen for enzymes involved in plant biomass degradation are based on the use of single substrates. Moreover, highly diverse environments are used as metagenomic sources. However, such methods suffer from low hit rates of positive clones and hence the discovery of novel enzymatic activities from metagenomes has been hampered. Here, we constructed fosmid libraries from two wheat straw-degrading microbial consortia, denoted RWS (bred on untreated wheat straw) and TWS (bred on heat-treated wheat straw). Approximately 22,000 clones from each library were screened for (hemi)cellulose-degrading enzymes using a multi-chromogenic substrate approach. The screens yielded 71 positive clones for both libraries, giving hit rates of 1:440 and 1:1,047 for RWS and TWS, respectively. Seven clones (NT2-2, T5-5, NT18-17, T4-1, 10BT, NT18-21 and T17-2) were selected for sequence analyses. Their inserts revealed the presence of 18 genes encoding enzymes belonging to twelve different glycosyl hydrolase families (GH2, GH3, GH13, GH17, GH20, GH27, GH32, GH39, GH53, GH58, GH65 and GH109). These encompassed several carbohydrate-active gene clusters traceable mainly to Klebsiella related species. Detailed functional analyses showed that clone NT2-2 (containing a beta-galactosidase of ~116 kDa) had highest enzymatic activity at 55 °C and pH 9.0. Additionally, clone T5-5 (containing a beta-xylosidase of ~86 kDa) showed > 90% of enzymatic activity at 55 °C and pH 10.0. This study employed a high-throughput method for rapid screening of fosmid metagenomic libraries for (hemi)cellulose-degrading enzymes. The approach, consisting of screens on multi-substrates coupled to further analyses, revealed high hit rates, as compared with recent other studies. Two

  14. CO Dehydrogenase Genes Found in Metagenomic Fosmid Clones from the Deep Mediterranean Sea▿ †

    PubMed Central

    Martin-Cuadrado, Ana-Belen; Ghai, Rohit; Gonzaga, Aitor; Rodriguez-Valera, Francisco

    2009-01-01

    The use of carbon monoxide (CO) as a biological energy source is widespread in microbes. In recent years, the role of CO oxidation in superficial ocean waters has been shown to be an important energy supplement for heterotrophs (carboxydovores). The key enzyme CO dehydrogenase was found in both isolates and metagenomes from the ocean's photic zone, where CO is continuously generated by organic matter photolysis. We have also found genes that code for both forms I (low affinity) and II (high affinity) in fosmids from a metagenomic library generated from a 3,000-m depth in the Mediterranean Sea. Analysis of other metagenomic databases indicates that similar genes are also found in the mesopelagic and bathypelagic North Pacific and on the surfaces of this and other oceanic locations (in lower proportions and similarities). The frequency with which this gene was found indicates that this energy-generating metabolism would be at least as important in the bathypelagic habitat as it is in the photic zone. Although there are no data about CO concentrations or origins deep in the ocean, it could have a geothermal origin or be associated with anaerobic metabolism of organic matter. The identities of the microbes that carry out these processes were not established, but they seem to be representatives of either Bacteroidetes or Chloroflexi. PMID:19801465

  15. Soil Bacterial Community Shifts after Chitin Enrichment: An Integrative Metagenomic Approach

    PubMed Central

    Jacquiod, Samuel; Franqueville, Laure; Cécillon, Sébastien; M. Vogel, Timothy; Simonet, Pascal

    2013-01-01

    Chitin is the second most produced biopolymer on Earth after cellulose. Chitin degrading enzymes are promising but untapped sources for developing novel industrial biocatalysts. Hidden amongst uncultivated micro-organisms, new bacterial enzymes can be discovered and exploited by metagenomic approaches through extensive cloning and screening. Enrichment is also a well-known strategy, as it allows selection of organisms adapted to feed on a specific compound. In this study, we investigated how the soil bacterial community responded to chitin enrichment in a microcosm experiment. An integrative metagenomic approach coupling phylochips and high throughput shotgun pyrosequencing was established in order to assess the taxonomical and functional changes in the soil bacterial community. Results indicate that chitin enrichment leads to an increase of Actinobacteria, γ-proteobacteria and β-proteobacteria suggesting specific selection of chitin degrading bacteria belonging to these classes. Part of enriched bacterial genera were not yet reported to be involved in chitin degradation, like the members from the Micrococcineae sub-order (Actinobacteria). An increase of the observed bacterial diversity was noticed, with detection of specific genera only in chitin treated conditions. The relative proportion of metagenomic sequences related to chitin degradation was significantly increased, even if it represents only a tiny fraction of the sequence diversity found in a soil metagenome. PMID:24278158

  16. MetaPhinder—Identifying Bacteriophage Sequences in Metagenomic Data Sets

    PubMed Central

    Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten

    2016-01-01

    Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder. PMID:27684958

  17. Construction and Screening of Marine Metagenomic Large Insert Libraries.

    PubMed

    Weiland-Bräuer, Nancy; Langfeldt, Daniela; Schmitz, Ruth A

    2017-01-01

    The marine environment covers more than 70 % of the world's surface. Marine microbial communities are highly diverse and have evolved during extended evolutionary processes of physiological adaptations under the influence of a variety of ecological conditions and selection pressures. They harbor an enormous diversity of microbes with still unknown and probably new physiological characteristics. In the past, marine microbes, mostly bacteria of microbial consortia attached to marine tissues of multicellular organisms, have proven to be a rich source of highly potent bioactive compounds, which represent a considerable number of drug candidates. However, to date, the biodiversity of marine microbes and the versatility of their bioactive compounds and metabolites have not been fully explored. This chapter describes sampling in the marine environment, construction of metagenomic large insert libraries from marine habitats, and exemplarily one function based screen of metagenomic clones for identification of quorum quenching activities.

  18. Synthesis and evaluation of a series of 6-chloro-4-methylumbelliferyl glycosides as fluorogenic reagents for screening metagenomic libraries for glycosidase activity.

    PubMed

    Chen, Hong-Ming; Armstrong, Zachary; Hallam, Steven J; Withers, Stephen G

    2016-02-08

    Screening of large enzyme libraries such as those derived from metagenomic sources requires sensitive substrates. Fluorogenic glycosides typically offer the best sensitivity but typically must be used in a stopped format to generate good signal. Use of fluorescent phenols of pKa < 7, such as halogenated coumarins, allows direct screening at neutral pH. The synthesis and characterisation of a set of nine different glycosides of 6-chloro-4-methylumbelliferone are described. The use of these substrates in a pooled format for screening of expressed metagenomic libraries yielded a "hit rate" of 1 in 60. Hits were then readily deconvoluted with the individual substrates in a single plate to identify specific activities within each clone. The use of such a collection of substrates greatly accelerates the screening process. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Cloning and biochemical characterization of a novel lipolytic gene from activated sludge metagenome, and its gene product

    PubMed Central

    2010-01-01

    In this study, a putative esterase, designated EstMY, was isolated from an activated sludge metagenomic library. The lipolytic gene was subcloned and expressed in Escherichia coli BL21 using the pET expression system. The gene estMY contained a 1,083 bp open reading frame (ORF) encoding a polypeptide of 360 amino acids with a molecular mass of 38 kDa. Sequence analysis indicated that it showed 71% and 52% amino acid identity to esterase/lipase from marine metagenome (ACL67845) and Burkholderia ubonensis Bu (ZP_02382719), respectively; and several conserved regions were identified, including the putative active site, GDSAG, a catalytic triad (Ser203, Asp301, and His327) and a HGGG conserved motif (starting from His133). The EstMY was determined to hydrolyse p-nitrophenyl (NP) esters of fatty acids with short chain lengths (≤C8). This EstMY exhibited the highest activity at 35°C and pH 8.5 respectively, by hydrolysis of p-NP caprylate. It also exhibited the same level of activity over wide temperature and pH spectra and in the presence of metal ions or detergents. The high level of stability of esterase EstMY with unique substrate specificities makes it highly valuable for downstream biotechnological applications. PMID:21054894

  20. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries.

    PubMed

    Coughlan, Laura M; Cotter, Paul D; Hill, Colin; Alvarez-Ordóñez, Avelino

    2015-01-01

    Microorganisms are found throughout nature, thriving in a vast range of environmental conditions. The majority of them are unculturable or difficult to culture by traditional methods. Metagenomics enables the study of all microorganisms, regardless of whether they can be cultured or not, through the analysis of genomic data obtained directly from an environmental sample, providing knowledge of the species present, and allowing the extraction of information regarding the functionality of microbial communities in their natural habitat. Function-based screenings, following the cloning and expression of metagenomic DNA in a heterologous host, can be applied to the discovery of novel proteins of industrial interest encoded by the genes of previously inaccessible microorganisms. Functional metagenomics has considerable potential in the food and pharmaceutical industries, where it can, for instance, aid (i) the identification of enzymes with desirable technological properties, capable of catalyzing novel reactions or replacing existing chemically synthesized catalysts which may be difficult or expensive to produce, and able to work under a wide range of environmental conditions encountered in food and pharmaceutical processing cycles including extreme conditions of temperature, pH, osmolarity, etc; (ii) the discovery of novel bioactives including antimicrobials active against microorganisms of concern both in food and medical settings; (iii) the investigation of industrial and societal issues such as antibiotic resistance development. This review article summarizes the state-of-the-art functional metagenomic methods available and discusses the potential of functional metagenomic approaches to mine as yet unexplored environments to discover novel genes with biotechnological application in the food and pharmaceutical industries.

  1. Biotechnological applications of functional metagenomics in the food and pharmaceutical industries

    PubMed Central

    Coughlan, Laura M.; Cotter, Paul D.; Hill, Colin; Alvarez-Ordóñez, Avelino

    2015-01-01

    Microorganisms are found throughout nature, thriving in a vast range of environmental conditions. The majority of them are unculturable or difficult to culture by traditional methods. Metagenomics enables the study of all microorganisms, regardless of whether they can be cultured or not, through the analysis of genomic data obtained directly from an environmental sample, providing knowledge of the species present, and allowing the extraction of information regarding the functionality of microbial communities in their natural habitat. Function-based screenings, following the cloning and expression of metagenomic DNA in a heterologous host, can be applied to the discovery of novel proteins of industrial interest encoded by the genes of previously inaccessible microorganisms. Functional metagenomics has considerable potential in the food and pharmaceutical industries, where it can, for instance, aid (i) the identification of enzymes with desirable technological properties, capable of catalyzing novel reactions or replacing existing chemically synthesized catalysts which may be difficult or expensive to produce, and able to work under a wide range of environmental conditions encountered in food and pharmaceutical processing cycles including extreme conditions of temperature, pH, osmolarity, etc; (ii) the discovery of novel bioactives including antimicrobials active against microorganisms of concern both in food and medical settings; (iii) the investigation of industrial and societal issues such as antibiotic resistance development. This review article summarizes the state-of-the-art functional metagenomic methods available and discusses the potential of functional metagenomic approaches to mine as yet unexplored environments to discover novel genes with biotechnological application in the food and pharmaceutical industries. PMID:26175729

  2. New FeFe-hydrogenase genes identified in a metagenomic fosmid library from a municipal wastewater treatment plant as revealed by high-throughput sequencing.

    PubMed

    Tomazetto, Geizecler; Wibberg, Daniel; Schlüter, Andreas; Oliveira, Valéria M

    2015-01-01

    A fosmid metagenomic library was constructed with total community DNA obtained from a municipal wastewater treatment plant (MWWTP), with the aim of identifying new FeFe-hydrogenase genes encoding the enzymes most important for hydrogen metabolism. The dataset generated by pyrosequencing of a fosmid library was mined to identify environmental gene tags (EGTs) assigned to FeFe-hydrogenase. The majority of EGTs representing FeFe-hydrogenase genes were affiliated with the class Clostridia, suggesting that this group is the main hydrogen producer in the MWWTP analyzed. Based on assembled sequences, three FeFe-hydrogenase genes were predicted based on detection of the L2 motif (MPCxxKxxE) in the encoded gene product, confirming true FeFe-hydrogenase sequences. These sequences were used to design specific primers to detect fosmids encoding FeFe-hydrogenase genes predicted from the dataset. Three identified fosmids were completely sequenced. The cloned genomic fragments within these fosmids are closely related to members of the Spirochaetaceae, Bacteroidales and Firmicutes, and their FeFe-hydrogenase sequences are characterized by the structure type M3, which is common to clostridial enzymes. FeFe-hydrogenase sequences found in this study represent hitherto undetected sequences, indicating the high genetic diversity regarding these enzymes in MWWTP. Results suggest that MWWTP have to be considered as reservoirs for new FeFe-hydrogenase genes. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  3. Metagenomics: A new horizon in cancer research

    PubMed Central

    Banerjee, Joyita; Mishra, Neetu; Dhas, Yogita

    2015-01-01

    Metagenomics has broadened the scope of targeting microbes responsible for inducing various types of cancers. About 16.1% of cancers are associated with microbial infection. Metagenomics is an equitable way of identifying and studying micro-organisms within their habitat. In cancer research, this approach has revolutionized the way of identifying, analyzing and targeting the microbial diversity present in the tissue specimens of cancer patients. The genomic analyses of these micro-organisms through next generation sequencing techniques invariably facilitate in recognizing the microbial population in biopsies and their evolutionary relationships with each other. In this review an attempt has been made to generate current metagenomic view on cancer microbiota. Different types of micro-organisms have been found to be linked to various types of cancers, thus, contributing significantly in understanding the disease at molecular level. PMID:26110115

  4. Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.

    PubMed

    Pride, David T; Schoenfeld, Thomas

    2008-09-17

    Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs

  5. Improving microbial fitness in the mammalian gut by in vivo temporal functional metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yaung, Stephanie J.; Deng, Luxue; Li, Ning

    Elucidating functions of commensal microbial genes in the mammalian gut is challenging because many commensals are recalcitrant to laboratory cultivation and genetic manipulation. We present Temporal FUnctional Metagenomics sequencing (TFUMseq), a platform to functionally mine bacterial genomes for genes that contribute to fitness of commensal bacteria in vivo. Our approach uses metagenomic DNA to construct large-scale heterologous expression libraries that are tracked over time in vivo by deep sequencing and computational methods. To demonstrate our approach, we built a TFUMseq plasmid library using the gut commensal Bacteroides thetaiotaomicron (Bt) and introduced Escherichia coli carrying this library into germfree mice. Populationmore » dynamics of library clones revealed Bt genes conferring significant fitness advantages in E. coli over time, including carbohydrate utilization genes, with a Bt galactokinase central to early colonization, and subsequent dominance by a Bt glycoside hydrolase enabling sucrose metabolism coupled with co-evolution of the plasmid library and E. coli genome driving increased galactose utilization. Here, our findings highlight the utility of functional metagenomics for engineering commensal bacteria with improved properties, including expanded colonization capabilities in vivo.« less

  6. Improving microbial fitness in the mammalian gut by in vivo temporal functional metagenomics

    DOE PAGES

    Yaung, Stephanie J.; Deng, Luxue; Li, Ning; ...

    2015-03-11

    Elucidating functions of commensal microbial genes in the mammalian gut is challenging because many commensals are recalcitrant to laboratory cultivation and genetic manipulation. We present Temporal FUnctional Metagenomics sequencing (TFUMseq), a platform to functionally mine bacterial genomes for genes that contribute to fitness of commensal bacteria in vivo. Our approach uses metagenomic DNA to construct large-scale heterologous expression libraries that are tracked over time in vivo by deep sequencing and computational methods. To demonstrate our approach, we built a TFUMseq plasmid library using the gut commensal Bacteroides thetaiotaomicron (Bt) and introduced Escherichia coli carrying this library into germfree mice. Populationmore » dynamics of library clones revealed Bt genes conferring significant fitness advantages in E. coli over time, including carbohydrate utilization genes, with a Bt galactokinase central to early colonization, and subsequent dominance by a Bt glycoside hydrolase enabling sucrose metabolism coupled with co-evolution of the plasmid library and E. coli genome driving increased galactose utilization. Here, our findings highlight the utility of functional metagenomics for engineering commensal bacteria with improved properties, including expanded colonization capabilities in vivo.« less

  7. Functional metagenomic selection of RubisCOs from uncultivated bacteria

    USGS Publications Warehouse

    Varaljay, Vanessa A; Satagopan, Sriram; North, Justin A.; Witteveen, Briana; Dourado, Manuella N.; Anantharaman, Karthik; Arbing, Mark A.; McCann, Shelley; Oremland, Ronald S.; Banfield, Jillian F.; Wrighton, Kelly C.; Tabita, F. Robert

    2016-01-01

    Ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) is a critical yet severely inefficient enzyme that catalyses the fixation of virtually all of the carbon found on Earth. Here, we report a functional metagenomic selection that recovers physiologically active RubisCO molecules directly from uncultivated and largely unknown members of natural microbial communities. Selection is based on CO2-dependent growth in a host strain capable of expressing environmental deoxyribonucleic acid (DNA), precluding the need for pure cultures or screening of recombinant clones for enzymatic activity. Seventeen functional RubisCO-encoded sequences were selected using DNA extracted from soil and river autotrophic enrichments, a photosynthetic biofilm and a subsurface groundwater aquifer. Notably, three related form II RubisCOs were recovered which share high sequence similarity with metagenomic scaffolds from uncultivated members of theGallionellaceae family. One of the Gallionellaceae RubisCOs was purified and shown to possessCO2/O2 specificity typical of form II enzymes. X-ray crystallography determined that this enzyme is a hexamer, only the second form II multimer ever solved and the first RubisCO structure obtained from an uncultivated bacterium. Functional metagenomic selection leverages natural biological diversity and billions of years of evolution inherent in environmental communities, providing a new window into the discovery of CO2-fixing enzymes not previously characterized.

  8. Construction of a metagenomic DNA library of sponge symbionts and screening of antibacterial metabolites

    NASA Astrophysics Data System (ADS)

    Chen, Juan; Zhu, Tianjiao; Li, Dehai; Cui, Chengbin; Fang, Yuchun; Liu, Hongbing; Liu, Peipei; Gu, Qianqun; Zhu, Weiming

    2006-04-01

    To study the bioactive metabolites produced by sponge-derived uncultured symbionts, a metagenomic DNA library of the symbionts of sponge Gelliodes gracilis was constructed. The average size of DNA inserts in the library was 20 kb. This library was screened for antibiotic activity using paper dise assaying. Two clones displayed the antibacterial activity against Micrococcus tetragenus. The metabolites of these two clones were analyzed through HPLC. The result showed that their metabolites were quite different from those of the host E. coli DH5α and the host containing vector pHZ132. This study may present a new approach to exploring bioactive metabolites of sponge symbionts.

  9. Antibiotic Resistome: Improving Detection and Quantification Accuracy for Comparative Metagenomics.

    PubMed

    Elbehery, Ali H A; Aziz, Ramy K; Siam, Rania

    2016-04-01

    The unprecedented rise of life-threatening antibiotic resistance (AR), combined with the unparalleled advances in DNA sequencing of genomes and metagenomes, has pushed the need for in silico detection of the resistance potential of clinical and environmental metagenomic samples through the quantification of AR genes (i.e., genes conferring antibiotic resistance). Therefore, determining an optimal methodology to quantitatively and accurately assess AR genes in a given environment is pivotal. Here, we optimized and improved existing AR detection methodologies from metagenomic datasets to properly consider AR-generating mutations in antibiotic target genes. Through comparative metagenomic analysis of previously published AR gene abundance in three publicly available metagenomes, we illustrate how mutation-generated resistance genes are either falsely assigned or neglected, which alters the detection and quantitation of the antibiotic resistome. In addition, we inspected factors influencing the outcome of AR gene quantification using metagenome simulation experiments, and identified that genome size, AR gene length, total number of metagenomics reads and selected sequencing platforms had pronounced effects on the level of detected AR. In conclusion, our proposed improvements in the current methodologies for accurate AR detection and resistome assessment show reliable results when tested on real and simulated metagenomic datasets.

  10. Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures

    PubMed Central

    Pride, David T; Schoenfeld, Thomas

    2008-01-01

    Background Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses. Results From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw

  11. The Metagenome-Derived Enzymes LipS and LipT Increase the Diversity of Known Lipases

    PubMed Central

    Chow, Jennifer; Kovacic, Filip; Dall Antonia, Yuliya; Krauss, Ulrich; Fersini, Francesco; Schmeisser, Christel; Lauinger, Benjamin; Bongen, Patrick; Pietruszka, Joerg; Schmidt, Marlen; Menyes, Ina; Bornscheuer, Uwe T.; Eckstein, Marrit; Thum, Oliver; Liese, Andreas; Mueller-Dieckmann, Jochen; Jaeger, Karl-Erich; Streit, Wolfgang R.

    2012-01-01

    Triacylglycerol lipases (EC 3.1.1.3) catalyze both hydrolysis and synthesis reactions with a broad spectrum of substrates rendering them especially suitable for many biotechnological applications. Most lipases used today originate from mesophilic organisms and are susceptible to thermal denaturation whereas only few possess high thermotolerance. Here, we report on the identification and characterization of two novel thermostable bacterial lipases identified by functional metagenomic screenings. Metagenomic libraries were constructed from enrichment cultures maintained at 65 to 75°C and screened resulting in the identification of initially 10 clones with lipolytic activities. Subsequently, two ORFs were identified encoding lipases, LipS and LipT. Comparative sequence analyses suggested that both enzymes are members of novel lipase families. LipS is a 30.2 kDa protein and revealed a half-life of 48 h at 70°C. The lipT gene encoded for a multimeric enzyme with a half-life of 3 h at 70°C. LipS had an optimum temperature at 70°C and LipT at 75°C. Both enzymes catalyzed hydrolysis of long-chain (C12 and C14) fatty acid esters and additionally hydrolyzed a number of industry-relevant substrates. LipS was highly specific for (R)-ibuprofen-phenyl ester with an enantiomeric excess (ee) of 99%. Furthermore, LipS was able to synthesize 1-propyl laurate and 1-tetradecyl myristate at 70°C with rates similar to those of the lipase CalB from Candida antarctica. LipS represents the first example of a thermostable metagenome-derived lipase with significant synthesis activities. Its X-ray structure was solved with a resolution of 1.99 Å revealing an unusually compact lid structure. PMID:23112831

  12. Integrating metagenomic and amplicon databases to resolve the phylogenetic and ecological diversity of the Chlamydiae

    PubMed Central

    Lagkouvardos, Ilias; Weinmaier, Thomas; Lauro, Federico M; Cavicchioli, Ricardo; Rattei, Thomas; Horn, Matthias

    2014-01-01

    In the era of metagenomics and amplicon sequencing, comprehensive analyses of available sequence data remain a challenge. Here we describe an approach exploiting metagenomic and amplicon data sets from public databases to elucidate phylogenetic diversity of defined microbial taxa. We investigated the phylum Chlamydiae whose known members are obligate intracellular bacteria that represent important pathogens of humans and animals, as well as symbionts of protists. Despite their medical relevance, our knowledge about chlamydial diversity is still scarce. Most of the nine known families are represented by only a few isolates, while previous clone library-based surveys suggested the existence of yet uncharacterized members of this phylum. Here we identified more than 22 000 high quality, non-redundant chlamydial 16S rRNA gene sequences in diverse databases, as well as 1900 putative chlamydial protein-encoding genes. Even when applying the most conservative approach, clustering of chlamydial 16S rRNA gene sequences into operational taxonomic units revealed an unexpectedly high species, genus and family-level diversity within the Chlamydiae, including 181 putative families. These in silico findings were verified experimentally in one Antarctic sample, which contained a high diversity of novel Chlamydiae. In our analysis, the Rhabdochlamydiaceae, whose known members infect arthropods, represents the most diverse and species-rich chlamydial family, followed by the protist-associated Parachlamydiaceae, and a putative new family (PCF8) with unknown host specificity. Available information on the origin of metagenomic samples indicated that marine environments contain the majority of the newly discovered chlamydial lineages, highlighting this environment as an important chlamydial reservoir. PMID:23949660

  13. Diversity of hydrolases from hydrothermal vent sediments of the Levante Bay, Vulcano Island (Aeolian archipelago) identified by activity-based metagenomics and biochemical characterization of new esterases and an arabinopyranosidase.

    PubMed

    Placido, Antonio; Hai, Tran; Ferrer, Manuel; Chernikova, Tatyana N; Distaso, Marco; Armstrong, Dale; Yakunin, Alexander F; Toshchakov, Stepan V; Yakimov, Michail M; Kublanov, Ilya V; Golyshina, Olga V; Pesole, Graziano; Ceci, Luigi R; Golyshin, Peter N

    2015-12-01

    A metagenomic fosmid expression library established from environmental DNA (eDNA) from the shallow hot vent sediment sample collected from the Levante Bay, Vulcano Island (Aeolian archipelago) was established in Escherichia coli. Using activity-based screening assays, we have assessed 9600 fosmid clones corresponding to approximately 350 Mbp of the cloned eDNA, for the lipases/esterases/lactamases, haloalkane and haloacid dehalogenases, and glycoside hydrolases. Thirty-four positive fosmid clones were selected from the total of 120 positive hits and sequenced to yield ca. 1360 kbp of high-quality assemblies. Fosmid inserts were attributed to the members of ten bacterial phyla, including Proteobacteria, Bacteroidetes, Acidobateria, Firmicutes, Verrucomicrobia, Chloroflexi, Spirochaetes, Thermotogae, Armatimonadetes, and Planctomycetes. Of ca. 200 proteins with high biotechnological potential identified therein, we have characterized in detail three distinct α/β-hydrolases (LIPESV12_9, LIPESV12_24, LIPESV12_26) and one new α-arabinopyranosidase (GLV12_5). All LIPESV12 enzymes revealed distinct substrate specificities tested against 43 structurally diverse esters and 4 p-nitrophenol carboxyl esters. Of 16 different glycosides tested, the GLV12_5 hydrolysed only p-nitrophenol-α-(L)-arabinopyranose with a high specific activity of about 2.7 kU/mg protein. Most of the α/β-hydrolases were thermophilic and revealed a high tolerance to, and high activities in the presence of, numerous heavy metal ions. Among them, the LIPESV12_24 was the best temperature-adapted, retaining its activity after 40 min of incubation at 90 °C. Furthermore, enzymes were active in organic solvents (e.g., >30% methanol). Both LIPESV12_24 and LIPESV12_26 had the GXSXG pentapeptides and the catalytic triads Ser-Asp-His typical to the representatives of carboxylesterases of EC 3.1.1.1.

  14. Metagenomic analysis reveals a green sulfur bacterium as a potential coral symbiont.

    PubMed

    Cai, Lin; Zhou, Guowei; Tian, Ren-Mao; Tong, Haoya; Zhang, Weipeng; Sun, Jin; Ding, Wei; Wong, Yue Him; Xie, James Y; Qiu, Jian-Wen; Liu, Sheng; Huang, Hui; Qian, Pei-Yuan

    2017-08-24

    Coral reefs are ecologically significant habitats. Coral-algal symbiosis confers ecological success on coral reefs and coral-microbial symbiosis is also vital to coral reefs. However, current understanding of coral-microbial symbiosis on a genomic scale is largely unknown. Here we report a potential microbial symbiont in corals revealed by metagenomics-based genomic study. Microbial cells in coral were enriched for metagenomic analysis and a high-quality draft genome of "Candidatus Prosthecochloris korallensis" was recovered by metagenome assembly and genome binning. Phylogenetic analysis shows "Ca. P. korallensis" belongs to the Prosthecochloris clade and is clustered with two Prosthecochloris clones derived from Caribbean corals. Genomic analysis reveals "Ca. P. korallensis" has potentially important ecological functions including anoxygenic photosynthesis, carbon fixation via the reductive tricarboxylic acid (rTCA) cycle, nitrogen fixation, and sulfur oxidization. Core metabolic pathway analysis suggests "Ca. P. korallensis" is a green sulfur bacterium capable of photoautotrophy or mixotrophy. Potential host-microbial interaction reveals a symbiotic relationship: "Ca. P. korallensis" might provide organic and nitrogenous nutrients to its host and detoxify sulfide for the host; the host might provide "Ca. P. korallensis" with an anaerobic environment for survival, carbon dioxide and acetate for growth, and hydrogen sulfide as an electron donor for photosynthesis.

  15. Functional metagenomic analysis reveals rivers are a reservoir for diverse antibiotic resistance genes.

    PubMed

    Amos, G C A; Zhang, L; Hawkey, P M; Gaze, W H; Wellington, E M

    2014-07-16

    The environment harbours a significant diversity of uncultured bacteria and a potential source of novel and extant resistance genes which may recombine with clinically important bacteria disseminated into environmental reservoirs. There is evidence that pollution can select for resistance due to the aggregation of adaptive genes on mobile elements. The aim of this study was to establish the impact of waste water treatment plant (WWTP) effluent disposal to a river by using culture independent methods to study diversity of resistance genes downstream of the WWTP in comparison to upstream. Metagenomic libraries were constructed in Escherichia coli and screened for phenotypic resistance to amikacin, gentamicin, neomycin, ampicillin and ciprofloxacin. Resistance genes were identified by using transposon mutagenesis. A significant increase downstream of the WWTP was observed in the number of phenotypic resistant clones recovered in metagenomic libraries. Common β-lactamases such as blaTEM were recovered as well as a diverse range of acetyltransferases and unusual transporter genes, with evidence for newly emerging resistance mechanisms. The similarities of the predicted proteins to known sequences suggested origins of genes from a very diverse range of bacteria. The study suggests that waste water disposal increases the reservoir of resistance mechanisms in the environment either by addition of resistance genes or by input of agents selective for resistant phenotypes. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  16. Oral Metagenomic Biomarkers in Rheumatoid Arthritis

    DTIC Science & Technology

    2017-09-01

    AWARD NUMBER: W81XWH-15-1-0320 TITLE: Oral Metagenomic Biomarkers in Rheumatoid Arthritis PRINCIPAL INVESTIGATOR: Edward K Chan CONTRACTING... Rheumatoid Arthritis 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-15-1-0320 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Edward K Chan 5d. PROJECT NUMBER...individuals with  rheumatoid   arthritis  (RA). The goal is to test the  hypothesis that oral microbiome and metagenomic analyses will allow us to identify new

  17. Cloning and heterologous expression of plnE, -F, -J and -K genes derived from soil metagenome and purification of active plantaricin peptides.

    PubMed

    Pal, Gargi; Srivastava, Sheela

    2014-02-01

    Plantaricin gene-specific primers were used to obtain plnE, -F, -J and -K structural gene amplicons from soil metagenome. These amplicons were cloned and expressed in pET32a (+) vector in Escherichia coli BL21 (DE3). PlnE, -F, -J and -K peptides were expressed as His-tagged-fusion proteins and were separated by Ni(2+) -chelating affinity chromatography. The peptides were released from the fusion by enterokinase cleavage and separated from the carrier thioredoxin. The cleaved peptides were further analysed for antimicrobial activity and found to be active against Listeria innocua NRRL B33314, Micrococcus luteus MTCC 106 and lactic acid bacteria, such as Enterococcus casseliflavus NRRL B3502, Lactococcus lactis lactis NRRL 1821, Lactobacillus curvatus NRRL B4562 and Lactobacillus plantarum NRRL B4496. E. coli has been successfully exploited as a host for heterologous expression with a significant yield of fused and cleaved peptides in the range of 8-12 and 1-1.5 mg/l of the culture, respectively. Heterologous expression, therefore, can be used to overcome the constraints of low yield often reported from a native strain.

  18. A metagenomic viral discovery approach identifies potential zoonotic and novel mammalian viruses in Neoromicia bats within South Africa.

    PubMed

    Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C J; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H; Cui, Helen; Markotter, Wanda

    2018-01-01

    Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and

  19. Normalization of environmental metagenomic DNA enhances the discovery of under-represented microbial community members.

    PubMed

    Ramond, J-B; Makhalanyane, T P; Tuffin, M I; Cowan, D A

    2015-04-01

    Normalization is a procedure classically employed to detect rare sequences in cellular expression profiles (i.e. cDNA libraries). Here, we present a normalization protocol involving the direct treatment of extracted environmental metagenomic DNA with S1 nuclease, referred to as normalization of metagenomic DNA: NmDNA. We demonstrate that NmDNA, prior to post hoc PCR-based experiments (16S rRNA gene T-RFLP fingerprinting and clone library), increased the diversity of sequences retrieved from environmental microbial communities by detection of rarer sequences. This approach could be used to enhance the resolution of detection of ecologically relevant rare members in environmental microbial assemblages and therefore is promising in enabling a better understanding of ecosystem functioning. This study is the first testing 'normalization' on environmental metagenomic DNA (mDNA). The aim of this procedure was to improve the identification of rare phylotypes in environmental communities. Using hypoliths as model systems, we present evidence that this post-mDNA extraction molecular procedure substantially enhances the detection of less common phylotypes and could even lead to the discovery of novel microbial genotypes within a given environment. © 2014 The Society for Applied Microbiology.

  20. Simultaneous virus identification and characterization of severe unexplained pneumonia cases using a metagenomics sequencing technique.

    PubMed

    Zou, Xiaohui; Tang, Guangpeng; Zhao, Xiang; Huang, Yan; Chen, Tao; Lei, Mingyu; Chen, Wenbing; Yang, Lei; Zhu, Wenfei; Zhuang, Li; Yang, Jing; Feng, Zhaomin; Wang, Dayan; Wang, Dingming; Shu, Yuelong

    2017-03-01

    Many viruses can cause respiratory diseases in humans. Although great advances have been achieved in methods of diagnosis, it remains challenging to identify pathogens in unexplained pneumonia (UP) cases. In this study, we applied next-generation sequencing (NGS) technology and a metagenomic approach to detect and characterize respiratory viruses in UP cases from Guizhou Province, China. A total of 33 oropharyngeal swabs were obtained from hospitalized UP patients and subjected to NGS. An unbiased metagenomic analysis pipeline identified 13 virus species in 16 samples. Human rhinovirus C was the virus most frequently detected and was identified in seven samples. Human measles virus, adenovirus B 55 and coxsackievirus A10 were also identified. Metagenomic sequencing also provided virus genomic sequences, which enabled genotype characterization and phylogenetic analysis. For cases of multiple infection, metagenomic sequencing afforded information regarding the quantity of each virus in the sample, which could be used to evaluate each viruses' role in the disease. Our study highlights the potential of metagenomic sequencing for pathogen identification in UP cases.

  1. Swine Fecal Metagenomics

    EPA Science Inventory

    Metagenomic approaches are providing rapid and more robust means to investigate the composition and functional genetic potential of complex microbial communities. In this study, we utilized a metagenomic approach to further understand the functional diversity of the swine gut. To...

  2. Gene Expression and Molecular Characterization of a Xylanase from Chicken Cecum Metagenome

    PubMed Central

    AL-Darkazali, Hind; Meevootisom, Vithaya

    2017-01-01

    A xylanase gene xynAMG1 with a 1,116-bp open reading frame, encoding an endo-β-1,4-xylanase, was cloned from a chicken cecum metagenome. The translated XynAMG1 protein consisted of 372 amino acids including a putative signal peptide of 23 amino acids. The calculated molecular mass of the mature XynAMG1 was 40,013 Da, with a theoretical pI value of 5.76. The amino acid sequence of XynAMG1 showed 59% identity to endo-β-1,4-xylanase from Prevotella bryantii and Prevotella ruminicola and 58% identity to that from Prevotella copri. XynAMG1 has two conserved motifs, DVVNE and TEXD, containing two active site glutamates and an invariant asparagine, characteristic of GH10 family xylanase. The xynAMG1 gene without signal peptide sequence was cloned and fused with thioredoxin protein (Trx.Tag) in pET-32a plasmid and overexpressed in Escherichia coli Tuner™(DE3)pLysS. The purified mature XynAMG1 was highly salt-tolerant and stable and displayed higher than 96% of its catalytic activity in the reaction containing 1 to 4 M NaCl. It was only slightly affected by common organic solvents added in aqueous solution to up to 5 M. This chicken cecum metagenome-derived xylanase has potential applications in animal feed additives and industrial enzymatic processes requiring exposure to high concentrations of salt and organic solvents. PMID:28751915

  3. Consensus statement: Virus taxonomy in the age of metagenomics.

    PubMed

    Simmonds, Peter; Adams, Mike J; Benkő, Mária; Breitbart, Mya; Brister, J Rodney; Carstens, Eric B; Davison, Andrew J; Delwart, Eric; Gorbalenya, Alexander E; Harrach, Balázs; Hull, Roger; King, Andrew M Q; Koonin, Eugene V; Krupovic, Mart; Kuhn, Jens H; Lefkowitz, Elliot J; Nibert, Max L; Orton, Richard; Roossinck, Marilyn J; Sabanadzovic, Sead; Sullivan, Matthew B; Suttle, Curtis A; Tesh, Robert B; van der Vlugt, René A; Varsani, Arvind; Zerbini, F Murilo

    2017-03-01

    The number and diversity of viral sequences that are identified in metagenomic data far exceeds that of experimentally characterized virus isolates. In a recent workshop, a panel of experts discussed the proposal that, with appropriate quality control, viruses that are known only from metagenomic data can, and should be, incorporated into the official classification scheme of the International Committee on Taxonomy of Viruses (ICTV). Although a taxonomy that is based on metagenomic sequence data alone represents a substantial departure from the traditional reliance on phenotypic properties, the development of a robust framework for sequence-based virus taxonomy is indispensable for the comprehensive characterization of the global virome. In this Consensus Statement article, we consider the rationale for why metagenomic sequence data should, and how it can, be incorporated into the ICTV taxonomy, and present proposals that have been endorsed by the Executive Committee of the ICTV.

  4. Phylogenetically Novel LuxI/LuxR-Type Quorum Sensing Systems Isolated Using a Metagenomic Approach

    PubMed Central

    Nasuno, Eri; Fujita, Masaki J.; Nakatsu, Cindy H.; Kamagata, Yoichi; Hanada, Satoshi

    2012-01-01

    A great deal of research has been done to understand bacterial cell-to-cell signaling systems, but there is still a large gap in our current knowledge because the majority of microorganisms in natural environments do not have cultivated representatives. Metagenomics is one approach to identify novel quorum sensing (QS) systems from uncultured bacteria in environmental samples. In this study, fosmid metagenomic libraries were constructed from a forest soil and an activated sludge from a coke plant, and the target genes were detected using a green fluorescent protein (GFP)-based Escherichia coli biosensor strain whose fluorescence was screened by spectrophotometry. DNA sequence analysis revealed two pairs of new LuxI family N-acyl-l-homoserine lactone (AHL) synthases and LuxR family transcriptional regulators (clones N16 and N52, designated AubI/AubR and AusI/AusR, respectively). AubI and AusI each produced an identical AHL, N-dodecanoyl-l-homoserine lactone (C12-HSL), as determined by nuclear magnetic resonance (NMR) and mass spectrometry. Phylogenetic analysis based on amino acid sequences suggested that AusI/AusR was from an uncultured member of the Betaproteobacteria and AubI/AubR was very deeply branched from previously described LuxI/LuxR homologues in isolates of the Proteobacteria. The phylogenetic position of AubI/AubR indicates that they represent a QS system not acquired recently from the Proteobacteria by horizontal gene transfer but share a more ancient ancestry. We demonstrated that metagenomic screening is useful to provide further insight into the phylogenetic diversity of bacterial QS systems by describing two new LuxI/LuxR-type QS systems from uncultured bacteria. PMID:22983963

  5. Est16, a New Esterase Isolated from a Metagenomic Library of a Microbial Consortium Specializing in Diesel Oil Degradation.

    PubMed

    Pereira, Mariana Rangel; Mercaldi, Gustavo Fernando; Maester, Thaís Carvalho; Balan, Andrea; Lemos, Eliana Gertrudes de Macedo

    2015-01-01

    Lipolytic enzymes have attracted attention from a global market because they show enormous biotechnological potential for applications such as detergent production, leather processing, cosmetics production, and use in perfumes and biodiesel. Due to the intense demand for biocatalysts, a metagenomic approach provides methods of identifying new enzymes. In this study, an esterase designated as Est16 was selected from 4224 clones of a fosmid metagenomic library, revealing an 87% amino acid identity with an esterase/lipase (accession number ADM63076.1) from an uncultured bacterium. Phylogenetic studies showed that the enzyme belongs to family V of bacterial lipolytic enzymes and has sequence and structural similarities with an aryl-esterase from Pseudomonas fluorescens and a patented Anti-Kazlauskas lipase (patent number US20050153404). The protein was expressed and purified as a highly soluble, thermally stable enzyme that showed a preference for basic pH. Est16 exhibited activity toward a wide range of substrates and the highest catalytic efficiency against p-nitrophenyl butyrate and p-nitrophenyl valerate. Est16 also showed tolerance to the presence of organic solvents, detergents and metals. Based on molecular modeling, we showed that the large alpha-beta domain is conserved in the patented enzymes but not the substrate pocket. Here, it was demonstrated that a metagenomic approach is suitable for discovering the lipolytic enzyme diversity and that Est16 has the biotechnological potential for use in industrial processes.

  6. EBI metagenomics--a new resource for the analysis and archiving of metagenomic data.

    PubMed

    Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta

    2014-01-01

    Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

  7. BioCreative Workshops for DOE Genome Sciences: Text Mining for Metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Cathy H.; Hirschman, Lynette

    The objective of this project was to host BioCreative workshops to define and develop text mining tasks to meet the needs of the Genome Sciences community, focusing on metadata information extraction in metagenomics. Following the successful introduction of metagenomics at the BioCreative IV workshop, members of the metagenomics community and BioCreative communities continued discussion to identify candidate topics for a BioCreative metagenomics track for BioCreative V. Of particular interest was the capture of environmental and isolation source information from text. The outcome was to form a “community of interest” around work on the interactive EXTRACT system, which supported interactive taggingmore » of environmental and species data. This experiment is included in the BioCreative V virtual issue of Database. In addition, there was broad participation by members of the metagenomics community in the panels held at BioCreative V, leading to valuable exchanges between the text mining developers and members of the metagenomics research community. These exchanges are reflected in a number of the overview and perspective pieces also being captured in the BioCreative V virtual issue. Overall, this conversation has exposed the metagenomics researchers to the possibilities of text mining, and educated the text mining developers to the specific needs of the metagenomics community.« less

  8. Metagenome Assembly at the DOE JGI (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Chain, Patrick

    2018-01-25

    Patrick Chain of DOE JGI at LANL, Co-Chair of the Metagenome-specific Assembly session, on Metagenome Assembly at the DOE JGIat the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  9. A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

    PubMed Central

    Tang, Haixu; Li, Sujun; Ye, Yuzhen

    2016-01-01

    Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579

  10. Metagenomics as a Tool for Enzyme Discovery: Hydrolytic Enzymes from Marine-Related Metagenomes.

    PubMed

    Popovic, Ana; Tchigvintsev, Anatoly; Tran, Hai; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N; Yakunin, Alexander F

    2015-01-01

    This chapter discusses metagenomics and its application for enzyme discovery, with a focus on hydrolytic enzymes from marine metagenomic libraries. With less than one percent of culturable microorganisms in the environment, metagenomics, or the collective study of community genetics, has opened up a rich pool of uncharacterized metabolic pathways, enzymes, and adaptations. This great untapped pool of genes provides the particularly exciting potential to mine for new biochemical activities or novel enzymes with activities tailored to peculiar sets of environmental conditions. Metagenomes also represent a huge reservoir of novel enzymes for applications in biocatalysis, biofuels, and bioremediation. Here we present the results of enzyme discovery for four enzyme activities, of particular industrial or environmental interest, including esterase/lipase, glycosyl hydrolase, protease and dehalogenase.

  11. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

    PubMed

    McIntyre, Alexa B R; Ounit, Rachid; Afshinnekoo, Ebrahim; Prill, Robert J; Hénaff, Elizabeth; Alexander, Noah; Minot, Samuel S; Danko, David; Foox, Jonathan; Ahsanuddin, Sofia; Tighe, Scott; Hasan, Nur A; Subramanian, Poorani; Moffat, Kelly; Levy, Shawn; Lonardi, Stefano; Greenfield, Nick; Colwell, Rita R; Rosen, Gail L; Mason, Christopher E

    2017-09-21

    One of the main challenges in metagenomics is the identification of microorganisms in clinical and environmental samples. While an extensive and heterogeneous set of computational tools is available to classify microorganisms using whole-genome shotgun sequencing data, comprehensive comparisons of these methods are limited. In this study, we use the largest-to-date set of laboratory-generated and simulated controls across 846 species to evaluate the performance of 11 metagenomic classifiers. Tools were characterized on the basis of their ability to identify taxa at the genus, species, and strain levels, quantify relative abundances of taxa, and classify individual reads to the species level. Strikingly, the number of species identified by the 11 tools can differ by over three orders of magnitude on the same datasets. Various strategies can ameliorate taxonomic misclassification, including abundance filtering, ensemble approaches, and tool intersection. Nevertheless, these strategies were often insufficient to completely eliminate false positives from environmental samples, which are especially important where they concern medically relevant species. Overall, pairing tools with different classification strategies (k-mer, alignment, marker) can combine their respective advantages. This study provides positive and negative controls, titrated standards, and a guide for selecting tools for metagenomic analyses by comparing ranges of precision, accuracy, and recall. We show that proper experimental design and analysis parameters can reduce false positives, provide greater resolution of species in complex metagenomic samples, and improve the interpretation of results.

  12. A metagenomic viral discovery approach identifies potential zoonotic and novel mammalian viruses in Neoromicia bats within South Africa

    PubMed Central

    Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C. J.; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H.; Cui, Helen; Markotter, Wanda

    2018-01-01

    Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and

  13. A hybrid approach identifies metabolic signatures of high-producers for chinese hamster ovary clone selection and process optimization.

    PubMed

    Popp, Oliver; Müller, Dirk; Didzus, Katharina; Paul, Wolfgang; Lipsmeier, Florian; Kirchner, Florian; Niklas, Jens; Mauch, Klaus; Beaucamp, Nicola

    2016-09-01

    In-depth characterization of high-producer cell lines and bioprocesses is vital to ensure robust and consistent production of recombinant therapeutic proteins in high quantity and quality for clinical applications. This requires applying appropriate methods during bioprocess development to enable meaningful characterization of CHO clones and processes. Here, we present a novel hybrid approach for supporting comprehensive characterization of metabolic clone performance. The approach combines metabolite profiling with multivariate data analysis and fluxomics to enable a data-driven mechanistic analysis of key metabolic traits associated with desired cell phenotypes. We applied the methodology to quantify and compare metabolic performance in a set of 10 recombinant CHO-K1 producer clones and a host cell line. The comprehensive characterization enabled us to derive an extended set of clone performance criteria that not only captured growth and product formation, but also incorporated information on intracellular clone physiology and on metabolic changes during the process. These criteria served to establish a quantitative clone ranking and allowed us to identify metabolic differences between high-producing CHO-K1 clones yielding comparably high product titers. Through multivariate data analysis of the combined metabolite and flux data we uncovered common metabolic traits characteristic of high-producer clones in the screening setup. This included high intracellular rates of glutamine synthesis, low cysteine uptake, reduced excretion of aspartate and glutamate, and low intracellular degradation rates of branched-chain amino acids and of histidine. Finally, the above approach was integrated into a workflow that enables standardized high-content selection of CHO producer clones in a high-throughput fashion. In conclusion, the combination of quantitative metabolite profiling, multivariate data analysis, and mechanistic network model simulations can identify metabolic

  14. Discovery of a new polyhydroxyalkanoate synthase from limestone soil through metagenomic approach.

    PubMed

    Tai, Yen Teng; Foong, Choon Pin; Najimudin, Nazalan; Sudesh, Kumar

    2016-04-01

    PHA synthase (PhaC) is the key enzyme in the production of biodegradable plastics known as polyhydroxyalkanoate (PHA). Nevertheless, most of these enzymes are isolated from cultivable bacteria using traditional isolation method. Most of the microorganisms found in nature could not be successfully cultivated due to the lack of knowledge on their growth conditions. In this study, a culture-independent approach was applied. The presence of phaC genes in limestone soil was screened using primers targeting the class I and II PHA synthases. Based on the partial gene sequences, a total of 19 gene clusters have been identified and 7 clones were selected for full length amplification through genome walking. The complete phaC gene sequence of one of the clones (SC8) was obtained and it revealed 81% nucleotide identity to the PHA synthase gene of Chromobacterium violaceum ATCC 12472. This gene obtained from uncultured bacterium was successfully cloned and expressed in a Cupriavidus necator PHB(-)4 PHA-negative mutant resulting in the accumulation of significant amount of PHA. The PHA synthase activity of this transformant was 64 ± 12 U/g proteins. This paper presents a pioneering study on the discovery of phaC in a limestone area using metagenomic approach. Through this study, a new functional phaC was discovered from uncultured bacterium. Phylogenetic classification for all the phaCs isolated from this study has revealed that limestone hill harbors a great diversity of PhaCs with activities that have not yet been investigated. Copyright © 2015 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  15. Automated and Accurate Estimation of Gene Family Abundance from Shotgun Metagenomes

    PubMed Central

    Nayfach, Stephen; Bradley, Patrick H.; Wyman, Stacia K.; Laurent, Timothy J.; Williams, Alex; Eisen, Jonathan A.; Pollard, Katherine S.; Sharpton, Thomas J.

    2015-01-01

    Shotgun metagenomic DNA sequencing is a widely applicable tool for characterizing the functions that are encoded by microbial communities. Several bioinformatic tools can be used to functionally annotate metagenomes, allowing researchers to draw inferences about the functional potential of the community and to identify putative functional biomarkers. However, little is known about how decisions made during annotation affect the reliability of the results. Here, we use statistical simulations to rigorously assess how to optimize annotation accuracy and speed, given parameters of the input data like read length and library size. We identify best practices in metagenome annotation and use them to guide the development of the Shotgun Metagenome Annotation Pipeline (ShotMAP). ShotMAP is an analytically flexible, end-to-end annotation pipeline that can be implemented either on a local computer or a cloud compute cluster. We use ShotMAP to assess how different annotation databases impact the interpretation of how marine metagenome and metatranscriptome functional capacity changes across seasons. We also apply ShotMAP to data obtained from a clinical microbiome investigation of inflammatory bowel disease. This analysis finds that gut microbiota collected from Crohn’s disease patients are functionally distinct from gut microbiota collected from either ulcerative colitis patients or healthy controls, with differential abundance of metabolic pathways related to host-microbiome interactions that may serve as putative biomarkers of disease. PMID:26565399

  16. Microbial Metagenomics: Beyond the Genome

    NASA Astrophysics Data System (ADS)

    Gilbert, Jack A.; Dupont, Christopher L.

    2011-01-01

    Metagenomics literally means “beyond the genome.” Marine microbial metagenomic databases presently comprise ˜400 billion base pairs of DNA, only ˜3% of that found in 1 ml of seawater. Very soon a trillion-base-pair sequence run will be feasible, so it is time to reflect on what we have learned from metagenomics. We review the impact of metagenomics on our understanding of marine microbial communities. We consider the studies facilitated by data generated through the Global Ocean Sampling expedition, as well as the revolution wrought at the individual laboratory level through next generation sequencing technologies. We review recent studies and discoveries since 2008, provide a discussion of bioinformatic analyses, including conceptual pipelines and sequence annotation and predict the future of metagenomics, with suggestions of collaborative community studies tailored toward answering some of the fundamental questions in marine microbial ecology.

  17. An integrated metagenome and -proteome analysis of the microbial community residing in a biogas production plant.

    PubMed

    Ortseifen, Vera; Stolze, Yvonne; Maus, Irena; Sczyrba, Alexander; Bremges, Andreas; Albaum, Stefan P; Jaenicke, Sebastian; Fracowiak, Jochen; Pühler, Alfred; Schlüter, Andreas

    2016-08-10

    To study the metaproteome of a biogas-producing microbial community, fermentation samples were taken from an agricultural biogas plant for microbial cell and protein extraction and corresponding metagenome analyses. Based on metagenome sequence data, taxonomic community profiling was performed to elucidate the composition of bacterial and archaeal sub-communities. The community's cytosolic metaproteome was represented in a 2D-PAGE approach. Metaproteome databases for protein identification were compiled based on the assembled metagenome sequence dataset for the biogas plant analyzed and non-corresponding biogas metagenomes. Protein identification results revealed that the corresponding biogas protein database facilitated the highest identification rate followed by other biogas-specific databases, whereas common public databases yielded insufficient identification rates. Proteins of the biogas microbiome identified as highly abundant were assigned to the pathways involved in methanogenesis, transport and carbon metabolism. Moreover, the integrated metagenome/-proteome approach enabled the examination of genetic-context information for genes encoding identified proteins by studying neighboring genes on the corresponding contig. Exemplarily, this approach led to the identification of a Methanoculleus sp. contig encoding 16 methanogenesis-related gene products, three of which were also detected as abundant proteins within the community's metaproteome. Thus, metagenome contigs provide additional information on the genetic environment of identified abundant proteins. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.

    PubMed

    Nielsen, H Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska; Rasmussen, Simon; Li, Junhua; Sunagawa, Shinichi; Plichta, Damian R; Gautier, Laurent; Pedersen, Anders G; Le Chatelier, Emmanuelle; Pelletier, Eric; Bonde, Ida; Nielsen, Trine; Manichanh, Chaysavanh; Arumugam, Manimozhiyan; Batto, Jean-Michel; Quintanilha Dos Santos, Marcelo B; Blom, Nikolaj; Borruel, Natalia; Burgdorf, Kristoffer S; Boumezbeur, Fouad; Casellas, Francesc; Doré, Joël; Dworzynski, Piotr; Guarner, Francisco; Hansen, Torben; Hildebrand, Falk; Kaas, Rolf S; Kennedy, Sean; Kristiansen, Karsten; Kultima, Jens Roat; Léonard, Pierre; Levenez, Florence; Lund, Ole; Moumen, Bouziane; Le Paslier, Denis; Pons, Nicolas; Pedersen, Oluf; Prifti, Edi; Qin, Junjie; Raes, Jeroen; Sørensen, Søren; Tap, Julien; Tims, Sebastian; Ussery, David W; Yamada, Takuji; Renault, Pierre; Sicheritz-Ponten, Thomas; Bork, Peer; Wang, Jun; Brunak, Søren; Ehrlich, S Dusko

    2014-08-01

    Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.

  19. Expanding the Repertoire of Carbapenem-Hydrolyzing Metallo-ß-Lactamases by Functional Metagenomic Analysis of Soil Microbiota

    PubMed Central

    Gudeta, Dereje D.; Bortolaia, Valeria; Pollini, Simona; Docquier, Jean-Denis; Rossolini, Gian M.; Amos, Gregory C. A.; Wellington, Elizabeth M. H.; Guardabassi, Luca

    2016-01-01

    Carbapenemases are bacterial enzymes that hydrolyze carbapenems, a group of last-resort β-lactam antibiotics used for treatment of severe bacterial infections. They belong to three β-lactamase classes based amino acid sequence (A, B, and D). The aim of this study was to elucidate occurrence, diversity and functionality of carbapenemase-encoding genes in soil microbiota by functional metagenomics. Ten plasmid libraries were generated by cloning metagenomic DNA from agricultural (n = 6) and grassland (n = 4) soil into Escherichia coli. The libraries were cultured on amoxicillin-containing agar and up to 100 colonies per library were screened for carbapenemase production by CarbaNP test. Presumptive carbapenemases were characterized with regard to DNA sequence, minimum inhibitory concentration (MIC) of β-lactams, and imipenem hydrolysis. Nine distinct class B carbapenemases, also known as metallo-beta-lactamases (MBLs), were identified in six soil samples, including two subclass B1 (GRD23-1 and SPN79-1) and seven subclass B3 (CRD3-1, PEDO-1, GRD33-1, ESP-2, ALG6-1, ALG11-1, and DHT2-1). Except PEDO-1 and ESP-2, these enzymes were distantly related to any previously described MBLs (33 to 59% identity). RAIphy analysis indicated that six enzymes (CRD3-1, GRD23-1, DHT2-1, SPN79-1, ALG6-1, and ALG11-1) originated from Proteobacteria, two (PEDO-1 and ESP-2) from Bacteroidetes and one (GRD33-1) from Gemmatimonadetes. All MBLs detected in soil microbiota were functional when expressed in E. coli, resulting in detectable imipenem-hydrolyzing activity and significantly increased MICs of clinically relevant ß-lactams. Interestingly, the MBLs yielded by functional metagenomics generally differed from those detected in the same soil samples by antibiotic selective culture, showing that the two approaches targeted different subpopulations in soil microbiota. PMID:28082950

  20. Expanding the Repertoire of Carbapenem-Hydrolyzing Metallo-ß-Lactamases by Functional Metagenomic Analysis of Soil Microbiota.

    PubMed

    Gudeta, Dereje D; Bortolaia, Valeria; Pollini, Simona; Docquier, Jean-Denis; Rossolini, Gian M; Amos, Gregory C A; Wellington, Elizabeth M H; Guardabassi, Luca

    2016-01-01

    Carbapenemases are bacterial enzymes that hydrolyze carbapenems, a group of last-resort β-lactam antibiotics used for treatment of severe bacterial infections. They belong to three β-lactamase classes based amino acid sequence (A, B, and D). The aim of this study was to elucidate occurrence, diversity and functionality of carbapenemase-encoding genes in soil microbiota by functional metagenomics. Ten plasmid libraries were generated by cloning metagenomic DNA from agricultural ( n = 6) and grassland ( n = 4) soil into Escherichia coli . The libraries were cultured on amoxicillin-containing agar and up to 100 colonies per library were screened for carbapenemase production by CarbaNP test. Presumptive carbapenemases were characterized with regard to DNA sequence, minimum inhibitory concentration (MIC) of β-lactams, and imipenem hydrolysis. Nine distinct class B carbapenemases, also known as metallo-beta-lactamases (MBLs), were identified in six soil samples, including two subclass B1 (GRD23-1 and SPN79-1) and seven subclass B3 (CRD3-1, PEDO-1, GRD33-1, ESP-2, ALG6-1, ALG11-1, and DHT2-1). Except PEDO-1 and ESP-2, these enzymes were distantly related to any previously described MBLs (33 to 59% identity). RAIphy analysis indicated that six enzymes (CRD3-1, GRD23-1, DHT2-1, SPN79-1, ALG6-1, and ALG11-1) originated from Proteobacteria , two (PEDO-1 and ESP-2) from Bacteroidetes and one (GRD33-1) from Gemmatimonadetes . All MBLs detected in soil microbiota were functional when expressed in E. coli , resulting in detectable imipenem-hydrolyzing activity and significantly increased MICs of clinically relevant ß-lactams. Interestingly, the MBLs yielded by functional metagenomics generally differed from those detected in the same soil samples by antibiotic selective culture, showing that the two approaches targeted different subpopulations in soil microbiota.

  1. Bioproduction and characterization of extracellular melanin-like pigment from industrially polluted metagenomic library equipped Escherichia coli.

    PubMed

    Amin, Shivani; Rastogi, Rajesh P; Sonani, Ravi R; Ray, Arabinda; Sharma, Rakesh; Madamwar, Datta

    2018-04-15

    To explore the potential genes from the industrially polluted Amlakhadi canal, located in Ankleshwar, Gujarat, India, its community genome was extracted and cloned into E. coli EPI300™-T1 R using a fosmid vector (pCC2 FOS™) generating a library of 3,92,000 clones with average size of 40kb of DNA-insert. From this library, the clone DM1 producing brown colored melanin-like pigment was isolated and characterized. For over expression of the pigment, further sub-cloning of the clone DM1 was done. Sub-clone containing 10kb of the insert was sequenced for gene identification. The amino acids sequence of a protein 4-Hydroxyphenylpyruvate dioxygenase (HPPD), which is know to be involved in melanin biosynthesis was obtained from the gene sequence. The sequence-homology based 3D structure model of HPPD was constructed and analyzed. The physico-chemical nature of pigment was further analysed using 1 H and 13 C NMR, LC-MS, FTIR and UV-visible spectroscopy. The pigment was readily soluble in DMSO with an absorption maximum around 290nm. Based on the genetic and chemical characterization, the compound was confirmed as melanin-like pigment. The present results indicate that the metagenomic library from industrially polluted environment generated a microbial tool for the production of melanin-like pigment. Copyright © 2018 Elsevier B.V. All rights reserved.

  2. Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes.

    PubMed

    Olson, Nathan D; Treangen, Todd J; Hill, Christopher M; Cepeda-Espinoza, Victoria; Ghurye, Jay; Koren, Sergey; Pop, Mihai

    2017-08-07

    Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation. © The Author 2017. Published by Oxford University Press.

  3. Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

    PubMed Central

    Sczyrba, Alexander; Hofmann, Peter; Belmann, Peter; Koslicki, David; Janssen, Stefan; Dröge, Johannes; Gregor, Ivan; Majda, Stephan; Fiedler, Jessika; Dahms, Eik; Bremges, Andreas; Fritz, Adrian; Garrido-Oter, Ruben; Jørgensen, Tue Sparholt; Shapiro, Nicole; Blood, Philip D.; Gurevich, Alexey; Bai, Yang; Turaev, Dmitrij; DeMaere, Matthew Z.; Chikhi, Rayan; Nagarajan, Niranjan; Quince, Christopher; Meyer, Fernando; Balvočiūtė, Monika; Hansen, Lars Hestbjerg; Sørensen, Søren J.; Chia, Burton K. H.; Denis, Bertrand; Froula, Jeff L.; Wang, Zhong; Egan, Robert; Kang, Dongwan Don; Cook, Jeffrey J.; Deltel, Charles; Beckstette, Michael; Lemaitre, Claire; Peterlongo, Pierre; Rizk, Guillaume; Lavenier, Dominique; Wu, Yu-Wei; Singer, Steven W.; Jain, Chirag; Strous, Marc; Klingenberg, Heiner; Meinicke, Peter; Barton, Michael; Lingner, Thomas; Lin, Hsin-Hung; Liao, Yu-Chieh; Silva, Genivaldo Gueiros Z.; Cuevas, Daniel A.; Edwards, Robert A.; Saha, Surya; Piro, Vitor C.; Renard, Bernhard Y.; Pop, Mihai; Klenk, Hans-Peter; Göker, Markus; Kyrpides, Nikos C.; Woyke, Tanja; Vorholt, Julia A.; Schulze-Lefert, Paul; Rubin, Edward M.; Darling, Aaron E.; Rattei, Thomas; McHardy, Alice C.

    2018-01-01

    In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions. PMID:28967888

  4. Development and biotechnological application of a novel endoxylanase family GH10 identified from sugarcane soil metagenome.

    PubMed

    Alvarez, Thabata M; Goldbeck, Rosana; dos Santos, Camila Ramos; Paixão, Douglas A A; Gonçalves, Thiago A; Franco Cairo, João Paulo L; Almeida, Rodrigo Ferreira; de Oliveira Pereira, Isabela; Jackson, George; Cota, Junio; Büchli, Fernanda; Citadini, Ana Paula; Ruller, Roberto; Polo, Carla Cristina; de Oliveira Neto, Mario; Murakami, Mário T; Squina, Fabio M

    2013-01-01

    Metagenomics has been widely employed for discovery of new enzymes and pathways to conversion of lignocellulosic biomass to fuels and chemicals. In this context, the present study reports the isolation, recombinant expression, biochemical and structural characterization of a novel endoxylanase family GH10 (SCXyl) identified from sugarcane soil metagenome. The recombinant SCXyl was highly active against xylan from beechwood and showed optimal enzyme activity at pH 6,0 and 45°C. The crystal structure was solved at 2.75 Å resolution, revealing the classical (β/α)8-barrel fold with a conserved active-site pocket and an inherent flexibility of the Trp281-Arg291 loop that can adopt distinct conformational states depending on substrate binding. The capillary electrophoresis analysis of degradation products evidenced that the enzyme displays unusual capacity to degrade small xylooligosaccharides, such as xylotriose, which is consistent to the hydrophobic contacts at the +1 subsite and low-binding energies of subsites that are distant from the site of hydrolysis. The main reaction products from xylan polymers and phosphoric acid-pretreated sugarcane bagasse (PASB) were xylooligosaccharides, but, after a longer incubation time, xylobiose and xylose were also formed. Moreover, the use of SCXyl as pre-treatment step of PASB, prior to the addition of commercial cellulolytic cocktail, significantly enhanced the saccharification process. All these characteristics demonstrate the advantageous application of this enzyme in several biotechnological processes in food and feed industry and also in the enzymatic pretreatment of biomass for feedstock and ethanol production.

  5. Loeffler 4.0: Diagnostic Metagenomics.

    PubMed

    Höper, Dirk; Wylezich, Claudia; Beer, Martin

    2017-01-01

    A new world of possibilities for "virus discovery" was opened up with high-throughput sequencing becoming available in the last decade. While scientifically metagenomic analysis was established before the start of the era of high-throughput sequencing, the availability of the first second-generation sequencers was the kick-off for diagnosticians to use sequencing for the detection of novel pathogens. Today, diagnostic metagenomics is becoming the standard procedure for the detection and genetic characterization of new viruses or novel virus variants. Here, we provide an overview about technical considerations of high-throughput sequencing-based diagnostic metagenomics together with selected examples of "virus discovery" for animal diseases or zoonoses and metagenomics for food safety or basic veterinary research. © 2017 Elsevier Inc. All rights reserved.

  6. Genetic variability of psychrotolerant Acidithiobacillus ferrivorans revealed by (meta)genomic analysis.

    PubMed

    González, Carolina; Yanquepe, María; Cardenas, Juan Pablo; Valdes, Jorge; Quatrini, Raquel; Holmes, David S; Dopson, Mark

    2014-11-01

    Acidophilic microorganisms inhabit low pH environments such as acid mine drainage that is generated when sulfide minerals are exposed to air. The genome sequence of the psychrotolerant Acidithiobacillus ferrivorans SS3 was compared to a metagenome from a low temperature acidic stream dominated by an A. ferrivorans-like strain. Stretches of genomic DNA characterized by few matches to the metagenome, termed 'metagenomic islands', encoded genes associated with metal efflux and pH homeostasis. The metagenomic islands were enriched in mobile elements such as phage proteins, transposases, integrases and in one case, predicted to be flanked by truncated tRNAs. Cus gene clusters predicted to be involved in copper efflux and further Cus-like RND systems were predicted to be located in metagenomic islands and therefore, constitute part of the flexible gene complement of the species. Phylogenetic analysis of Cus clusters showed both lineage specificity within the Acidithiobacillus genus as well as niche specificity associated with an acidic environment. The metagenomic islands also contained a predicted copper efflux P-type ATPase system and a polyphosphate kinase potentially involved in polyphosphate mediated copper resistance. This study identifies genetic variability of low temperature acidophiles that likely reflects metal resistance selective pressures in the copper rich environment. Copyright © 2014 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  7. Metagenomic Assembly: Overview, Challenges and Applications

    PubMed Central

    Ghurye, Jay S.; Cepeda-Espinoza, Victoria; Pop, Mihai

    2016-01-01

    Advances in sequencing technologies have led to the increased use of high throughput sequencing in characterizing the microbial communities associated with our bodies and our environment. Critical to the analysis of the resulting data are sequence assembly algorithms able to reconstruct genes and organisms from complex mixtures. Metagenomic assembly involves new computational challenges due to the specific characteristics of the metagenomic data. In this survey, we focus on major algorithmic approaches for genome and metagenome assembly, and discuss the new challenges and opportunities afforded by this new field. We also review several applications of metagenome assembly in addressing interesting biological problems. PMID:27698619

  8. An Agile Functional Analysis of Metagenomic Data Using SUPER-FOCUS.

    PubMed

    Silva, Genivaldo Gueiros Z; Lopes, Fabyano A C; Edwards, Robert A

    2017-01-01

    One of the main goals in metagenomics is to identify the functional profile of a microbial community from unannotated shotgun sequencing reads. Functional annotation is important in biological research because it enables researchers to identify the abundance of functional genes of the organisms present in the sample, answering the question, "What can the organisms in the sample do?" Most currently available approaches do not scale with increasing data volumes, which is important because both the number and lengths of the reads provided by sequencing platforms keep increasing. Here, we present SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with real metagenomes, and the results show that it accurately predicts the subsystems present in the profiled microbial communities, is computationally efficient, and up to 1000 times faster than other tools. SUPER-FOCUS is freely available at http://edwards.sdsu.edu/SUPERFOCUS .

  9. Production and characterization of a novel antifungal chitinase identified by functional screening of a suppressive-soil metagenome.

    PubMed

    Berini, Francesca; Presti, Ilaria; Beltrametti, Fabrizio; Pedroli, Marco; Vårum, Kjell M; Pollegioni, Loredano; Sjöling, Sara; Marinelli, Flavia

    2017-01-31

    Through functional screening of a fosmid library, generated from a phytopathogen-suppressive soil metagenome, the novel antifungal chitinase-named Chi18H8 and belonging to family 18 glycosyl hydrolases-was previously discovered. The initial extremely low yield of Chi18H8 recombinant production and purification from Escherichia coli cells (21 μg/g cell) limited its characterization, thus preventing further investigation on its biotechnological potential. We report on how we succeeded in producing hundreds of milligrams of pure and biologically active Chi18H8 by developing and scaling up to a high-yielding, 30 L bioreactor process, based on a novel method of mild solubilization of E. coli inclusion bodies in lactic acid aqueous solution, coupled with a single step purification by hydrophobic interaction chromatography. Chi18H8 was characterized as a Ca 2+ -dependent mesophilic chitobiosidase, active on chitin substrates at acidic pHs and possessing interesting features, such as solvent tolerance, long-term stability in acidic environment and antifungal activity against the phytopathogens Fusarium graminearum and Rhizoctonia solani. Additionally, Chi18H8 was found to operate according to a non-processive endomode of action on a water-soluble chitin-like substrate. Expression screening of a metagenomic library may allow access to the functional diversity of uncultivable microbiota and to the discovery of novel enzymes useful for biotechnological applications. A persisting bottleneck, however, is the lack of methods for large scale production of metagenome-sourced enzymes from genes of unknown origin in the commonly used microbial hosts. To our knowledge, this is the first report on a novel metagenome-sourced enzyme produced in hundreds-of-milligram amount by recovering the protein in the biologically active form from recombinant E. coli inclusion bodies.

  10. Optimizing and evaluating the reconstruction of Metagenome-assembled microbial genomes.

    PubMed

    Papudeshi, Bhavya; Haggerty, J Matthew; Doane, Michael; Morris, Megan M; Walsh, Kevin; Beattie, Douglas T; Pande, Dnyanada; Zaeri, Parisa; Silva, Genivaldo G Z; Thompson, Fabiano; Edwards, Robert A; Dinsdale, Elizabeth A

    2017-11-28

    .49), low species richness (4.91 ± 0.66), and higher genome completeness (40.92 ± 1.75) across all projects. MetaBat extracted 115 bins from the 4 projects of which 66 bins were identified as reconstructed metagenome-assembled genomes with sequences belonging to a specific genus. We identified 13 novel genomes, some of which were 100% complete, but show low similarity to genomes within databases. In conclusion, we present a set of biologically relevant parameters for evaluation to select for optimal assembly and binning tools. For the tools we tested, SPAdes assembler and MetaBat binning tools reconstructed quality metagenome-assembled genomes for the four projects. We also conclude that metagenomes from microbial communities that have high coverage of phylogenetically distinct, and low taxonomic diversity results in highest quality metagenome-assembled genomes.

  11. Sequence-based screening for self-sufficient P450 monooxygenase from a metagenome library.

    PubMed

    Kim, B S; Kim, S Y; Park, J; Park, W; Hwang, K Y; Yoon, Y J; Oh, W K; Kim, B Y; Ahn, J S

    2007-05-01

    Cytochrome P450 monooxygenases (CYPs) are useful catalysts for oxidation reactions. Self-sufficient CYPs harbour a reductive domain covalently connected to a P450 domain and are known for their robust catalytic activity with great potential as biocatalysts. In an effort to expand genetic sources of self-sufficient CYPs, we devised a sequence-based screening system to identify them in a soil metagenome. We constructed a soil metagenome library and performed sequence-based screening for self-sufficient CYP genes. A new CYP gene, syk181, was identified from the metagenome library. Phylogenetic analysis revealed that SYK181 formed a distinct phylogenic line with 46% amino-acid-sequence identity to CYP102A1 which has been extensively studied as a fatty acid hydroxylase. The heterologously expressed SYK181 showed significant hydroxylase activity towards naphthalene and phenanthrene as well as towards fatty acids. Sequence-based screening of metagenome libraries is expected to be a useful approach for searching self-sufficient CYP genes. The translated product of syk181 shows self-sufficient hydroxylase activity towards fatty acids and aromatic compounds. SYK181 is the first self-sufficient CYP obtained directly from a metagenome library. The genetic and biochemical information on SYK181 are expected to be helpful for engineering self-sufficient CYPs with broader catalytic activities towards various substrates, which would be useful for bioconversion of natural products and biodegradation of organic chemicals.

  12. Metagenomic Taxonomy-Guided Database-Searching Strategy for Improving Metaproteomic Analysis.

    PubMed

    Xiao, Jinqiu; Tanca, Alessandro; Jia, Ben; Yang, Runqing; Wang, Bo; Zhang, Yu; Li, Jing

    2018-04-06

    Metaproteomics provides a direct measure of the functional information by investigating all proteins expressed by a microbiota. However, due to the complexity and heterogeneity of microbial communities, it is very hard to construct a sequence database suitable for a metaproteomic study. Using a public database, researchers might not be able to identify proteins from poorly characterized microbial species, while a sequencing-based metagenomic database may not provide adequate coverage for all potentially expressed protein sequences. To address this challenge, we propose a metagenomic taxonomy-guided database-search strategy (MT), in which a merged database is employed, consisting of both taxonomy-guided reference protein sequences from public databases and proteins from metagenome assembly. By applying our MT strategy to a mock microbial mixture, about two times as many peptides were detected as with the metagenomic database only. According to the evaluation of the reliability of taxonomic attribution, the rate of misassignments was comparable to that obtained using an a priori matched database. We also evaluated the MT strategy with a human gut microbial sample, and we found 1.7 times as many peptides as using a standard metagenomic database. In conclusion, our MT strategy allows the construction of databases able to provide high sensitivity and precision in peptide identification in metaproteomic studies, enabling the detection of proteins from poorly characterized species within the microbiota.

  13. Cloning, clones and clonal disease.

    PubMed

    Luzzatto, L

    2000-01-01

    In the past, cloning has been familiar to plant breeders because many plants can be easily reproduced in this way, bypassing the lengthy process of cross-fertilisation. Recently, the concept of cloning has become popular in human biology and medicine on two accounts. First, individual genes can be cloned from the enormous complexity of the DNA that makes up the human genetic material. It is expected that, within a few years, all the estimated 100,000 human genes will be isolated by this approach. This should make it possible to identify all the genes that determine the individual characteristics of human beings, including those responsible for causing human diseases or for making people more or less susceptible to pick up diseases from the environment. Cloned genes made into pharmaceutical products are already in use for treating a variety of diseases, from hormonal deficiencies to certain types of anaemia.

  14. A user's guide to quantitative and comparative analysis of metagenomic datasets.

    PubMed

    Luo, Chengwei; Rodriguez-R, Luis M; Konstantinidis, Konstantinos T

    2013-01-01

    Metagenomics has revolutionized microbiological studies during the past decade and provided new insights into the diversity, dynamics, and metabolic potential of natural microbial communities. However, metagenomics still represents a field in development, and standardized tools and approaches to handle and compare metagenomes have not been established yet. An important reason accounting for the latter is the continuous changes in the type of sequencing data available, for example, long versus short sequencing reads. Here, we provide a guide to bioinformatic pipelines developed to accomplish the following tasks, focusing primarily on those developed by our team: (i) assemble a metagenomic dataset; (ii) determine the level of sequence coverage obtained and the amount of sequencing required to obtain complete coverage; (iii) identify the taxonomic affiliation of a metagenomic read or assembled contig; and (iv) determine differentially abundant genes, pathways, and species between different datasets. Most of these pipelines do not depend on the type of sequences available or can be easily adjusted to fit different types of sequences, and are freely available (for instance, through our lab Web site: http://www.enve-omics.gatech.edu/). The limitations of current approaches, as well as the computational aspects that can be further improved, will also be briefly discussed. The work presented here provides practical guidelines on how to perform metagenomic analysis of microbial communities characterized by varied levels of diversity and establishes approaches to handle the resulting data, independent of the sequencing platform employed. © 2013 Elsevier Inc. All rights reserved.

  15. Metagenomic analysis reveals a functional signature for biomass degradation by cecal microbiota in the leaf-eating flying squirrel (Petaurista alborufus lena)

    PubMed Central

    2012-01-01

    Background Animals co-evolve with their gut microbiota; the latter can perform complex metabolic reactions that cannot be done independently by the host. Although the importance of gut microbiota has been well demonstrated, there is a paucity of research regarding its role in foliage-foraging mammals with a specialized digestive system. Results In this study, a 16S rRNA gene survey and metagenomic sequencing were used to characterize genetic diversity and functional capability of cecal microbiota of the folivorous flying squirrel (Petaurista alborufus lena). Phylogenetic compositions of the cecal microbiota derived from 3 flying squirrels were dominated by Firmicutes. Based on end-sequences of fosmid clones from 1 flying squirrel, we inferred that microbial metabolism greatly contributed to intestinal functions, including degradation of carbohydrates, metabolism of proteins, and synthesis of vitamins. Moreover, 33 polysaccharide-degrading enzymes and 2 large genomic fragments containing a series of carbohydrate-associated genes were identified. Conclusions Cecal microbiota of the leaf-eating flying squirrel have great metabolic potential for converting diverse plant materials into absorbable nutrients. The present study should serve as the basis for future investigations, using metagenomic approaches to elucidate the intricate mechanisms and interactions between host and gut microbiota of the flying squirrel digestive system, as well as other mammals with similar adaptations. PMID:22963241

  16. Metagenomic analysis reveals a functional signature for biomass degradation by cecal microbiota in the leaf-eating flying squirrel (Petaurista alborufus lena).

    PubMed

    Lu, Hsiao-Pei; Wang, Yu-bin; Huang, Shiao-Wei; Lin, Chung-Yen; Wu, Martin; Hsieh, Chih-hao; Yu, Hon-Tsen

    2012-09-10

    Animals co-evolve with their gut microbiota; the latter can perform complex metabolic reactions that cannot be done independently by the host. Although the importance of gut microbiota has been well demonstrated, there is a paucity of research regarding its role in foliage-foraging mammals with a specialized digestive system. In this study, a 16S rRNA gene survey and metagenomic sequencing were used to characterize genetic diversity and functional capability of cecal microbiota of the folivorous flying squirrel (Petaurista alborufus lena). Phylogenetic compositions of the cecal microbiota derived from 3 flying squirrels were dominated by Firmicutes. Based on end-sequences of fosmid clones from 1 flying squirrel, we inferred that microbial metabolism greatly contributed to intestinal functions, including degradation of carbohydrates, metabolism of proteins, and synthesis of vitamins. Moreover, 33 polysaccharide-degrading enzymes and 2 large genomic fragments containing a series of carbohydrate-associated genes were identified. Cecal microbiota of the leaf-eating flying squirrel have great metabolic potential for converting diverse plant materials into absorbable nutrients. The present study should serve as the basis for future investigations, using metagenomic approaches to elucidate the intricate mechanisms and interactions between host and gut microbiota of the flying squirrel digestive system, as well as other mammals with similar adaptations.

  17. Metagenomics of an Alkaline Hot Spring in Galicia (Spain): Microbial Diversity Analysis and Screening for Novel Lipolytic Enzymes.

    PubMed

    López-López, Olalla; Knapik, Kamila; Cerdán, Maria-Esperanza; González-Siso, María-Isabel

    2015-01-01

    A fosmid library was constructed with the metagenomic DNA from the water of the Lobios hot spring (76°C, pH = 8.2) located in Ourense (Spain). Metagenomic sequencing of the fosmid library allowed the assembly of 9722 contigs ranging in size from 500 to 56,677 bp and spanning ~18 Mbp. 23,207 ORFs (Open Reading Frames) were predicted from the assembly. Biodiversity was explored by taxonomic classification and it revealed that bacteria were predominant, while the archaea were less abundant. The six most abundant bacterial phyla were Deinococcus-Thermus, Proteobacteria, Firmicutes, Acidobacteria, Aquificae, and Chloroflexi. Within the archaeal superkingdom, the phylum Thaumarchaeota was predominant with the dominant species "Candidatus Caldiarchaeum subterraneum." Functional classification revealed the genes associated to one-carbon metabolism as the most abundant. Both taxonomic and functional classifications showed a mixture of different microbial metabolic patterns: aerobic and anaerobic, chemoorganotrophic and chemolithotrophic, autotrophic and heterotrophic. Remarkably, the presence of genes encoding enzymes with potential biotechnological interest, such as xylanases, galactosidases, proteases, and lipases, was also revealed in the metagenomic library. Functional screening of this library was subsequently done looking for genes encoding lipolytic enzymes. Six genes conferring lipolytic activity were identified and one was cloned and characterized. This gene was named LOB4Est and it was expressed in a yeast mesophilic host. LOB4Est codes for a novel esterase of family VIII, with sequence similarity to β-lactamases, but with unusual wide substrate specificity. When the enzyme was purified from the mesophilic host it showed half-life of 1 h and 43 min at 50°C, and maximal activity at 40°C and pH 7.5 with p-nitrophenyl-laurate as substrate. Interestingly, the enzyme retained more than 80% of maximal activity in a broad range of pH from 6.5 to 8.

  18. Clone DB: an integrated NCBI resource for clone-associated data

    PubMed Central

    Schneider, Valerie A.; Chen, Hsiu-Chuan; Clausen, Cliff; Meric, Peter A.; Zhou, Zhigang; Bouk, Nathan; Husain, Nora; Maglott, Donna R.; Church, Deanna M.

    2013-01-01

    The National Center for Biotechnology Information (NCBI) Clone DB (http://www.ncbi.nlm.nih.gov/clone/) is an integrated resource providing information about and facilitating access to clones, which serve as valuable research reagents in many fields, including genome sequencing and variation analysis. Clone DB represents an expansion and replacement of the former NCBI Clone Registry and has records for genomic and cell-based libraries and clones representing more than 100 different eukaryotic taxa. Records provide details of library construction, associated sequences, map positions and information about resource distribution. Clone DB is indexed in the NCBI Entrez system and can be queried by fields that include organism, clone name, gene name and sequence identifier. Whenever possible, genomic clones are mapped to reference assemblies and their map positions provided in clone records. Clones mapping to specific genomic regions can also be searched for using the NCBI Clone Finder tool, which accepts queries based on sequence coordinates or features such as gene or transcript names. Clone DB makes reports of library, clone and placement data on its FTP site available for download. With Clone DB, users now have available to them a centralized resource that provides them with the tools they will need to make use of these important research reagents. PMID:23193260

  19. The single-species metagenome: subtyping Staphylococcus aureus core genome sequences from shotgun metagenomic data

    PubMed Central

    Li, Ben; Petit III, Robert A.; Qin, Zhaohui S.; Darrow, Lyndsey

    2016-01-01

    In this study we developed a genome-based method for detecting Staphylococcus aureus subtypes from metagenome shotgun sequence data. We used a binomial mixture model and the coverage counts at >100,000 known S. aureus SNP (single nucleotide polymorphism) sites derived from prior comparative genomic analysis to estimate the proportion of 40 subtypes in metagenome samples. We were able to obtain >87% sensitivity and >94% specificity at 0.025X coverage for S. aureus. We found that 321 and 149 metagenome samples from the Human Microbiome Project and metaSUB analysis of the New York City subway, respectively, contained S. aureus at genome coverage >0.025. In both projects, CC8 and CC30 were the most common S. aureus clonal complexes encountered. We found evidence that the subtype composition at different body sites of the same individual were more similar than random sampling and more limited evidence that certain body sites were enriched for particular subtypes. One surprising finding was the apparent high frequency of CC398, a lineage often associated with livestock, in samples from the tongue dorsum. Epidemiologic analysis of the HMP subject population suggested that high BMI (body mass index) and health insurance are possibly associated with S. aureus carriage but there was limited power to identify factors linked to carriage of even the most common subtype. In the NYC subway data, we found a small signal of geographic distance affecting subtype clustering but other unknown factors influence taxonomic distribution of the species around the city. PMID:27781166

  20. Metazen - metadata capture for metagenomes.

    PubMed

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker

    2014-01-01

    As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.

  1. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs.

    PubMed

    Eloe-Fadrosh, Emiley A; Paez-Espino, David; Jarett, Jessica; Dunfield, Peter F; Hedlund, Brian P; Dekas, Anne E; Grasby, Stephen E; Brady, Allyson L; Dong, Hailiang; Briggs, Brandon R; Li, Wen-Jun; Goudeau, Danielle; Malmstrom, Rex; Pati, Amrita; Pett-Ridge, Jennifer; Rubin, Edward M; Woyke, Tanja; Kyrpides, Nikos C; Ivanova, Natalia N

    2016-01-27

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum ('Candidatus Kryptonia') found exclusively in high-temperature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic 'blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestyle with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery.

  2. Beyond Biodiversity: Fish Metagenomes

    PubMed Central

    Ardura, Alba; Planes, Serge; Garcia-Vazquez, Eva

    2011-01-01

    Biodiversity and intra-specific genetic diversity are interrelated and determine the potential of a community to survive and evolve. Both are considered together in Prokaryote communities treated as metagenomes or ensembles of functional variants beyond species limits. Many factors alter biodiversity in higher Eukaryote communities, and human exploitation can be one of the most important for some groups of plants and animals. For example, fisheries can modify both biodiversity and genetic diversity (intra specific). Intra-specific diversity can be drastically altered by overfishing. Intense fishing pressure on one stock may imply extinction of some genetic variants and subsequent loss of intra-specific diversity. The objective of this study was to apply a metagenome approach to fish communities and explore its value for rapid evaluation of biodiversity and genetic diversity at community level. Here we have applied the metagenome approach employing the Barcoding target gene COI as a model sequence in catch from four very different fish assemblages exploited by fisheries: freshwater communities from the Amazon River and northern Spanish rivers, and marine communities from the Cantabric and Mediterranean seas. Treating all sequences obtained from each regional catch as a biological unit (exploited community) we found that metagenomic diversity indices of the Amazonian catch sample here examined were lower than expected. Reduced diversity could be explained, at least partially, by overexploitation of the fish community that had been independently estimated by other methods. We propose using a metagenome approach for estimating diversity in Eukaryote communities and early evaluating genetic variation losses at multi-species level. PMID:21829636

  3. Beyond biodiversity: fish metagenomes.

    PubMed

    Ardura, Alba; Planes, Serge; Garcia-Vazquez, Eva

    2011-01-01

    Biodiversity and intra-specific genetic diversity are interrelated and determine the potential of a community to survive and evolve. Both are considered together in Prokaryote communities treated as metagenomes or ensembles of functional variants beyond species limits.Many factors alter biodiversity in higher Eukaryote communities, and human exploitation can be one of the most important for some groups of plants and animals. For example, fisheries can modify both biodiversity and genetic diversity (intra specific). Intra-specific diversity can be drastically altered by overfishing. Intense fishing pressure on one stock may imply extinction of some genetic variants and subsequent loss of intra-specific diversity. The objective of this study was to apply a metagenome approach to fish communities and explore its value for rapid evaluation of biodiversity and genetic diversity at community level. Here we have applied the metagenome approach employing the barcoding target gene coi as a model sequence in catch from four very different fish assemblages exploited by fisheries: freshwater communities from the Amazon River and northern Spanish rivers, and marine communities from the Cantabric and Mediterranean seas.Treating all sequences obtained from each regional catch as a biological unit (exploited community) we found that metagenomic diversity indices of the Amazonian catch sample here examined were lower than expected. Reduced diversity could be explained, at least partially, by overexploitation of the fish community that had been independently estimated by other methods.We propose using a metagenome approach for estimating diversity in Eukaryote communities and early evaluating genetic variation losses at multi-species level.

  4. Identification of eukaryotic open reading frames in metagenomic cDNA libraries made from environmental samples.

    PubMed

    Grant, Susan; Grant, William D; Cowan, Don A; Jones, Brian E; Ma, Yanhe; Ventosa, Antonio; Heaphy, Shaun

    2006-01-01

    Here we describe the application of metagenomic technologies to construct cDNA libraries from RNA isolated from environmental samples. RNAlater (Ambion) was shown to stabilize RNA in environmental samples for periods of at least 3 months at -20 degrees C. Protocols for library construction were established on total RNA extracted from Acanthamoeba polyphaga trophozoites. The methodology was then used on algal mats from geothermal hot springs in Tengchong county, Yunnan Province, People's Republic of China, and activated sludge from a sewage treatment plant in Leicestershire, United Kingdom. The Tenchong libraries were dominated by RNA from prokaryotes, reflecting the mainly prokaryote microbial composition. The majority of these clones resulted from rRNA; only a few appeared to be derived from mRNA. In contrast, many clones from the activated sludge library had significant similarity to eukaryote mRNA-encoded protein sequences. A library was also made using polyadenylated RNA isolated from total RNA from activated sludge; many more clones in this library were related to eukaryotic mRNA sequences and proteins. Open reading frames (ORFs) up to 378 amino acids in size could be identified. Some resembled known proteins over their full length, e.g., 36% match to cystatin, 49% match to ribosomal protein L32, 63% match to ribosomal protein S16, 70% to CPC2 protein. The methodology described here permits the polyadenylated transcriptome to be isolated from environmental samples with no knowledge of the identity of the microorganisms in the sample or the necessity to culture them. It has many uses, including the identification of novel eukaryotic ORFs encoding proteins and enzymes.

  5. Identification of Eukaryotic Open Reading Frames in Metagenomic cDNA Libraries Made from Environmental Samples†

    PubMed Central

    Grant, Susan; Grant, William D.; Cowan, Don A.; Jones, Brian E.; Ma, Yanhe; Ventosa, Antonio; Heaphy, Shaun

    2006-01-01

    Here we describe the application of metagenomic technologies to construct cDNA libraries from RNA isolated from environmental samples. RNAlater (Ambion) was shown to stabilize RNA in environmental samples for periods of at least 3 months at −20°C. Protocols for library construction were established on total RNA extracted from Acanthamoeba polyphaga trophozoites. The methodology was then used on algal mats from geothermal hot springs in Tengchong county, Yunnan Province, People's Republic of China, and activated sludge from a sewage treatment plant in Leicestershire, United Kingdom. The Tenchong libraries were dominated by RNA from prokaryotes, reflecting the mainly prokaryote microbial composition. The majority of these clones resulted from rRNA; only a few appeared to be derived from mRNA. In contrast, many clones from the activated sludge library had significant similarity to eukaryote mRNA-encoded protein sequences. A library was also made using polyadenylated RNA isolated from total RNA from activated sludge; many more clones in this library were related to eukaryotic mRNA sequences and proteins. Open reading frames (ORFs) up to 378 amino acids in size could be identified. Some resembled known proteins over their full length, e.g., 36% match to cystatin, 49% match to ribosomal protein L32, 63% match to ribosomal protein S16, 70% to CPC2 protein. The methodology described here permits the polyadenylated transcriptome to be isolated from environmental samples with no knowledge of the identity of the microorganisms in the sample or the necessity to culture them. It has many uses, including the identification of novel eukaryotic ORFs encoding proteins and enzymes. PMID:16391035

  6. Nematicidal protease genes screened from a soil metagenomic library to control Radopholus similis mediated by Pseudomonas fluorescens pf36.

    PubMed

    Chen, Deqiang; Wang, Dongwei; Xu, Chunling; Chen, Chun; Li, Junyi; Wu, Wenjia; Huang, Xin; Xie, Hui

    2018-04-01

    Controlling Radopholus similis, an important phytopathogenic nematode, is a challenge worldwide. Herein, we constructed a metagenomic fosmid library from the rhizosphere soil of banana plants, and six clones with protease activity were obtained by functionally screening the library. Furthermore, subclones were constructed using the six clones, and three protease genes with nematicidal activity were identified: pase1, pase4, and pase6. The pase4 gene was successfully cloned and expressed, demonstrating that the protease PASE4 could effectively degrade R. similis tissues and result in nematode death. Additionally, we isolated a predominant R. similis-associated bacterium, Pseudomonas fluorescens (pf36), from 10 R. similis populations with different hosts. The pase4 gene was successfully introduced into the pf36 strain by vector transformation and conjugative transposition, and two genetically modified strains were obtained: p4MCS-pf36 and p4Tn5-pf36. p4MCS-pf36 had significantly higher protease expression and nematicidal activity (p < 0.05) than p4Tn5-pf36 in a microtiter plate assay, whereas p4Tn5-pf36 was superior to p4MCS-pf36 in terms of genetic stability and controlling R. similis in growth pot tests. This study confirmed that R. similis is inhibited by the associated bacterium pf36-mediated expression of nematicidal proteases. Herein, a novel approach is provided for the study and development of efficient, environmentally friendly, and sustainable biocontrol techniques against phytonematodes.

  7. Metagenome analyses of corroded concrete wastewater pipe biofilms reveal a complex microbial system.

    PubMed

    Gomez-Alvarez, Vicente; Revetta, Randy P; Santo Domingo, Jorge W

    2012-06-22

    Concrete corrosion of wastewater collection systems is a significant cause of deterioration and premature collapse. Failure to adequately address the deteriorating infrastructure networks threatens our environment, public health, and safety. Analysis of whole-metagenome pyrosequencing data and 16S rRNA gene clone libraries was used to determine microbial composition and functional genes associated with biomass harvested from crown (top) and invert (bottom) sections of a corroded wastewater pipe. Taxonomic and functional analysis demonstrated that approximately 90% of the total diversity was associated with the phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria. The top (TP) and bottom pipe (BP) communities were different in composition, with some of the differences attributed to the abundance of sulfide-oxidizing and sulfate-reducing bacteria. Additionally, human fecal bacteria were more abundant in the BP communities. Among the functional categories, proteins involved in sulfur and nitrogen metabolism showed the most significant differences between biofilms. There was also an enrichment of genes associated with heavy metal resistance, virulence (protein secretion systems) and stress response in the TP biofilm, while a higher number of genes related to motility and chemotaxis were identified in the BP biofilm. Both biofilms contain a high number of genes associated with resistance to antibiotics and toxic compounds subsystems. The function potential of wastewater biofilms was highly diverse with level of COG diversity similar to that described for soil. On the basis of the metagenomic data, some factors that may contribute to niche differentiation were pH, aerobic conditions and availability of substrate, such as nitrogen and sulfur. The results from this study will help us better understand the genetic network and functional capability of microbial members of wastewater concrete biofilms.

  8. The metagenomic data life-cycle: standards and best practices

    PubMed Central

    ten Hoopen, Petra; Finn, Robert D.; Bongo, Lars Ailo; Corre, Erwan; Meyer, Folker; Mitchell, Alex; Pelletier, Eric; Pesole, Graziano; Santamaria, Monica; Willassen, Nils Peder

    2017-01-01

    Abstract Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine research, we summarize essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community, but greater awareness and adoption is still needed. We emphasize the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing. PMID:28637310

  9. The metagenomic data life-cycle: standards and best practices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    ten Hoopen, Petra; Finn, Robert D.; Bongo, Lars Ailo

    Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonised way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (1) material sampling, (2) material sequencing (3) data analysis and (4) data archiving & publishing. Taking examples from marine research, we summarise essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community but greater awareness and adoption is stillmore » needed. We emphasise the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.« less

  10. Cloning, expression and characterization of a metagenome derived thermoactive/thermostable pectinase.

    PubMed

    Singh, Rajvinder; Dhawan, Samriti; Singh, Kashmir; Kaur, Jagdeep

    2012-08-01

    The gene encoding a thermostable pectinase was isolated from a soil metagenome sample. The gene sequence corresponded to an open reading frame of 1,311 bp encoding a translation product of 47.9 kDa. It showed maximum (93 %) identity to a Bacillus licheniformis glycoside hydrolase. Deduced amino acid analysis showed an absence of highly conserved cysteine residues in the N-terminal region at positions 24 and 42, and in the C-terminal region at positions 389, 394, 413 and 424. pQpecJKR01 (pQE30 expression vector containing the pectinase gene) was expressed in Escherichia coli strain M15 as a recombinant fusion protein containing an N-terminal 6× His tag. Biochemical properties of this pectinase were novel. The enzyme had temperature and pH optima of 70 °C and 7.0, respectively, but was active over a broad temperature and pH range. The enzyme was stable at 60 °C with a half-life of 5 h and the enzyme activity was inhibited by 0.1 % diethyl pyrocarbonate and 5 mM dicyclohexyl carbodiimide. The enzyme could be of great use in industrial processes due to its activity over a broad pH range and at high temperature.

  11. Gene and translation initiation site prediction in metagenomic sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translationmore » initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.« less

  12. In-depth resistome analysis by targeted metagenomics.

    PubMed

    Lanza, Val F; Baquero, Fernando; Martínez, José Luís; Ramos-Ruíz, Ricardo; González-Zorn, Bruno; Andremont, Antoine; Sánchez-Valenzuela, Antonio; Ehrlich, Stanislav Dusko; Kennedy, Sean; Ruppé, Etienne; van Schaik, Willem; Willems, Rob J; de la Cruz, Fernando; Coque, Teresa M

    2018-01-15

    Antimicrobial resistance is a major global health challenge. Metagenomics allows analyzing the presence and dynamics of "resistomes" (the ensemble of genes encoding antimicrobial resistance in a given microbiome) in disparate microbial ecosystems. However, the low sensitivity and specificity of available metagenomic methods preclude the detection of minority populations (often present below their detection threshold) and/or the identification of allelic variants that differ in the resulting phenotype. Here, we describe a novel strategy that combines targeted metagenomics using last generation in-solution capture platforms, with novel bioinformatics tools to establish a standardized framework that allows both quantitative and qualitative analyses of resistomes. We developed ResCap, a targeted sequence capture platform based on SeqCapEZ (NimbleGene) technology, which includes probes for 8667 canonical resistance genes (7963 antibiotic resistance genes and 704 genes conferring resistance to metals or biocides), and 2517 relaxase genes (plasmid markers) and 78,600 genes homologous to the previous identified targets (47,806 for antibiotics and 30,794 for biocides or metals). Its performance was compared with metagenomic shotgun sequencing (MSS) for 17 fecal samples (9 humans, 8 swine). ResCap significantly improves MSS to detect "gene abundance" (from 2.0 to 83.2%) and "gene diversity" (26 versus 14.9 genes unequivocally detected per sample per million of reads; the number of reads unequivocally mapped increasing up to 300-fold by using ResCap), which were calculated using novel bioinformatic tools. ResCap also facilitated the analysis of novel genes potentially involved in the resistance to antibiotics, metals, biocides, or any combination thereof. ResCap, the first targeted sequence capture, specifically developed to analyze resistomes, greatly enhances the sensitivity and specificity of available metagenomic methods and offers the possibility to analyze genes

  13. Metagenomic characterization of viral communities in Goseong Bay, Korea

    NASA Astrophysics Data System (ADS)

    Hwang, Jinik; Park, So Yun; Park, Mirye; Lee, Sukchan; Jo, Yeonhwa; Cho, Won Kyong; Lee, Taek-Kyun

    2016-12-01

    In this study, seawater samples were collected from Goseong Bay, Korea in March 2014 and viral populations were examined by metagenomics assembly. Enrichment of marine viral particles using FeCl3 followed by next-generation sequencing produced numerous sequences. De novo assembly and BLAST search showed that most of the obtained contigs were unknown sequences and only 0.74% of sequences were associated with known viruses. As a result, 138 viruses, including bacteriophages (87%), viruses infecting algae and others (13%) were identified. The identified 138 viruses were divided into 11 orders, 14 families, 34 genera, and 133 species. The dominant viruses were Pelagibacter phage HTVC010P and Roseobacter phage SIO1. The viruses infecting algae, including the Ostreococcus species, accounted for 9.4% of total identified viruses. In addition, we identified pathogenic herpes viruses infecting fishes and giant viruses infecting parasitic acanthamoeba species. This is a comprehensive study to reveal the viral populations in the Goseong Bay using metagenomics. The information associated with the marine viral community in Goseong Bay, Korea will be useful for comparative analysis in other marine viral communities.

  14. Metagenomic gene annotation by a homology-independent approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Froula, Jeff; Zhang, Tao; Salmeen, Annette

    2011-06-02

    Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMERmore » but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.« less

  15. Glucose-tolerant β-glucosidase retrieved from a Kusaya gravy metagenome.

    PubMed

    Uchiyama, Taku; Yaoi, Katusro; Miyazaki, Kentaro

    2015-01-01

    β-glucosidases (BGLs) hydrolyze cello-oligosaccharides to glucose and play a crucial role in the enzymatic saccharification of cellulosic biomass. Despite their significance for the production of glucose, most identified BGLs are commonly inhibited by low (∼mM) concentrations of glucose. Therefore, BGLs that are insensitive to glucose inhibition have great biotechnological merit. We applied a metagenomic approach to screen for such rare glucose-tolerant BGLs. A metagenomic library was created in Escherichia coli (∼10,000 colonies) and grown on LB agar plates containing 5-bromo-4-chloro-3-indolyl-β-D-glucoside, yielding 828 positive (blue) colonies. These were then arrayed in 96-well plates, grown in LB, and secondarily screened for activity in the presence of 10% (w/v) glucose. Seven glucose-tolerant clones were identified, each of which contained a single bgl gene. The genes were classified into two groups, differing by two nucleotides. The deduced amino acid sequences of these genes were identical (452 aa) and found to belong to the glycosyl hydrolase family 1. The recombinant protein (Ks5A7) was overproduced in E. coli as a C-terminal 6 × His-tagged protein and purified to apparent homogeneity. The molecular mass of the purified Ks5A7 was determined to be 54 kDa by SDS-PAGE, and 160 kDa by gel filtration analysis. The enzyme was optimally active at 45°C and pH 5.0-6.5 and retained full or 1.5-2-fold enhanced activity in the presence of 0.1-0.5 M glucose. It had a low KM (78 μM with p-nitrophenyl β-D-glucoside; 0.36 mM with cellobiose) and high V max (91 μmol min(-1) mg(-1) with p-nitrophenyl β-D-glucoside; 155 μmol min(-1) mg(-1) with cellobiose) among known glucose-tolerant BGLs and was free from substrate (0.1 M cellobiose) inhibition. The efficient use of Ks5A7 in conjunction with Trichoderma reesei cellulases in enzymatic saccharification of alkaline-treated rice straw was demonstrated by increased production of glucose.

  16. Glucose-tolerant β-glucosidase retrieved from a Kusaya gravy metagenome

    PubMed Central

    Uchiyama, Taku; Yaoi, Katusro; Miyazaki, Kentaro

    2015-01-01

    β-glucosidases (BGLs) hydrolyze cello-oligosaccharides to glucose and play a crucial role in the enzymatic saccharification of cellulosic biomass. Despite their significance for the production of glucose, most identified BGLs are commonly inhibited by low (∼mM) concentrations of glucose. Therefore, BGLs that are insensitive to glucose inhibition have great biotechnological merit. We applied a metagenomic approach to screen for such rare glucose-tolerant BGLs. A metagenomic library was created in Escherichia coli (∼10,000 colonies) and grown on LB agar plates containing 5-bromo-4-chloro-3-indolyl-β-D-glucoside, yielding 828 positive (blue) colonies. These were then arrayed in 96-well plates, grown in LB, and secondarily screened for activity in the presence of 10% (w/v) glucose. Seven glucose-tolerant clones were identified, each of which contained a single bgl gene. The genes were classified into two groups, differing by two nucleotides. The deduced amino acid sequences of these genes were identical (452 aa) and found to belong to the glycosyl hydrolase family 1. The recombinant protein (Ks5A7) was overproduced in E. coli as a C-terminal 6 × His-tagged protein and purified to apparent homogeneity. The molecular mass of the purified Ks5A7 was determined to be 54 kDa by SDS-PAGE, and 160 kDa by gel filtration analysis. The enzyme was optimally active at 45°C and pH 5.0–6.5 and retained full or 1.5–2-fold enhanced activity in the presence of 0.1–0.5 M glucose. It had a low KM (78 μM with p-nitrophenyl β-D-glucoside; 0.36 mM with cellobiose) and high Vmax (91 μmol min-1 mg-1 with p-nitrophenyl β-D-glucoside; 155 μmol min-1 mg-1 with cellobiose) among known glucose-tolerant BGLs and was free from substrate (0.1 M cellobiose) inhibition. The efficient use of Ks5A7 in conjunction with Trichoderma reesei cellulases in enzymatic saccharification of alkaline-treated rice straw was demonstrated by increased production of glucose. PMID:26136726

  17. Metagenomic frameworks for monitoring antibiotic resistance in aquatic environments.

    PubMed

    Port, Jesse A; Cullen, Alison C; Wallace, James C; Smith, Marissa N; Faustman, Elaine M

    2014-03-01

    High-throughput genomic technologies offer new approaches for environmental health monitoring, including metagenomic surveillance of antibiotic resistance determinants (ARDs). Although natural environments serve as reservoirs for antibiotic resistance genes that can be transferred to pathogenic and human commensal bacteria, monitoring of these determinants has been infrequent and incomplete. Furthermore, surveillance efforts have not been integrated into public health decision making. We used a metagenomic epidemiology-based approach to develop an ARD index that quantifies antibiotic resistance potential, and we analyzed this index for common modal patterns across environmental samples. We also explored how metagenomic data such as this index could be conceptually framed within an early risk management context. We analyzed 25 published data sets from shotgun pyrosequencing projects. The samples consisted of microbial community DNA collected from marine and freshwater environments across a gradient of human impact. We used principal component analysis to identify index patterns across samples. We observed significant differences in the overall index and index subcategory levels when comparing ecosystems more proximal versus distal to human impact. The selection of different sequence similarity thresholds strongly influenced the index measurements. Unique index subcategory modes distinguished the different metagenomes. Broad-scale screening of ARD potential using this index revealed utility for framing environmental health monitoring and surveillance. This approach holds promise as a screening tool for establishing baseline ARD levels that can be used to inform and prioritize decision making regarding management of ARD sources and human exposure routes. Port JA, Cullen AC, Wallace JC, Smith MN, Faustman EM. 2014. Metagenomic frameworks for monitoring antibiotic resistance in aquatic environments. Environ Health Perspect 122:222–228; http://dx.doi.org/10.1289/ehp

  18. Crystallization and preliminary X-ray analysis of a novel halotolerant feruloyl esterase identified from a soil metagenomic library

    PubMed Central

    Chen, Shang-ke; Wang, Kui; Liu, Yuhuan; Hu, Xiaopeng

    2012-01-01

    Feruloyl esterase cleaves the ester linkage formed between ferulic acid and polysaccharides in plant cell walls and thus has wide potential industrial applications. A novel feruloyl esterase (EstF27) identified from a soil metagenomic library was crystallized and a complete data set was collected from a single cooled crystal using an in-house X-ray source. The crystal diffracted to 2.9 Å resolution and belonged to space group P212121, with unit-cell parameters a = 94.35, b = 106.19, c = 188.51 Å, α = β = γ = 90.00°. A Matthews coefficient of 2.55 Å3 Da−1, with a corresponding solvent content of 51.84%, suggested the presence of ten protein subunits in the asymmetric unit. PMID:22750860

  19. Human milk metagenome: a functional capacity analysis

    PubMed Central

    2013-01-01

    Background Human milk contains a diverse population of bacteria that likely influences colonization of the infant gastrointestinal tract. Recent studies, however, have been limited to characterization of this microbial community by 16S rRNA analysis. In the present study, a metagenomic approach using Illumina sequencing of a pooled milk sample (ten donors) was employed to determine the genera of bacteria and the types of bacterial open reading frames in human milk that may influence bacterial establishment and stability in this primal food matrix. The human milk metagenome was also compared to that of breast-fed and formula-fed infants’ feces (n = 5, each) and mothers’ feces (n = 3) at the phylum level and at a functional level using open reading frame abundance. Additionally, immune-modulatory bacterial-DNA motifs were also searched for within human milk. Results The bacterial community in human milk contained over 360 prokaryotic genera, with sequences aligning predominantly to the phyla of Proteobacteria (65%) and Firmicutes (34%), and the genera of Pseudomonas (61.1%), Staphylococcus (33.4%) and Streptococcus (0.5%). From assembled human milk-derived contigs, 30,128 open reading frames were annotated and assigned to functional categories. When compared to the metagenome of infants’ and mothers’ feces, the human milk metagenome was less diverse at the phylum level, and contained more open reading frames associated with nitrogen metabolism, membrane transport and stress response (P < 0.05). The human milk metagenome also contained a similar occurrence of immune-modulatory DNA motifs to that of infants’ and mothers’ fecal metagenomes. Conclusions Our results further expand the complexity of the human milk metagenome and enforce the benefits of human milk ingestion on the microbial colonization of the infant gut and immunity. Discovery of immune-modulatory motifs in the metagenome of human milk indicates more exhaustive analyses of the

  20. A functional metagenomic approach for expanding the synthetic biology toolbox for biomass conversion

    PubMed Central

    Sommer, Morten OA; Church, George M; Dantas, Gautam

    2010-01-01

    Sustainable biofuel alternatives to fossil fuel energy are hampered by recalcitrance and toxicity of biomass substrates to microbial biocatalysts. To address this issue, we present a culture-independent functional metagenomic platform for mining Nature's vast enzymatic reservoir and show its relevance to biomass conversion. We performed functional selections on 4.7 Gb of metagenomic fosmid libraries and show that genetic elements conferring tolerance toward seven important biomass inhibitors can be identified. We select two metagenomic fosmids that improve the growth of Escherichia coli by 5.7- and 6.9-fold in the presence of inhibitory concentrations of syringaldehyde and 2-furoic acid, respectively, and identify the individual genes responsible for these tolerance phenotypes. Finally, we combine the individual genes to create a three-gene construct that confers tolerance to mixtures of these important biomass inhibitors. This platform presents a route for expanding the repertoire of genetic elements available to synthetic biology and provides a starting point for efforts to engineer robust strains for biofuel generation. PMID:20393580

  1. Toward Accurate and Quantitative Comparative Metagenomics

    PubMed Central

    Nayfach, Stephen; Pollard, Katherine S.

    2016-01-01

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. PMID:27565341

  2. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs

    PubMed Central

    Eloe-Fadrosh, Emiley A.; Paez-Espino, David; Jarett, Jessica; Dunfield, Peter F.; Hedlund, Brian P.; Dekas, Anne E.; Grasby, Stephen E.; Brady, Allyson L.; Dong, Hailiang; Briggs, Brandon R.; Li, Wen-Jun; Goudeau, Danielle; Malmstrom, Rex; Pati, Amrita; Pett-Ridge, Jennifer; Rubin, Edward M.; Woyke, Tanja; Kyrpides, Nikos C.; Ivanova, Natalia N.

    2016-01-01

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum (‘Candidatus Kryptonia') found exclusively in high-temperature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic ‘blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestyle with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery. PMID:26814032

  3. Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs

    DOE PAGES

    Eloe-Fadrosh, Emiley A.; Paez-Espino, David; Jarett, Jessica; ...

    2016-01-27

    Analysis of the increasing wealth of metagenomic data collected from diverse environments can lead to the discovery of novel branches on the tree of life. Here we analyse 5.2 Tb of metagenomic data collected globally to discover a novel bacterial phylum (' Candidatus Kryptonia') found exclusively in higherature pH-neutral geothermal springs. This lineage had remained hidden as a taxonomic 'blind spot' because of mismatches in the primers commonly used for ribosomal gene surveys. Genome reconstruction from metagenomic data combined with single-cell genomics results in several high-quality genomes representing four genera from the new phylum. Metabolic reconstruction indicates a heterotrophic lifestylemore » with conspicuous nutritional deficiencies, suggesting the need for metabolic complementarity with other microbes. Co-occurrence patterns identifies a number of putative partners, including an uncultured Armatimonadetes lineage. The discovery of Kryptonia within previously studied geothermal springs underscores the importance of globally sampled metagenomic data in detection of microbial novelty, and highlights the extraordinary diversity of microbial life still awaiting discovery.« less

  4. Metagenomic applications in environmental monitoring and bioremediation

    DOE PAGES

    Techtmann, Stephen M.; Hazen, Terry C.

    2016-01-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples ofmore » the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.« less

  5. Metagenomic applications in environmental monitoring and bioremediation.

    PubMed

    Techtmann, Stephen M; Hazen, Terry C

    2016-10-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples of the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.

  6. Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Stepanauskas, Ramunas

    2018-02-06

    DOE JGI's Tanja Woyke, chair of the Single Cells and Metagenomes session, delivers an introduction, followed by Bigelow Laboratory's Ramunas Stepanauskas on "Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  7. Toward Accurate and Quantitative Comparative Metagenomics.

    PubMed

    Nayfach, Stephen; Pollard, Katherine S

    2016-08-25

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Challenges and Opportunities of Airborne Metagenomics

    PubMed Central

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-01-01

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles. PMID:25953766

  9. Metagenomics and novel gene discovery

    PubMed Central

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-01-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics. PMID:24317337

  10. Survey of (Meta)genomic Approaches for Understanding Microbial Community Dynamics.

    PubMed

    Sharma, Anukriti; Lal, Rup

    2017-03-01

    Advancement in the next generation sequencing technologies has led to evolution of the field of genomics and metagenomics in a slim duration with nominal cost at precipitous higher rate. While metagenomics and genomics can be separately used to reveal the culture-independent and culture-based microbial evolution, respectively, (meta)genomics together can be used to demonstrate results at population level revealing in-depth complex community interactions for specific ecotypes. The field of metagenomics which started with answering "who is out there?" based on 16S rRNA gene has evolved immensely with the precise organismal reconstruction at species/strain level from the deeply covered metagenome data outweighing the need to isolate bacteria of which 99% are de facto non-cultivable. In this review we have underlined the appeal of metagenomic-derived genomes in providing insights into the evolutionary patterns, growth dynamics, genome/gene-specific sweeps, and durability of environmental pressures. We have demonstrated the use of culture-based genomics and environmental shotgun metagenome data together to elucidate environment specific genome modulations via metagenomic recruitments in terms of gene loss/gain, accessory and core-genome extent. We further illustrated the benefit of (meta)genomics in the understanding of infectious diseases by deducing the relationship between human microbiota and clinical microbiology. This review summarizes the technological advances in the (meta)genomic strategies using the genome and metagenome datasets together to increase the resolution of microbial population studies.

  11. Interactive metagenomic visualization in a Web browser.

    PubMed

    Ondov, Brian D; Bergman, Nicholas H; Phillippy, Adam M

    2011-09-30

    A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net.

  12. Interactive metagenomic visualization in a Web browser

    PubMed Central

    2011-01-01

    Background A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables. Results Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools. Conclusions Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: http://krona.sourceforge.net. PMID:21961884

  13. Metagenomic Frameworks for Monitoring Antibiotic Resistance in Aquatic Environments

    PubMed Central

    Port, Jesse A.; Cullen, Alison C.; Wallace, James C.; Smith, Marissa N.

    2013-01-01

    Background: High-throughput genomic technologies offer new approaches for environmental health monitoring, including metagenomic surveillance of antibiotic resistance determinants (ARDs). Although natural environments serve as reservoirs for antibiotic resistance genes that can be transferred to pathogenic and human commensal bacteria, monitoring of these determinants has been infrequent and incomplete. Furthermore, surveillance efforts have not been integrated into public health decision making. Objectives: We used a metagenomic epidemiology–based approach to develop an ARD index that quantifies antibiotic resistance potential, and we analyzed this index for common modal patterns across environmental samples. We also explored how metagenomic data such as this index could be conceptually framed within an early risk management context. Methods: We analyzed 25 published data sets from shotgun pyrosequencing projects. The samples consisted of microbial community DNA collected from marine and freshwater environments across a gradient of human impact. We used principal component analysis to identify index patterns across samples. Results: We observed significant differences in the overall index and index subcategory levels when comparing ecosystems more proximal versus distal to human impact. The selection of different sequence similarity thresholds strongly influenced the index measurements. Unique index subcategory modes distinguished the different metagenomes. Conclusions: Broad-scale screening of ARD potential using this index revealed utility for framing environmental health monitoring and surveillance. This approach holds promise as a screening tool for establishing baseline ARD levels that can be used to inform and prioritize decision making regarding management of ARD sources and human exposure routes. Citation: Port JA, Cullen AC, Wallace JC, Smith MN, Faustman EM. 2014. Metagenomic frameworks for monitoring antibiotic resistance in aquatic environments

  14. Comparative analysis of metagenomes of Italian top soil improvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gigliucci, Federica, E-mail: Federica.gigliucci@li

    Biosolids originating from Municipal Waste Water Treatment Plants are proposed as top soil improvers (TSI) for their beneficial input of organic carbon on agriculture lands. Their use to amend soil is controversial, as it may lead to the presence of emerging hazards of anthropogenic or animal origin in the environment devoted to food production. In this study, we used a shotgun metagenomics sequencing as a tool to perform a characterization of the hazards related with the TSIs. The samples showed the presence of many virulence genes associated to different diarrheagenic E. coli pathotypes as well as of different antimicrobial resistance-associatedmore » genes. The genes conferring resistance to Fluoroquinolones was the most relevant class of antimicrobial resistance genes observed in all the samples tested. To a lesser extent traits associated with the resistance to Methicillin in Staphylococci and genes conferring resistance to Streptothricin, Fosfomycin and Vancomycin were also identified. The most represented metal resistance genes were cobalt-zinc-cadmium related, accounting for 15–50% of the sequence reads in the different metagenomes out of the total number of those mapping on the class of resistance to compounds determinants. Moreover the taxonomic analysis performed by comparing compost-based samples and biosolids derived from municipal sewage-sludges treatments divided the samples into separate populations, based on the microbiota composition. The results confirm that the metagenomics is efficient to detect genomic traits associated with pathogens and antimicrobial resistance in complex matrices and this approach can be efficiently used for the traceability of TSI samples using the microorganisms’ profiles as indicators of their origin. - Highlights: • Sludge- and green- based biosolids analysed by metagenomics. • Biosolids may introduce microbial hazards in the food chain. • Metagenomics enables tracking biosolids’ sources.« less

  15. EBI metagenomics in 2016 - an expanding and evolving resource for the analysis and archiving of metagenomic data

    PubMed Central

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; Hoopen, Petra ten; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D.

    2016-01-01

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. PMID:26582919

  16. Metazen – metadata capture for metagenomes

    DOE PAGES

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; ...

    2014-12-08

    Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less

  17. Metazen – metadata capture for metagenomes

    PubMed Central

    2014-01-01

    Background As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusions Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility. PMID:25780508

  18. Metazen – metadata capture for metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bischof, Jared; Harrison, Travis; Paczian, Tobias

    Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less

  19. Developing a Bacteroides System for Function-Based Screening of DNA from the Human Gut Microbiome.

    PubMed

    Lam, Kathy N; Martens, Eric C; Charles, Trevor C

    2018-01-01

    Functional metagenomics is a powerful method that allows the isolation of genes whose role may not have been predicted from DNA sequence. In this approach, first, environmental DNA is cloned to generate metagenomic libraries that are maintained in Escherichia coli, and second, the cloned DNA is screened for activities of interest. Typically, functional screens are carried out using E. coli as a surrogate host, although there likely exist barriers to gene expression, such as lack of recognition of native promoters. Here, we describe efforts to develop Bacteroides thetaiotaomicron as a surrogate host for screening metagenomic DNA from the human gut. We construct a B. thetaiotaomicron-compatible fosmid cloning vector, generate a fosmid clone library using DNA from the human gut, and show successful functional complementation of a B. thetaiotaomicron glycan utilization mutant. Though we were unable to retrieve the physical fosmid after complementation, we used genome sequencing to identify the complementing genes derived from the human gut microbiome. Our results demonstrate that the use of B. thetaiotaomicron to express metagenomic DNA is promising, but they also exemplify the challenges that can be encountered in the development of new surrogate hosts for functional screening. IMPORTANCE Human gut microbiome research has been supported by advances in DNA sequencing that make it possible to obtain gigabases of sequence data from metagenomes but is limited by a lack of knowledge of gene function that leads to incomplete annotation of these data sets. There is a need for the development of methods that can provide experimental data regarding microbial gene function. Functional metagenomics is one such method, but functional screens are often carried out using hosts that may not be able to express the bulk of the environmental DNA being screened. We expand the range of current screening hosts and demonstrate that human gut-derived metagenomic libraries can be

  20. Metagenome analyses of corroded concrete wastewater pipe biofilms reveal a complex microbial system

    PubMed Central

    2012-01-01

    Background Concrete corrosion of wastewater collection systems is a significant cause of deterioration and premature collapse. Failure to adequately address the deteriorating infrastructure networks threatens our environment, public health, and safety. Analysis of whole-metagenome pyrosequencing data and 16S rRNA gene clone libraries was used to determine microbial composition and functional genes associated with biomass harvested from crown (top) and invert (bottom) sections of a corroded wastewater pipe. Results Taxonomic and functional analysis demonstrated that approximately 90% of the total diversity was associated with the phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria. The top (TP) and bottom pipe (BP) communities were different in composition, with some of the differences attributed to the abundance of sulfide-oxidizing and sulfate-reducing bacteria. Additionally, human fecal bacteria were more abundant in the BP communities. Among the functional categories, proteins involved in sulfur and nitrogen metabolism showed the most significant differences between biofilms. There was also an enrichment of genes associated with heavy metal resistance, virulence (protein secretion systems) and stress response in the TP biofilm, while a higher number of genes related to motility and chemotaxis were identified in the BP biofilm. Both biofilms contain a high number of genes associated with resistance to antibiotics and toxic compounds subsystems. Conclusions The function potential of wastewater biofilms was highly diverse with level of COG diversity similar to that described for soil. On the basis of the metagenomic data, some factors that may contribute to niche differentiation were pH, aerobic conditions and availability of substrate, such as nitrogen and sulfur. The results from this study will help us better understand the genetic network and functional capability of microbial members of wastewater concrete biofilms. PMID:22727216

  1. A New Zamilon-like Virophage Partial Genome Assembled from a Bioreactor Metagenome

    PubMed Central

    Bekliz, Meriem; Verneau, Jonathan; Benamar, Samia; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2015-01-01

    Virophages replicate within viral factories inside the Acanthamoeba cytoplasm, and decrease the infectivity and replication of their associated giant viruses. Culture isolation and metagenome analyses have suggested that they are common in our environment. By screening metagenomic databases in search of amoebal viruses, we detected virophage-related sequences among sequences generated from the same non-aerated bioreactor metagenome as recently screened by another team for virophage capsid-encoding genes. We describe here the assembled partial genome of a virophage closely related to Zamilon, which infects Acanthamoeba with mimiviruses of lineages B and C but not A. Searches for sequences related to amoebal giant viruses, other Megavirales representatives and virophages were conducted using BLAST against this bioreactor metagenome (PRJNA73603). Comparative genomic and phylogenetic analyses were performed using sequences from previously identified virophages. A total of 72 metagenome contigs generated from the bioreactor were identified as best matching with sequences from Megavirales representatives, mostly Pithovirus sibericum, pandoraviruses and amoebal mimiviruses from three lineages A–C, as well as from virophages. In addition, a partial genome from a Zamilon-like virophage, we named Zamilon 2, was assembled. This genome has a size of 6716 base pairs, corresponding to 39% of the Zamilon genome, and comprises partial or full-length homologs for 15 Zamilon predicted open reading frames (ORFs). Mean nucleotide and amino acid identities for these 15 Zamilon 2 ORFs with their Zamilon counterparts were 89% (range, 81–96%) and 91% (range, 78–99%), respectively. Notably, these ORFs included two encoding a capsid protein and a packaging ATPase. Comparative genomics and phylogenetic analyses indicated that the partial genome was that of a new Zamilon-like virophage. Further studies are needed to gain better knowledge of the tropism and prevalence of virophages in

  2. A metagenomic window into carbon metabolism at 3 km depth in Precambrian continental crust

    PubMed Central

    Magnabosco, Cara; Ryan, Kathleen; Lau, Maggie C Y; Kuloyo, Olukayode; Sherwood Lollar, Barbara; Kieft, Thomas L; van Heerden, Esta; Onstott, Tullis C

    2016-01-01

    Subsurface microbial communities comprise a significant fraction of the global prokaryotic biomass; however, the carbon metabolisms that support the deep biosphere have been relatively unexplored. In order to determine the predominant carbon metabolisms within a 3-km deep fracture fluid system accessed via the Tau Tona gold mine (Witwatersrand Basin, South Africa), metagenomic and thermodynamic analyses were combined. Within our system of study, the energy-conserving reductive acetyl-CoA (Wood-Ljungdahl) pathway was found to be the most abundant carbon fixation pathway identified in the metagenome. Carbon monoxide dehydrogenase genes that have the potential to participate in (1) both autotrophic and heterotrophic metabolisms through the reversible oxidization of CO and subsequent transfer of electrons for sulfate reduction, (2) direct utilization of H2 and (3) methanogenesis were identified. The most abundant members of the metagenome belonged to Euryarchaeota (22%) and Firmicutes (57%)—by far, the highest relative abundance of Euryarchaeota yet reported from deep fracture fluids in South Africa and one of only five Firmicutes-dominated deep fracture fluids identified in the region. Importantly, by combining the metagenomics data and thermodynamic modeling of this study with previously published isotopic and community composition data from the South African subsurface, we are able to demonstrate that Firmicutes-dominated communities are associated with a particular hydrogeologic environment, specifically the older, more saline and more reducing waters. PMID:26325359

  3. Challenges and opportunities of airborne metagenomics.

    PubMed

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-05-06

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  4. A new tetracycline efflux gene, tet(40), is located in tandem with tet(O/32/O) in a human gut firmicute bacterium and in metagenomic library clones.

    PubMed

    Kazimierczak, Katarzyna A; Rincon, Marco T; Patterson, Andrea J; Martin, Jennifer C; Young, Pauline; Flint, Harry J; Scott, Karen P

    2008-11-01

    The bacterium Clostridium saccharolyticum K10, isolated from a fecal sample obtained from a healthy donor who had received long-term tetracycline therapy, was found to carry three tetracycline resistance genes: tet(W) and the mosaic tet(O/32/O), both conferring ribosome protection-type resistance, and a novel, closely linked efflux-type resistance gene designated tet(40). tet(40) encodes a predicted membrane-associated protein with 42% amino acid identity to tetA(P). Tetracycline did not accumulate in Escherichia coli cells expressing the Tet(40) efflux protein, and resistance to tetracycline was reduced when cells were incubated with an efflux pump inhibitor. E. coli cells carrying tet(40) had a 50% inhibitory concentration of tetracycline of 60 microg/ml. Analysis of a transconjugant from a mating between donor strain C. saccharolyticum K10 and the recipient human gut commensal bacterium Roseburia inulinivorans suggested that tet(O/32/O) and tet(40) were cotransferred on a mobile element. Sequence analysis of a 37-kb insert identified on the basis of tetracycline resistance from a metagenomic fosmid library again revealed a tandem arrangement of tet(O/32/O) and tet(40), flanked by regions with homology to parts of the VanG operon previously identified in Enterococcus faecalis. At least 10 of the metagenomic inserts that carried tet(O/32/O) also carried tet(40), suggesting that tet(40), although previously undetected, may be an abundant efflux gene.

  5. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  6. Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance

    DOE PAGES

    Ahn, Tae-Hyuk; Chai, Juanjuan; Pan, Chongle

    2014-09-29

    Motivation: Metagenomic sequencing of clinical samples provides a promising technique for direct pathogen detection and characterization in biosurveillance. Taxonomic analysis at the strain level can be used to resolve serotypes of a pathogen in biosurveillance. Sigma was developed for strain-level identification and quantification of pathogens using their reference genomes based on metagenomic analysis. Results: Sigma provides not only accurate strain-level inferences, but also three unique capabilities: (i) Sigma quantifies the statistical uncertainty of its inferences, which includes hypothesis testing of identified genomes and confidence interval estimation of their relative abundances; (ii) Sigma enables strain variant calling by assigning metagenomic readsmore » to their most likely reference genomes; and (iii) Sigma supports parallel computing for fast analysis of large datasets. In conclusion, the algorithm performance was evaluated using simulated mock communities and fecal samples with spike-in pathogen strains. Availability and Implementation: Sigma was implemented in C++ with source codes and binaries freely available at http://sigma.omicsbio.org.« less

  7. Marine metagenomics: strategies for the discovery of novel enzymes with biotechnological applications from marine environments

    PubMed Central

    Kennedy, Jonathan; Marchesi, Julian R; Dobson, Alan DW

    2008-01-01

    Metagenomic based strategies have previously been successfully employed as powerful tools to isolate and identify enzymes with novel biocatalytic activities from the unculturable component of microbial communities from various terrestrial environmental niches. Both sequence based and function based screening approaches have been employed to identify genes encoding novel biocatalytic activities and metabolic pathways from metagenomic libraries. While much of the focus to date has centred on terrestrial based microbial ecosystems, it is clear that the marine environment has enormous microbial biodiversity that remains largely unstudied. Marine microbes are both extremely abundant and diverse; the environments they occupy likewise consist of very diverse niches. As culture-dependent methods have thus far resulted in the isolation of only a tiny percentage of the marine microbiota the application of metagenomic strategies holds great potential to study and exploit the enormous microbial biodiversity which is present within these marine environments. PMID:18717988

  8. Comparative Viral Metagenomics of Environmental Samples from Korea

    PubMed Central

    Kim, Min-Soo; Whon, Tae Woong

    2013-01-01

    The introduction of metagenomics into the field of virology has facilitated the exploration of viral communities in various natural habitats. Understanding the viral ecology of a variety of sample types throughout the biosphere is important per se, but it also has potential applications in clinical and diagnostic virology. However, the procedures used by viral metagenomics may produce technical errors, such as amplification bias, while public viral databases are very limited, which may hamper the determination of the viral diversity in samples. This review considers the current state of viral metagenomics, based on examples from Korean viral metagenomic studies-i.e., rice paddy soil, fermented foods, human gut, seawater, and the near-surface atmosphere. Viral metagenomics has become widespread due to various methodological developments, and much attention has been focused on studies that consider the intrinsic role of viruses that interact with their hosts. PMID:24124407

  9. A Delphi Technology Foresight Study: Mapping Social Construction of Scientific Evidence on Metagenomics Tests for Water Safety

    PubMed Central

    Birko, Stanislav; Dove, Edward S.; Özdemir, Vural

    2015-01-01

    metagenomics rather than a co-productionist role at the “upstream” scientific design stage of metagenomics tests. In summary, these findings offer strategic foresight to govern metagenomics innovations symmetrically: by identifying areas where acceleration (e.g., consensus areas) and deceleration/reconsideration (e.g., dissensus areas) of the innovation trajectory might be warranted. Additionally, we show how scientific evidence is subject to potential social construction by experts’ value systems and the need for greater upstream public engagement on metagenomics innovations. PMID:26066837

  10. Metagenomic insights into chlorination effects on microbial antibiotic resistance in drinking water.

    PubMed

    Shi, Peng; Jia, Shuyu; Zhang, Xu-Xiang; Zhang, Tong; Cheng, Shupei; Li, Aimin

    2013-01-01

    This study aimed to investigate the chlorination effects on microbial antibiotic resistance in a drinking water treatment plant. Biochemical identification, 16S rRNA gene cloning and metagenomic analysis consistently indicated that Proteobacteria were the main antibiotic resistant bacteria (ARB) dominating in the drinking water and chlorine disinfection greatly affected microbial community structure. After chlorination, higher proportion of the surviving bacteria was resistant to chloramphenicol, trimethoprim and cephalothin. Quantitative real-time PCRs revealed that sulI had the highest abundance among the antibiotic resistance genes (ARGs) detected in the drinking water, followed by tetA and tetG. Chlorination caused enrichment of ampC, aphA2, bla(TEM-1), tetA, tetG, ermA and ermB, but sulI was considerably removed (p < 0.05). Metagenomic analysis confirmed that drinking water chlorination could concentrate various ARGs, as well as of plasmids, insertion sequences and integrons involved in horizontal transfer of the ARGs. Water pipeline transportation tended to reduce the abundance of most ARGs, but various ARB and ARGs were still present in the tap water, which deserves more public health concerns. The results highlighted prevalence of ARB and ARGs in chlorinated drinking water and this study might be technologically useful for detecting the ARGs in water environments. Copyright © 2012 Elsevier Ltd. All rights reserved.

  11. Metagenomic analysis reveals adaptations to a cold-adapted lifestyle in a low-temperature acid mine drainage stream.

    PubMed

    Liljeqvist, Maria; Ossandon, Francisco J; González, Carolina; Rajan, Sukithar; Stell, Adam; Valdes, Jorge; Holmes, David S; Dopson, Mark

    2015-04-01

    An acid mine drainage (pH 2.5-2.7) stream biofilm situated 250 m below ground in the low-temperature (6-10°C) Kristineberg mine, northern Sweden, contained a microbial community equipped for growth at low temperature and acidic pH. Metagenomic sequencing of the biofilm and planktonic fractions identified the most abundant microorganism to be similar to the psychrotolerant acidophile, Acidithiobacillus ferrivorans. In addition, metagenome contigs were most similar to other Acidithiobacillus species, an Acidobacteria-like species, and a Gallionellaceae-like species. Analyses of the metagenomes indicated functional characteristics previously characterized as related to growth at low temperature including cold-shock proteins, several pathways for the production of compatible solutes and an anti-freeze protein. In addition, genes were predicted to encode functions related to pH homeostasis and metal resistance related to growth in the acidic metal-containing mine water. Metagenome analyses identified microorganisms capable of nitrogen fixation and exhibiting a primarily autotrophic lifestyle driven by the oxidation of the ferrous iron and inorganic sulfur compounds contained in the sulfidic mine waters. The study identified a low diversity of abundant microorganisms adapted to a low-temperature acidic environment as well as identifying some of the strategies the microorganisms employ to grow in this extreme environment. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. Comparison of normalization methods for the analysis of metagenomic gene abundance data.

    PubMed

    Pereira, Mariana Buongermino; Wallroth, Mikael; Jonsson, Viktor; Kristiansson, Erik

    2018-04-20

    In shotgun metagenomics, microbial communities are studied through direct sequencing of DNA without any prior cultivation. By comparing gene abundances estimated from the generated sequencing reads, functional differences between the communities can be identified. However, gene abundance data is affected by high levels of systematic variability, which can greatly reduce the statistical power and introduce false positives. Normalization, which is the process where systematic variability is identified and removed, is therefore a vital part of the data analysis. A wide range of normalization methods for high-dimensional count data has been proposed but their performance on the analysis of shotgun metagenomic data has not been evaluated. Here, we present a systematic evaluation of nine normalization methods for gene abundance data. The methods were evaluated through resampling of three comprehensive datasets, creating a realistic setting that preserved the unique characteristics of metagenomic data. Performance was measured in terms of the methods ability to identify differentially abundant genes (DAGs), correctly calculate unbiased p-values and control the false discovery rate (FDR). Our results showed that the choice of normalization method has a large impact on the end results. When the DAGs were asymmetrically present between the experimental conditions, many normalization methods had a reduced true positive rate (TPR) and a high false positive rate (FPR). The methods trimmed mean of M-values (TMM) and relative log expression (RLE) had the overall highest performance and are therefore recommended for the analysis of gene abundance data. For larger sample sizes, CSS also showed satisfactory performance. This study emphasizes the importance of selecting a suitable normalization methods in the analysis of data from shotgun metagenomics. Our results also demonstrate that improper methods may result in unacceptably high levels of false positives, which in turn may lead

  13. Microbial population index and community structure in saline-alkaline soil using gene targeted metagenomics.

    PubMed

    Keshri, Jitendra; Mishra, Avinash; Jha, Bhavanath

    2013-03-30

    Population indices of bacteria and archaea were investigated from saline-alkaline soil and a possible microbe-environment pattern was established using gene targeted metagenomics. Clone libraries were constructed using 16S rRNA and functional gene(s) involved in carbon fixation (cbbL), nitrogen fixation (nifH), ammonia oxidation (amoA) and sulfur metabolism (apsA). Molecular phylogeny revealed the dominance of Actinobacteria, Firmicutes and Proteobacteria along with archaeal members of Halobacteraceae. The library consisted of novel bacterial (20%) and archaeal (38%) genera showing ≤95% similarity to previously retrieved sequences. Phylogenetic analysis indicated ability of inhabitant to survive in stress condition. The 16S rRNA gene libraries contained novel gene sequences and were distantly homologous with cultured bacteria. Functional gene libraries were found unique and most of the clones were distantly related to Proteobacteria, while clones of nifH gene library also showed homology with Cyanobacteria and Firmicutes. Quantitative real-time PCR exhibited that bacterial abundance was two orders of magnitude higher than archaeal. The gene(s) quantification indicated the size of the functional guilds harboring relevant key genes. The study provides insights on microbial ecology and different metabolic interactions occurring in saline-alkaline soil, possessing phylogenetically diverse groups of bacteria and archaea, which may be explored further for gene cataloging and metabolic profiling. Copyright © 2012 Elsevier GmbH. All rights reserved.

  14. Cloning, purification, crystallization and preliminary X-ray studies of a carbohydrate-binding module (CBM_E1) derived from sugarcane soil metagenome.

    PubMed

    Campos, Bruna Medeia; Alvarez, Thabata Maria; Liberato, Marcelo Vizona; Polikarpov, Igor; Gilbert, Harry J; Zeri, Ana Carolina de Mattos; Squina, Fabio Marcio

    2014-09-01

    In recent years, owing to the growing global demand for energy, dependence on fossil fuels, limited natural resources and environmental pollution, biofuels have attracted great interest as a source of renewable energy. However, the production of biofuels from plant biomass is still considered to be an expensive technology. In this context, the study of carbohydrate-binding modules (CBMs), which are involved in guiding the catalytic domains of glycoside hydrolases for polysaccharide degradation, is attracting growing attention. Aiming at the identification of new CBMs, a sugarcane soil metagenomic library was analyzed and an uncharacterized CBM (CBM_E1) was identified. In this study, CBM_E1 was expressed, purified and crystallized. X-ray diffraction data were collected to 1.95 Å resolution. The crystals, which were obtained by the sitting-drop vapour-diffusion method, belonged to space group I23, with unit-cell parameters a = b = c = 88.07 Å.

  15. Product-induced gene expression, a product-responsive reporter assay used to screen metagenomic libraries for enzyme-encoding genes.

    PubMed

    Uchiyama, Taku; Miyazaki, Kentaro

    2010-11-01

    A reporter assay-based screening method for enzymes, which we named product-induced gene expression (PIGEX), was developed and used to screen a metagenomic library for amidases. A benzoate-responsive transcriptional activator, BenR, was placed upstream of the gene encoding green fluorescent protein and used as a sensor. Escherichia coli sensor cells carrying the benR-gfp gene cassette fluoresced in response to benzoate concentrations as low as 10 μM but were completely unresponsive to the substrate benzamide. An E. coli metagenomic library consisting of 96,000 clones was grown in 96-well format in LB medium containing benzamide. The library cells were then cocultivated with sensor cells. Eleven amidase genes were recovered from 143 fluorescent wells; eight of these genes were homologous to known bacterial amidase genes while three were novel genes. In addition to their activity toward benzamide, the enzymes were active toward various substrates, including d- and l-amino acid amides, and displayed enantioselectivity. Thus, we demonstrated that PIGEX is an effective approach for screening novel enzymes based on product detection.

  16. Meta-Storms: efficient search for similar microbial communities based on a novel indexing scheme and similarity score for metagenomic data.

    PubMed

    Su, Xiaoquan; Xu, Jian; Ning, Kang

    2012-10-01

    database management and search system to quickly identify similar metagenomic samples from a large pool of samples. ningkang@qibebt.ac.cn Supplementary data are available at Bioinformatics online.

  17. Unlocking the potential of metagenomics through replicated experimental design.

    PubMed

    Knight, Rob; Jansson, Janet; Field, Dawn; Fierer, Noah; Desai, Narayan; Fuhrman, Jed A; Hugenholtz, Phil; van der Lelie, Daniel; Meyer, Folker; Stevens, Rick; Bailey, Mark J; Gordon, Jeffrey I; Kowalchuk, George A; Gilbert, Jack A

    2012-06-07

    Metagenomics holds enormous promise for discovering novel enzymes and organisms that are biomarkers or drivers of processes relevant to disease, industry and the environment. In the past two years, we have seen a paradigm shift in metagenomics to the application of cross-sectional and longitudinal studies enabled by advances in DNA sequencing and high-performance computing. These technologies now make it possible to broadly assess microbial diversity and function, allowing systematic investigation of the largely unexplored frontier of microbial life. To achieve this aim, the global scientific community must collaborate and agree upon common objectives and data standards to enable comparative research across the Earth's microbiome. Improvements in comparability of data will facilitate the study of biotechnologically relevant processes, such as bioprospecting for new glycoside hydrolases or identifying novel energy sources.

  18. A unique circovirus-like genome detected in pig feces

    USDA-ARS?s Scientific Manuscript database

    Using a metagenomic approach and molecular cloning methods, we identified, cloned, and sequenced the complete genome of a novel circular DNA virus, porcine stool-associated virus (PoSCV4), from pig feces. Phylogenetic analysis of the deduced replication initiator protein showed that PoSCV4 is most r...

  19. Current Advances on Virus Discovery and Diagnostic Role of Viral Metagenomics in Aquatic Organisms

    PubMed Central

    Munang'andu, Hetron M.; Mugimba, Kizito K.; Byarugaba, Denis K.; Mutoloki, Stephen; Evensen, Øystein

    2017-01-01

    The global expansion of the aquaculture industry has brought with it a corresponding increase of novel viruses infecting different aquatic organisms. These emerging viral pathogens have proved to be a challenge to the use of traditional cell-cultures and immunoassays for identification of new viruses especially in situations where the novel viruses are unculturable and no antibodies exist for their identification. Viral metagenomics has the potential to identify novel viruses without prior knowledge of their genomic sequence data and may provide a solution for the study of unculturable viruses. This review provides a synopsis on the contribution of viral metagenomics to the discovery of viruses infecting different aquatic organisms as well as its potential role in viral diagnostics. High throughput Next Generation sequencing (NGS) and library construction used in metagenomic projects have simplified the task of generating complete viral genomes unlike the challenge faced in traditional methods that use multiple primers targeted at different segments and VPs to generate the entire genome of a novel virus. In terms of diagnostics, studies carried out this far show that viral metagenomics has the potential to serve as a multifaceted tool able to study and identify etiological agents of single infections, co-infections, tissue tropism, profiling viral infections of different aquatic organisms, epidemiological monitoring of disease prevalence, evolutionary phylogenetic analyses, and the study of genomic diversity in quasispecies viruses. With sequencing technologies and bioinformatics analytical tools becoming cheaper and easier, we anticipate that metagenomics will soon become a routine tool for the discovery, study, and identification of novel pathogens including viruses to enable timely disease control for emerging diseases in aquaculture. PMID:28382024

  20. Tracking microbial colonization in fecal microbiota transplantation experiments via genome-resolved metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Sonny T. M.; Kahn, Stacy A.; Delmont, Tom O.

    Fecal microbiota transplantation (FMT) is an effective treatment for recurrent Clostridium difficile infection and shows promise for treating other medical conditions associated with intestinal dysbioses. However, we lack a sufficient understanding of which microbial populations successfully colonize the recipient gut, and the widely used approaches to study the microbial ecology of FMT experiments fail to provide enough resolution to identify populations that are likely responsible for FMT-derived benefits. Here, we used shotgun metagenomics together with assembly and binning strategies to reconstruct metagenome-assembled genomes (MAGs) from fecal samples of a single FMT donor. We then used metagenomic mapping to track themore » occurrence and distribution patterns of donor MAGs in two FMT recipients. Our analyses revealed that 22% of the 92 highly complete bacterial MAGs that we identified from the donor successfully colonized and remained abundant in two recipients for at least 8 weeks. Most MAGs with a high colonization rate belonged to the order Bacteroidales. The vast majority of those that lacked evidence of colonization belonged to the order Clostridiales, and colonization success was negatively correlated with the number of genes related to sporulation. Our analysis of 151 publicly available gut metagenomes showed that the donor MAGs that colonized both recipients were prevalent, and the ones that colonized neither were rare across the participants of the Human Microbiome Project. Although our dataset showed a link between taxonomy and the colonization ability of a given MAG, we also identified MAGs that belong to the same taxon with different colonization properties, highlighting the importance of an appropriate level of resolution to explore the functional basis of colonization and to identify targets for cultivation, hypothesis generation, and testing in model systems. Lastly, the analytical strategy adopted in our study can provide genomic insights

  1. Tracking microbial colonization in fecal microbiota transplantation experiments via genome-resolved metagenomics

    DOE PAGES

    Lee, Sonny T. M.; Kahn, Stacy A.; Delmont, Tom O.; ...

    2017-05-04

    Fecal microbiota transplantation (FMT) is an effective treatment for recurrent Clostridium difficile infection and shows promise for treating other medical conditions associated with intestinal dysbioses. However, we lack a sufficient understanding of which microbial populations successfully colonize the recipient gut, and the widely used approaches to study the microbial ecology of FMT experiments fail to provide enough resolution to identify populations that are likely responsible for FMT-derived benefits. Here, we used shotgun metagenomics together with assembly and binning strategies to reconstruct metagenome-assembled genomes (MAGs) from fecal samples of a single FMT donor. We then used metagenomic mapping to track themore » occurrence and distribution patterns of donor MAGs in two FMT recipients. Our analyses revealed that 22% of the 92 highly complete bacterial MAGs that we identified from the donor successfully colonized and remained abundant in two recipients for at least 8 weeks. Most MAGs with a high colonization rate belonged to the order Bacteroidales. The vast majority of those that lacked evidence of colonization belonged to the order Clostridiales, and colonization success was negatively correlated with the number of genes related to sporulation. Our analysis of 151 publicly available gut metagenomes showed that the donor MAGs that colonized both recipients were prevalent, and the ones that colonized neither were rare across the participants of the Human Microbiome Project. Although our dataset showed a link between taxonomy and the colonization ability of a given MAG, we also identified MAGs that belong to the same taxon with different colonization properties, highlighting the importance of an appropriate level of resolution to explore the functional basis of colonization and to identify targets for cultivation, hypothesis generation, and testing in model systems. Lastly, the analytical strategy adopted in our study can provide genomic insights

  2. Colonic Mucosal Microbiota in Colorectal Cancer: A Single-Center Metagenomic Study in Saudi Arabia.

    PubMed

    Alomair, Ahmed O; Masoodi, Ibrahim; Alyamani, Essam J; Allehibi, Abed A; Qutub, Adel N; Alsayari, Khalid N; Altammami, Musaad A; Alshanqeeti, Ali S

    2018-01-01

    Because genetic and geographic variations in intestinal microbiota are known to exist, the focus of this study was to establish an estimation of microbiota in colorectal cancer (CRC) patients in Saudi Arabia by means of metagenomic studies. From July 2010 to November 2012, colorectal cancer patients attending our hospital were enrolled for the metagenomic studies. All underwent clinical, endoscopic, and histological assessment. Mucosal microbiota samples were collected from each patient by jet-flushing colonic mucosa with distilled water at unified segments of the colon, followed by aspiration, during colonoscopy. Total purified dsDNA was extracted and quantified prior to metagenomic sequencing using an Illumina platform. Satisfactory DNA samples ( n = 29) were subjected to metagenomics studies, followed by comprehensive comparative phylogenetic analysis. An equal number of healthy age-matched controls were also examined for colonic mucosal microbiota. Metagenomics data on 29 patients (14 females) in the age range 38-77 years were analyzed. The majority 11 (37%) of our patients were overweight (BMI = 25-30). Rectal bleeding was the presenting symptom in 18/29 (62%), while symptomatic anemia was the presenting symptom in 11/29 (37%). The location of colon cancer was rectal in 14 (48%), while cecal growth was observed in 8 (27%). Hepatic flexure growth was found in 1 (3%), descending colonic growth was found in 2 (6%), and 4 (13%) patients had transverse colon growth. The metagenomics analysis was carried out, and a total of 3.58G reads were sequenced, and about 321.91G data were used in the analysis. This study identified 11 genera specific to colorectal cancer patients when compared to genera in the control group. Bacteroides fragilis and Fusobacterium were found to be significantly prevalent in the carcinoma group when compared to the control group. The current study has given an insight into the microbiota of colorectal cancer patients in Saudi Arabia and has

  3. Ancient DNA analysis identifies marine mollusc shells as new metagenomic archives of the past.

    PubMed

    Der Sarkissian, Clio; Pichereau, Vianney; Dupont, Catherine; Ilsøe, Peter C; Perrigault, Mickael; Butler, Paul; Chauvaud, Laurent; Eiríksson, Jón; Scourse, James; Paillard, Christine; Orlando, Ludovic

    2017-09-01

    Marine mollusc shells enclose a wealth of information on coastal organisms and their environment. Their life history traits as well as (palaeo-) environmental conditions, including temperature, food availability, salinity and pollution, can be traced through the analysis of their shell (micro-) structure and biogeochemical composition. Adding to this list, the DNA entrapped in shell carbonate biominerals potentially offers a novel and complementary proxy both for reconstructing palaeoenvironments and tracking mollusc evolutionary trajectories. Here, we assess this potential by applying DNA extraction, high-throughput shotgun DNA sequencing and metagenomic analyses to marine mollusc shells spanning the last ~7,000 years. We report successful DNA extraction from shells, including a variety of ancient specimens, and find that DNA recovery is highly dependent on their biomineral structure, carbonate layer preservation and disease state. We demonstrate positive taxonomic identification of mollusc species using a combination of mitochondrial DNA genomes, barcodes, genome-scale data and metagenomic approaches. We also find shell biominerals to contain a diversity of microbial DNA from the marine environment. Finally, we reconstruct genomic sequences of organisms closely related to the Vibrio tapetis bacteria from Manila clam shells previously diagnosed with Brown Ring Disease. Our results reveal marine mollusc shells as novel genetic archives of the past, which opens new perspectives in ancient DNA research, with the potential to reconstruct the evolutionary history of molluscs, microbial communities and pathogens in the face of environmental changes. Other future applications include conservation of endangered mollusc species and aquaculture management. © 2017 John Wiley & Sons Ltd.

  4. SUPER-FOCUS: A tool for agile functional analysis of shotgun metagenomic data

    DOE PAGES

    Silva, Genivaldo Gueiros Z.; Green, Kevin T.; Dutilh, Bas E.; ...

    2015-10-09

    Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reducedmore » reference database to report the subsystems present in metagenomic datasets and profile their abundances. We tested SUPER-FOCUS with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools.« less

  5. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    DOE PAGES

    Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.; ...

    2017-04-09

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less

  6. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Musumeci, Matias A.; Lozada, Mariana; Rial, Daniela V.

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putativemore » monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. As a result, this work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.« less

  7. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach.

    PubMed

    Musumeci, Matías A; Lozada, Mariana; Rial, Daniela V; Mac Cormack, Walter P; Jansson, Janet K; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M

    2017-04-09

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.

  8. Prospecting Biotechnologically-Relevant Monooxygenases from Cold Sediment Metagenomes: An In Silico Approach

    PubMed Central

    Musumeci, Matías A.; Lozada, Mariana; Rial, Daniela V.; Mac Cormack, Walter P.; Jansson, Janet K.; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M.

    2017-01-01

    The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer–Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments. PMID:28397770

  9. Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Canon, Shane

    2018-01-24

    DOE JGI's Zhong Wang, chair of the High-performance Computing session, gives a brief introduction before Berkeley Lab's Shane Canon talks about "Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  10. Genomic clones for human cholinesterase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kott, M.; Venta, P.J.; Larsen, J.

    1987-05-01

    A human genomic library was prepared from peripheral white blood cells from a single donor by inserting an MboI partial digest into BamHI poly-linker sites of EMBL3. This library was screened using an oligolabeled human cholinesterase cDNA probe over 700 bp long. The latter probe was obtained from a human basal ganglia cDNA library. Of approximately 2 million clones screened with high stringency conditions several positive clones were identified; two have been plaque purified. One of these clones has been partially mapped using restriction enzymes known to cut within the coded region of the cDNA for human serum cholinesterase. Hybridizationmore » of the fragments and their sizes are as expected if the genomic clone is cholinesterase. Sequencing of the DNA fragments in M13 is in progress to verify the identify of the clone and the location of introns.« less

  11. Ethical issues in animal cloning.

    PubMed

    Fiester, Autumn

    2005-01-01

    The issue of human reproductive cloning has recently received a great deal attention in public discourse. Bioethicists, policy makers, and the media have been quick to identify the key ethical issues involved in human reproductive cloning and to argue, almost unanimously, for an international ban on such attempts. Meanwhile, scientists have proceeded with extensive research agendas in the cloning of animals. Despite this research, there has been little public discussion of the ethical issues raised by animal cloning projects. Polling data show that the public is decidedly against the cloning of animals. To understand the public's reaction and fill the void of reasoned debate about the issue, we need to review the possible objections to animal cloning and assess the merits of the anti-animal cloning stance. Some objections to animal cloning (e.g., the impact of cloning on the population of unwanted animals) can be easily addressed, while others (e.g., the health of cloned animals) require more serious attention by the public and policy makers.

  12. MetaSort untangles metagenome assembly by reducing microbial community complexity

    PubMed Central

    Ji, Peifeng; Zhang, Yanming; Wang, Jinfeng; Zhao, Fangqing

    2017-01-01

    Most current approaches to analyse metagenomic data rely on reference genomes. Novel microbial communities extend far beyond the coverage of reference databases and de novo metagenome assembly from complex microbial communities remains a great challenge. Here we present a novel experimental and bioinformatic framework, metaSort, for effective construction of bacterial genomes from metagenomic samples. MetaSort provides a sorted mini-metagenome approach based on flow cytometry and single-cell sequencing methodologies, and employs new computational algorithms to efficiently recover high-quality genomes from the sorted mini-metagenome by the complementary of the original metagenome. Through extensive evaluations, we demonstrated that metaSort has an excellent and unbiased performance on genome recovery and assembly. Furthermore, we applied metaSort to an unexplored microflora colonized on the surface of marine kelp and successfully recovered 75 high-quality genomes at one time. This approach will greatly improve access to microbial genomes from complex or novel communities. PMID:28112173

  13. [Construction of large fragment metagenome library of natural mangrove soil].

    PubMed

    Jiang, Yun-Xia; Zheng, Tian-Ling

    2007-11-01

    Applying our optimized direct extraction method, the percentage of large fragment DNA in the total extracted mangrove soil DNA was significant increased. The large fragment metagenome library derived from natural mangrove soil over four seasons was successfully constructed by the optimized DNA extraction and electro elution purification method. All of the clones had recombinant Cosmids and each differed in their fragment profiles when Cosmid DNA was extracted from 12 randomly picked colonies and digested with BamHI. The average insert size for this library was larger than 35 kbp. This culturing-independent library at least encompassed 335 Mbp valuable genetic information of mangrove soil microbes. It allowed mining of valuable intertidal microbial resource to become a reality. It is a recommended method for those researchers who have still not circumvented the large insert environmental libraries or for those beginning research in this field, so as to avoid them attempting repetitive, fussy work.

  14. Metagenomic analysis of viral diversity in respiratory samples from patients with respiratory tract infections in Kuwait.

    PubMed

    Madi, Nada; Al-Nakib, Widad; Mustafa, Abu Salim; Habibi, Nazima

    2018-03-01

    A metagenomic approach based on target independent next-generation sequencing has become a known method for the detection of both known and novel viruses in clinical samples. This study aimed to use the metagenomic sequencing approach to characterize the viral diversity in respiratory samples from patients with respiratory tract infections. We have investigated 86 respiratory samples received from various hospitals in Kuwait between 2015 and 2016 for the diagnosis of respiratory tract infections. A metagenomic approach using the next-generation sequencer to characterize viruses was used. According to the metagenomic analysis, an average of 145, 019 reads were identified, and 2% of these reads were of viral origin. Also, metagenomic analysis of the viral sequences revealed many known respiratory viruses, which were detected in 30.2% of the clinical samples. Also, sequences of non-respiratory viruses were detected in 14% of the clinical samples, while sequences of non-human viruses were detected in 55.8% of the clinical samples. The average genome coverage of the viruses was 12% with the highest genome coverage of 99.2% for respiratory syncytial virus, and the lowest was 1% for torque teno midi virus 2. Our results showed 47.7% agreement between multiplex Real-Time PCR and metagenomics sequencing in the detection of respiratory viruses in the clinical samples. Though there are some difficulties in using this method to clinical samples such as specimen quality, these observations are indicative of the promising utility of the metagenomic sequencing approach for the identification of respiratory viruses in patients with respiratory tract infections. © 2017 Wiley Periodicals, Inc.

  15. Modeling ecological drivers in marine viral communities using comparative metagenomics and network analyses.

    PubMed

    Hurwitz, Bonnie L; Westveld, Anton H; Brum, Jennifer R; Sullivan, Matthew B

    2014-07-22

    Long-standing questions in marine viral ecology are centered on understanding how viral assemblages change along gradients in space and time. However, investigating these fundamental ecological questions has been challenging due to incomplete representation of naturally occurring viral diversity in single gene- or morphology-based studies and an inability to identify up to 90% of reads in viral metagenomes (viromes). Although protein clustering techniques provide a significant advance by helping organize this unknown metagenomic sequence space, they typically use only ∼75% of the data and rely on assembly methods not yet tuned for naturally occurring sequence variation. Here, we introduce an annotation- and assembly-free strategy for comparative metagenomics that combines shared k-mer and social network analyses (regression modeling). This robust statistical framework enables visualization of complex sample networks and determination of ecological factors driving community structure. Application to 32 viromes from the Pacific Ocean Virome dataset identified clusters of samples broadly delineated by photic zone and revealed that geographic region, depth, and proximity to shore were significant predictors of community structure. Within subsets of this dataset, depth, season, and oxygen concentration were significant drivers of viral community structure at a single open ocean station, whereas variability along onshore-offshore transects was driven by oxygen concentration in an area with an oxygen minimum zone and not depth or proximity to shore, as might be expected. Together these results demonstrate that this highly scalable approach using complete metagenomic network-based comparisons can both test and generate hypotheses for ecological investigation of viral and microbial communities in nature.

  16. Modeling ecological drivers in marine viral communities using comparative metagenomics and network analyses

    PubMed Central

    Hurwitz, Bonnie L.; Westveld, Anton H.; Brum, Jennifer R.; Sullivan, Matthew B.

    2014-01-01

    Long-standing questions in marine viral ecology are centered on understanding how viral assemblages change along gradients in space and time. However, investigating these fundamental ecological questions has been challenging due to incomplete representation of naturally occurring viral diversity in single gene- or morphology-based studies and an inability to identify up to 90% of reads in viral metagenomes (viromes). Although protein clustering techniques provide a significant advance by helping organize this unknown metagenomic sequence space, they typically use only ∼75% of the data and rely on assembly methods not yet tuned for naturally occurring sequence variation. Here, we introduce an annotation- and assembly-free strategy for comparative metagenomics that combines shared k-mer and social network analyses (regression modeling). This robust statistical framework enables visualization of complex sample networks and determination of ecological factors driving community structure. Application to 32 viromes from the Pacific Ocean Virome dataset identified clusters of samples broadly delineated by photic zone and revealed that geographic region, depth, and proximity to shore were significant predictors of community structure. Within subsets of this dataset, depth, season, and oxygen concentration were significant drivers of viral community structure at a single open ocean station, whereas variability along onshore–offshore transects was driven by oxygen concentration in an area with an oxygen minimum zone and not depth or proximity to shore, as might be expected. Together these results demonstrate that this highly scalable approach using complete metagenomic network-based comparisons can both test and generate hypotheses for ecological investigation of viral and microbial communities in nature. PMID:25002514

  17. Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods

    PubMed Central

    Dröge, J.; Gregor, I.; McHardy, A. C.

    2015-01-01

    Motivation: Metagenomics characterizes microbial communities by random shotgun sequencing of DNA isolated directly from an environment of interest. An essential step in computational metagenome analysis is taxonomic sequence assignment, which allows identifying the sequenced community members and reconstructing taxonomic bins with sequence data for the individual taxa. For the massive datasets generated by next-generation sequencing technologies, this cannot be performed with de-novo phylogenetic inference methods. We describe an algorithm and the accompanying software, taxator-tk, which performs taxonomic sequence assignment by fast approximate determination of evolutionary neighbors from sequence similarities. Results: Taxator-tk was precise in its taxonomic assignment across all ranks and taxa for a range of evolutionary distances and for short as well as for long sequences. In addition to the taxonomic binning of metagenomes, it is well suited for profiling microbial communities from metagenome samples because it identifies bacterial, archaeal and eukaryotic community members without being affected by varying primer binding strengths, as in marker gene amplification, or copy number variations of marker genes across different taxa. Taxator-tk has an efficient, parallelized implementation that allows the assignment of 6 Gb of sequence data per day on a standard multiprocessor system with 10 CPU cores and microbial RefSeq as the genomic reference data. Availability and implementation: Taxator-tk source and binary program files are publicly available at http://algbio.cs.uni-duesseldorf.de/software/. Contact: Alice.McHardy@uni-duesseldorf.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25388150

  18. Data on partial polyhydroxyalkanoate synthase genes (phaC) mined from Aaptos aaptos marine sponge-associated bacteria metagenome.

    PubMed

    Amelia, Tan Suet May; Amirul, Al-Ashraf Abdullah; Bhubalan, Kesaven

    2018-02-01

    We report data associated with the identification of three polyhydroxyalkanoate synthase genes (phaC) isolated from the marine bacteria metagenome of Aaptos aaptos marine sponge in the waters of Bidong Island, Terengganu, Malaysia. Our data describe the extraction of bacterial metagenome from sponge tissue, measurement of purity and concentration of extracted metagenome, polymerase chain reaction (PCR)-mediated amplification using degenerate primers targeting Class I and II phaC genes, sequencing at First BASE Laboratories Sdn Bhd, and phylogenetic analysis of identified and known phaC genes. The partial nucleotide sequences were aligned, refined, compared with the Basic Local Alignment Search Tool (BLAST) databases, and released online in GenBank. The data include the identified partial putative phaC and their GenBank accession numbers, which are Rhodocista sp. phaC (MF457754), Pseudomonas sp. phaC (MF437016), and an uncultured bacterium AR5-9d_16 phaC (MF457753).

  19. Fast and sensitive taxonomic classification for metagenomics with Kaiju

    PubMed Central

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-01-01

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows–Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk. PMID:27071849

  20. Fast and sensitive taxonomic classification for metagenomics with Kaiju.

    PubMed

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-04-13

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows-Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk.

  1. Fluorescent Amplified-Fragment Length Polymorphism Genotyping of Neisseria meningitidis Identifies Clones Associated with Invasive Disease

    PubMed Central

    Goulding, Jonathan N.; Hookey, John V.; Stanley, John; Olver, Will; Neal, Keith R.; Ala'Aldeen, Dlawer A. A.; Arnold, Catherine

    2000-01-01

    Fluorescent amplified-fragment length polymorphism (FAFLP), a genotyping technique with phylogenetic significance, was applied to 123 isolates of Neisseria meningitidis. Nine of these were from an outbreak in a British university; 9 were from a recent outbreak in Pontypridd, Glamorgan; 15 were from sporadic cases of meningococcal disease; 26 were from the National Collection of Type Cultures; 58 were carrier isolates from Ironville, Derbyshire; 1 was a disease isolate from Ironville; and five were representatives of invasive clones of N. meningitidis. FAFLP analysis results were compared with previously published multilocus sequence typing (MLST) and pulsed-field gel electrophoresis (PFGE) results. FAFLP was able to identify hypervirulent, hyperendemic lineages (invasive clones) of N. meningitidis as well as did MLST. PFGE did not discriminate between two strains from the outbreak that were classified as similar but distinct by FAFLP. The results suggest that high resolution of N. meningitidis for outbreak and other epidemiological analyses is more cost efficient by FAFLP than by sequencing procedures. PMID:11101599

  2. MetaCRAST: reference-guided extraction of CRISPR spacers from unassembled metagenomes.

    PubMed

    Moller, Abraham G; Liang, Chun

    2017-01-01

    Clustered regularly interspaced short palindromic repeat (CRISPR) systems are the adaptive immune systems of bacteria and archaea against viral infection. While CRISPRs have been exploited as a tool for genetic engineering, their spacer sequences can also provide valuable insights into microbial ecology by linking environmental viruses to their microbial hosts. Despite this importance, metagenomic CRISPR detection remains a major challenge. Here we present a reference-guided CRISPR spacer detection tool ( Meta genomic C RISPR R eference- A ided S earch T ool-MetaCRAST) that constrains searches based on user-specified direct repeats (DRs). These DRs could be expected from assembly or taxonomic profiles of metagenomes. We compared the performance of MetaCRAST to those of two existing metagenomic CRISPR detection tools-Crass and MinCED-using both real and simulated acid mine drainage (AMD) and enhanced biological phosphorus removal (EBPR) metagenomes. Our evaluation shows MetaCRAST improves CRISPR spacer detection in real metagenomes compared to the de novo CRISPR detection methods Crass and MinCED. Evaluation on simulated metagenomes show it performs better than de novo tools for Illumina metagenomes and comparably for 454 metagenomes. It also has comparable performance dependence on read length and community composition, run time, and accuracy to these tools. MetaCRAST is implemented in Perl, parallelizable through the Many Core Engine (MCE), and takes metagenomic sequence reads and direct repeat queries (FASTA or FASTQ) as input. It is freely available for download at https://github.com/molleraj/MetaCRAST.

  3. Tentacle: distributed quantification of genes in metagenomes.

    PubMed

    Boulund, Fredrik; Sjögren, Anders; Kristiansson, Erik

    2015-01-01

    In metagenomics, microbial communities are sequenced at increasingly high resolution, generating datasets with billions of DNA fragments. Novel methods that can efficiently process the growing volumes of sequence data are necessary for the accurate analysis and interpretation of existing and upcoming metagenomes. Here we present Tentacle, which is a novel framework that uses distributed computational resources for gene quantification in metagenomes. Tentacle is implemented using a dynamic master-worker approach in which DNA fragments are streamed via a network and processed in parallel on worker nodes. Tentacle is modular, extensible, and comes with support for six commonly used sequence aligners. It is easy to adapt Tentacle to different applications in metagenomics and easy to integrate into existing workflows. Evaluations show that Tentacle scales very well with increasing computing resources. We illustrate the versatility of Tentacle on three different use cases. Tentacle is written for Linux in Python 2.7 and is published as open source under the GNU General Public License (v3). Documentation, tutorials, installation instructions, and the source code are freely available online at: http://bioinformatics.math.chalmers.se/tentacle.

  4. Shotgun metagenomic data streams: surfing without fear

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berendzen, Joel R

    2010-12-06

    Timely information about bio-threat prevalence, consequence, propagation, attribution, and mitigation is needed to support decision-making, both routinely and in a crisis. One DNA sequencer can stream 25 Gbp of information per day, but sampling strategies and analysis techniques are needed to turn raw sequencing power into actionable knowledge. Shotgun metagenomics can enable biosurveillance at the level of a single city, hospital, or airplane. Metagenomics characterizes viruses and bacteria from complex environments such as soil, air filters, or sewage. Unlike targeted-primer-based sequencing, shotgun methods are not blind to sequences that are truly novel, and they can measure absolute prevalence. Shotgun metagenomicmore » sampling can be non-invasive, efficient, and inexpensive while being informative. We have developed analysis techniques for shotgun metagenomic sequencing that rely upon phylogenetic signature patterns. They work by indexing local sequence patterns in a manner similar to web search engines. Our methods are laptop-fast and favorable scaling properties ensure they will be sustainable as sequencing methods grow. We show examples of application to soil metagenomic samples.« less

  5. Environmental Metagenomics: The Data Assembly and Data Analysis Perspectives

    NASA Astrophysics Data System (ADS)

    Kumar, Vinay; Maitra, S. S.; Shukla, Rohit Nandan

    2015-03-01

    Novel gene finding is one of the emerging fields in the environmental research. In the past decades the research was focused mainly on the discovery of microorganisms which were capable of degrading a particular compound. A lot of methods are available in literature about the cultivation and screening of these novel microorganisms. All of these methods are efficient for screening of microbes which can be cultivated in the laboratory. Microorganisms which live in extreme conditions like hot springs, frozen glaciers, acid mine drainage, etc. cannot be cultivated in the laboratory, this is because of incomplete knowledge about their growth requirements like temperature, nutrients and their mutual dependence on each other. The microbes that can be cultivated correspond only to less than 1 % of the total microbes which are present in the earth. Rest of the 99 % of uncultivated majority remains inaccessible. Metagenomics transcends the culture requirements of microbes. In metagenomics DNA is directly extracted from the environmental samples such as soil, seawater, acid mine drainage etc., followed by construction and screening of metagenomic library. With the ongoing research, a huge amount of metagenomic data is accumulating. Understanding this data is an essential step to extract novel genes of industrial importance. Various bioinformatics tools have been designed to analyze and annotate the data produced from the metagenome. The Bio-informatic requirements of metagenomics data analysis are different in theory and practice. This paper reviews the tools that are available for metagenomic data analysis and the capability such tools—what they can do and their web availability.

  6. Evaluating the Quantitative Capabilities of Metagenomic Analysis Software.

    PubMed

    Kerepesi, Csaba; Grolmusz, Vince

    2016-05-01

    DNA sequencing technologies are applied widely and frequently today to describe metagenomes, i.e., microbial communities in environmental or clinical samples, without the need for culturing them. These technologies usually return short (100-300 base-pairs long) DNA reads, and these reads are processed by metagenomic analysis software that assign phylogenetic composition-information to the dataset. Here we evaluate three metagenomic analysis software (AmphoraNet--a webserver implementation of AMPHORA2--, MG-RAST, and MEGAN5) for their capabilities of assigning quantitative phylogenetic information for the data, describing the frequency of appearance of the microorganisms of the same taxa in the sample. The difficulties of the task arise from the fact that longer genomes produce more reads from the same organism than shorter genomes, and some software assign higher frequencies to species with longer genomes than to those with shorter ones. This phenomenon is called the "genome length bias." Dozens of complex artificial metagenome benchmarks can be found in the literature. Because of the complexity of those benchmarks, it is usually difficult to judge the resistance of a metagenomic software to this "genome length bias." Therefore, we have made a simple benchmark for the evaluation of the "taxon-counting" in a metagenomic sample: we have taken the same number of copies of three full bacterial genomes of different lengths, break them up randomly to short reads of average length of 150 bp, and mixed the reads, creating our simple benchmark. Because of its simplicity, the benchmark is not supposed to serve as a mock metagenome, but if a software fails on that simple task, it will surely fail on most real metagenomes. We applied three software for the benchmark. The ideal quantitative solution would assign the same proportion to the three bacterial taxa. We have found that AMPHORA2/AmphoraNet gave the most accurate results and the other two software were under

  7. DOE JGI Quality Metrics; Approaches to Scaling and Improving Metagenome Assembly (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Copeland, Alex; Brown, C. Titus

    2011-10-13

    DOE JGI's Alex Copeland on "DOE JGI Quality Metrics" and Michigan State University's C. Titus Brown on "Approaches to Scaling and Improving Metagenome Assembly" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  8. DOE JGI Quality Metrics; Approaches to Scaling and Improving Metagenome Assembly (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Copeland, Alex; Brown, C. Titus

    2018-04-27

    DOE JGI's Alex Copeland on "DOE JGI Quality Metrics" and Michigan State University's C. Titus Brown on "Approaches to Scaling and Improving Metagenome Assembly" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  9. Strain-Level Metagenomic Analysis of the Fermented Dairy Beverage Nunu Highlights Potential Food Safety Risks

    PubMed Central

    Walsh, Aaron M.; Crispie, Fiona; Daari, Kareem; O'Sullivan, Orla; Martin, Jennifer C.; Arthur, Cornelius T.; Claesson, Marcus J.; Scott, Karen P.

    2017-01-01

    ABSTRACT The rapid detection of pathogenic strains in food products is essential for the prevention of disease outbreaks. It has already been demonstrated that whole-metagenome shotgun sequencing can be used to detect pathogens in food but, until recently, strain-level detection of pathogens has relied on whole-metagenome assembly, which is a computationally demanding process. Here we demonstrated that three short-read-alignment-based methods, i.e., MetaMLST, PanPhlAn, and StrainPhlAn, could accurately and rapidly identify pathogenic strains in spinach metagenomes that had been intentionally spiked with Shiga toxin-producing Escherichia coli in a previous study. Subsequently, we employed the methods, in combination with other metagenomics approaches, to assess the safety of nunu, a traditional Ghanaian fermented milk product that is produced by the spontaneous fermentation of raw cow milk. We showed that nunu samples were frequently contaminated with bacteria associated with the bovine gut and, worryingly, we detected putatively pathogenic E. coli and Klebsiella pneumoniae strains in a subset of nunu samples. Ultimately, our work establishes that short-read-alignment-based bioinformatics approaches are suitable food safety tools, and we describe a real-life example of their utilization. IMPORTANCE Foodborne pathogens are responsible for millions of illnesses each year. Here we demonstrate that short-read-alignment-based bioinformatics tools can accurately and rapidly detect pathogenic strains in food products by using shotgun metagenomics data. The methods used here are considerably faster than both traditional culturing methods and alternative bioinformatics approaches that rely on metagenome assembly; therefore, they can potentially be used for more high-throughput food safety testing. Overall, our results suggest that whole-metagenome sequencing can be used as a practical food safety tool to prevent diseases or to link outbreaks to specific food products. PMID

  10. Strain-Level Metagenomic Analysis of the Fermented Dairy Beverage Nunu Highlights Potential Food Safety Risks.

    PubMed

    Walsh, Aaron M; Crispie, Fiona; Daari, Kareem; O'Sullivan, Orla; Martin, Jennifer C; Arthur, Cornelius T; Claesson, Marcus J; Scott, Karen P; Cotter, Paul D

    2017-08-15

    The rapid detection of pathogenic strains in food products is essential for the prevention of disease outbreaks. It has already been demonstrated that whole-metagenome shotgun sequencing can be used to detect pathogens in food but, until recently, strain-level detection of pathogens has relied on whole-metagenome assembly, which is a computationally demanding process. Here we demonstrated that three short-read-alignment-based methods, i.e., MetaMLST, PanPhlAn, and StrainPhlAn, could accurately and rapidly identify pathogenic strains in spinach metagenomes that had been intentionally spiked with Shiga toxin-producing Escherichia coli in a previous study. Subsequently, we employed the methods, in combination with other metagenomics approaches, to assess the safety of nunu, a traditional Ghanaian fermented milk product that is produced by the spontaneous fermentation of raw cow milk. We showed that nunu samples were frequently contaminated with bacteria associated with the bovine gut and, worryingly, we detected putatively pathogenic E. coli and Klebsiella pneumoniae strains in a subset of nunu samples. Ultimately, our work establishes that short-read-alignment-based bioinformatics approaches are suitable food safety tools, and we describe a real-life example of their utilization. IMPORTANCE Foodborne pathogens are responsible for millions of illnesses each year. Here we demonstrate that short-read-alignment-based bioinformatics tools can accurately and rapidly detect pathogenic strains in food products by using shotgun metagenomics data. The methods used here are considerably faster than both traditional culturing methods and alternative bioinformatics approaches that rely on metagenome assembly; therefore, they can potentially be used for more high-throughput food safety testing. Overall, our results suggest that whole-metagenome sequencing can be used as a practical food safety tool to prevent diseases or to link outbreaks to specific food products. Copyright

  11. Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Li, Weizhong

    2018-02-12

    San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  12. Metagenomics workflow analysis of endophytic bacteria from oil palm fruits

    NASA Astrophysics Data System (ADS)

    Tanjung, Z. A.; Aditama, R.; Sudania, W. M.; Utomo, C.; Liwang, T.

    2017-05-01

    Next-Generation Sequencing (NGS) has become a powerful sequencing tool for microbial study especially to lead the establishment of the field area of metagenomics. This study described a workflow to analyze metagenomics data of a Sequence Read Archive (SRA) file under accession ERP004286 deposited by University of Sao Paulo. It was a direct sequencing data generated by 454 pyrosequencing platform originated from oil palm fruits endophytic bacteria which were cultured using oil-palm enriched medium. This workflow used SortMeRNA to split ribosomal reads sequence, Newbler (GS Assembler and GS Mapper) to assemble and map reads into genome reference, BLAST package to identify and annotate contigs sequence, and QualiMap for statistical analysis. Eight bacterial species were identified in this study. Enterobacter cloacae was the most abundant species followed by Citrobacter koseri, Seratia marcescens, Latococcus lactis subsp. lactis, Klebsiella pneumoniae, Citrobacter amalonaticus, Achromobacter xylosoxidans, and Pseudomonas sp. respectively. All of these species have been reported as endophyte bacteria in various plant species and each has potential as plant growth promoting bacteria or another application in agricultural industries.

  13. A bioinformatic analysis of ribonucleotide reductase genes in phage genomes and metagenomes

    PubMed Central

    2013-01-01

    Background Ribonucleotide reductase (RNR), the enzyme responsible for the formation of deoxyribonucleotides from ribonucleotides, is found in all domains of life and many viral genomes. RNRs are also amongst the most abundant genes identified in environmental metagenomes. This study focused on understanding the distribution, diversity, and evolution of RNRs in phages (viruses that infect bacteria). Hidden Markov Model profiles were used to analyze the proteins encoded by 685 completely sequenced double-stranded DNA phages and 22 environmental viral metagenomes to identify RNR homologs in cultured phages and uncultured viral communities, respectively. Results RNRs were identified in 128 phage genomes, nearly tripling the number of phages known to encode RNRs. Class I RNR was the most common RNR class observed in phages (70%), followed by class II (29%) and class III (28%). Twenty-eight percent of the phages contained genes belonging to multiple RNR classes. RNR class distribution varied according to phage type, isolation environment, and the host’s ability to utilize oxygen. The majority of the phages containing RNRs are Myoviridae (65%), followed by Siphoviridae (30%) and Podoviridae (3%). The phylogeny and genomic organization of phage and host RNRs reveal several distinct evolutionary scenarios involving horizontal gene transfer, co-evolution, and differential selection pressure. Several putative split RNR genes interrupted by self-splicing introns or inteins were identified, providing further evidence for the role of frequent genetic exchange. Finally, viral metagenomic data indicate that RNRs are prevalent and highly dynamic in uncultured viral communities, necessitating future research to determine the environmental conditions under which RNRs provide a selective advantage. Conclusions This comprehensive study describes the distribution, diversity, and evolution of RNRs in phage genomes and environmental viral metagenomes. The distinct distributions of

  14. Reconstructing the Genomic Content of Microbiome Taxa through Shotgun Metagenomic Deconvolution

    PubMed Central

    Carr, Rogan; Shen-Orr, Shai S.; Borenstein, Elhanan

    2013-01-01

    Metagenomics has transformed our understanding of the microbial world, allowing researchers to bypass the need to isolate and culture individual taxa and to directly characterize both the taxonomic and gene compositions of environmental samples. However, associating the genes found in a metagenomic sample with the specific taxa of origin remains a critical challenge. Existing binning methods, based on nucleotide composition or alignment to reference genomes allow only a coarse-grained classification and rely heavily on the availability of sequenced genomes from closely related taxa. Here, we introduce a novel computational framework, integrating variation in gene abundances across multiple samples with taxonomic abundance data to deconvolve metagenomic samples into taxa-specific gene profiles and to reconstruct the genomic content of community members. This assembly-free method is not bounded by various factors limiting previously described methods of metagenomic binning or metagenomic assembly and represents a fundamentally different approach to metagenomic-based genome reconstruction. An implementation of this framework is available at http://elbo.gs.washington.edu/software.html. We first describe the mathematical foundations of our framework and discuss considerations for implementing its various components. We demonstrate the ability of this framework to accurately deconvolve a set of metagenomic samples and to recover the gene content of individual taxa using synthetic metagenomic samples. We specifically characterize determinants of prediction accuracy and examine the impact of annotation errors on the reconstructed genomes. We finally apply metagenomic deconvolution to samples from the Human Microbiome Project, successfully reconstructing genus-level genomic content of various microbial genera, based solely on variation in gene count. These reconstructed genera are shown to correctly capture genus-specific properties. With the accumulation of metagenomic

  15. Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification.

    PubMed

    Alsop, Eric B; Raymond, Jason

    2013-01-01

    Oligonucleotide signatures, especially tetranucleotide signatures, have been used as method for homology binning by exploiting an organism's inherent biases towards the use of specific oligonucleotide words. Tetranucleotide signatures have been especially useful in environmental metagenomics samples as many of these samples contain organisms from poorly classified phyla which cannot be easily identified using traditional homology methods, including NCBI BLAST. This study examines oligonucleotide signatures across 1,424 completed genomes from across the tree of life, substantially expanding upon previous work. A comprehensive analysis of mononucleotide through nonanucleotide word lengths suggests that longer word lengths substantially improve the classification of DNA fragments across a range of sizes of relevance to high throughput sequencing. We find that, at present, heptanucleotide signatures represent an optimal balance between prediction accuracy and computational time for resolving taxonomy using both genomic and metagenomic fragments. We directly compare the ability of tetranucleotide and heptanucleotide world lengths (tetranucleotide signatures are the current standard for oligonucleotide word usage analyses) for taxonomic binning of metagenome reads. We present evidence that heptanucleotide word lengths consistently provide more taxonomic resolving power, particularly in distinguishing between closely related organisms that are often present in metagenomic samples. This implies that longer oligonucleotide word lengths should replace tetranucleotide signatures for most analyses. Finally, we show that the application of longer word lengths to metagenomic datasets leads to more accurate taxonomic binning of DNA scaffolds and have the potential to substantially improve taxonomic assignment and assembly of metagenomic data.

  16. Gut metagenomes of type 2 diabetic patients have characteristic single-nucleotide polymorphism distribution in Bacteroides coprocola.

    PubMed

    Chen, Yaowen; Li, Zongcheng; Hu, Shuofeng; Zhang, Jian; Wu, Jiaqi; Shao, Ningsheng; Bo, Xiaochen; Ni, Ming; Ying, Xiaomin

    2017-02-01

    Gut microbes play a critical role in human health and disease, and researchers have begun to characterize their genomes, the so-called gut metagenome. Thus far, metagenomics studies have focused on genus- or species-level composition and microbial gene sets, while strain-level composition and single-nucleotide polymorphism (SNP) have been overlooked. The gut metagenomes of type 2 diabetes (T2D) patients have been found to be enriched with butyrate-producing bacteria and sulfate reduction functions. However, it is not known whether the gut metagenomes of T2D patients have characteristic strain patterns or SNP distributions. We downloaded public gut metagenome datasets from 170 T2D patients and 174 healthy controls and performed a systematic comparative analysis of their metagenome SNPs. We found that Bacteroides coprocola, whose relative abundance did not differ between the groups, had a characteristic distribution of SNPs in the T2D patient group. We identified 65 genes, all in B. coprocola, that had remarkably different enrichment of SNPs. The first and sixth ranked genes encode glycosyl hydrolases (GenBank accession EDU99824.1 and EDV02301.1). Interestingly, alpha-glucosidase, which is also a glycosyl hydrolase located in the intestine, is an important drug target of T2D. These results suggest that different strains of B. coprocola may have different roles in human gut and a specific set of B. coprocola strains are correlated with T2D.

  17. Metagenomic studies of the Red Sea.

    PubMed

    Behzad, Hayedeh; Ibarra, Martin Augusto; Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    Metagenomics has significantly advanced the field of marine microbial ecology, revealing the vast diversity of previously unknown microbial life forms in different marine niches. The tremendous amount of data generated has enabled identification of a large number of microbial genes (metagenomes), their community interactions, adaptation mechanisms, and their potential applications in pharmaceutical and biotechnology-based industries. Comparative metagenomics reveals that microbial diversity is a function of the local environment, meaning that unique or unusual environments typically harbor novel microbial species with unique genes and metabolic pathways. The Red Sea has an abundance of unique characteristics; however, its microbiota is one of the least studied among marine environments. The Red Sea harbors approximately 25 hot anoxic brine pools, plus a vibrant coral reef ecosystem. Physiochemical studies describe the Red Sea as an oligotrophic environment that contains one of the warmest and saltiest waters in the world with year-round high UV radiations. These characteristics are believed to have shaped the evolution of microbial communities in the Red Sea. Over-representation of genes involved in DNA repair, high-intensity light responses, and osmoregulation were found in the Red Sea metagenomic databases suggesting acquisition of specific environmental adaptation by the Red Sea microbiota. The Red Sea brine pools harbor a diverse range of halophilic and thermophilic bacterial and archaeal communities, which are potential sources of enzymes for pharmaceutical and biotechnology-based application. Understanding the mechanisms of these adaptations and their function within the larger ecosystem could also prove useful in light of predicted global warming scenarios where global ocean temperatures are expected to rise by 1-3°C in the next few decades. In this review, we provide an overview of the published metagenomic studies that were conducted in the Red Sea, and

  18. Zooplankton community analysis in the Changjiang River estuary by single-gene-targeted metagenomics

    NASA Astrophysics Data System (ADS)

    Cheng, Fangping; Wang, Minxiao; Li, Chaolun; Sun, Song

    2014-07-01

    DNA barcoding provides accurate identification of zooplankton species through all life stages. Single-gene-targeted metagenomic analysis based on DNA barcode databases can facilitate longterm monitoring of zooplankton communities. With the help of the available zooplankton databases, the zooplankton community of the Changjiang (Yangtze) River estuary was studied using a single-gene-targeted metagenomic method to estimate the species richness of this community. A total of 856 mitochondrial cytochrome oxidase subunit 1 (cox1) gene sequences were determined. The environmental barcodes were clustered into 70 molecular operational taxonomic units (MOTUs). Forty-two MOTUs matched barcoded marine organisms with more than 90% similarity and were assigned to either the species (similarity>96%) or genus level (similarity<96%). Sibling species could also be distinguished. Many species that were overlooked by morphological methods were identified by molecular methods, especially gelatinous zooplankton and merozooplankton that were likely sampled at different life history phases. Zooplankton community structures differed significantly among all of the samples. The MOTU spatial distributions were influenced by the ecological habits of the corresponding species. In conclusion, single-gene-targeted metagenomic analysis is a useful tool for zooplankton studies, with which specimens from all life history stages can be identified quickly and effectively with a comprehensive database.

  19. Evaluation method for the potential functionome harbored in the genome and metagenome

    PubMed Central

    2012-01-01

    Background One of the main goals of genomic analysis is to elucidate the comprehensive functions (functionome) in individual organisms or a whole community in various environments. However, a standard evaluation method for discerning the functional potentials harbored within the genome or metagenome has not yet been established. We have developed a new evaluation method for the potential functionome, based on the completion ratio of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules. Results Distribution of the completion ratio of the KEGG functional modules in 768 prokaryotic species varied greatly with the kind of module, and all modules primarily fell into 4 patterns (universal, restricted, diversified and non-prokaryotic modules), indicating the universal and unique nature of each module, and also the versatility of the KEGG Orthology (KO) identifiers mapped to each one. The module completion ratio in 8 phenotypically different bacilli revealed that some modules were shared only in phenotypically similar species. Metagenomes of human gut microbiomes from 13 healthy individuals previously determined by the Sanger method were analyzed based on the module completion ratio. Results led to new discoveries in the nutritional preferences of gut microbes, believed to be one of the mutualistic representations of gut microbiomes to avoid nutritional competition with the host. Conclusions The method developed in this study could characterize the functionome harbored in genomes and metagenomes. As this method also provided taxonomical information from KEGG modules as well as the gene hosts constructing the modules, interpretation of completion profiles was simplified and we could identify the complementarity between biochemical functions in human hosts and the nutritional preferences in human gut microbiomes. Thus, our method has the potential to be a powerful tool for comparative functional analysis in genomics and metagenomics, able to target unknown

  20. Evaluation method for the potential functionome harbored in the genome and metagenome.

    PubMed

    Takami, Hideto; Taniguchi, Takeaki; Moriya, Yuki; Kuwahara, Tomomi; Kanehisa, Minoru; Goto, Susumu

    2012-12-12

    One of the main goals of genomic analysis is to elucidate the comprehensive functions (functionome) in individual organisms or a whole community in various environments. However, a standard evaluation method for discerning the functional potentials harbored within the genome or metagenome has not yet been established. We have developed a new evaluation method for the potential functionome, based on the completion ratio of Kyoto Encyclopedia of Genes and Genomes (KEGG) functional modules. Distribution of the completion ratio of the KEGG functional modules in 768 prokaryotic species varied greatly with the kind of module, and all modules primarily fell into 4 patterns (universal, restricted, diversified and non-prokaryotic modules), indicating the universal and unique nature of each module, and also the versatility of the KEGG Orthology (KO) identifiers mapped to each one. The module completion ratio in 8 phenotypically different bacilli revealed that some modules were shared only in phenotypically similar species. Metagenomes of human gut microbiomes from 13 healthy individuals previously determined by the Sanger method were analyzed based on the module completion ratio. Results led to new discoveries in the nutritional preferences of gut microbes, believed to be one of the mutualistic representations of gut microbiomes to avoid nutritional competition with the host. The method developed in this study could characterize the functionome harbored in genomes and metagenomes. As this method also provided taxonomical information from KEGG modules as well as the gene hosts constructing the modules, interpretation of completion profiles was simplified and we could identify the complementarity between biochemical functions in human hosts and the nutritional preferences in human gut microbiomes. Thus, our method has the potential to be a powerful tool for comparative functional analysis in genomics and metagenomics, able to target unknown environments containing various

  1. Scalable metagenomic taxonomy classification using a reference genome database

    PubMed Central

    Ames, Sasha K.; Hysom, David A.; Gardner, Shea N.; Lloyd, G. Scott; Gokhale, Maya B.; Allen, Jonathan E.

    2013-01-01

    Motivation: Deep metagenomic sequencing of biological samples has the potential to recover otherwise difficult-to-detect microorganisms and accurately characterize biological samples with limited prior knowledge of sample contents. Existing metagenomic taxonomic classification algorithms, however, do not scale well to analyze large metagenomic datasets, and balancing classification accuracy with computational efficiency presents a fundamental challenge. Results: A method is presented to shift computational costs to an off-line computation by creating a taxonomy/genome index that supports scalable metagenomic classification. Scalable performance is demonstrated on real and simulated data to show accurate classification in the presence of novel organisms on samples that include viruses, prokaryotes, fungi and protists. Taxonomic classification of the previously published 150 giga-base Tyrolean Iceman dataset was found to take <20 h on a single node 40 core large memory machine and provide new insights on the metagenomic contents of the sample. Availability: Software was implemented in C++ and is freely available at http://sourceforge.net/projects/lmat Contact: allen99@llnl.gov Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23828782

  2. Forest harvesting reduces the soil metagenomic potential for biomass decomposition.

    PubMed

    Cardenas, Erick; Kranabetter, J M; Hope, Graeme; Maas, Kendra R; Hallam, Steven; Mohn, William W

    2015-11-01

    Soil is the key resource that must be managed to ensure sustainable forest productivity. Soil microbial communities mediate numerous essential ecosystem functions, and recent studies show that forest harvesting alters soil community composition. From a long-term soil productivity study site in a temperate coniferous forest in British Columbia, 21 forest soil shotgun metagenomes were generated, totaling 187 Gb. A method to analyze unassembled metagenome reads from the complex community was optimized and validated. The subsequent metagenome analysis revealed that, 12 years after forest harvesting, there were 16% and 8% reductions in relative abundances of biomass decomposition genes in the organic and mineral soil layers, respectively. Organic and mineral soil layers differed markedly in genetic potential for biomass degradation, with the organic layer having greater potential and being more strongly affected by harvesting. Gene families were disproportionately affected, and we identified 41 gene families consistently affected by harvesting, including families involved in lignin, cellulose, hemicellulose and pectin degradation. The results strongly suggest that harvesting profoundly altered below-ground cycling of carbon and other nutrients at this site, with potentially important consequences for forest regeneration. Thus, it is important to determine whether these changes foreshadow long-term changes in forest productivity or resilience and whether these changes are broadly characteristic of harvested forests.

  3. Screening Currency Notes for Microbial Pathogens and Antibiotic Resistance Genes Using a Shotgun Metagenomic Approach

    PubMed Central

    Jalali, Saakshi; Kohli, Samantha; Latka, Chitra; Bhatia, Sugandha; Vellarikal, Shamsudheen Karuthedath; Sivasubbu, Sridhar; Scaria, Vinod; Ramachandran, Srinivasan

    2015-01-01

    Fomites are a well-known source of microbial infections and previous studies have provided insights into the sojourning microbiome of fomites from various sources. Paper currency notes are one of the most commonly exchanged objects and its potential to transmit pathogenic organisms has been well recognized. Approaches to identify the microbiome associated with paper currency notes have been largely limited to culture dependent approaches. Subsequent studies portrayed the use of 16S ribosomal RNA based approaches which provided insights into the taxonomical distribution of the microbiome. However, recent techniques including shotgun sequencing provides resolution at gene level and enable estimation of their copy numbers in the metagenome. We investigated the microbiome of Indian paper currency notes using a shotgun metagenome sequencing approach. Metagenomic DNA isolated from samples of frequently circulated denominations of Indian currency notes were sequenced using Illumina Hiseq sequencer. Analysis of the data revealed presence of species belonging to both eukaryotic and prokaryotic genera. The taxonomic distribution at kingdom level revealed contigs mapping to eukaryota (70%), bacteria (9%), viruses and archae (~1%). We identified 78 pathogens including Staphylococcus aureus, Corynebacterium glutamicum, Enterococcus faecalis, and 75 cellulose degrading organisms including Acidothermus cellulolyticus, Cellulomonas flavigena and Ruminococcus albus. Additionally, 78 antibiotic resistance genes were identified and 18 of these were found in all the samples. Furthermore, six out of 78 pathogens harbored at least one of the 18 common antibiotic resistance genes. To the best of our knowledge, this is the first report of shotgun metagenome sequence dataset of paper currency notes, which can be useful for future applications including as bio-surveillance of exchangeable fomites for infectious agents. PMID:26035208

  4. Metagenome, metatranscriptome and single-cell sequencing reveal microbial response to Deepwater Horizon oil spill.

    PubMed

    Mason, Olivia U; Hazen, Terry C; Borglin, Sharon; Chain, Patrick S G; Dubinsky, Eric A; Fortney, Julian L; Han, James; Holman, Hoi-Ying N; Hultman, Jenni; Lamendella, Regina; Mackelprang, Rachel; Malfatti, Stephanie; Tom, Lauren M; Tringe, Susannah G; Woyke, Tanja; Zhou, Jizhong; Rubin, Edward M; Jansson, Janet K

    2012-09-01

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility, chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea.

  5. Rapid Detection of Powassan Virus in a Patient With Encephalitis by Metagenomic Sequencing.

    PubMed

    Piantadosi, Anne; Kanjilal, Sanjat; Ganesh, Vijay; Khanna, Arjun; Hyle, Emily P; Rosand, Jonathan; Bold, Tyler; Metsky, Hayden C; Lemieux, Jacob; Leone, Michael J; Freimark, Lisa; Matranga, Christian B; Adams, Gordon; McGrath, Graham; Zamirpour, Siavash; Telford, Sam; Rosenberg, Eric; Cho, Tracey; Frosch, Matthew P; Goldberg, Marcia B; Mukerji, Shibani S; Sabeti, Pardis C

    2018-02-10

    We describe a patient with severe and progressive encephalitis of unknown etiology. We performed rapid metagenomic sequencing from cerebrospinal fluid and identified Powassan virus, an emerging tick-borne flavivirus that has been increasingly detected in the United States.

  6. Extraction of inhibitor-free metagenomic DNA from polluted sediments, compatible with molecular diversity analysis using adsorption and ion-exchange treatments.

    PubMed

    Desai, Chirayu; Madamwar, Datta

    2007-03-01

    PCR inhibitor-free metagenomic DNA of high quality and high yield was extracted from highly polluted sediments using a simple remediation strategy of adsorption and ion-exchange chromatography. Extraction procedure was optimized with series of steps, which involved gentle mechanical lysis, treatment with powdered activated charcoal (PAC) and ion-exchange chromatography with amberlite resin. Quality of the extracted DNA for molecular diversity analysis was tested by amplifying bacterial 16S rDNA (16S rRNA gene) with eubacterial specific universal primers (8f and 1492r), cloning of the amplified 16S rDNA and ARDRA (amplified rDNA restriction analysis) of the 16S rDNA clones. The presence of discrete differences in ARDRA banding profiles provided evidence for expediency of the DNA extraction protocol in molecular diversity studies. A comparison of the optimized protocol with commercial Ultraclean Soil DNA isolation kit suggested that method described in this report would be more efficient in removing metallic and organic inhibitors, from polluted sediment samples.

  7. Use of a pooled clone method to isolate a novel Bacillus thuringiensis Cry2A toxin with activity against Ostrinia furnacalis.

    PubMed

    Shu, Changlong; Zhang, Jingtao; Chen, Guihua; Liang, Gemei; He, Kanglai; Crickmore, Neil; Huang, Dafang; Zhang, Jie; Song, Fuping

    2013-09-01

    A pooled clone method was developed to screen for cry2A genes. This metagenomic method avoids the need to analyse isolated Bacillus thuringiensis strains by performing gene specific PCR on plasmid-enriched DNA prepared from a pooled soil sample. Using this approach the novel holotype gene cry2Ah1 was cloned and characterized. The toxin gene was over-expressed in Escherichia coli Rosetta (DE3) and the expressed toxin accumulated in both the soluble and insoluble fractions. The soluble Cry2Ah1 was found to have a weight loss activity against Ostrinia furnacalis, and a growth inhibitory activity to both Cry1Ac-susceptible and resistant Helicoverpa armigera populations. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. MetaQUAST: evaluation of metagenome assemblies.

    PubMed

    Mikheenko, Alla; Saveliev, Vladislav; Gurevich, Alexey

    2016-04-01

    During the past years we have witnessed the rapid development of new metagenome assembly methods. Although there are many benchmark utilities designed for single-genome assemblies, there is no well-recognized evaluation and comparison tool for metagenomic-specific analogues. In this article, we present MetaQUAST, a modification of QUAST, the state-of-the-art tool for genome assembly evaluation based on alignment of contigs to a reference. MetaQUAST addresses such metagenome datasets features as (i) unknown species content by detecting and downloading reference sequences, (ii) huge diversity by giving comprehensive reports for multiple genomes and (iii) presence of highly relative species by detecting chimeric contigs. We demonstrate MetaQUAST performance by comparing several leading assemblers on one simulated and two real datasets. http://bioinf.spbau.ru/metaquast aleksey.gurevich@spbu.ru Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    PubMed

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.

  10. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    PubMed

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.

  11. Rapid Detection of Powassan Virus in a Patient With Encephalitis by Metagenomic Sequencing

    PubMed Central

    Piantadosi, Anne; Kanjilal, Sanjat; Ganesh, Vijay; Khanna, Arjun; Hyle, Emily P; Rosand, Jonathan; Bold, Tyler; Metsky, Hayden C; Lemieux, Jacob; Leone, Michael J; Freimark, Lisa; Matranga, Christian B; Adams, Gordon; McGrath, Graham; Zamirpour, Siavash; Telford, Sam; Rosenberg, Eric; Cho, Tracey; Frosch, Matthew P; Goldberg, Marcia B; Mukerji, Shibani S; Sabeti, Pardis C

    2018-01-01

    Abstract We describe a patient with severe and progressive encephalitis of unknown etiology. We performed rapid metagenomic sequencing from cerebrospinal fluid and identified Powassan virus, an emerging tick-borne flavivirus that has been increasingly detected in the United States. PMID:29020227

  12. Novel high-performance metagenome β-galactosidases for lactose hydrolysis in the dairy industry.

    PubMed

    Erich, Sarah; Kuschel, Beatrice; Schwarz, Thilo; Ewert, Jacob; Böhmer, Nico; Niehaus, Frank; Eck, Jürgen; Lutz-Wahl, Sabine; Stressler, Timo; Fischer, Lutz

    2015-09-20

    The industrially utilised β-galactosidases from Kluyveromyces spp. and Aspergillus spp. feature undesirable kinetic properties in praxis, such as an unsatisfactory lactose affinity (KM) and product inhibition (KI) by galactose. In this study, a metagenome library of about 1.3 million clones was investigated with a three-step activity-based screening strategy in order to find new β-galactosidases with more favourable kinetic properties. Six novel metagenome β-galactosidases (M1-M6) were found with an improved lactose hydrolysis performance in original milk when directly compared to the commercial β-galactosidase from Kluyveromyces lactis (GODO-YNL2). The best metagenome candidate, called "M1", was recombinantly produced in Escherichia coli BL21(DE3) in a bioreactor (volume 35 L), resulting in a total β-galactosidase M1 activity of about 1100 μkatoNPGal,37 °C L(-1). Since milk is a sensitive and complex medium, it has to be processed at 5-10 °C in the dairy industry. Therefore, the β-galactosidase M1 was tested at 8 °C in milk and possessed a good stability (t1/2=21.8 d), a desirably low apparent KM,lactose,8 °C value of 3.8±0.7 mM and a high apparent KI,galactose,8 °C value of 196.6±55.5 mM. A lactose hydrolysis process (milk, 40 nkatlactose mLmilk,8 °C(-1)) was conducted at a scale of 0.5L to compare the performance of M1 with the commercial β-galactosidase from K. lactis (GODO-YNL2). Lactose was completely (>99.99%) hydrolysed by M1 and to 99.6% (w/v) by K. lactis β-galactosidase after 25 h process time. Thus, M1 was able to achieve the limit of <100 mg lactose per litre milk, which is recommended for dairy products labelled as "lactose-free". Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  13. FY11 Report on Metagenome Analysis using Pathogen Marker Libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, Shea N.; Allen, Jonathan E.; McLoughlin, Kevin S.

    2011-06-02

    A method, sequence library, and software suite was invented to rapidly assess whether any member of a pre-specified list of threat organisms or their near neighbors is present in a metagenome. The system was designed to handle mega- to giga-bases of FASTA-formatted raw sequence reads from short or long read next generation sequencing platforms. The approach is to pre-calculate a viral and a bacterial "Pathogen Marker Library" (PML) containing sub-sequences specific to pathogens or their near neighbors. A list of expected matches comparing every bacterial or viral genome against the PML sequences is also pre-calculated. To analyze a metagenome, readsmore » are compared to the PML, and observed PML-metagenome matches are compared to the expected PML-genome matches, and the ratio of observed relative to expected matches is reported. In other words, a 3-way comparison among the PML, metagenome, and existing genome sequences is used to quickly assess which (if any) species included in the PML is likely to be present in the metagenome, based on available sequence data. Our tests showed that the species with the most PML matches correctly indicated the organism sequenced for empirical metagenomes consisting of a cultured, relatively pure isolate. These runs completed in 1 minute to 3 hours on 12 CPU (1 thread/CPU), depending on the metagenome and PML. Using more threads on the same number of CPU resulted in speed improvements roughly proportional to the number of threads. Simulations indicated that detection sensitivity depends on both sequencing coverage levels for a species and the size of the PML: species were correctly detected even at ~0.003x coverage by the large PMLs, and at ~0.03x coverage by the smaller PMLs. Matches to true positive species were 3-4 orders of magnitude higher than to false positives. Simulations with short reads (36 nt and ~260 nt) showed that species were usually detected for metagenome coverage above 0.005x and coverage in the PML above 0

  14. Applying meta-pathway analyses through metagenomics to identify the functional properties of the major bacterial communities of a single spontaneous cocoa bean fermentation process sample.

    PubMed

    Illeghems, Koen; Weckx, Stefan; De Vuyst, Luc

    2015-09-01

    A high-resolution functional metagenomic analysis of a representative single sample of a Brazilian spontaneous cocoa bean fermentation process was carried out to gain insight into its bacterial community functioning. By reconstruction of microbial meta-pathways based on metagenomic data, the current knowledge about the metabolic capabilities of bacterial members involved in the cocoa bean fermentation ecosystem was extended. Functional meta-pathway analysis revealed the distribution of the metabolic pathways between the bacterial members involved. The metabolic capabilities of the lactic acid bacteria present were most associated with the heterolactic fermentation and citrate assimilation pathways. The role of Enterobacteriaceae in the conversion of substrates was shown through the use of the mixed-acid fermentation and methylglyoxal detoxification pathways. Furthermore, several other potential functional roles for Enterobacteriaceae were indicated, such as pectinolysis and citrate assimilation. Concerning acetic acid bacteria, metabolic pathways were partially reconstructed, in particular those related to responses toward stress, explaining their metabolic activities during cocoa bean fermentation processes. Further, the in-depth metagenomic analysis unveiled functionalities involved in bacterial competitiveness, such as the occurrence of CRISPRs and potential bacteriocin production. Finally, comparative analysis of the metagenomic data with bacterial genomes of cocoa bean fermentation isolates revealed the applicability of the selected strains as functional starter cultures. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Random whole metagenomic sequencing for forensic discrimination of soils.

    PubMed

    Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.

  16. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    PubMed Central

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  17. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    PubMed

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution.

  18. Integrated metagenomic analysis of the rumen microbiome of cattle reveals key biological mechanisms associated with methane traits.

    PubMed

    Wang, Haiying; Zheng, Huiru; Browne, Fiona; Roehe, Rainer; Dewhurst, Richard J; Engel, Felix; Hemmje, Matthias; Lu, Xiangwu; Walsh, Paul

    2017-07-15

    Methane is one of the major contributors to global warming. The rumen microbiota is directly involved in methane production in cattle. The link between variation in rumen microbial communities and host genetics has important applications and implications in bioscience. Having the potential to reveal the full extent of microbial gene diversity and complex microbial interactions, integrated metagenomics and network analysis holds great promise in this endeavour. This study investigates the rumen microbial community in cattle through the integration of metagenomic and network-based approaches. Based on the relative abundance of 1570 microbial genes identified in a metagenomics analysis, the co-abundance network was constructed and functional modules of microbial genes were identified. One of the main contributions is to develop a random matrix theory-based approach to automatically determining the correlation threshold used to construct the co-abundance network. The resulting network, consisting of 549 microbial genes and 3349 connections, exhibits a clear modular structure with certain trait-specific genes highly over-represented in modules. More specifically, all the 20 genes previously identified to be associated with methane emissions are found in a module (hypergeometric test, p<10 -11 ). One third of genes are involved in methane metabolism pathways. The further examination of abundance profiles across 8 samples of genes highlights that the revealed pattern of metagenomics abundance has a strong association with methane emissions. Furthermore, the module is significantly enriched with microbial genes encoding enzymes that are directly involved in methanogenesis (hypergeometric test, p<10 -9 ). Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Deciphering viral presences: two novel partial giant viruses detected in marine metagenome and in a mine drainage metagenome.

    PubMed

    Andreani, Julien; Verneau, Jonathan; Raoult, Didier; Levasseur, Anthony; La Scola, Bernard

    2018-04-10

    Nucleo-cytoplasmic large DNA viruses are doubled stranded DNA viruses capable of infecting eukaryotic cells. Since the discovery of Mimivirus and Pandoravirus, there has been no doubt about their extraordinary features compared to "classic" viruses. Recently, we reported the expansion of the proposed family Pithoviridae, with the description of Cedratvirus and Orpheovirus, two new viruses related to Pithoviruses. Studying the major capsid protein of Orpheovirus, we detected a homologous sequence in a mine drainage metagenome. The in-depth exploration of this metagenome, using the MG-Digger program, enabled us to retrieve up to 10 contigs with clear evidence of viral sequences. Moreover, phylogenetic analyses further extended our screening with the discovery in another marine metagenome of a second virus closely related to Orpheovirus IHUMI-LCC2. This virus is a misidentified virus confused with and annotated as a Rickettsiales bacterium. It presents a partial genome size of about 170 kbp.

  20. A high throughput screen for biomining cellulase activity from metagenomic libraries.

    PubMed

    Mewis, Keith; Taupp, Marcus; Hallam, Steven J

    2011-02-01

    Cellulose, the most abundant source of organic carbon on the planet, has wide-ranging industrial applications with increasing emphasis on biofuel production (1). Chemical methods to modify or degrade cellulose typically require strong acids and high temperatures. As such, enzymatic methods have become prominent in the bioconversion process. While the identification of active cellulases from bacterial and fungal isolates has been somewhat effective, the vast majority of microbes in nature resist laboratory cultivation. Environmental genomic, also known as metagenomic, screening approaches have great promise in bridging the cultivation gap in the search for novel bioconversion enzymes. Metagenomic screening approaches have successfully recovered novel cellulases from environments as varied as soils (2), buffalo rumen (3) and the termite hind-gut (4) using carboxymethylcellulose (CMC) agar plates stained with congo red dye (based on the method of Teather and Wood (5)). However, the CMC method is limited in throughput, is not quantitative and manifests a low signal to noise ratio (6). Other methods have been reported (7,8) but each use an agar plate-based assay, which is undesirable for high-throughput screening of large insert genomic libraries. Here we present a solution-based screen for cellulase activity using a chromogenic dinitrophenol (DNP)-cellobioside substrate (9). Our library was cloned into the pCC1 copy control fosmid to increase assay sensitivity through copy number induction (10). The method uses one-pot chemistry in 384-well microplates with the final readout provided as an absorbance measurement. This readout is quantitative, sensitive and automated with a throughput of up to 100X 384-well plates per day using a liquid handler and plate reader with attached stacking system.

  1. Environmental Viral Metagenomics Analyses in Aquaculture: Applications in Epidemiology and Disease Control

    PubMed Central

    Munang’andu, Hetron M.

    2016-01-01

    Studies on the epidemiology of viral diseases in aquaculture have for a long time depended on isolation of viruses from infected aquatic organisms. The role of aquatic environments in the epidemiology of viral diseases in aquaculture has not been extensively expounded mainly because of the lack of appropriate tools for environmental studies on aquatic viruses. However, the upcoming of metagenomics analyses opens great avenues in which environmental samples can be used to study the epidemiology of viral diseases outside their host species. Hence, in this review I have shown that epidemiological factors that influence the composition of viruses in different aquatic environments include ecological factors, anthropogenic activities and stocking densities of cultured organisms based on environmental metagenomics studies carried out this far. Ballast water transportation and global trade of aquatic organisms are the most common virus dispersal process identified this far. In terms of disease control for outdoor aquaculture systems, baseline data on viruses found in different environments intended for aquaculture use can be obtained to enable the design of effective disease control strategies. And as such, high-risk areas having a high specter of pathogenic viruses can be identified as an early warning system. As for the control of viral diseases for indoor recirculation aquaculture systems (RAS), the most effective disinfection methods able to eliminate pathogenic viruses from water used in RAS can be identified. Overall, the synopsis I have put forth in this review shows that environmental samples can be used to study the epidemiology of viral diseases in aquaculture using viral metagenomics analysis as an overture for the design of rational disease control strategies. PMID:28018317

  2. SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data

    PubMed Central

    Green, Kevin T.; Dutilh, Bas E.; Edwards, Robert A.

    2016-01-01

    Summary: Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools. Availability and implementation: SUPER-FOCUS was implemented in Python, and its source code and the tool website are freely available at https://edwards.sdsu.edu/SUPERFOCUS. Contact: redwards@mail.sdsu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26454280

  3. SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data.

    PubMed

    Silva, Genivaldo Gueiros Z; Green, Kevin T; Dutilh, Bas E; Edwards, Robert A

    2016-02-01

    Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools. SUPER-FOCUS was implemented in Python, and its source code and the tool website are freely available at https://edwards.sdsu.edu/SUPERFOCUS. redwards@mail.sdsu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  4. Discovery and characterizaton of a novel lipase with transesterification activity from hot spring metagenomic library.

    PubMed

    Yan, Wei; Li, Furong; Wang, Li; Zhu, Yaxin; Dong, Zhiyang; Bai, Linhan

    2017-03-01

    A new gene encoding a lipase (designated as Lip-1 ) was identified from a metagenomic bacterial artificial chromosome(BAC) library prepared from a concentrated water sample collected from a hot spring field in Niujie, Eryuan of Yunnan province in China. The open reading frame of this gene encoded 622 amino acid residues. It was cloned, fused with the oleosin gene and over expressed in Escherichia coli to prepare immobilized lipase artificial oil body AOB-sole-lip-1. The monomeric Sole-lip-1 fusion protein presented a molecular mass of 102.4 kDa. Enzyme assays using olive oil and methanol as the substrates in petroleum ether confirmed its transesterification activity. Hexadecanoic acid methyl ester, 8,11-Octadecadienoic acid methyl ester, 8-Octadecenoic acid methyl ester, and Octadecanoic acid methyl ester were detected. It showed favorable transesterification activity with optimal temperature 45 °C. Besides, the maximal biodiesel yield was obtained when the petroleum ether system as the organic solvent and the substrate methanol in 350 mmol/L (at a molar ratio of methanol of 10.5:1) and the water content was 1%. In light of these advantages, this lipase presents a promising resource for biodiesel production.

  5. Accessing the Soil Metagenome for Studies of Microbial Diversity▿ †

    PubMed Central

    Delmont, Tom O.; Robe, Patrick; Cecillon, Sébastien; Clark, Ian M.; Constancias, Florentin; Simonet, Pascal; Hirsch, Penny R.; Vogel, Timothy M.

    2011-01-01

    Soil microbial communities contain the highest level of prokaryotic diversity of any environment, and metagenomic approaches involving the extraction of DNA from soil can improve our access to these communities. Most analyses of soil biodiversity and function assume that the DNA extracted represents the microbial community in the soil, but subsequent interpretations are limited by the DNA recovered from the soil. Unfortunately, extraction methods do not provide a uniform and unbiased subsample of metagenomic DNA, and as a consequence, accurate species distributions cannot be determined. Moreover, any bias will propagate errors in estimations of overall microbial diversity and may exclude some microbial classes from study and exploitation. To improve metagenomic approaches, investigate DNA extraction biases, and provide tools for assessing the relative abundances of different groups, we explored the biodiversity of the accessible community DNA by fractioning the metagenomic DNA as a function of (i) vertical soil sampling, (ii) density gradients (cell separation), (iii) cell lysis stringency, and (iv) DNA fragment size distribution. Each fraction had a unique genetic diversity, with different predominant and rare species (based on ribosomal intergenic spacer analysis [RISA] fingerprinting and phylochips). All fractions contributed to the number of bacterial groups uncovered in the metagenome, thus increasing the DNA pool for further applications. Indeed, we were able to access a more genetically diverse proportion of the metagenome (a gain of more than 80% compared to the best single extraction method), limit the predominance of a few genomes, and increase the species richness per sequencing effort. This work stresses the difference between extracted DNA pools and the currently inaccessible complete soil metagenome. PMID:21183646

  6. The Challenge and Potential of Metagenomics in the Clinic

    PubMed Central

    Mulcahy-O’Grady, Heidi; Workentine, Matthew L.

    2016-01-01

    resistance identified from the sequencing data. All of these applications rely on sophisticated computational tools, and we also discuss the importance of skilled bioinformatic support for the implementation and use of metagenomics in the clinic. PMID:26870044

  7. Metagenomics: The Next Culture-Independent Game Changer

    PubMed Central

    Forbes, Jessica D.; Knox, Natalie C.; Ronholm, Jennifer; Pagotto, Franco; Reimer, Aleisha

    2017-01-01

    A trend towards the abandonment of obtaining pure culture isolates in frontline laboratories is at a crossroads with the ability of public health agencies to perform their basic mandate of foodborne disease surveillance and response. The implementation of culture-independent diagnostic tests (CIDTs) including nucleic acid and antigen-based assays for acute gastroenteritis is leaving public health agencies without laboratory evidence to link clinical cases to each other and to food or environmental substances. This limits the efficacy of public health epidemiology and surveillance as well as outbreak detection and investigation. Foodborne outbreaks have the potential to remain undetected or have insufficient evidence to support source attribution and may inadvertently increase the incidence of foodborne diseases. Next-generation sequencing of pure culture isolates in clinical microbiology laboratories has the potential to revolutionize the fields of food safety and public health. Metagenomics and other ‘omics’ disciplines could provide the solution to a cultureless future in clinical microbiology, food safety and public health. Data mining of information obtained from metagenomics assays can be particularly useful for the identification of clinical causative agents or foodborne contamination, detection of AMR and/or virulence factors, in addition to providing high-resolution subtyping data. Thus, metagenomics assays may provide a universal test for clinical diagnostics, foodborne pathogen detection, subtyping and investigation. This information has the potential to reform the field of enteric disease diagnostics and surveillance and also infectious diseases as a whole. The aim of this review will be to present the current state of CIDTs in diagnostic and public health laboratories as they relate to foodborne illness and food safety. Moreover, we will also discuss the diagnostic and subtyping utility and concomitant bias limitations of metagenomics and comparable

  8. SPHINX--an algorithm for taxonomic binning of metagenomic sequences.

    PubMed

    Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Singh, Nitin Kumar; Mande, Sharmila S

    2011-01-01

    Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms. Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms. A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.

  9. Signal Processing for Metagenomics: Extracting Information from the Soup

    PubMed Central

    Rosen, Gail L.; Sokhansanj, Bahrad A.; Polikar, Robi; Bruns, Mary Ann; Russell, Jacob; Garbarine, Elaine; Essinger, Steve; Yok, Non

    2009-01-01

    Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology. PMID:20436876

  10. Meeting Report: 1st International Functional Metagenomics Workshop May 7–8, 2012, St. Jacobs, Ontario, Canada.

    PubMed Central

    Engel, Katja; Ashby, Deborah; Brady, Sean F.; Cowan, Don A.; Doemer, John; Edwards, Elizabeth A.; Fiebig, Klaus; Martens, Eric C.; McCormac, Dennis; Mead, David A.; Miyazaki, Kentaro; Moreno-Hagelsieb, Gabriel; O’Gara, Fergal; Reid, Alexandra; Rose, David R.; Simonet, Pascal; Sjöling, Sara; Smalla, Kornelia; Streit, Wolfgang R.; Tedman-Jones, Jennifer; Valla, Svein; Wellington, Elizabeth M. H.; Wu, Cheng-Cang; Liles, Mark R.; Neufeld, Josh D.; Sessitsch, Angela

    2013-01-01

    This report summarizes the events of the 1st International Functional Metagenomics Workshop. The workshop was held on May 7 and 8, 2012, in St. Jacobs, Ontario, Canada and was focused on building an international functional metagenomics community, exploring strategic research areas, and identifying opportunities for future collaboration and funding. The workshop was initiated by researchers at the University of Waterloo with support from the Ontario Genomics Institute (OGI), Natural Sciences and Engineering Research Council of Canada (NSERC) and the University of Waterloo. PMID:23961315

  11. A combined meta-barcoding and shotgun metagenomic analysis of spontaneous wine fermentation.

    PubMed

    Sternes, Peter R; Lee, Danna; Kutyna, Dariusz R; Borneman, Anthony R

    2017-07-01

    Wine is a complex beverage, comprising hundreds of metabolites produced through the action of yeasts and bacteria in fermenting grape must. Commercially, there is now a growing trend away from using wine yeast (Saccharomyces) starter cultures, toward the historic practice of uninoculated or "wild" fermentation, where the yeasts and bacteria associated with the grapes and/or winery perform the fermentation. It is the varied metabolic contributions of these numerous non-Saccharomyces species that are thought to impart complexity and desirable taste and aroma attributes to wild ferments in comparison to their inoculated counterparts. To map the microflora of spontaneous fermentation, metagenomic techniques were employed to characterize and monitor the progression of fungal species in 5 different wild fermentations. Both amplicon-based ribosomal DNA internal transcribed spacer (ITS) phylotyping and shotgun metagenomics were used to assess community structure across different stages of fermentation. While providing a sensitive and highly accurate means of characterizing the wine microbiome, the shotgun metagenomic data also uncovered a significant overabundance bias in the ITS phylotyping abundance estimations for the common non-Saccharomyces wine yeast genus Metschnikowia. By identifying biases such as that observed for Metschnikowia, abundance measurements from future ITS phylotyping datasets can be corrected to provide more accurate species representation. Ultimately, as more shotgun metagenomic and single-strain de novo assemblies for key wine species become available, the accuracy of both ITS-amplicon and shotgun studies will greatly increase, providing a powerful methodology for deciphering the influence of the microbial community on the wine flavor and aroma. © The Authors 2017. Published by Oxford University Press.

  12. A combined meta-barcoding and shotgun metagenomic analysis of spontaneous wine fermentation

    PubMed Central

    Sternes, Peter R.; Lee, Danna; Kutyna, Dariusz R.

    2017-01-01

    Abstract Wine is a complex beverage, comprising hundreds of metabolites produced through the action of yeasts and bacteria in fermenting grape must. Commercially, there is now a growing trend away from using wine yeast (Saccharomyces) starter cultures, toward the historic practice of uninoculated or “wild” fermentation, where the yeasts and bacteria associated with the grapes and/or winery perform the fermentation. It is the varied metabolic contributions of these numerous non-Saccharomyces species that are thought to impart complexity and desirable taste and aroma attributes to wild ferments in comparison to their inoculated counterparts. To map the microflora of spontaneous fermentation, metagenomic techniques were employed to characterize and monitor the progression of fungal species in 5 different wild fermentations. Both amplicon-based ribosomal DNA internal transcribed spacer (ITS) phylotyping and shotgun metagenomics were used to assess community structure across different stages of fermentation. While providing a sensitive and highly accurate means of characterizing the wine microbiome, the shotgun metagenomic data also uncovered a significant overabundance bias in the ITS phylotyping abundance estimations for the common non-Saccharomyces wine yeast genus Metschnikowia. By identifying biases such as that observed for Metschnikowia, abundance measurements from future ITS phylotyping datasets can be corrected to provide more accurate species representation. Ultimately, as more shotgun metagenomic and single-strain de novo assemblies for key wine species become available, the accuracy of both ITS-amplicon and shotgun studies will greatly increase, providing a powerful methodology for deciphering the influence of the microbial community on the wine flavor and aroma. PMID:28595314

  13. Metagenomic Survey for Viruses in Western Arctic Caribou, Alaska, through Iterative Assembly of Taxonomic Units

    PubMed Central

    Schürch, Anita C.; Schipper, Debby; Bijl, Maarten A.; Dau, Jim; Beckmen, Kimberlee B.; Schapendonk, Claudia M. E.; Raj, V. Stalin; Osterhaus, Albert D. M. E.; Haagmans, Bart L.; Tryland, Morten; Smits, Saskia L.

    2014-01-01

    Pathogen surveillance in animals does not provide a sufficient level of vigilance because it is generally confined to surveillance of pathogens with known economic impact in domestic animals and practically nonexistent in wildlife species. As most (re-)emerging viral infections originate from animal sources, it is important to obtain insight into viral pathogens present in the wildlife reservoir from a public health perspective. When monitoring living, free-ranging wildlife for viruses, sample collection can be challenging and availability of nucleic acids isolated from samples is often limited. The development of viral metagenomics platforms allows a more comprehensive inventory of viruses present in wildlife. We report a metagenomic viral survey of the Western Arctic herd of barren ground caribou (Rangifer tarandus granti) in Alaska, USA. The presence of mammalian viruses in eye and nose swabs of 39 free-ranging caribou was investigated by random amplification combined with a metagenomic analysis approach that applied exhaustive iterative assembly of sequencing results to define taxonomic units of each metagenome. Through homology search methods we identified the presence of several mammalian viruses, including different papillomaviruses, a novel parvovirus, polyomavirus, and a virus that potentially represents a member of a novel genus in the family Coronaviridae. PMID:25140520

  14. Chronic Meningitis Investigated via Metagenomic Next-Generation Sequencing.

    PubMed

    Wilson, Michael R; O'Donovan, Brian D; Gelfand, Jeffrey M; Sample, Hannah A; Chow, Felicia C; Betjemann, John P; Shah, Maulik P; Richie, Megan B; Gorman, Mark P; Hajj-Ali, Rula A; Calabrese, Leonard H; Zorn, Kelsey C; Chow, Eric D; Greenlee, John E; Blum, Jonathan H; Green, Gary; Khan, Lillian M; Banerji, Debarko; Langelier, Charles; Bryson-Cahn, Chloe; Harrington, Whitney; Lingappa, Jairam R; Shanbhag, Niraj M; Green, Ari J; Brew, Bruce J; Soldatos, Ariane; Strnad, Luke; Doernberg, Sarah B; Jay, Cheryl A; Douglas, Vanja; Josephson, S Andrew; DeRisi, Joseph L

    2018-04-16

    Identifying infectious causes of subacute or chronic meningitis can be challenging. Enhanced, unbiased diagnostic approaches are needed. To present a case series of patients with diagnostically challenging subacute or chronic meningitis using metagenomic next-generation sequencing (mNGS) of cerebrospinal fluid (CSF) supported by a statistical framework generated from mNGS of control samples from the environment and from patients who were noninfectious. In this case series, mNGS data obtained from the CSF of 94 patients with noninfectious neuroinflammatory disorders and from 24 water and reagent control samples were used to develop and implement a weighted scoring metric based on z scores at the species and genus levels for both nucleotide and protein alignments to prioritize and rank the mNGS results. Total RNA was extracted for mNGS from the CSF of 7 participants with subacute or chronic meningitis who were recruited between September 2013 and March 2017 as part of a multicenter study of mNGS pathogen discovery among patients with suspected neuroinflammatory conditions. The neurologic infections identified by mNGS in these 7 participants represented a diverse array of pathogens. The patients were referred from the University of California, San Francisco Medical Center (n = 2), Zuckerberg San Francisco General Hospital and Trauma Center (n = 2), Cleveland Clinic (n = 1), University of Washington (n = 1), and Kaiser Permanente (n = 1). A weighted z score was used to filter out environmental contaminants and facilitate efficient data triage and analysis. Pathogens identified by mNGS and the ability of a statistical model to prioritize, rank, and simplify mNGS results. The 7 participants ranged in age from 10 to 55 years, and 3 (43%) were female. A parasitic worm (Taenia solium, in 2 participants), a virus (HIV-1), and 4 fungi (Cryptococcus neoformans, Aspergillus oryzae, Histoplasma capsulatum, and Candida dubliniensis) were identified among the 7

  15. Identifying Keystone Species in the Human Gut Microbiome from Metagenomic Timeseries Using Sparse Linear Regression

    PubMed Central

    Fisher, Charles K.; Mehta, Pankaj

    2014-01-01

    Human associated microbial communities exert tremendous influence over human health and disease. With modern metagenomic sequencing methods it is now possible to follow the relative abundance of microbes in a community over time. These microbial communities exhibit rich ecological dynamics and an important goal of microbial ecology is to infer the ecological interactions between species directly from sequence data. Any algorithm for inferring ecological interactions must overcome three major obstacles: 1) a correlation between the abundances of two species does not imply that those species are interacting, 2) the sum constraint on the relative abundances obtained from metagenomic studies makes it difficult to infer the parameters in timeseries models, and 3) errors due to experimental uncertainty, or mis-assignment of sequencing reads into operational taxonomic units, bias inferences of species interactions due to a statistical problem called “errors-in-variables”. Here we introduce an approach, Learning Interactions from MIcrobial Time Series (LIMITS), that overcomes these obstacles. LIMITS uses sparse linear regression with boostrap aggregation to infer a discrete-time Lotka-Volterra model for microbial dynamics. We tested LIMITS on synthetic data and showed that it could reliably infer the topology of the inter-species ecological interactions. We then used LIMITS to characterize the species interactions in the gut microbiomes of two individuals and found that the interaction networks varied significantly between individuals. Furthermore, we found that the interaction networks of the two individuals are dominated by distinct “keystone species”, Bacteroides fragilis and Bacteroided stercosis, that have a disproportionate influence on the structure of the gut microbiome even though they are only found in moderate abundance. Based on our results, we hypothesize that the abundances of certain keystone species may be responsible for individuality in the human

  16. Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold

    PubMed Central

    Nijkamp, Jurgen F.; Pop, Mihai; Reinders, Marcel J. T.; de Ridder, Dick

    2013-01-01

    Motivation: Although many tools are available to study variation and its impact in single genomes, there is a lack of algorithms for finding such variation in metagenomes. This hampers the interpretation of metagenomics sequencing datasets, which are increasingly acquired in research on the (human) microbiome, in environmental studies and in the study of processes in the production of foods and beverages. Existing algorithms often depend on the use of reference genomes, which pose a problem when a metagenome of a priori unknown strain composition is studied. In this article, we develop a method to perform reference-free detection and visual exploration of genomic variation, both within a single metagenome and between metagenomes. Results: We present the MaryGold algorithm and its implementation, which efficiently detects bubble structures in contig graphs using graph decomposition. These bubbles represent variable genomic regions in closely related strains in metagenomic samples. The variation found is presented in a condensed Circos-based visualization, which allows for easy exploration and interpretation of the found variation. We validated the algorithm on two simulated datasets containing three respectively seven Escherichia coli genomes and showed that finding allelic variation in these genomes improves assemblies. Additionally, we applied MaryGold to publicly available real metagenomic datasets, enabling us to find within-sample genomic variation in the metagenomes of a kimchi fermentation process, the microbiome of a premature infant and in microbial communities living on acid mine drainage. Moreover, we used MaryGold for between-sample variation detection and exploration by comparing sequencing data sampled at different time points for both of these datasets. Availability: MaryGold has been written in C++ and Python and can be downloaded from http://bioinformatics.tudelft.nl/software Contact: d.deridder@tudelft.nl PMID:24058058

  17. Ray Meta: scalable de novo metagenome assembly and profiling

    PubMed Central

    2012-01-01

    Voluminous parallel sequencing datasets, especially metagenomic experiments, require distributed computing for de novo assembly and taxonomic profiling. Ray Meta is a massively distributed metagenome assembler that is coupled with Ray Communities, which profiles microbiomes based on uniquely-colored k-mers. It can accurately assemble and profile a three billion read metagenomic experiment representing 1,000 bacterial genomes of uneven proportions in 15 hours with 1,024 processor cores, using only 1.5 GB per core. The software will facilitate the processing of large and complex datasets, and will help in generating biological insights for specific environments. Ray Meta is open source and available at http://denovoassembler.sf.net. PMID:23259615

  18. MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.

    PubMed

    Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

    2014-01-01

    A key challenge in analyzing metagenomics data pertains to assembly of sequenced DNA fragments (i.e. reads) originating from various microbes in a given environmental sample. Several existing methodologies can assemble reads originating from a single genome. However, these methodologies cannot be applied for efficient assembly of metagenomic sequence datasets. In this study, we present MetaCAA - a clustering-aided methodology which helps in improving the quality of metagenomic sequence assembly. MetaCAA initially groups sequences constituting a given metagenome into smaller clusters. Subsequently, sequences in each cluster are independently assembled using CAP3, an existing single genome assembly program. Contigs formed in each of the clusters along with the unassembled reads are then subjected to another round of assembly for generating the final set of contigs. Validation using simulated and real-world metagenomic datasets indicates that MetaCAA aids in improving the overall quality of assembly. A software implementation of MetaCAA is available at https://metagenomics.atc.tcs.com/MetaCAA. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. New Hydrocarbon Degradation Pathways in the Microbial Metagenome from Brazilian Petroleum Reservoirs

    PubMed Central

    Sierra-García, Isabel Natalia; Correa Alvarez, Javier; Pantaroto de Vasconcellos, Suzan; Pereira de Souza, Anete; dos Santos Neto, Eugenio Vaz; de Oliveira, Valéria Maia

    2014-01-01

    Current knowledge of the microbial diversity and metabolic pathways involved in hydrocarbon degradation in petroleum reservoirs is still limited, mostly due to the difficulty in recovering the complex community from such an extreme environment. Metagenomics is a valuable tool to investigate the genetic and functional diversity of previously uncultured microorganisms in natural environments. Using a function-driven metagenomic approach, we investigated the metabolic abilities of microbial communities in oil reservoirs. Here, we describe novel functional metabolic pathways involved in the biodegradation of aromatic compounds in a metagenomic library obtained from an oil reservoir. Although many of the deduced proteins shared homology with known enzymes of different well-described aerobic and anaerobic catabolic pathways, the metagenomic fragments did not contain the complete clusters known to be involved in hydrocarbon degradation. Instead, the metagenomic fragments comprised genes belonging to different pathways, showing novel gene arrangements. These results reinforce the potential of the metagenomic approach for the identification and elucidation of new genes and pathways in poorly studied environments and contribute to a broader perspective on the hydrocarbon degradation processes in petroleum reservoirs. PMID:24587220

  20. Metagenome assembly through clustering of next-generation sequencing data using protein sequences.

    PubMed

    Sim, Mikang; Kim, Jaebum

    2015-02-01

    The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Antibiotic resistance genes across a wide variety of metagenomes.

    PubMed

    Fitzpatrick, David; Walsh, Fiona

    2016-02-01

    The distribution of potential clinically relevant antibiotic resistance (AR) genes across soil, water, animal, plant and human microbiomes is not well understood. We aimed to investigate if there were differences in the distribution and relative abundances of resistance genes across a variety of ecological niches. All sequence reads (human, animal, water, soil, plant and insect metagenomes) from the MG-RAST database were downloaded and assembled into a local sequence database. We show that there are many reservoirs of the basic form of resistance genes e.g. blaTEM, but the human and mammalian gut microbiomes contain the widest diversity of clinically relevant resistance genes using metagenomic analysis. The human microbiomes contained a high relative abundance of resistance genes, while the relative abundances varied greatly in the marine and soil metagenomes, when datasets with greater than one million genes were compared. While these results reflect a bias in the distribution of AR genes across the metagenomes, we note this interpretation with caution. Metagenomics analysis includes limits in terms of detection and identification of AR genes in complex and diverse microbiome population. Therefore, if we do not detect the AR gene is it in fact not there or just below the limits of our techniques? © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Metagenomics, metatranscriptomics and single cell genomics reveal functional response of active Oceanospirillales to Gulf oil spill

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mason, Olivia U.; Hazen, Terry C.; Borglin, Sharon

    The Deepwater Horizon oil spill in the Gulf of Mexico resulted in a deep-sea hydrocarbon plume that caused a shift in the indigenous microbial community composition with unknown ecological consequences. Early in the spill history, a bloom of uncultured, thus uncharacterized, members of the Oceanospirillales was previously detected, but their role in oil disposition was unknown. Here our aim was to determine the functional role of the Oceanospirillales and other active members of the indigenous microbial community using deep sequencing of community DNA and RNA, as well as single-cell genomics. Shotgun metagenomic and metatranscriptomic sequencing revealed that genes for motility,more » chemotaxis and aliphatic hydrocarbon degradation were significantly enriched and expressed in the hydrocarbon plume samples compared with uncontaminated seawater collected from plume depth. In contrast, although genes coding for degradation of more recalcitrant compounds, such as benzene, toluene, ethylbenzene, total xylenes and polycyclic aromatic hydrocarbons, were identified in the metagenomes, they were expressed at low levels, or not at all based on analysis of the metatranscriptomes. Isolation and sequencing of two Oceanospirillales single cells revealed that both cells possessed genes coding for n-alkane and cycloalkane degradation. Specifically, the near-complete pathway for cyclohexane oxidation in the Oceanospirillales single cells was elucidated and supported by both metagenome and metatranscriptome data. The draft genome also included genes for chemotaxis, motility and nutrient acquisition strategies that were also identified in the metagenomes and metatranscriptomes. These data point towards a rapid response of members of the Oceanospirillales to aliphatic hydrocarbons in the deep sea.« less

  3. Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition

    PubMed Central

    Saeed, Isaam; Tang, Sen-Lin; Halgamuge, Saman K.

    2012-01-01

    An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis. PMID:22180538

  4. A Novel Multifunctional β-N-Acetylhexosaminidase Revealed through Metagenomics of an Oil-Spilled Mangrove

    PubMed Central

    Soares, Fábio Lino; Marcon, Joelma; Khakhum, Nittaya; Cerdeira, Louise Teixeira; Domingos, Daniela Ferreira; Taketani, Rodrigo Gouvea; de Oliveira, Valéria Maia; Lima, André Oliveira de Souza

    2017-01-01

    The use of culture-independent approaches, such as metagenomics, provides complementary access to environmental microbial diversity. Mangrove environments represent a highly complex system with plenty of opportunities for finding singular functions. In this study we performed a functional screening of fosmid libraries obtained from an oil contaminated mangrove site, with the purpose of identifying clones expressing hydrolytic activities. A novel gene coding for a β-N-acetylhexosaminidase with 355 amino acids and 43KDa was retrieved and characterized. The translated sequence showed only 38% similarity to a β-N-acetylhexosaminidase gene in the genome of Veillonella sp. CAG:933, suggesting that it might constitute a novel enzyme. The enzyme was expressed, purified, and characterized for its enzymatic activity on carboxymethyl cellulose, p-Nitrophenyl-2acetamide-2deoxy-β-d-glucopyranoside, p-Nitrophenyl-2acetamide-2deoxy-β-d-galactopyranoside, and 4-Nitrophenyl β-d-glucopyranoside, presenting β-N-acetylglucosaminidase, β-glucosidase, and β-1,4-endoglucanase activities. The enzyme showed optimum activity at 30 °C and pH 5.5. The characterization of the putative novel β-N-acetylglucosaminidase enzyme reflects similarities to characteristics of the environment explored, which differs from milder conditions environments. This work exemplifies the application of cultivation-independent molecular techniques to the mangrove microbiome for obtaining a novel biotechnological product. PMID:28952541

  5. Metagenomes from two microbial consortia associated with Santa Barbara seep oil.

    PubMed

    Hawley, Erik R; Malfatti, Stephanie A; Pagani, Ioanna; Huntemann, Marcel; Chen, Amy; Foster, Brian; Copeland, Alexander; del Rio, Tijana Glavina; Pati, Amrita; Jansson, Janet R; Gilbert, Jack A; Tringe, Susannah Green; Lorenson, Thomas D; Hess, Matthias

    2014-12-01

    The metagenomes from two microbial consortia associated with natural oils seeping into the Pacific Ocean offshore the coast of Santa Barbara (California, USA) were determined to complement already existing metagenomes generated from microbial communities associated with hydrocarbons that pollute the marine ecosystem. This genomics resource article is the first of two publications reporting a total of four new metagenomes from oils that seep into the Santa Barbara Channel. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Metagenomes of complex microbial consortia derived from different soils as sources for novel genes conferring formation of carbonyls from short-chain polyols on Escherichia coli.

    PubMed

    Knietsch, Anja; Waschkowitz, Tanja; Bowien, Susanne; Henne, Anke; Daniel, Rolf

    2003-01-01

    Metagenomic DNA libraries from three different soil samples (meadow, sugar beet field, cropland) were constructed. The three unamplified libraries comprised approximately 1267000 independent clones and harbored approximately 4.05 Gbp of environmental DNA. Approximately 300000 recombinant Escherichia coli strains of each library per test substrate were screened for the production of carbonyls from short-chain (C2 to C4) polyols such as 1,2-ethanediol, 2,3-butanediol, and a mixture of glycerol and 1,2-propanediol on indicator agar. Twenty-four positive E. COLI clones were obtained during the initial screen. Fifteen of them contained recombinant plasmids, designated pAK201-215, which conferred a stable carbonyl-forming phenotype on E. coli Sequencing revealed that the inserts of pAK201-215 encoded 26 complete and 14 incomplete predicted protein-encoding genes. Most of these genes were similar to genes with unknown functions from other microorganisms or unrelated to any other known gene. The further analysis was focused on the 7 plasmids (pAK204, pAK206, pAK208, and pAK210-213) recovered from the positive clones, which exhibited an NAD(H)-dependent alcohol oxidoreductase activity with polyols or the correlating carbonyls as substrates in crude extracts. Three genes (ORF6, ORF24, and ORF25) conferring this activity were identified during subcloning of the inserts of pAK204, pAK211, and pAK212. The sequences of the three deduced gene products revealed no significant similarities to known alcohol oxidoreductases, but contained putative glycine-rich regions, which are characteristic for binding of nicotinamide cofactors. Copyright 2003 S. Karger AG, Basel

  7. Metagenomics and the protein universe

    PubMed Central

    Godzik, Adam

    2011-01-01

    Metagenomics sequencing projects have dramatically increased our knowledge of the protein universe and provided over one-half of currently known protein sequences; they have also introduced a much broader phylogenetic diversity into the protein databases. The full analysis of metagenomic datasets is only beginning, but it has already led to the discovery of thousands of new protein families, likely representing novel functions specific to given environments. At the same time, a deeper analysis of such novel families, including experimental structure determination of some representatives, suggests that most of them represent distant homologs of already characterized protein families, and thus most of the protein diversity present in the new environments are due to functional divergence of the known protein families rather than the emergence of new ones. PMID:21497084

  8. MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Sakakibara, Yasumbumi

    2018-02-13

    Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  9. MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sakakibara, Yasumbumi

    2011-10-13

    Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  10. Comparative fecal metagenomics unveils unique functional capacity of the swine gut

    PubMed Central

    2011-01-01

    Background Uncovering the taxonomic composition and functional capacity within the swine gut microbial consortia is of great importance to animal physiology and health as well as to food and water safety due to the presence of human pathogens in pig feces. Nonetheless, limited information on the functional diversity of the swine gut microbiome is available. Results Analysis of 637, 722 pyrosequencing reads (130 megabases) generated from Yorkshire pig fecal DNA extracts was performed to help better understand the microbial diversity and largely unknown functional capacity of the swine gut microbiome. Swine fecal metagenomic sequences were annotated using both MG-RAST and JGI IMG/M-ER pipelines. Taxonomic analysis of metagenomic reads indicated that swine fecal microbiomes were dominated by Firmicutes and Bacteroidetes phyla. At a finer phylogenetic resolution, Prevotella spp. dominated the swine fecal metagenome, while some genes associated with Treponema and Anareovibrio species were found to be exclusively within the pig fecal metagenomic sequences analyzed. Functional analysis revealed that carbohydrate metabolism was the most abundant SEED subsystem, representing 13% of the swine metagenome. Genes associated with stress, virulence, cell wall and cell capsule were also abundant. Virulence factors associated with antibiotic resistance genes with highest sequence homology to genes in Bacteroidetes, Clostridia, and Methanosarcina were numerous within the gene families unique to the swine fecal metagenomes. Other abundant proteins unique to the distal swine gut shared high sequence homology to putative carbohydrate membrane transporters. Conclusions The results from this metagenomic survey demonstrated the presence of genes associated with resistance to antibiotics and carbohydrate metabolism suggesting that the swine gut microbiome may be shaped by husbandry practices. PMID:21575148

  11. Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).

    PubMed

    Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco

    2016-03-01

    Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Insights into resistome and stress responses genes in Bubalus bubalis rumen through metagenomic analysis.

    PubMed

    Reddy, Bhaskar; Singh, Krishna M; Patel, Amrutlal K; Antony, Ancy; Panchasara, Harshad J; Joshi, Chaitanya G

    2014-10-01

    Buffalo rumen microbiota experience variety of diets and represents a huge reservoir of mobilome, resistome and stress responses. However, knowledge of metagenomic responses to such conditions is still rudimentary. We analyzed the metagenomes of buffalo rumen in the liquid and solid phase of the rumen biomaterial from river buffalo adapted to varying proportion of concentrate to green or dry roughages, using high-throughput sequencing to know the occurrence of antibiotics resistance genes, genetic exchange between bacterial population and environmental reservoirs. A total of 3914.94 MB data were generated from all three treatments group. The data were analysed with Metagenome rapid annotation system tools. At phyla level, Bacteroidetes were dominant in all the treatments followed by Firmicutes. Genes coding for functional responses to stress (oxidative stress and heat shock proteins) and resistome genes (resistance to antibiotics and toxic compounds, phages, transposable elements and pathogenicity islands) were prevalent in similar proportion in liquid and solid fraction of rumen metagenomes. The fluoroquinolone resistance, MDR efflux pumps and Methicillin resistance genes were broadly distributed across 11, 9, and 14 bacterial classes, respectively. Bacteria responsible for phages replication and prophages and phage packaging and rlt-like streptococcal phage genes were mostly assigned to phyla Bacteroides, Firmicutes and proteaobacteria. Also, more reads matching the sigma B genes were identified in the buffalo rumen. This study underscores the presence of diverse mechanisms of adaptation to different diet, antibiotics and other stresses in buffalo rumen, reflecting the proportional representation of major bacterial groups.

  13. Exon trapping: a genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA.

    PubMed

    Duyk, G M; Kim, S W; Myers, R M; Cox, D R

    1990-11-01

    Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons.

  14. Exon trapping: a genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA.

    PubMed Central

    Duyk, G M; Kim, S W; Myers, R M; Cox, D R

    1990-01-01

    Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons. PMID:2247475

  15. Cloning and characterization of thermo-alkalistable and surfactant stable endoglucanase from Puga hot spring metagenome of Ladakh (J&K).

    PubMed

    Gupta, Puneet; Mishra, Arjun K; Vakhlu, Jyoti

    2017-10-01

    A thermo-alkalistable and surfactant stable endoglucanase (PHS) gene consisting of 554 amino acids was identified from metagenomic library of Puga hot spring using functional screening. PHS gene was overexpressed and purified to homogeneity using affinity chromatography The purified PHS protein presented a single band of 60kDa on the SDS-PAGE gel and zymogram. The recombinant PHS exhibited activity over a broad range of pH and temperature with optima at pH 8.0 and 65°C, respectively and having optimum stability at 60°C and pH 8.0, respectively. The recombinant PHS showed highest substrate specificity using CMC (218.4U/mg) as compared with Barley β-glucan (89.2U/mg) and Avicel (0.8U/mg). The K m and V max of recombinant PHS for CMC were 3.85mg/ml and 370.37μmolmin -1 mg -1 , respectively. The activity of the recombinant PHS was enhanced by treatment with 10mM non-ionic detergents such as Tween 20, Tween 40, Tween 80, Triton X- 100 and PEG and was inhibited by CTAB, SDS. Its functionality was stable in the presence of Fe 3+ but inhibited by Cu 2+ , Hg 2+ , Mn 2+ and Zn 2+ . These properties make PHS endoglucanase a potential candidate for use in laundry, textile,paper and pulp industries. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Chronic Meningitis Investigated via Metagenomic Next-Generation Sequencing

    PubMed Central

    O’Donovan, Brian D.; Gelfand, Jeffrey M.; Sample, Hannah A.; Chow, Felicia C.; Betjemann, John P.; Shah, Maulik P.; Richie, Megan B.; Gorman, Mark P.; Hajj-Ali, Rula A.; Calabrese, Leonard H.; Zorn, Kelsey C.; Chow, Eric D.; Greenlee, John E.; Blum, Jonathan H.; Green, Gary; Khan, Lillian M.; Banerji, Debarko; Langelier, Charles; Bryson-Cahn, Chloe; Harrington, Whitney; Lingappa, Jairam R.; Shanbhag, Niraj M.; Green, Ari J.; Brew, Bruce J.; Soldatos, Ariane; Strnad, Luke; Doernberg, Sarah B.; Jay, Cheryl A.; Douglas, Vanja; Josephson, S. Andrew; DeRisi, Joseph L.

    2018-01-01

    Importance Identifying infectious causes of subacute or chronic meningitis can be challenging. Enhanced, unbiased diagnostic approaches are needed. Objective To present a case series of patients with diagnostically challenging subacute or chronic meningitis using metagenomic next-generation sequencing (mNGS) of cerebrospinal fluid (CSF) supported by a statistical framework generated from mNGS of control samples from the environment and from patients who were noninfectious. Design, Setting, and Participants In this case series, mNGS data obtained from the CSF of 94 patients with noninfectious neuroinflammatory disorders and from 24 water and reagent control samples were used to develop and implement a weighted scoring metric based on z scores at the species and genus levels for both nucleotide and protein alignments to prioritize and rank the mNGS results. Total RNA was extracted for mNGS from the CSF of 7 participants with subacute or chronic meningitis who were recruited between September 2013 and March 2017 as part of a multicenter study of mNGS pathogen discovery among patients with suspected neuroinflammatory conditions. The neurologic infections identified by mNGS in these 7 participants represented a diverse array of pathogens. The patients were referred from the University of California, San Francisco Medical Center (n = 2), Zuckerberg San Francisco General Hospital and Trauma Center (n = 2), Cleveland Clinic (n = 1), University of Washington (n = 1), and Kaiser Permanente (n = 1). A weighted z score was used to filter out environmental contaminants and facilitate efficient data triage and analysis. Main Outcomes and Measures Pathogens identified by mNGS and the ability of a statistical model to prioritize, rank, and simplify mNGS results. Results The 7 participants ranged in age from 10 to 55 years, and 3 (43%) were female. A parasitic worm (Taenia solium, in 2 participants), a virus (HIV-1), and 4 fungi (Cryptococcus neoformans

  17. MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

    PubMed Central

    Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2016-01-01

    The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about

  18. Dip in the gene pool: metagenomic survey of natural coccolithovirus communities.

    PubMed

    Pagarete, António; Kusonmano, Kanthida; Petersen, Kjell; Kimmance, Susan A; Martínez Martínez, Joaquín; Wilson, William H; Hehemann, Jan-Hendrik; Allen, Michael J; Sandaa, Ruth-Anne

    2014-10-01

    Despite the global oceanic distribution and recognised biogeochemical impact of coccolithoviruses (EhV), their diversity remains poorly understood. Here we employed a metagenomic approach to study the occurrence and progression of natural EhV community genomic variability. Analysis of EhV metagenomes from the early and late stages of an induced bloom led to three main discoveries. First, we observed resilient and specific genomic signatures in the EhV community associated with the Norwegian coast, which reinforce the existence of limitations to the capacity of dispersal and genomic exchange among EhV populations. Second, we identified a hyper-variable region (approximately 21kbp long) in the coccolithovirus genome. Third, we observed a clear trend for EhV relative amino-acid diversity to reduce from early to late stages of the bloom. This study validated two new methodological combinations, and proved very useful in the discovery of new genomic features associated with coccolithovirus natural communities. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Integrative workflows for metagenomic analysis

    PubMed Central

    Ladoukakis, Efthymios; Kolisis, Fragiskos N.; Chatziioannou, Aristotelis A.

    2014-01-01

    The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. PMID:25478562

  20. Making a living while starving in the dark: metagenomic insights into the energy dynamics of a carbonate cave.

    PubMed

    Ortiz, Marianyoly; Legatzki, Antje; Neilson, Julia W; Fryslie, Brandon; Nelson, William M; Wing, Rod A; Soderlund, Carol A; Pryor, Barry M; Maier, Raina M

    2014-02-01

    Carbonate caves represent subterranean ecosystems that are largely devoid of phototrophic primary production. In semiarid and arid regions, allochthonous organic carbon inputs entering caves with vadose-zone drip water are minimal, creating highly oligotrophic conditions; however, past research indicates that carbonate speleothem surfaces in these caves support diverse, predominantly heterotrophic prokaryotic communities. The current study applied a metagenomic approach to elucidate the community structure and potential energy dynamics of microbial communities, colonizing speleothem surfaces in Kartchner Caverns, a carbonate cave in semiarid, southeastern Arizona, USA. Manual inspection of a speleothem metagenome revealed a community genetically adapted to low-nutrient conditions with indications that a nitrogen-based primary production strategy is probable, including contributions from both Archaea and Bacteria. Genes for all six known CO2-fixation pathways were detected in the metagenome and RuBisCo genes representative of the Calvin-Benson-Bassham cycle were over-represented in Kartchner speleothem metagenomes relative to bulk soil, rhizosphere soil and deep-ocean communities. Intriguingly, quantitative PCR found Archaea to be significantly more abundant in the cave communities than in soils above the cave. MEtaGenome ANalyzer (MEGAN) analysis of speleothem metagenome sequence reads found Thaumarchaeota to be the third most abundant phylum in the community, and identified taxonomic associations to this phylum for indicator genes representative of multiple CO2-fixation pathways. The results revealed that this oligotrophic subterranean environment supports a unique chemoautotrophic microbial community with potentially novel nutrient cycling strategies. These strategies may provide key insights into other ecosystems dominated by oligotrophy, including aphotic subsurface soils or aquifers and photic systems such as arid deserts.

  1. Phylogenetic convolutional neural networks in metagenomics.

    PubMed

    Fioravanti, Diego; Giarratano, Ylenia; Maggio, Valerio; Agostinelli, Claudio; Chierici, Marco; Jurman, Giuseppe; Furlanello, Cesare

    2018-03-08

    Convolutional Neural Networks can be effectively used only when data are endowed with an intrinsic concept of neighbourhood in the input space, as is the case of pixels in images. We introduce here Ph-CNN, a novel deep learning architecture for the classification of metagenomics data based on the Convolutional Neural Networks, with the patristic distance defined on the phylogenetic tree being used as the proximity measure. The patristic distance between variables is used together with a sparsified version of MultiDimensional Scaling to embed the phylogenetic tree in a Euclidean space. Ph-CNN is tested with a domain adaptation approach on synthetic data and on a metagenomics collection of gut microbiota of 38 healthy subjects and 222 Inflammatory Bowel Disease patients, divided in 6 subclasses. Classification performance is promising when compared to classical algorithms like Support Vector Machines and Random Forest and a baseline fully connected neural network, e.g. the Multi-Layer Perceptron. Ph-CNN represents a novel deep learning approach for the classification of metagenomics data. Operatively, the algorithm has been implemented as a custom Keras layer taking care of passing to the following convolutional layer not only the data but also the ranked list of neighbourhood of each sample, thus mimicking the case of image data, transparently to the user.

  2. Metagenomic sequence of saline desert microbiota from wild ass sanctuary, Little Rann of Kutch, Gujarat, India.

    PubMed

    Patel, Rajesh; Mevada, Vishal; Prajapati, Dhaval; Dudhagara, Pravin; Koringa, Prakash; Joshi, C G

    2015-03-01

    We report Metagenome from the saline desert soil sample of Little Rann of Kutch, Gujarat State, India. Metagenome consisted of 633,760 sequences with size 141,307,202 bp and 56% G + C content. Metagenome sequence data are available at EBI under EBI Metagenomics database with accession no. ERP005612. Community metagenomics revealed total 1802 species belonged to 43 different phyla with dominating Marinobacter (48.7%) and Halobacterium (4.6%) genus in bacterial and archaeal domain respectively. Remarkably, 18.2% sequences in a poorly characterized group and 4% gene for various stress responses along with versatile presence of commercial enzyme were evident in a functional metagenome analysis.

  3. The Metagenome of Utricularia gibba's Traps: Into the Microbial Input to a Carnivorous Plant

    PubMed Central

    Alcaraz, Luis David; Martínez-Sánchez, Shamayim; Torres, Ignacio; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis

    2016-01-01

    The genome and transcriptome sequences of the aquatic, rootless, and carnivorous plant Utricularia gibba L. (Lentibulariaceae), were recently determined. Traps are necessary for U. gibba because they help the plant to survive in nutrient-deprived environments. The U. gibba's traps (Ugt) are specialized structures that have been proposed to selectively filter microbial inhabitants. To determine whether the traps indeed have a microbiome that differs, in composition or abundance, from the microbiome in the surrounding environment, we used whole-genome shotgun (WGS) metagenomics to describe both the taxonomic and functional diversity of the Ugt microbiome. We collected U. gibba plants from their natural habitat and directly sequenced the metagenome of the Ugt microbiome and its surrounding water. The total predicted number of species in the Ugt was more than 1,100. Using pan-genome fragment recruitment analysis, we were able to identify to the species level of some key Ugt players, such as Pseudomonas monteilii. Functional analysis of the Ugt metagenome suggests that the trap microbiome plays an important role in nutrient scavenging and assimilation while complementing the hydrolytic functions of the plant. PMID:26859489

  4. Metagenomic Analysis of the Ferret Fecal Viral Flora

    PubMed Central

    Smits, Saskia L.; Raj, V. Stalin; Oduber, Minoushka D.; Schapendonk, Claudia M. E.; Bodewes, Rogier; Provacia, Lisette; Stittelaar, Koert J.; Osterhaus, Albert D. M. E.; Haagmans, Bart L.

    2013-01-01

    Ferrets are widely used as a small animal model for a number of viral infections, including influenza A virus and SARS coronavirus. To further analyze the microbiological status of ferrets, their fecal viral flora was studied using a metagenomics approach. Novel viruses from the families Picorna-, Papilloma-, and Anelloviridae as well as known viruses from the families Astro-, Corona-, Parvo-, and Hepeviridae were identified in different ferret cohorts. Ferret kobu- and hepatitis E virus were mainly present in human household ferrets, whereas coronaviruses were found both in household as well as farm ferrets. Our studies illuminate the viral diversity found in ferrets and provide tools to prescreen for newly identified viruses that potentially could influence disease outcome of experimental virus infections in ferrets. PMID:23977082

  5. MIPE: A metagenome-based community structure explorer and SSU primer evaluation tool

    PubMed Central

    Zhou, Quan

    2017-01-01

    An understanding of microbial community structure is an important issue in the field of molecular ecology. The traditional molecular method involves amplification of small subunit ribosomal RNA (SSU rRNA) genes by polymerase chain reaction (PCR). However, PCR-based amplicon approaches are affected by primer bias and chimeras. With the development of high-throughput sequencing technology, unbiased SSU rRNA gene sequences can be mined from shotgun sequencing-based metagenomic or metatranscriptomic datasets to obtain a reflection of the microbial community structure in specific types of environment and to evaluate SSU primers. However, the use of short reads obtained through next-generation sequencing for primer evaluation has not been well resolved. The software MIPE (MIcrobiota metagenome Primer Explorer) was developed to adapt numerous short reads from metagenomes and metatranscriptomes. Using metagenomic or metatranscriptomic datasets as input, MIPE extracts and aligns rRNA to reveal detailed information on microbial composition and evaluate SSU rRNA primers. A mock dataset, a real Metagenomics Rapid Annotation using Subsystem Technology (MG-RAST) test dataset, two PrimerProspector test datasets and a real metatranscriptomic dataset were used to validate MIPE. The software calls Mothur (v1.33.3) and the SILVA database (v119) for the alignment and classification of rRNA genes from a metagenome or metatranscriptome. MIPE can effectively extract shotgun rRNA reads from a metagenome or metatranscriptome and is capable of classifying these sequences and exhibiting sensitivity to different SSU rRNA PCR primers. Therefore, MIPE can be used to guide primer design for specific environmental samples. PMID:28350876

  6. Therapeutic cloning: from consequences to contradiction.

    PubMed

    Coors, Marilyn

    2002-06-01

    The British Parliament legalized therapeutic cloning in December 2000 despite opposition from the European Union. The watershed event in Parliament's move was the active and unprecedented government support for the generation and destruction of human embryonic life merely as a means of medical advancement. This article contends that the utilitarian analysis of this procedure is necessary to identify the real world risks of therapeutic cloning but insufficient to identify the breach of defensible ethical limits that this procedure represents. A value-oriented approach to Kantian ethics demonstrates that the utilitarian endorsement of therapeutic cloning entails a contradiction of the necessity of human vulnerability and a faulty valuation of the human embryo. The concern is that a narrow utilitarian focus ultimately commodifies human embryonic life and preferences outcomes as the sole determinant of moral value.

  7. Functional metagenomic profiling of intestinal microbiome in extreme ageing.

    PubMed

    Rampelli, Simone; Candela, Marco; Turroni, Silvia; Biagi, Elena; Collino, Sebastiano; Franceschi, Claudio; O'Toole, Paul W; Brigidi, Patrizia

    2013-12-01

    Age-related alterations in human gut microbiota composition have been thoroughly described, but a detailed functional description of the intestinal bacterial coding capacity is still missing. In order to elucidate the contribution of the gut metagenome to the complex mosaic of human longevity, we applied shotgun sequencing to total fecal bacterial DNA in a selection of samples belonging to a well-characterized human ageing cohort. The age-related trajectory of the human gut microbiome was characterized by loss of genes for shortchain fatty acid production and an overall decrease in the saccharolytic potential, while proteolytic functions were more abundant than in the intestinal metagenome of younger adults. This altered functional profile was associated with a relevant enrichment in "pathobionts", i.e. opportunistic pro-inflammatory bacteria generally present in the adult gut ecosystem in low numbers. Finally, as a signature for long life we identified 116 microbial genes that significantly correlated with ageing. Collectively, our data emphasize the relationship between intestinal bacteria and human metabolism, by detailing the modifications in the gut microbiota as a consequence of and/or promoter of the physiological changes occurring in the human host upon ageing.

  8. Metagenomics reveals flavour metabolic network of cereal vinegar microbiota.

    PubMed

    Wu, Lin-Huan; Lu, Zhen-Ming; Zhang, Xiao-Juan; Wang, Zong-Min; Yu, Yong-Jian; Shi, Jin-Song; Xu, Zheng-Hong

    2017-04-01

    Multispecies microbial community formed through centuries of repeated batch acetic acid fermentation (AAF) is crucial for the flavour quality of traditional vinegar produced from cereals. However, the metabolism to generate and/or formulate the essential flavours by the multispecies microbial community is hardly understood. Here we used metagenomic approach to clarify in situ metabolic network of key microbes responsible for flavour synthesis of a typical cereal vinegar, Zhenjiang aromatic vinegar, produced by solid-state fermentation. First, we identified 3 organic acids, 7 amino acids, and 20 volatiles as dominant vinegar metabolites. Second, we revealed taxonomic and functional composition of the microbiota by metagenomic shotgun sequencing. A total of 86 201 predicted protein-coding genes from 35 phyla (951 genera) were involved in Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of Metabolism (42.3%), Genetic Information Processing (28.3%), and Environmental Information Processing (10.1%). Furthermore, a metabolic network for substrate breakdown and dominant flavour formation in vinegar microbiota was constructed, and microbial distribution discrepancy in different metabolic pathways was charted. This study helps elucidating different metabolic roles of microbes during flavour formation in vinegar microbiota. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Functional metagenomic profiling of intestinal microbiome in extreme ageing

    PubMed Central

    Rampelli, Simone; Candela, Marco; Turroni, Silvia; Biagi, Elena; Collino, Sebastiano; Franceschi, Claudio; O'Toole, Paul W; Brigidi, Patrizia

    2013-01-01

    Age-related alterations in human gut microbiota composition have been thoroughly described, but a detailed functional description of the intestinal bacterial coding capacity is still missing. In order to elucidate the contribution of the gut metagenome to the complex mosaic of human longevity, we applied shotgun sequencing to total fecal bacterial DNA in a selection of samples belonging to a well-characterized human ageing cohort. The age-related trajectory of the human gut microbiome was characterized by loss of genes for shortchain fatty acid production and an overall decrease in the saccharolytic potential, while proteolytic functions were more abundant than in the intestinal metagenome of younger adults. This altered functional profile was associated with a relevant enrichment in “pathobionts”, i.e. opportunistic pro-inflammatory bacteria generally present in the adult gut ecosystem in low numbers. Finally, as a signature for long life we identified 116 microbial genes that significantly correlated with ageing. Collectively, our data emphasize the relationship between intestinal bacteria and human metabolism, by detailing the modifications in the gut microbiota as a consequence of and/or promoter of the physiological changes occurring in the human host upon ageing. PMID:24334635

  10. Microbial survival strategies in ancient permafrost: insights from metagenomics

    USGS Publications Warehouse

    Mackelprang, Rachel; Burkert, Alexander; Haw, Monica; Mahendrarajah, Tara; Conaway, Christopher H.; Douglas, Thomas A.; Waldrop, Mark P.

    2017-01-01

    In permafrost (perennially frozen ground) microbes survive oligotrophic conditions, sub-zero temperatures, low water availability and high salinity over millennia. Viable life exists in permafrost tens of thousands of years old but we know little about the metabolic and physiological adaptations to the challenges presented by life in frozen ground over geologic time. In this study we asked whether increasing age and the associated stressors drive adaptive changes in community composition and function. We conducted deep metagenomic and 16 S rRNA gene sequencing across a Pleistocene permafrost chronosequence from 19 000 to 33 000 years before present (kyr). We found that age markedly affected community composition and reduced diversity. Reconstruction of paleovegetation from metagenomic sequence suggests vegetation differences in the paleo record are not responsible for shifts in community composition and function. Rather, we observed shifts consistent with long-term survival strategies in extreme cryogenic environments. These include increased reliance on scavenging detrital biomass, horizontal gene transfer, chemotaxis, dormancy, environmental sensing and stress response. Our results identify traits that may enable survival in ancient cryoenvironments with no influx of energy or new materials.

  11. Microbial survival strategies in ancient permafrost: insights from metagenomics.

    PubMed

    Mackelprang, Rachel; Burkert, Alexander; Haw, Monica; Mahendrarajah, Tara; Conaway, Christopher H; Douglas, Thomas A; Waldrop, Mark P

    2017-10-01

    In permafrost (perennially frozen ground) microbes survive oligotrophic conditions, sub-zero temperatures, low water availability and high salinity over millennia. Viable life exists in permafrost tens of thousands of years old but we know little about the metabolic and physiological adaptations to the challenges presented by life in frozen ground over geologic time. In this study we asked whether increasing age and the associated stressors drive adaptive changes in community composition and function. We conducted deep metagenomic and 16 S rRNA gene sequencing across a Pleistocene permafrost chronosequence from 19 000 to 33 000 years before present (kyr). We found that age markedly affected community composition and reduced diversity. Reconstruction of paleovegetation from metagenomic sequence suggests vegetation differences in the paleo record are not responsible for shifts in community composition and function. Rather, we observed shifts consistent with long-term survival strategies in extreme cryogenic environments. These include increased reliance on scavenging detrital biomass, horizontal gene transfer, chemotaxis, dormancy, environmental sensing and stress response. Our results identify traits that may enable survival in ancient cryoenvironments with no influx of energy or new materials.

  12. Microbial survival strategies in ancient permafrost: insights from metagenomics

    PubMed Central

    Mackelprang, Rachel; Burkert, Alexander; Haw, Monica; Mahendrarajah, Tara; Conaway, Christopher H; Douglas, Thomas A; Waldrop, Mark P

    2017-01-01

    In permafrost (perennially frozen ground) microbes survive oligotrophic conditions, sub-zero temperatures, low water availability and high salinity over millennia. Viable life exists in permafrost tens of thousands of years old but we know little about the metabolic and physiological adaptations to the challenges presented by life in frozen ground over geologic time. In this study we asked whether increasing age and the associated stressors drive adaptive changes in community composition and function. We conducted deep metagenomic and 16 S rRNA gene sequencing across a Pleistocene permafrost chronosequence from 19 000 to 33 000 years before present (kyr). We found that age markedly affected community composition and reduced diversity. Reconstruction of paleovegetation from metagenomic sequence suggests vegetation differences in the paleo record are not responsible for shifts in community composition and function. Rather, we observed shifts consistent with long-term survival strategies in extreme cryogenic environments. These include increased reliance on scavenging detrital biomass, horizontal gene transfer, chemotaxis, dormancy, environmental sensing and stress response. Our results identify traits that may enable survival in ancient cryoenvironments with no influx of energy or new materials. PMID:28696425

  13. Evaluation of the Cow Rumen Metagenome: Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Sczyrba, Alex

    2018-02-13

    DOE JGI's Alex Sczyrba on "Evaluation of the Cow Rumen Metagenome" and "Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  14. Evaluation of the Cow Rumen Metagenome: Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sczyrba, Alex

    2011-10-13

    DOE JGI's Alex Sczyrba on "Evaluation of the Cow Rumen Metagenome" and "Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  15. Phytoplankton Diversity and Geologically Relevant Carbon: Using metagenomics to determine phytoplankton biomarker production

    NASA Astrophysics Data System (ADS)

    Kodner, R. B.; Armbrust, E.

    2008-12-01

    Phytoplankton play an important role in the global carbon cycle, on short and long time scales. On long time scales, organic carbon, especially recalcitrant forms of biomass such as lipids, can be preserved and thus sequestered in sediments and rocks on geologic time scales. If the preserved lipids have some taxonomic specificity, they can be used as fossil biomarkers to characterize the community of organisms that contributed to ancient carbon sinks. Currently, it is not well understood how well the complex mixture of organic compounds preserved in geological carbon sinks represents the original community that produced those molecules or how the diversity of organism in a community is reflected in the lipid biomarkers they collectively synthesize. We have begun to investigate these questions by characterizing lipid biomarker production in modern phytoplankton communities with metagenomic data sets. Here we evaluate the information on community biomarker biosynthesis gathered from this type of data set using sterols as a case study. We have identified genes involved in sterol biosynthesis in a number of metagenomes and placed these genes in a phylogenetic context using a method designed to deal with short metagenomic sequences. The degree of taxonomic diversity of biomarker production measured with gene sequences can be more specific than lipid analysis alone.

  16. Shotgun Pyrosequencing Metagenomic Analyses of Dusts from Swine Confinement and Grain Facilities

    PubMed Central

    Boissy, Robert J.; Romberger, Debra J.; Roughead, William A.; Weissenburger-Moser, Lisa; Poole, Jill A.; LeVan, Tricia D.

    2014-01-01

    Inhalation of agricultural dusts causes inflammatory reactions and symptoms such as headache, fever, and malaise, which can progress to chronic airway inflammation and associated diseases, e.g. asthma, chronic bronchitis, chronic obstructive pulmonary disease, and hypersensitivity pneumonitis. Although in many agricultural environments feed particles are the major constituent of these dusts, the inflammatory responses that they provoke are likely attributable to particle-associated bacteria, archaebacteria, fungi, and viruses. In this study, we performed shotgun pyrosequencing metagenomic analyses of DNA from dusts from swine confinement facilities or grain elevators, with comparisons to dusts from pet-free households. DNA sequence alignment showed that 19% or 62% of shotgun pyrosequencing metagenomic DNA sequence reads from swine facility or household dusts, respectively, were of swine or human origin, respectively. In contrast only 2% of such reads from grain elevator dust were of mammalian origin. These metagenomic shotgun reads of mammalian origin were excluded from our analyses of agricultural dust microbiota. The ten most prevalent bacterial taxa identified in swine facility compared to grain elevator or household dust were comprised of 75%, 16%, and 42% gram-positive organisms, respectively. Four of the top five swine facility dust genera were assignable (Clostridium, Lactobacillus, Ruminococcus, and Eubacterium, ranging from 4% to 19% relative abundance). The relative abundances of these four genera were lower in dust from grain elevators or pet-free households. These analyses also highlighted the predominance in swine facility dust of Firmicutes (70%) at the phylum level, Clostridia (44%) at the Class level, and Clostridiales at the Order level (41%). In summary, shotgun pyrosequencing metagenomic analyses of agricultural dusts show that they differ qualitatively and quantitatively at the level of microbial taxa present, and that the bioinformatic analyses

  17. Shotgun pyrosequencing metagenomic analyses of dusts from swine confinement and grain facilities.

    PubMed

    Boissy, Robert J; Romberger, Debra J; Roughead, William A; Weissenburger-Moser, Lisa; Poole, Jill A; LeVan, Tricia D

    2014-01-01

    Inhalation of agricultural dusts causes inflammatory reactions and symptoms such as headache, fever, and malaise, which can progress to chronic airway inflammation and associated diseases, e.g. asthma, chronic bronchitis, chronic obstructive pulmonary disease, and hypersensitivity pneumonitis. Although in many agricultural environments feed particles are the major constituent of these dusts, the inflammatory responses that they provoke are likely attributable to particle-associated bacteria, archaebacteria, fungi, and viruses. In this study, we performed shotgun pyrosequencing metagenomic analyses of DNA from dusts from swine confinement facilities or grain elevators, with comparisons to dusts from pet-free households. DNA sequence alignment showed that 19% or 62% of shotgun pyrosequencing metagenomic DNA sequence reads from swine facility or household dusts, respectively, were of swine or human origin, respectively. In contrast only 2% of such reads from grain elevator dust were of mammalian origin. These metagenomic shotgun reads of mammalian origin were excluded from our analyses of agricultural dust microbiota. The ten most prevalent bacterial taxa identified in swine facility compared to grain elevator or household dust were comprised of 75%, 16%, and 42% gram-positive organisms, respectively. Four of the top five swine facility dust genera were assignable (Clostridium, Lactobacillus, Ruminococcus, and Eubacterium, ranging from 4% to 19% relative abundance). The relative abundances of these four genera were lower in dust from grain elevators or pet-free households. These analyses also highlighted the predominance in swine facility dust of Firmicutes (70%) at the phylum level, Clostridia (44%) at the Class level, and Clostridiales at the Order level (41%). In summary, shotgun pyrosequencing metagenomic analyses of agricultural dusts show that they differ qualitatively and quantitatively at the level of microbial taxa present, and that the bioinformatic analyses

  18. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes

    PubMed Central

    Hingamp, Pascal; Grimsley, Nigel; Acinas, Silvia G; Clerissi, Camille; Subirana, Lucie; Poulain, Julie; Ferrera, Isabel; Sarmento, Hugo; Villar, Emilie; Lima-Mendez, Gipsi; Faust, Karoline; Sunagawa, Shinichi; Claverie, Jean-Michel; Moreau, Hervé; Desdevises, Yves; Bork, Peer; Raes, Jeroen; de Vargas, Colomban; Karsenti, Eric; Kandels-Lewis, Stefanie; Jaillon, Olivier; Not, Fabrice; Pesant, Stéphane; Wincker, Patrick; Ogata, Hiroyuki

    2013-01-01

    Nucleo-cytoplasmic large DNA viruses (NCLDVs) constitute a group of eukaryotic viruses that can have crucial ecological roles in the sea by accelerating the turnover of their unicellular hosts or by causing diseases in animals. To better characterize the diversity, abundance and biogeography of marine NCLDVs, we analyzed 17 metagenomes derived from microbial samples (0.2–1.6 μm size range) collected during the Tara Oceans Expedition. The sample set includes ecosystems under-represented in previous studies, such as the Arabian Sea oxygen minimum zone (OMZ) and Indian Ocean lagoons. By combining computationally derived relative abundance and direct prokaryote cell counts, the abundance of NCLDVs was found to be in the order of 104–105 genomes ml−1 for the samples from the photic zone and 102–103 genomes ml−1 for the OMZ. The Megaviridae and Phycodnaviridae dominated the NCLDV populations in the metagenomes, although most of the reads classified in these families showed large divergence from known viral genomes. Our taxon co-occurrence analysis revealed a potential association between viruses of the Megaviridae family and eukaryotes related to oomycetes. In support of this predicted association, we identified six cases of lateral gene transfer between Megaviridae and oomycetes. Our results suggest that marine NCLDVs probably outnumber eukaryotic organisms in the photic layer (per given water mass) and that metagenomic sequence analyses promise to shed new light on the biodiversity of marine viruses and their interactions with potential hosts. PMID:23575371

  19. Exploring nucleo-cytoplasmic large DNA viruses in Tara Oceans microbial metagenomes.

    PubMed

    Hingamp, Pascal; Grimsley, Nigel; Acinas, Silvia G; Clerissi, Camille; Subirana, Lucie; Poulain, Julie; Ferrera, Isabel; Sarmento, Hugo; Villar, Emilie; Lima-Mendez, Gipsi; Faust, Karoline; Sunagawa, Shinichi; Claverie, Jean-Michel; Moreau, Hervé; Desdevises, Yves; Bork, Peer; Raes, Jeroen; de Vargas, Colomban; Karsenti, Eric; Kandels-Lewis, Stefanie; Jaillon, Olivier; Not, Fabrice; Pesant, Stéphane; Wincker, Patrick; Ogata, Hiroyuki

    2013-09-01

    Nucleo-cytoplasmic large DNA viruses (NCLDVs) constitute a group of eukaryotic viruses that can have crucial ecological roles in the sea by accelerating the turnover of their unicellular hosts or by causing diseases in animals. To better characterize the diversity, abundance and biogeography of marine NCLDVs, we analyzed 17 metagenomes derived from microbial samples (0.2-1.6 μm size range) collected during the Tara Oceans Expedition. The sample set includes ecosystems under-represented in previous studies, such as the Arabian Sea oxygen minimum zone (OMZ) and Indian Ocean lagoons. By combining computationally derived relative abundance and direct prokaryote cell counts, the abundance of NCLDVs was found to be in the order of 10(4)-10(5) genomes ml(-1) for the samples from the photic zone and 10(2)-10(3) genomes ml(-1) for the OMZ. The Megaviridae and Phycodnaviridae dominated the NCLDV populations in the metagenomes, although most of the reads classified in these families showed large divergence from known viral genomes. Our taxon co-occurrence analysis revealed a potential association between viruses of the Megaviridae family and eukaryotes related to oomycetes. In support of this predicted association, we identified six cases of lateral gene transfer between Megaviridae and oomycetes. Our results suggest that marine NCLDVs probably outnumber eukaryotic organisms in the photic layer (per given water mass) and that metagenomic sequence analyses promise to shed new light on the biodiversity of marine viruses and their interactions with potential hosts.

  20. Cloning

    MedlinePlus

    Cloning describes the processes used to create an exact genetic replica of another cell, tissue or organism. ... named Dolly. There are three different types of cloning: Gene cloning, which creates copies of genes or ...

  1. Metagenomics of Bacterial Diversity in Villa Luz Caves with Sulfur Water Springs

    PubMed Central

    Artacho, Alejandro; Bautista, José S.; Méndez, Roberto; Gamboa, María T.; Gamboa, Jesús R.; Gómez-Cruz, Rodolfo

    2018-01-01

    New biotechnology applications require in-depth preliminary studies of biodiversity. The methods of massive sequencing using metagenomics and bioinformatics tools offer us sufficient and reliable knowledge to understand environmental diversity, to know new microorganisms, and to take advantage of their functional genes. Villa Luz caves, in the southern Mexican state of Tabasco, are fed by at least 26 groundwater inlets, containing 300–500 mg L−1 H2S and <0.1 mg L−1 O2. We extracted environmental DNA for metagenomic analysis of collected samples in five selected Villa Luz caves sites, with pH values from 2.5 to 7. Foreign organisms found in this underground ecosystem can oxidize H2S to H2SO4. These include: biovermiculites, a bacterial association that can grow on the rock walls; snottites, that are whitish, viscous biofilms hanging from the rock walls, and sacks or bags of phlegm, which live within the aquatic environment of the springs. Through the emergency food assistance program (TEFAP) pyrosequencing, a total of 20,901 readings of amplification products from hypervariable regions V1 and V3 of 16S rRNA bacterial gene in whole and pure metagenomic DNA samples were generated. Seven bacterial phyla were identified. As a result, Proteobacteria was more frequent than Acidobacteria. Finally, acidophilic Proteobacteria was detected in UJAT5 sample. PMID:29361802

  2. What is Cloning?

    MedlinePlus

    Donate Home Cloning What is Cloning What is Cloning Clones are organisms that are exact genetic copies. ... clones made through modern cloning technologies. How Is Cloning Done? Many people first heard of cloning when ...

  3. Metagenomic Evidence for H2 Oxidation and H2 Production by Serpentinite-Hosted Subsurface Microbial Communities

    PubMed Central

    Brazelton, William J.; Nelson, Bridget; Schrenk, Matthew O.

    2012-01-01

    Ultramafic rocks in the Earth’s mantle represent a tremendous reservoir of carbon and reducing power. Upon tectonic uplift and exposure to fluid flow, serpentinization of these materials generates copious energy, sustains abiogenic synthesis of organic molecules, and releases hydrogen gas (H2). In order to assess the potential for microbial H2 utilization fueled by serpentinization, we conducted metagenomic surveys of a marine serpentinite-hosted hydrothermal chimney (at the Lost City hydrothermal field) and two continental serpentinite-hosted alkaline seeps (at the Tablelands Ophiolite, Newfoundland). Novel [NiFe]-hydrogenase sequences were identified at both the marine and continental sites, and in both cases, phylogenetic analyses indicated aerobic, potentially autotrophic Betaproteobacteria belonging to order Burkholderiales as the most likely H2-oxidizers. Both sites also yielded metagenomic evidence for microbial H2 production catalyzed by [FeFe]-hydrogenases in anaerobic Gram-positive bacteria belonging to order Clostridiales. In addition, we present metagenomic evidence at both sites for aerobic carbon monoxide utilization and anaerobic carbon fixation via the Wood–Ljungdahl pathway. In general, our results point to H2-oxidizing Betaproteobacteria thriving in shallow, oxic–anoxic transition zones and the anaerobic Clostridia thriving in anoxic, deep subsurface habitats. These data demonstrate the feasibility of metagenomic investigations into novel subsurface habitats via surface-exposed seeps and indicate the potential for H2-powered primary production in serpentinite-hosted subsurface habitats. PMID:22232619

  4. Metagenomic evidence for h(2) oxidation and h(2) production by serpentinite-hosted subsurface microbial communities.

    PubMed

    Brazelton, William J; Nelson, Bridget; Schrenk, Matthew O

    2012-01-01

    Ultramafic rocks in the Earth's mantle represent a tremendous reservoir of carbon and reducing power. Upon tectonic uplift and exposure to fluid flow, serpentinization of these materials generates copious energy, sustains abiogenic synthesis of organic molecules, and releases hydrogen gas (H(2)). In order to assess the potential for microbial H(2) utilization fueled by serpentinization, we conducted metagenomic surveys of a marine serpentinite-hosted hydrothermal chimney (at the Lost City hydrothermal field) and two continental serpentinite-hosted alkaline seeps (at the Tablelands Ophiolite, Newfoundland). Novel [NiFe]-hydrogenase sequences were identified at both the marine and continental sites, and in both cases, phylogenetic analyses indicated aerobic, potentially autotrophic Betaproteobacteria belonging to order Burkholderiales as the most likely H(2)-oxidizers. Both sites also yielded metagenomic evidence for microbial H(2) production catalyzed by [FeFe]-hydrogenases in anaerobic Gram-positive bacteria belonging to order Clostridiales. In addition, we present metagenomic evidence at both sites for aerobic carbon monoxide utilization and anaerobic carbon fixation via the Wood-Ljungdahl pathway. In general, our results point to H(2)-oxidizing Betaproteobacteria thriving in shallow, oxic-anoxic transition zones and the anaerobic Clostridia thriving in anoxic, deep subsurface habitats. These data demonstrate the feasibility of metagenomic investigations into novel subsurface habitats via surface-exposed seeps and indicate the potential for H(2)-powered primary production in serpentinite-hosted subsurface habitats.

  5. Bioinformatics tools for quantitative and functional metagenome and metatranscriptome data analysis in microbes.

    PubMed

    Niu, Sheng-Yong; Yang, Jinyu; McDermaid, Adam; Zhao, Jing; Kang, Yu; Ma, Qin

    2017-05-08

    Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  6. EBI metagenomics—a new resource for the analysis and archiving of metagenomic data

    PubMed Central

    Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta

    2014-01-01

    Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive. PMID:24165880

  7. The rumen microbial metagenome associated with high methane production in cattle.

    PubMed

    Wallace, R John; Rooke, John A; McKain, Nest; Duthie, Carol-Anne; Hyslop, Jimmy J; Ross, David W; Waterhouse, Anthony; Watson, Mick; Roehe, Rainer

    2015-10-23

    Methane represents 16 % of total anthropogenic greenhouse gas emissions. It has been estimated that ruminant livestock produce ca. 29 % of this methane. As individual animals produce consistently different quantities of methane, understanding the basis for these differences may lead to new opportunities for mitigating ruminal methane emissions. Metagenomics is a powerful new tool for understanding the composition and function of complex microbial communities. Here we have applied metagenomics to the rumen microbial community to identify differences in the microbiota and metagenome that lead to high- and low-methane-emitting cattle phenotypes. Four pairs of beef cattle were selected for extreme high and low methane emissions from 72 animals, matched for breed (Aberdeen-Angus or Limousin cross) and diet (high or medium concentrate). Community analysis was carried out by qPCR of 16S and 18S rRNA genes and by alignment of Illumina HiSeq reads to the GREENGENES database. Total genomic reads were aligned to the KEGG genes databasefor functional analysis. Deep sequencing produced on average 11.3 Gb per sample. 16S rRNA gene abundances indicated that archaea, predominantly Methanobrevibacter, were 2.5× more numerous (P = 0.026) in high emitters, whereas among bacteria Proteobacteria, predominantly Succinivibrionaceae, were 4-fold less abundant (2.7 vs. 11.2 %; P = 0.002). KEGG analysis revealed that archaeal genes leading directly or indirectly to methane production were 2.7-fold more abundant in high emitters. Genes less abundant in high emitters included acetate kinase, electron transport complex proteins RnfC and RnfD and glucose-6-phosphate isomerase. Sequence data were assembled de novo and over 1.5 million proteins were annotated on the subsequent metagenome scaffolds. Less than half of the predicted genes matched matched a domain within Pfam. Amongst 2774 identified proteins of the 20 KEGG orthologues that correlated with methane emissions, only 16 showed

  8. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    PubMed

    Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  9. Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

    PubMed Central

    Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417

  10. High frequency of phylogenetically diverse reductive dehalogenase-homologous genes in deep subseafloor sedimentary metagenomes

    PubMed Central

    Kawai, Mikihiko; Futagami, Taiki; Toyoda, Atsushi; Takaki, Yoshihiro; Nishi, Shinro; Hori, Sayaka; Arai, Wataru; Tsubouchi, Taishi; Morono, Yuki; Uchiyama, Ikuo; Ito, Takehiko; Fujiyama, Asao; Inagaki, Fumio; Takami, Hideto

    2014-01-01

    Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf) or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA) homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5, and 107.0 mbsf) at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB), key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere. PMID:24624126

  11. High frequency of phylogenetically diverse reductive dehalogenase-homologous genes in deep subseafloor sedimentary metagenomes.

    PubMed

    Kawai, Mikihiko; Futagami, Taiki; Toyoda, Atsushi; Takaki, Yoshihiro; Nishi, Shinro; Hori, Sayaka; Arai, Wataru; Tsubouchi, Taishi; Morono, Yuki; Uchiyama, Ikuo; Ito, Takehiko; Fujiyama, Asao; Inagaki, Fumio; Takami, Hideto

    2014-01-01

    Marine subsurface sediments on the Pacific margin harbor diverse microbial communities even at depths of several hundreds meters below the seafloor (mbsf) or more. Previous PCR-based molecular analysis showed the presence of diverse reductive dehalogenase gene (rdhA) homologs in marine subsurface sediment, suggesting that anaerobic respiration of organohalides is one of the possible energy-yielding pathways in the organic-rich sedimentary habitat. However, primer-independent molecular characterization of rdhA has remained to be demonstrated. Here, we studied the diversity and frequency of rdhA homologs by metagenomic analysis of five different depth horizons (0.8, 5.1, 18.6, 48.5, and 107.0 mbsf) at Site C9001 off the Shimokita Peninsula of Japan. From all metagenomic pools, remarkably diverse rdhA-homologous sequences, some of which are affiliated with novel clusters, were observed with high frequency. As a comparison, we also examined frequency of dissimilatory sulfite reductase genes (dsrAB), key functional genes for microbial sulfate reduction. The dsrAB were also widely observed in the metagenomic pools whereas the frequency of dsrAB genes was generally smaller than that of rdhA-homologous genes. The phylogenetic composition of rdhA-homologous genes was similar among the five depth horizons. Our metagenomic data revealed that subseafloor rdhA homologs are more diverse than previously identified from PCR-based molecular studies. Spatial distribution of similar rdhA homologs across wide depositional ages indicates that the heterotrophic metabolic processes mediated by the genes can be ecologically important, functioning in the organic-rich subseafloor sedimentary biosphere.

  12. Host-Associated Metagenomics: A Guide to Generating Infectious RNA Viromes

    PubMed Central

    Robert, Catherine; Pascalis, Hervé; Michelle, Caroline; Jardot, Priscilla; Charrel, Rémi; Raoult, Didier; Desnues, Christelle

    2015-01-01

    Background Metagenomic analyses have been widely used in the last decade to describe viral communities in various environments or to identify the etiology of human, animal, and plant pathologies. Here, we present a simple and standardized protocol that allows for the purification and sequencing of RNA viromes from complex biological samples with an important reduction of host DNA and RNA contaminants, while preserving the infectivity of viral particles. Principal Findings We evaluated different viral purification steps, random reverse transcriptions and sequence-independent amplifications of a pool of representative RNA viruses. Viruses remained infectious after the purification process. We then validated the protocol by sequencing the RNA virome of human body lice engorged in vitro with artificially contaminated human blood. The full genomes of the most abundant viruses absorbed by the lice during the blood meal were successfully sequenced. Interestingly, random amplifications differed in the genome coverage of segmented RNA viruses. Moreover, the majority of reads were taxonomically identified, and only 7–15% of all reads were classified as “unknown”, depending on the random amplification method. Conclusion The protocol reported here could easily be applied to generate RNA viral metagenomes from complex biological samples of different origins. Our protocol allows further virological characterizations of the described viral communities because it preserves the infectivity of viral particles and allows for the isolation of viruses. PMID:26431175

  13. Bacterial diversity of the American sand fly Lutzomyia intermedia using high-throughput metagenomic sequencing.

    PubMed

    Monteiro, Carolina Cunha; Villegas, Luis Eduardo Martinez; Campolina, Thais Bonifácio; Pires, Ana Clara Machado Araújo; Miranda, Jose Carlos; Pimenta, Paulo Filemon Paolucci; Secundino, Nagila Francinete Costa

    2016-08-31

    Parasites of the genus Leishmania cause a broad spectrum of diseases, collectively known as leishmaniasis, in humans worldwide. American cutaneous leishmaniasis is a neglected disease transmitted by sand fly vectors including Lutzomyia intermedia, a proven vector. The female sand fly can acquire or deliver Leishmania spp. parasites while feeding on a blood meal, which is required for nutrition, egg development and survival. The microbiota composition and abundance varies by food source, life stages and physiological conditions. The sand fly microbiota can affect parasite life-cycle in the vector. We performed a metagenomic analysis for microbiota composition and abundance in Lu. intermedia, from an endemic area in Brazil. The adult insects were collected using CDC light traps, morphologically identified, carefully sterilized, dissected under a microscope and the females separated into groups according to their physiological condition: (i) absence of blood meal (unfed = UN); (ii) presence of blood meal (blood-fed = BF); and (iii) presence of developed ovaries (gravid = GR). Then, they were processed for metagenomics with Illumina Hiseq Sequencing in order to be sequence analyzed and to obtain the taxonomic profiles of the microbiota. Bacterial metagenomic analysis revealed differences in microbiota composition based upon the distinct physiological stages of the adult insect. Sequence identification revealed two phyla (Proteobacteria and Actinobacteria), 11 families and 15 genera; 87 % of the bacteria were Gram-negative, while only one family and two genera were identified as Gram-positive. The genera Ochrobactrum, Bradyrhizobium and Pseudomonas were found across all of the groups. The metagenomic analysis revealed that the microbiota of the Lu. intermedia female sand flies are distinct under specific physiological conditions and consist of 15 bacterial genera. The Ochrobactrum, Bradyrhizobium and Pseudomonas were the common genera. Our results detailing

  14. MetAMOS: a modular and open source metagenomic assembly and analysis pipeline

    PubMed Central

    2013-01-01

    We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing computational cost. MetAMOS can be downloaded from: https://github.com/treangen/MetAMOS. PMID:23320958

  15. Under-detection of endospore-forming Firmicutes in metagenomic data

    DOE PAGES

    Filippidou, Sevasti; Junier, Thomas; Wunderlin, Tina; ...

    2015-04-25

    Microbial diversity studies based on metagenomic sequencing have greatly enhanced our knowledge of the microbial world. However, one caveat is the fact that not all microorganisms are equally well detected, questioning the universality of this approach. Firmicutes are known to be a dominant bacterial group. Several Firmicutes species are endospore formers and this property makes them hardy in potentially harsh conditions, and thus likely to be present in a wide variety of environments, even as residents and not functional players. While metagenomic libraries can be expected to contain endospore formers, endospores are known to be resilient to many traditional methodsmore » of DNA isolation and thus potentially undetectable. In this study we evaluated the representation of endospore-forming Firmicutes in 73 published metagenomic datasets using two molecular markers unique to this bacterial group ( spo0A and gpr). Both markers were notably absent in well-known habitats of Firmicutes such as soil, with spo0A found only in three mammalian gut microbiomes. A tailored DNA extraction method resulted in the detection of a large diversity of endospore-formers in amplicon sequencing of the 16S rRNA and spo0A genes. However, shotgun classification was still poor with only a minor fraction of the community assigned to Firmicutes. Thus, removing a specific bias in a molecular workflow improves detection in amplicon sequencing, but it was insufficient to overcome the limitations for detecting endospore-forming Firmicutes in whole-genome metagenomics. In conclusion, this study highlights the importance of understanding the specific methodological biases that can contribute to improve the universality of metagenomic approaches.« less

  16. Under-detection of endospore-forming Firmicutes in metagenomic data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Filippidou, Sevasti; Junier, Thomas; Wunderlin, Tina

    Microbial diversity studies based on metagenomic sequencing have greatly enhanced our knowledge of the microbial world. However, one caveat is the fact that not all microorganisms are equally well detected, questioning the universality of this approach. Firmicutes are known to be a dominant bacterial group. Several Firmicutes species are endospore formers and this property makes them hardy in potentially harsh conditions, and thus likely to be present in a wide variety of environments, even as residents and not functional players. While metagenomic libraries can be expected to contain endospore formers, endospores are known to be resilient to many traditional methodsmore » of DNA isolation and thus potentially undetectable. In this study we evaluated the representation of endospore-forming Firmicutes in 73 published metagenomic datasets using two molecular markers unique to this bacterial group ( spo0A and gpr). Both markers were notably absent in well-known habitats of Firmicutes such as soil, with spo0A found only in three mammalian gut microbiomes. A tailored DNA extraction method resulted in the detection of a large diversity of endospore-formers in amplicon sequencing of the 16S rRNA and spo0A genes. However, shotgun classification was still poor with only a minor fraction of the community assigned to Firmicutes. Thus, removing a specific bias in a molecular workflow improves detection in amplicon sequencing, but it was insufficient to overcome the limitations for detecting endospore-forming Firmicutes in whole-genome metagenomics. In conclusion, this study highlights the importance of understanding the specific methodological biases that can contribute to improve the universality of metagenomic approaches.« less

  17. Culture-Independent Identification of Manganese-Oxidizing Genes from Deep-Sea Hydrothermal Vent Chemoautotrophic Ferromanganese Microbial Communities Using a Metagenomic Approach

    NASA Astrophysics Data System (ADS)

    Davis, R.; Tebo, B. M.

    2013-12-01

    Microbial activity has long been recognized as being important to the fate of manganese (Mn) in hydrothermal systems, yet we know very little about the organisms that catalyze Mn oxidation, the mechanisms by which Mn is oxidized or the physiological function that Mn oxidation serves in these hydrothermal systems. Hydrothermal vents with thick ferromanganese microbial mats and Mn oxide-coated rocks observed throughout the Pacific Ring of Fire are ideal models to study the mechanisms of microbial Mn oxidation, as well as primary productivity in these metal-cycling ecosystems. We sampled ferromanganese microbial mats from Vai Lili Vent Field (Tmax=43°C) located on the Eastern Lau Spreading Center and Mn oxide-encrusted rhyolytic pumice (4°C) from Niua South Seamount on the Tonga Volcanic Arc. Metagenomic libraries were constructed and assembled from these samples and key genes known to be involved in Mn oxidation and carbon fixation pathways were identified in the reconstructed genomes. The Vai Lili metagenome assembled to form 121,157 contiguous sequences (contigs) greater than 1000bp in length, with an N50 of 8,261bp and a total metagenome size of 593 Mbp. Contigs were binned using an emergent self-organizing map of tetranucleotide frequencies. Putative homologs of the multicopper Mn-oxidase MnxG were found in the metagenome that were related to both the Pseudomonas-like and Bacillus-like forms of the enzyme. The bins containing the Pseudomonas-like mnxG genes are most closely related to uncultured Deltaproteobacteria and Chloroflexi. The Deltaproteobacteria bin appears to be an obligate anaerobe with possible chemoautotrophic metabolisms, while the Chloroflexi appears to be a heterotrophic organism. The metagenome from the Mn-stained pumice was assembled into 122,092 contigs greater than 1000bp in length with an N50 of 7635 and a metagenome size of 385 Mbp. Both forms of mnxG genes are present in this metagenome as well as the genes encoding the putative Mn

  18. Statistical Methods for Detecting Differentially Abundant Features in Clinical Metagenomic Samples

    PubMed Central

    White, James Robert; Nagarajan, Niranjan; Pop, Mihai

    2009-01-01

    Numerous studies are currently underway to characterize the microbial communities inhabiting our world. These studies aim to dramatically expand our understanding of the microbial biosphere and, more importantly, hope to reveal the secrets of the complex symbiotic relationship between us and our commensal bacterial microflora. An important prerequisite for such discoveries are computational tools that are able to rapidly and accurately compare large datasets generated from complex bacterial communities to identify features that distinguish them. We present a statistical method for comparing clinical metagenomic samples from two treatment populations on the basis of count data (e.g. as obtained through sequencing) to detect differentially abundant features. Our method, Metastats, employs the false discovery rate to improve specificity in high-complexity environments, and separately handles sparsely-sampled features using Fisher's exact test. Under a variety of simulations, we show that Metastats performs well compared to previously used methods, and significantly outperforms other methods for features with sparse counts. We demonstrate the utility of our method on several datasets including a 16S rRNA survey of obese and lean human gut microbiomes, COG functional profiles of infant and mature gut microbiomes, and bacterial and viral metabolic subsystem data inferred from random sequencing of 85 metagenomes. The application of our method to the obesity dataset reveals differences between obese and lean subjects not reported in the original study. For the COG and subsystem datasets, we provide the first statistically rigorous assessment of the differences between these populations. The methods described in this paper are the first to address clinical metagenomic datasets comprising samples from multiple subjects. Our methods are robust across datasets of varied complexity and sampling level. While designed for metagenomic applications, our software can also be applied

  19. Unbiased Taxonomic Annotation of Metagenomic Samples

    PubMed Central

    Fosso, Bruno; Pesole, Graziano; Rosselló, Francesc

    2018-01-01

    Abstract The classification of reads from a metagenomic sample using a reference taxonomy is usually based on first mapping the reads to the reference sequences and then classifying each read at a node under the lowest common ancestor of the candidate sequences in the reference taxonomy with the least classification error. However, this taxonomic annotation can be biased by an imbalanced taxonomy and also by the presence of multiple nodes in the taxonomy with the least classification error for a given read. In this article, we show that the Rand index is a better indicator of classification error than the often used area under the receiver operating characteristic (ROC) curve and F-measure for both balanced and imbalanced reference taxonomies, and we also address the second source of bias by reducing the taxonomic annotation problem for a whole metagenomic sample to a set cover problem, for which a logarithmic approximation can be obtained in linear time and an exact solution can be obtained by integer linear programming. Experimental results with a proof-of-concept implementation of the set cover approach to taxonomic annotation in a next release of the TANGO software show that the set cover approach further reduces ambiguity in the taxonomic annotation obtained with TANGO without distorting the relative abundance profile of the metagenomic sample. PMID:29028181

  20. ELIXIR pilot action: Marine metagenomics - towards a domain specific set of sustainable services.

    PubMed

    Robertsen, Espen Mikal; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Bongo, Lars Ailo; Willassen, Nils Peder

    2017-01-01

    Metagenomics, the study of genetic material recovered directly from environmental samples, has the potential to provide insight into the structure and function of heterogeneous microbial communities.  There has been an increased use of metagenomics to discover and understand the diverse biosynthetic capacities of marine microbes, thereby allowing them to be exploited for industrial, food, and health care products. This ELIXIR pilot action was motivated by the need to establish dedicated data resources and harmonized metagenomics pipelines for the marine domain, in order to enhance the exploration and exploitation of marine genetic resources. In this paper, we summarize some of the results from the ELIXIR pilot action "Marine metagenomics - towards user centric services".

  1. Bioremediation potential of microorganisms derived from petroleum reservoirs.

    PubMed

    Dellagnezze, Bruna Martins; de Sousa, Gabriel Vasconcelos; Martins, Laercio Lopes; Domingos, Daniela Ferreira; Limache, Elmer E G; de Vasconcellos, Suzan Pantaroto; da Cruz, Georgiana Feitosa; de Oliveira, Valéria Maia

    2014-12-15

    Bacterial strains and metagenomic clones, both obtained from petroleum reservoirs, were evaluated for petroleum degradation abilities either individually or in pools using seawater microcosms for 21 days. Gas Chromatography-Flame Ionization Detector (GC-FID) and Gas Chromatography-Mass Spectrometry (GC-MS) analyses were carried out to evaluate crude oil degradation. The results showed that metagenomic clones 1A and 2B were able to biodegrade n-alkanes (C14 to C33) and isoprenoids (phytane and pristane), with rates ranging from 31% to 47%, respectively. The bacteria Dietzia maris CBMAI 705 and Micrococcus sp. CBMAI 636 showed higher rates reaching 99% after 21 days. The metagenomic clone pool biodegraded these compounds at rates ranging from 11% to 45%. Regarding aromatic compound biodegradation, metagenomic clones 2B and 10A were able to biodegrade up to 94% of phenanthrene and methylphenanthrenes (3-MP, 2-MP, 9-MP and 1-MP) with rates ranging from 55% to 70% after 21 days, while the bacteria Dietzia maris CBMAI 705 and Micrococcus sp. CBMAI 636 were able to biodegrade 63% and up to 99% of phenanthrene, respectively, and methylphenanthrenes (3-MP, 2-MP, 9-MP and 1-MP) with rates ranging from 23% to 99% after 21 days. In this work, isolated strains as well as metagenomic clones were capable of degrading several petroleum compounds, revealing an innovative strategy and a great potential for further biotechnological and bioremediation applications. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Molecular Diagnosis of Orthopedic-Device-Related Infection Directly from Sonication Fluid by Metagenomic Sequencing

    PubMed Central

    Sanderson, Nicholas D.; Atkins, Bridget L.; Brent, Andrew J.; Cole, Kevin; Foster, Dona; McNally, Martin A.; Oakley, Sarah; Peto, Leon; Taylor, Adrian; Peto, Tim E. A.; Crook, Derrick W.; Eyre, David W.

    2017-01-01

    ABSTRACT Culture of multiple periprosthetic tissue samples is the current gold standard for microbiological diagnosis of prosthetic joint infections (PJI). Additional diagnostic information may be obtained through culture of sonication fluid from explants. However, current techniques can have relatively low sensitivity, with prior antimicrobial therapy and infection by fastidious organisms influencing results. We assessed if metagenomic sequencing of total DNA extracts obtained direct from sonication fluid can provide an alternative rapid and sensitive tool for diagnosis of PJI. We compared metagenomic sequencing with standard aerobic and anaerobic culture in 97 sonication fluid samples from prosthetic joint and other orthopedic device infections. Reads from Illumina MiSeq sequencing were taxonomically classified using Kraken. Using 50 derivation samples, we determined optimal thresholds for the number and proportion of bacterial reads required to identify an infection and confirmed our findings in 47 independent validation samples. Compared to results from sonication fluid culture, the species-level sensitivity of metagenomic sequencing was 61/69 (88%; 95% confidence interval [CI], 77 to 94%; for derivation samples 35/38 [92%; 95% CI, 79 to 98%]; for validation samples, 26/31 [84%; 95% CI, 66 to 95%]), and genus-level sensitivity was 64/69 (93%; 95% CI, 84 to 98%). Species-level specificity, adjusting for plausible fastidious causes of infection, species found in concurrently obtained tissue samples, and prior antibiotics, was 85/97 (88%; 95% CI, 79 to 93%; for derivation samples, 43/50 [86%; 95% CI, 73 to 94%]; for validation samples, 42/47 [89%; 95% CI, 77 to 96%]). High levels of human DNA contamination were seen despite the use of laboratory methods to remove it. Rigorous laboratory good practice was required to minimize bacterial DNA contamination. We demonstrate that metagenomic sequencing can provide accurate diagnostic information in PJI. Our findings

  3. Metagenomic sequencing reveals microbiota and its functional potential associated with periodontal disease

    PubMed Central

    Wang, Jinfeng; Qi, Ji; Zhao, Hui; He, Shu; Zhang, Yifei; Wei, Shicheng; Zhao, Fangqing

    2013-01-01

    Although attempts have been made to reveal the relationships between bacteria and human health, little is known about the species and function of the microbial community associated with oral diseases. In this study, we report the sequencing of 16 metagenomic samples collected from dental swabs and plaques representing four periodontal states. Insights into the microbial community structure and the metabolic variation associated with periodontal health and disease were obtained. We observed a strong correlation between community structure and disease status, and described a core disease-associated community. A number of functional genes and metabolic pathways including bacterial chemotaxis and glycan biosynthesis were over-represented in the microbiomes of periodontal disease. A significant amount of novel species and genes were identified in the metagenomic assemblies. Our study enriches the understanding of the oral microbiome and sheds light on the contribution of microorganisms to the formation and succession of dental plaques and oral diseases. PMID:23673380

  4. Challenges and opportunities in understanding microbial communities with metagenome assembly (accompanied by IPython Notebook tutorial)

    DOE PAGES

    Howe, Adina; Chain, Patrick S. G.

    2015-07-09

    Metagenomic investigations hold great promise for informing the genetics, physiology, and ecology of environmental microorganisms. Current challenges for metagenomic analysis are related to our ability to connect the dots between sequencing reads, their population of origin, and their encoding functions. Assembly-based methods reduce dataset size by extending overlapping reads into larger contiguous sequences (contigs), providing contextual information for genetic sequences that does not rely on existing references. These methods, however, tend to be computationally intensive and are again challenged by sequencing errors as well as by genomic repeats. While numerous tools have been developed based on these methodological concepts, theymore » present confounding choices and training requirements to metagenomic investigators. To help with accessibility to assembly tools, this review also includes an IPython Notebook metagenomic assembly tutorial. This tutorial has instructions for execution any operating system using Amazon Elastic Cloud Compute and guides users through downloading, assembly, and mapping reads to contigs of a mock microbiome metagenome. Despite its challenges, metagenomic analysis has already revealed novel insights into many environments on Earth. As software, training, and data continue to emerge, metagenomic data access and its discoveries will to grow.« less

  5. Challenges and opportunities in understanding microbial communities with metagenome assembly (accompanied by IPython Notebook tutorial)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Howe, Adina; Chain, Patrick S. G.

    Metagenomic investigations hold great promise for informing the genetics, physiology, and ecology of environmental microorganisms. Current challenges for metagenomic analysis are related to our ability to connect the dots between sequencing reads, their population of origin, and their encoding functions. Assembly-based methods reduce dataset size by extending overlapping reads into larger contiguous sequences (contigs), providing contextual information for genetic sequences that does not rely on existing references. These methods, however, tend to be computationally intensive and are again challenged by sequencing errors as well as by genomic repeats. While numerous tools have been developed based on these methodological concepts, theymore » present confounding choices and training requirements to metagenomic investigators. To help with accessibility to assembly tools, this review also includes an IPython Notebook metagenomic assembly tutorial. This tutorial has instructions for execution any operating system using Amazon Elastic Cloud Compute and guides users through downloading, assembly, and mapping reads to contigs of a mock microbiome metagenome. Despite its challenges, metagenomic analysis has already revealed novel insights into many environments on Earth. As software, training, and data continue to emerge, metagenomic data access and its discoveries will to grow.« less

  6. Harvesting of novel polyhydroxyalkanaote (PHA) synthase encoding genes from a soil metagenome library using phenotypic screening.

    PubMed

    Schallmey, Marcus; Ly, Anh; Wang, Chunxia; Meglei, Gabriela; Voget, Sonja; Streit, Wolfgang R; Driscoll, Brian T; Charles, Trevor C

    2011-08-01

    We previously reported the construction of metagenomic libraries in the IncP cosmid vector pRK7813, enabling heterologous expression of these broad-host-range libraries in multiple bacterial hosts. Expressing these libraries in Sinorhizobium meliloti, we have successfully complemented associated phenotypes of polyhydroxyalkanoate synthesis mutants. DNA sequence analysis of three clones indicates that the complementing genes are homologous to, but substantially different from, known polyhydroxyalkanaote synthase-encoding genes. Thus we have demonstrated the ability to isolate diverse genes for polyhydroxyalkanaote synthesis by functional complementation of defined mutants. Such genes might be of use in the engineering of more efficient systems for the industrial production of bioplastics. The use of functional complementation will also provide a vehicle to probe the genetics of polyhydroxyalkanaote metabolism and its relation to carbon availability in complex microbial assemblages. 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  7. Genomics and metagenomics in medical microbiology.

    PubMed

    Padmanabhan, Roshan; Mishra, Ajay Kumar; Raoult, Didier; Fournier, Pierre-Edouard

    2013-12-01

    Over the last two decades, sequencing tools have evolved from laborious time-consuming methodologies to real-time detection and deciphering of genomic DNA. Genome sequencing, especially using next generation sequencing (NGS) has revolutionized the landscape of microbiology and infectious disease. This deluge of sequencing data has not only enabled advances in fundamental biology but also helped improve diagnosis, typing of pathogen, virulence and antibiotic resistance detection, and development of new vaccines and culture media. In addition, NGS also enabled efficient analysis of complex human micro-floras, both commensal, and pathological, through metagenomic methods, thus helping the comprehension and management of human diseases such as obesity. This review summarizes technological advances in genomics and metagenomics relevant to the field of medical microbiology. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Elucidation of Taste- and Odor-Producing Bacteria and Toxigenic Cyanobacteria in a Midwestern Drinking Water Supply Reservoir by Shotgun Metagenomic Analysis

    PubMed Central

    Graham, Jennifer L.; Harris, Theodore D.

    2016-01-01

    ABSTRACT While commonplace in clinical settings, DNA-based assays for identification or enumeration of drinking water pathogens and other biological contaminants remain widely unadopted by the monitoring community. In this study, shotgun metagenomics was used to identify taste-and-odor producers and toxin-producing cyanobacteria over a 2-year period in a drinking water reservoir. The sequencing data implicated several cyanobacteria, including Anabaena spp., Microcystis spp., and an unresolved member of the order Oscillatoriales as the likely principal producers of geosmin, microcystin, and 2-methylisoborneol (MIB), respectively. To further demonstrate this, quantitative PCR (qPCR) assays targeting geosmin-producing Anabaena and microcystin-producing Microcystis were utilized, and these data were fitted using generalized linear models and compared with routine monitoring data, including microscopic cell counts, sonde-based physicochemical analyses, and assays of all inorganic and organic nitrogen and phosphorus forms and fractions. The qPCR assays explained the greatest variation in observed geosmin (adjusted R2 = 0.71) and microcystin (adjusted R2 = 0.84) concentrations over the study period, highlighting their potential for routine monitoring applications. The origin of the monoterpene cyclase required for MIB biosynthesis was putatively linked to a periphytic cyanobacterial mat attached to the concrete drinking water inflow structure. We conclude that shotgun metagenomics can be used to identify microbial agents involved in water quality deterioration and to guide PCR assay selection or design for routine monitoring purposes. Finally, we offer estimates of microbial diversity and metagenomic coverage of our data sets for reference to others wishing to apply shotgun metagenomics to other lacustrine systems. IMPORTANCE Cyanobacterial toxins and microbial taste-and-odor compounds are a growing concern for drinking water utilities reliant upon surface water resources

  9. Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer.

    PubMed

    Yu, Jun; Feng, Qiang; Wong, Sunny Hei; Zhang, Dongya; Liang, Qiao Yi; Qin, Youwen; Tang, Longqing; Zhao, Hui; Stenvang, Jan; Li, Yanli; Wang, Xiaokai; Xu, Xiaoqiang; Chen, Ning; Wu, William Ka Kei; Al-Aama, Jumana; Nielsen, Hans Jørgen; Kiilerich, Pia; Jensen, Benjamin Anderschou Holbech; Yau, Tung On; Lan, Zhou; Jia, Huijue; Li, Junhua; Xiao, Liang; Lam, Thomas Yuen Tung; Ng, Siew Chien; Cheng, Alfred Sze-Lok; Wong, Vincent Wai-Sun; Chan, Francis Ka Leung; Xu, Xun; Yang, Huanming; Madsen, Lise; Datz, Christian; Tilg, Herbert; Wang, Jian; Brünner, Nils; Kristiansen, Karsten; Arumugam, Manimozhiyan; Sung, Joseph Jao-Yiu; Wang, Jun

    2017-01-01

    To evaluate the potential for diagnosing colorectal cancer (CRC) from faecal metagenomes. We performed metagenome-wide association studies on faecal samples from 74 patients with CRC and 54 controls from China, and validated the results in 16 patients and 24 controls from Denmark. We further validated the biomarkers in two published cohorts from France and Austria. Finally, we employed targeted quantitative PCR (qPCR) assays to evaluate diagnostic potential of selected biomarkers in an independent Chinese cohort of 47 patients and 109 controls. Besides confirming known associations of Fusobacterium nucleatum and Peptostreptococcus stomatis with CRC, we found significant associations with several species, including Parvimonas micra and Solobacterium moorei. We identified 20 microbial gene markers that differentiated CRC and control microbiomes, and validated 4 markers in the Danish cohort. In the French and Austrian cohorts, these four genes distinguished CRC metagenomes from controls with areas under the receiver-operating curve (AUC) of 0.72 and 0.77, respectively. qPCR measurements of two of these genes accurately classified patients with CRC in the independent Chinese cohort with AUC=0.84 and OR of 23. These genes were enriched in early-stage (I-II) patient microbiomes, highlighting the potential for using faecal metagenomic biomarkers for early diagnosis of CRC. We present the first metagenomic profiling study of CRC faecal microbiomes to discover and validate microbial biomarkers in ethnically different cohorts, and to independently validate selected biomarkers using an affordable clinically relevant technology. Our study thus takes a step further towards affordable non-invasive early diagnostic biomarkers for CRC from faecal samples. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  10. Recovering complete and draft population genomes from metagenome datasets

    DOE PAGES

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    2016-03-08

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  11. Recovering complete and draft population genomes from metagenome datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  12. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    PubMed Central

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729

  13. MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit

    PubMed Central

    Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R.; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer

    2012-01-01

    MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/. PMID:23082188

  14. MOCAT: a metagenomics assembly and gene prediction toolkit.

    PubMed

    Kultima, Jens Roat; Sunagawa, Shinichi; Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer

    2012-01-01

    MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.

  15. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads.

    PubMed

    Petersen, Thomas Nordahl; Lukjancenko, Oksana; Thomsen, Martin Christen Frølund; Maddalena Sperotto, Maria; Lund, Ole; Møller Aarestrup, Frank; Sicheritz-Pontén, Thomas

    2017-01-01

    An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets.

  16. Recombinational Cloning Using Gateway and In-Fusion Cloning Schemes

    PubMed Central

    Throop, Andrea L.; LaBaer, Joshua

    2015-01-01

    The comprehensive study of protein structure and function, or proteomics, depends on the obtainability of full-length cDNAs in species-specific expression vectors and subsequent functional analysis of the expressed protein. Recombinational cloning is a universal cloning technique based on site-specific recombination that is independent of the insert DNA sequence of interest, which differentiates this method from the classical restriction enzyme-based cloning methods. Recombinational cloning enables rapid and efficient parallel transfer of DNA inserts into multiple expression systems. This unit summarizes strategies for generating expression-ready clones using the most popular recombinational cloning technologies, including the commercially available Gateway® (Life Technologies) and In-Fusion® (Clontech) cloning technologies. PMID:25827088

  17. A metagenomic survey of viral abundance and diversity in mosquitoes from Hubei province.

    PubMed

    Shi, Chenyan; Liu, Yi; Hu, Xiaomin; Xiong, Jinfeng; Zhang, Bo; Yuan, Zhiming

    2015-01-01

    Mosquitoes as one of the most common but important vectors have the potential to transmit or acquire a lot of viruses through biting, however viral flora in mosquitoes and its impact on mosquito-borne disease transmission has not been well investigated and evaluated. In this study, the metagenomic techniquehas been successfully employed in analyzing the abundance and diversity of viral community in three mosquito samples from Hubei, China. Among 92,304 reads produced through a run with 454 GS FLX system, 39% have high similarities with viral sequences belonging to identified bacterial, fungal, animal, plant and insect viruses, and 0.02% were classed into unidentified viral sequences, demonstrating high abundance and diversity of viruses in mosquitoes. Furthermore, two novel viruses in subfamily Densovirinae and family Dicistroviridae were identified, and six torque tenosus virus1 in family Anelloviridae, three porcine parvoviruses in subfamily Parvovirinae and a Culex tritaeniorhynchus rhabdovirus in Family Rhabdoviridae were preliminarily characterized. The viral metagenomic analysis offered us a deep insight into the viral population of mosquito which played an important role in viral initiative or passive transmission and evolution during the process.

  18. A Metagenomic Survey of Viral Abundance and Diversity in Mosquitoes from Hubei Province

    PubMed Central

    Shi, Chenyan; Liu, Yi; Hu, Xiaomin; Xiong, Jinfeng; Zhang, Bo; Yuan, Zhiming

    2015-01-01

    Mosquitoes as one of the most common but important vectors have the potential to transmit or acquire a lot of viruses through biting, however viral flora in mosquitoes and its impact on mosquito-borne disease transmission has not been well investigated and evaluated. In this study, the metagenomic techniquehas been successfully employed in analyzing the abundance and diversity of viral community in three mosquito samples from Hubei, China. Among 92,304 reads produced through a run with 454 GS FLX system, 39% have high similarities with viral sequences belonging to identified bacterial, fungal, animal, plant and insect viruses, and 0.02% were classed into unidentified viral sequences, demonstrating high abundance and diversity of viruses in mosquitoes. Furthermore, two novel viruses in subfamily Densovirinae and family Dicistroviridae were identified, and six torque tenosus virus1 in family Anelloviridae, three porcine parvoviruses in subfamily Parvovirinae and a Culex tritaeniorhynchus rhabdovirus in Family Rhabdoviridae were preliminarily characterized. The viral metagenomic analysis offered us a deep insight into the viral population of mosquito which played an important role in viral initiative or passive transmission and evolution during the process. PMID:26030271

  19. Direct Cloning of Yeast Genes from an Ordered Set of Lambda Clones in Saccharomyces Cerevisiae by Recombination in Vivo

    PubMed Central

    Erickson, J. R.; Johnston, M.

    1993-01-01

    We describe a technique that facilitates the isolation of yeast genes that are difficult to clone. This technique utilizes a plasmid vector that rescues lambda clones as yeast centromere plasmids. The source of these lambda clones is a set of clones whose location in the yeast genome has been determined by L. Riles et al. in 1993. The Esherichia coli-yeast shuttle plasmid carries URA3, ARS4 and CEN6, and contains DNA fragments from the lambda vector that flank the cloned yeast insert. When yeast is cotransformed with linearized plasmid and lambda clone DNA, Ura(+) transformants are obtained by a recombination event between the lambda clone and the plasmid vector that generates an autonomously replicating plasmid containing the cloned yeast DNA sequences. Genes whose genetic map positions are known can easily be identified and recovered in this plasmid by testing only those lambda clones that map to the relevant region of the yeast genome for their ability to complement the mutant phenotype. This technique facilitates the isolation of yeast genes that resist cloning either because (1) they are underrepresented in yeast genomic libraries amplified in E. coli, (2) they provide phenotypes that are too marginal to allow selection of the gene by genetic complementation or (3) they provide phenotypes that are laborious to score. We demonstrate the utility of this technique by isolating three genes, GAL83, SSN2 and MAK7, each of which presents one of these problems for cloning. PMID:8514124

  20. The effects of variable sample biomass on comparative metagenomics.

    PubMed

    Chafee, Meghan; Maignien, Loïs; Simmons, Sheri L

    2015-07-01

    Longitudinal studies that integrate samples with variable biomass are essential to understand microbial community dynamics across space or time. Shotgun metagenomics is widely used to investigate these communities at the functional level, but little is known about the effects of combining low and high biomass samples on downstream analysis. We investigated the interacting effects of DNA input and library amplification by polymerase chain reaction on comparative metagenomic analysis using dilutions of a single complex template from an Arabidopsis thaliana-associated microbial community. We modified the Illumina Nextera kit to generate high-quality large-insert (680 bp) paired-end libraries using a range of 50 pg to 50 ng of input DNA. Using assembly-based metagenomic analysis, we demonstrate that DNA input level has a significant impact on community structure due to overrepresentation of low-GC genomic regions following library amplification. In our system, these differences were largely superseded by variations between biological replicates, but our results advocate verifying the influence of library amplification on a case-by-case basis. Overall, this study provides recommendations for quality filtering and de-replication prior to analysis, as well as a practical framework to address the issue of low biomass or biomass heterogeneity in longitudinal metagenomic surveys. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  1. Deconvoluting simulated metagenomes: the performance of hard- and soft- clustering algorithms applied to metagenomic chromosome conformation capture (3C)

    PubMed Central

    DeMaere, Matthew Z.

    2016-01-01

    Background Chromosome conformation capture, coupled with high throughput DNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (strain-level diversity) are present in the sample has not yet been systematically characterised. Methods We developed a computational simulation pipeline for metagenomic 3C and Hi-C sequencing to evaluate the accuracy of genomic reconstructions at, above, and below an operationally defined species boundary. We simulated datasets and measured accuracy over a wide range of parameters. Five clustering algorithms were evaluated (2 hard, 3 soft) using an adaptation of the extended B-cubed validation measure. Results When all genomes in a sample are below 95% sequence identity, all of the tested clustering algorithms performed well. When sequence data contains genomes above 95% identity (our operational definition of strain-level diversity), a naive soft-clustering extension of the Louvain method achieves the highest performance. Discussion Previously, only hard-clustering algorithms have been applied to metagenomic 3C and Hi-C data, yet none of these perform well when strain-level diversity exists in a metagenomic sample. Our simple extension of the Louvain method performed the best in these scenarios, however, accuracy remained well below the levels observed for samples without strain-level diversity. Strain resolution is also highly dependent on the amount of available 3C sequence data, suggesting that depth of sequencing must be carefully considered during experimental design. Finally, there appears to be great scope to improve the accuracy of strain resolution through further algorithm development. PMID:27843713

  2. Metagenomic Analyses Reveal That Energy Transfer Gene Abundances Can Predict the Syntrophic Potential of Environmental Microbial Communities.

    PubMed

    Oberding, Lisa; Gieg, Lisa M

    2016-01-05

    Hydrocarbon compounds can be biodegraded by anaerobic microorganisms to form methane through an energetically interdependent metabolic process known as syntrophy. The microorganisms that perform this process as well as the energy transfer mechanisms involved are difficult to study and thus are still poorly understood, especially on an environmental scale. Here, metagenomic data was analyzed for specific clusters of orthologous groups (COGs) related to key energy transfer genes thus far identified in syntrophic bacteria, and principal component analysis was used in order to determine whether potentially syntrophic environments could be distinguished using these syntroph related COGs as opposed to universally present COGs. We found that COGs related to hydrogenase and formate dehydrogenase genes were able to distinguish known syntrophic consortia and environments with the potential for syntrophy from non-syntrophic environments, indicating that these COGs could be used as a tool to identify syntrophic hydrocarbon biodegrading environments using metagenomic data.

  3. Metagenomic Analyses Reveal That Energy Transfer Gene Abundances Can Predict the Syntrophic Potential of Environmental Microbial Communities

    PubMed Central

    Oberding, Lisa; Gieg, Lisa M.

    2016-01-01

    Hydrocarbon compounds can be biodegraded by anaerobic microorganisms to form methane through an energetically interdependent metabolic process known as syntrophy. The microorganisms that perform this process as well as the energy transfer mechanisms involved are difficult to study and thus are still poorly understood, especially on an environmental scale. Here, metagenomic data was analyzed for specific clusters of orthologous groups (COGs) related to key energy transfer genes thus far identified in syntrophic bacteria, and principal component analysis was used in order to determine whether potentially syntrophic environments could be distinguished using these syntroph related COGs as opposed to universally present COGs. We found that COGs related to hydrogenase and formate dehydrogenase genes were able to distinguish known syntrophic consortia and environments with the potential for syntrophy from non-syntrophic environments, indicating that these COGs could be used as a tool to identify syntrophic hydrocarbon biodegrading environments using metagenomic data. PMID:27681901

  4. MPD: a pathogen genome and metagenome database

    PubMed Central

    Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen

    2018-01-01

    Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040

  5. A retrospective metagenomics approach to studying Blastocystis.

    PubMed

    Andersen, Lee O'Brien; Bonde, Ida; Nielsen, Henrik Bjørn; Stensvold, Christen Rune

    2015-07-01

    Blastocystis is a common single-celled intestinal parasitic genus, comprising several subtypes. Here, we screened data obtained by metagenomic analysis of faecal DNA for Blastocystis by searching for subtype-specific genes in coabundance gene groups, which are groups of genes that covary across a selection of 316 human faecal samples, hence representing genes originating from a single subtype. The 316 faecal samples were from 236 healthy individuals, 13 patients with Crohn's disease (CD) and 67 patients with ulcerative colitis (UC). The prevalence of Blastocystis was 20.3% in the healthy individuals and 14.9% in patients with UC. Meanwhile, Blastocystis was absent in patients with CD. Individuals with intestinal microbiota dominated by Bacteroides were much less prone to having Blastocystis-positive stool (Matthew's correlation coefficient = -0.25, P < 0.0001) than individuals with Ruminococcus- and Prevotella-driven enterotypes. This is the first study to investigate the relationship between Blastocystis and communities of gut bacteria using a metagenomics approach. The study serves as an example of how it is possible to retrospectively investigate microbial eukaryotic communities in the gut using metagenomic datasets targeting the bacterial component of the intestinal microbiome and the interplay between these microbial communities. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. Survival of Skin Graft between Transgenic Cloned Dogs and Non-Transgenic Cloned Dogs

    PubMed Central

    Kim, Geon A; Oh, Hyun Ju; Kim, Min Jung; Jo, Young Kwang; Choi, Jin; Park, Jung Eun; Park, Eun Jung; Lim, Sang Hyun; Yoon, Byung Il; Kang, Sung Keun; Jang, Goo; Lee, Byeong Chun

    2014-01-01

    Whereas it has been assumed that genetically modified tissues or cells derived from somatic cell nuclear transfer (SCNT) should be accepted by a host of the same species, their immune compatibility has not been extensively explored. To identify acceptance of SCNT-derived cells or tissues, skin grafts were performed between cloned dogs that were identical except for their mitochondrial DNA (mtDNA) haplotypes and foreign gene. We showed here that differences in mtDNA haplotypes and genetic modification did not elicit immune responses in these dogs: 1) skin tissues from genetically-modified cloned dogs were successfully transplanted into genetically-modified cloned dogs with different mtDNA haplotype under three successive grafts over 63 days; and 2) non-transgenic cloned tissues were accepted into transgenic cloned syngeneic recipients with different mtDNA haplotypes and vice versa under two successive grafts over 63 days. In addition, expression of the inserted gene was maintained, being functional without eliciting graft rejection. In conclusion, these results show that transplanting genetically-modified tissues into normal, syngeneic or genetically-modified recipient dogs with different mtDNA haplotypes do not elicit skin graft rejection or affect expression of the inserted gene. Therefore, therapeutically valuable tissue derived from SCNT with genetic modification might be used safely in clinical applications for patients with diseased tissues. PMID:25372489

  7. Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

    PubMed Central

    Tsai, Yu-Chih; Deming, Clayton; Segre, Julia A.; Kong, Heidi H.; Korlach, Jonas

    2016-01-01

    ABSTRACT Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT) sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation. PMID:26861018

  8. ELIXIR pilot action: Marine metagenomics – towards a domain specific set of sustainable services

    PubMed Central

    Robertsen, Espen Mikal; Denise, Hubert; Mitchell, Alex; Finn, Robert D.; Bongo, Lars Ailo; Willassen, Nils Peder

    2017-01-01

    Metagenomics, the study of genetic material recovered directly from environmental samples, has the potential to provide insight into the structure and function of heterogeneous microbial communities.  There has been an increased use of metagenomics to discover and understand the diverse biosynthetic capacities of marine microbes, thereby allowing them to be exploited for industrial, food, and health care products. This ELIXIR pilot action was motivated by the need to establish dedicated data resources and harmonized metagenomics pipelines for the marine domain, in order to enhance the exploration and exploitation of marine genetic resources. In this paper, we summarize some of the results from the ELIXIR pilot action “Marine metagenomics – towards user centric services”. PMID:28620454

  9. Xander: employing a novel method for efficient gene-targeted metagenomic assembly

    DOE PAGES

    Wang, Qiong; Fish, Jordan A.; Gilman, Mariah; ...

    2015-08-05

    Here, metagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes. We present a novel method for targeting assembly of specific protein-coding genes. This method combines a de Bruijn graph, as used in standard assembly approaches, and a protein profile hidden Markov model (HMM) for the gene of interest, as used in standard annotation approaches. These are used to create a novel combined weighted assembly graph. Xander performs both assembly and annotation concomitantly using information incorporated in this graph. We demonstrate the utility ofmore » this approach by assembling contigs for one phylogenetic marker gene and for two functional marker genes, first on Human Microbiome Project (HMP)-defined community Illumina data and then on 21 rhizosphere soil metagenomic datasets from three different crops totaling over 800 Gbp of unassembled data. We compared our method to a recently published bulk metagenome assembly method and a recently published gene-targeted assembler and found our method produced more, longer, and higher quality gene sequences. In conclusion, xander combines gene assignment with the rapid assembly of full-length or near full-length functional genes from metagenomic data without requiring bulk assembly or post-processing to find genes of interest. HMMs used for assembly can be tailored to the targeted genes, allowing flexibility to improve annotation over generic annotation pipelines.« less

  10. Xander: employing a novel method for efficient gene-targeted metagenomic assembly.

    PubMed

    Wang, Qiong; Fish, Jordan A; Gilman, Mariah; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

    2015-01-01

    Metagenomics can provide important insight into microbial communities. However, assembling metagenomic datasets has proven to be computationally challenging. Current methods often assemble only fragmented partial genes. We present a novel method for targeting assembly of specific protein-coding genes. This method combines a de Bruijn graph, as used in standard assembly approaches, and a protein profile hidden Markov model (HMM) for the gene of interest, as used in standard annotation approaches. These are used to create a novel combined weighted assembly graph. Xander performs both assembly and annotation concomitantly using information incorporated in this graph. We demonstrate the utility of this approach by assembling contigs for one phylogenetic marker gene and for two functional marker genes, first on Human Microbiome Project (HMP)-defined community Illumina data and then on 21 rhizosphere soil metagenomic datasets from three different crops totaling over 800 Gbp of unassembled data. We compared our method to a recently published bulk metagenome assembly method and a recently published gene-targeted assembler and found our method produced more, longer, and higher quality gene sequences. Xander combines gene assignment with the rapid assembly of full-length or near full-length functional genes from metagenomic data without requiring bulk assembly or post-processing to find genes of interest. HMMs used for assembly can be tailored to the targeted genes, allowing flexibility to improve annotation over generic annotation pipelines. This method is implemented as open source software and is available at https://github.com/rdpstaff/Xander_assembler.

  11. Pooled assembly of marine metagenomic datasets: enriching annotation through chimerism.

    PubMed

    Magasin, Jonathan D; Gerloff, Dietlind L

    2015-02-01

    Despite advances in high-throughput sequencing, marine metagenomic samples remain largely opaque. A typical sample contains billions of microbial organisms from thousands of genomes and quadrillions of DNA base pairs. Its derived metagenomic dataset underrepresents this complexity by orders of magnitude because of the sparseness and shortness of sequencing reads. Read shortness and sequencing errors pose a major challenge to accurate species and functional annotation. This includes distinguishing known from novel species. Often the majority of reads cannot be annotated and thus cannot help our interpretation of the sample. Here, we demonstrate quantitatively how careful assembly of marine metagenomic reads within, but also across, datasets can alleviate this problem. For 10 simulated datasets, each with species complexity modeled on a real counterpart, chimerism remained within the same species for most contigs (97%). For 42 real pyrosequencing ('454') datasets, assembly increased the proportion of annotated reads, and even more so when datasets were pooled, by on average 1.6% (max 6.6%) for species, 9.0% (max 28.7%) for Pfam protein domains and 9.4% (max 22.9%) for PANTHER gene families. Our results outline exciting prospects for data sharing in the metagenomics community. While chimeric sequences should be avoided in other areas of metagenomics (e.g. biodiversity analyses), conservative pooled assembly is advantageous for annotation specificity and sensitivity. Intriguingly, our experiment also found potential prospects for (low-cost) discovery of new species in 'old' data. dgerloff@ffame.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Screening for Cellulase Encoding Clones in Metagenomic Libraries.

    PubMed

    Ilmberger, Nele; Streit, Wolfgang R

    2017-01-01

    For modern biotechnology there is a steady need to identify novel enzymes. In biotechnological applications, however, enzymes often must function under extreme and nonnatural conditions (i.e., in the presence of solvents, high temperature and/or at extreme pH values). Cellulases have many industrial applications from the generation of bioethanol, a realistic long-term energy source, to the finishing of textiles. These industrial processes require cellulolytic activity under a wide range of pH, temperature, and ionic conditions, and they are usually carried out by mixtures of cellulases. Investigation of the broad diversity of cellulolytic enzymes involved in the natural degradation of cellulose is necessary for optimizing these processes.

  13. Diel Metagenomics and Metatranscriptomics of Elkhorn Slough Hypersaline Microbial Mat

    NASA Astrophysics Data System (ADS)

    Lee, J.; Detweiler, A. M.; Everroad, R. C.; Bebout, L. E.; Weber, P. K.; Pett-Ridge, J.; Bebout, B.

    2014-12-01

    To understand the variation in gene expression associated with the daytime oxygenic phototrophic and nighttime fermentation regimes seen in hypersaline microbial mats, a contiguous mat piece was subjected to sampling at regular intervals over a 24-hour diel period. Additionally, to understand the impact of sulfate reduction on biohydrogen consumption, molybdate was added to a parallel experiment in the same run. 4 metagenome and 12 metatranscriptome Illumina HiSeq lanes were completed over day / night, and control / molybdate experiments. Preliminary comparative examination of noon and midnight metatranscriptomic samples mapped using bowtie2 to reference genomes has revealed several notable results about the dominant mat-building cyanobacterium Microcoleus chthonoplastes PCC 7420. Dominant cyanobacterium M. chthonoplastes PCC 7420 shows expression in several pathways for nitrogen scavenging, including nitrogen fixation. Reads mapped to M. chthonoplastes PCC 7420 shows expression of two starch storage and utilization pathways, one as a starch-trehalose-maltose-glucose pathway, another through UDP-glucose-cellulose-β-1,4 glucan-glucose pathway. The overall trend of gene expression was primarily light driven up-regulation followed by down-regulation in dark, while much of the remaining expression profile appears to be constitutive. Co-assembly of quality-controlled reads from 4 metagenomes was performed using Ray Meta with progressively smaller K-mer sizes, with bins identified and filtered using principal component analysis of coverages from all libraries and a %GC filter, followed by reassembly of the remaining co-assembly reads and binned reads. Despite having relatively similar abundance profiles in each metagenome, this binning approach was able to distinctly resolve bins from dominant taxa, but also sulfate reducing bacteria that are desired for understanding molybdate inhibition. Bins generated from this iterative assembly process will be used for downstream

  14. MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads.

    PubMed

    Tyakht, Alexander V; Popenko, Anna S; Belenikin, Maxim S; Altukhov, Ilya A; Pavlenko, Alexander V; Kostryukova, Elena S; Selezneva, Oksana V; Larin, Andrei K; Karpova, Irina Y; Alexeev, Dmitry G

    2012-12-07

    MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.

  15. Metagenomics of urban sewage identifies an extensively shared antibiotic resistome in China.

    PubMed

    Su, Jian-Qiang; An, Xin-Li; Li, Bing; Chen, Qing-Lin; Gillings, Michael R; Chen, Hong; Zhang, Tong; Zhu, Yong-Guan

    2017-07-19

    Antibiotic-resistant pathogens are challenging treatment of infections worldwide. Urban sewage is potentially a major conduit for dissemination of antibiotic resistance genes into various environmental compartments. However, the diversity and abundance of such genes in wastewater are not well known. Here, seasonal and geographical distributions of antibiotic resistance genes and their host bacterial communities from Chinese urban sewage were characterized, using metagenomic analyses and 16S rRNA gene-based Illumina sequencing, respectively. In total, 381 different resistance genes were detected, and these genes were extensively shared across China, with no geographical clustering. Seasonal variation in abundance of resistance genes was observed, with average concentrations of 3.27 × 10 11 and 1.79 × 10 12 copies/L in summer and winter, respectively. Bacterial communities did not exhibit geographical clusters, but did show a significant distance-decay relationship (P < 0.01). The core, shared resistome accounted for 57.7% of the total resistance genes, and was significantly associated with the core microbial community (P < 0.01). The core human gut microbiota was also strongly associated with the shared resistome, demonstrating the potential contribution of human gut microbiota to the dissemination of resistance elements via sewage disposal. This study provides a baseline for investigating environmental dissemination of resistance elements and raises the possibility of using the abundance of resistance genes in sewage as a tool for antibiotic stewardship.

  16. Analysis of composition-based metagenomic classification.

    PubMed

    Higashi, Susan; Barreto, André da Motta Salles; Cantão, Maurício Egidio; de Vasconcelos, Ana Tereza Ribeiro

    2012-01-01

    An essential step of a metagenomic study is the taxonomic classification, that is, the identification of the taxonomic lineage of the organisms in a given sample. The taxonomic classification process involves a series of decisions. Currently, in the context of metagenomics, such decisions are usually based on empirical studies that consider one specific type of classifier. In this study we propose a general framework for analyzing the impact that several decisions can have on the classification problem. Instead of focusing on any specific classifier, we define a generic score function that provides a measure of the difficulty of the classification task. Using this framework, we analyze the impact of the following parameters on the taxonomic classification problem: (i) the length of n-mers used to encode the metagenomic sequences, (ii) the similarity measure used to compare sequences, and (iii) the type of taxonomic classification, which can be conventional or hierarchical, depending on whether the classification process occurs in a single shot or in several steps according to the taxonomic tree. We defined a score function that measures the degree of separability of the taxonomic classes under a given configuration induced by the parameters above. We conducted an extensive computational experiment and found out that reasonable values for the parameters of interest could be (i) intermediate values of n, the length of the n-mers; (ii) any similarity measure, because all of them resulted in similar scores; and (iii) the hierarchical strategy, which performed better in all of the cases. As expected, short n-mers generate lower configuration scores because they give rise to frequency vectors that represent distinct sequences in a similar way. On the other hand, large values for n result in sparse frequency vectors that represent differently metagenomic fragments that are in fact similar, also leading to low configuration scores. Regarding the similarity measure, in

  17. Functional and structural characterization of a novel putative cysteine protease cell wall-modifying multi-domain enzyme selected from a microbial metagenome.

    PubMed

    Faheem, Muhammad; Martins-de-Sa, Diogo; Vidal, Julia F D; Álvares, Alice C M; Brandão-Neto, José; Bird, Louise E; Tully, Mark D; von Delft, Frank; Souto, Betulia M; Quirino, Betania F; Freitas, Sonia M; Barbosa, João Alexandre R G

    2016-12-09

    A current metagenomics focus is to interpret and transform collected genomic data into biological information. By combining structural, functional and genomic data we have assessed a novel bacterial protein selected from a carbohydrate-related activity screen in a microbial metagenomic library from Capra hircus (domestic goat) gut. This uncharacterized protein was predicted as a bacterial cell wall-modifying enzyme (CWME) and shown to contain four domains: an N-terminal, a cysteine protease, a peptidoglycan-binding and an SH3 bacterial domain. We successfully cloned, expressed and purified this putative cysteine protease (PCP), which presented autoproteolytic activity and inhibition by protease inhibitors. We observed cell wall hydrolytic activity and ampicillin binding capacity, a characteristic of most bacterial CWME. Fluorimetric binding analysis yielded a K b of 1.8 × 10 5  M -1 for ampicillin. Small-angle X-ray scattering (SAXS) showed a maximum particle dimension of 95 Å with a real-space R g of 28.35 Å. The elongated molecular envelope corroborates the dynamic light scattering (DLS) estimated size. Furthermore, homology modeling and SAXS allowed the construction of a model that explains the stability and secondary structural changes observed by circular dichroism (CD). In short, we report a novel cell wall-modifying autoproteolytic PCP with insight into its biochemical, biophysical and structural features.

  18. Positional cloning of zebrafish ferroportin1 identifies a conserved vertebrate iron exporter.

    PubMed

    Donovan, A; Brownlie, A; Zhou, Y; Shepard, J; Pratt, S J; Moynihan, J; Paw, B H; Drejer, A; Barut, B; Zapata, A; Law, T C; Brugnara, C; Lux, S E; Pinkus, G S; Pinkus, J L; Kingsley, P D; Palis, J; Fleming, M D; Andrews, N C; Zon, L I

    2000-02-17

    Defects in iron absorption and utilization lead to iron deficiency and overload disorders. Adult mammals absorb iron through the duodenum, whereas embryos obtain iron through placental transport. Iron uptake from the intestinal lumen through the apical surface of polarized duodenal enterocytes is mediated by the divalent metal transporter, DMTi. A second transporter has been postulated to export iron across the basolateral surface to the circulation. Here we have used positional cloning to identify the gene responsible for the hypochromic anaemia of the zebrafish mutant weissherbst. The gene, ferroportin1, encodes a multiple-transmembrane domain protein, expressed in the yolk sac, that is a candidate for the elusive iron exporter. Zebrafish ferroportin1 is required for the transport of iron from maternally derived yolk stores to the circulation and functions as an iron exporter when expressed in Xenopus oocytes. Human Ferroportin1 is found at the basal surface of placental syncytiotrophoblasts, suggesting that it also transports iron from mother to embryo. Mammalian Ferroportin1 is expressed at the basolateral surface of duodenal enterocytes and could export cellular iron into the circulation. We propose that Ferroportin1 function may be perturbed in mammalian disorders of iron deficiency or overload.

  19. The soil microbiome — from metagenomics to metaphenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jansson, Janet K.; Hofmockel, Kirsten S.

    Soil microorganisms carry out important processes, including support of plant growth and cycling of carbon and other nutrients. However, the majority of soil microbes have not yet been isolated and their functions are largely unknown. Although metagenomic sequencing reveals microbial identities and functional gene information, it includes DNA from microbes with vastly varying physiological states. Therefore, metagenomics is only predictive of community functional potential. We posit that the next frontier lies in understanding the metaphenome, the product of the combined genetic potential of the microbiome and available resources. Here in this paper we describe examples of opportunities towards gaining understandingmore » of the soil metaphenome.« less

  20. Kraken: ultrafast metagenomic sequence classification using exact alignments

    PubMed Central

    2014-01-01

    Kraken is an ultrafast and highly accurate program for assigning taxonomic labels to metagenomic DNA sequences. Previous programs designed for this task have been relatively slow and computationally expensive, forcing researchers to use faster abundance estimation programs, which only classify small subsets of metagenomic data. Using exact alignment of k-mers, Kraken achieves classification accuracy comparable to the fastest BLAST program. In its fastest mode, Kraken classifies 100 base pair reads at a rate of over 4.1 million reads per minute, 909 times faster than Megablast and 11 times faster than the abundance estimation program MetaPhlAn. Kraken is available at http://ccb.jhu.edu/software/kraken/. PMID:24580807

  1. The soil microbiome — from metagenomics to metaphenomics

    DOE PAGES

    Jansson, Janet K.; Hofmockel, Kirsten S.

    2018-02-15

    Soil microorganisms carry out important processes, including support of plant growth and cycling of carbon and other nutrients. However, the majority of soil microbes have not yet been isolated and their functions are largely unknown. Although metagenomic sequencing reveals microbial identities and functional gene information, it includes DNA from microbes with vastly varying physiological states. Therefore, metagenomics is only predictive of community functional potential. We posit that the next frontier lies in understanding the metaphenome, the product of the combined genetic potential of the microbiome and available resources. Here in this paper we describe examples of opportunities towards gaining understandingmore » of the soil metaphenome.« less

  2. Laboratory procedures to generate viral metagenomes.

    PubMed

    Thurber, Rebecca V; Haynes, Matthew; Breitbart, Mya; Wegley, Linda; Rohwer, Forest

    2009-01-01

    This collection of laboratory protocols describes the steps to collect viruses from various samples with the specific aim of generating viral metagenome sequence libraries (viromes). Viral metagenomics, the study of uncultured viral nucleic acid sequences from different biomes, relies on several concentration, purification, extraction, sequencing and heuristic bioinformatic methods. No single technique can provide an all-inclusive approach, and therefore the protocols presented here will be discussed in terms of hypothetical projects. However, care must be taken to individualize each step depending on the source and type of viral-particles. This protocol is a description of the processes we have successfully used to: (i) concentrate viral particles from various types of samples, (ii) eliminate contaminating cells and free nucleic acids and (iii) extract, amplify and purify viral nucleic acids. Overall, a sample can be processed to isolate viral nucleic acids suitable for high-throughput sequencing in approximately 1 week.

  3. MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads

    PubMed Central

    Lukjancenko, Oksana; Thomsen, Martin Christen Frølund; Maddalena Sperotto, Maria; Lund, Ole; Møller Aarestrup, Frank; Sicheritz-Pontén, Thomas

    2017-01-01

    An increasing amount of species and gene identification studies rely on the use of next generation sequence analysis of either single isolate or metagenomics samples. Several methods are available to perform taxonomic annotations and a previous metagenomics benchmark study has shown that a vast number of false positive species annotations are a problem unless thresholds or post-processing are applied to differentiate between correct and false annotations. MGmapper is a package to process raw next generation sequence data and perform reference based sequence assignment, followed by a post-processing analysis to produce reliable taxonomy annotation at species and strain level resolution. An in-vitro bacterial mock community sample comprised of 8 genuses, 11 species and 12 strains was previously used to benchmark metagenomics classification methods. After applying a post-processing filter, we obtained 100% correct taxonomy assignments at species and genus level. A sensitivity and precision at 75% was obtained for strain level annotations. A comparison between MGmapper and Kraken at species level, shows MGmapper assigns taxonomy at species level using 84.8% of the sequence reads, compared to 70.5% for Kraken and both methods identified all species with no false positives. Extensive read count statistics are provided in plain text and excel sheets for both rejected and accepted taxonomy annotations. The use of custom databases is possible for the command-line version of MGmapper, and the complete pipeline is freely available as a bitbucked package (https://bitbucket.org/genomicepidemiology/mgmapper). A web-version (https://cge.cbs.dtu.dk/services/MGmapper) provides the basic functionality for analysis of small fastq datasets. PMID:28467460

  4. Human cloning 2001.

    PubMed

    Healy, David L; Weston, Gareth; Pera, Martin F; Rombauts, Luk; Trounson, Alan O

    2002-05-01

    This review summaries human cloning from a clinical perspective. Natural human clones, that is, monozygotic twins, are increasing in the general community. Iatrogenic human clones have been produced for decades in infertile couples given fertility treatment such as ovulation induction. A clear distinction must be made between therapeutic cloning using embryonic stem cells and reproductive cloning attempts. Unlike the early clinical years of in vitro fertilization, with cloning there is no animal model that is safe and dependable. Until there is such a model, 'Dolly'-style human cloning is medically unacceptable.

  5. Comparative (Meta)genomic Analysis and Ecological Profiling of Human Gut-Specific Bacteriophage φB124-14

    PubMed Central

    Ogilvie, Lesley A.; Caplin, Jonathan; Dedi, Cinzia; Diston, David; Cheek, Elizabeth; Bowler, Lucas; Taylor, Huw; Ebdon, James; Jones, Brian V.

    2012-01-01

    Bacteriophage associated with the human gut microbiome are likely to have an important impact on community structure and function, and provide a wealth of biotechnological opportunities. Despite this, knowledge of the ecology and composition of bacteriophage in the gut bacterial community remains poor, with few well characterized gut-associated phage genomes currently available. Here we describe the identification and in-depth (meta)genomic, proteomic, and ecological analysis of a human gut-specific bacteriophage (designated φB124-14). In doing so we illuminate a fraction of the biological dark matter extant in this ecosystem and its surrounding eco-genomic landscape, identifying a novel and uncharted bacteriophage gene-space in this community. φB124-14 infects only a subset of closely related gut-associated Bacteroides fragilis strains, and the circular genome encodes functions previously found to be rare in viral genomes and human gut viral metagenome sequences, including those which potentially confer advantages upon phage and/or host bacteria. Comparative genomic analyses revealed φB124-14 is most closely related to φB40-8, the only other publically available Bacteroides sp. phage genome, whilst comparative metagenomic analysis of both phage failed to identify any homologous sequences in 136 non-human gut metagenomic datasets searched, supporting the human gut-specific nature of this phage. Moreover, a potential geographic variation in the carriage of these and related phage was revealed by analysis of their distribution and prevalence within 151 human gut microbiomes and viromes from Europe, America and Japan. Finally, ecological profiling of φB124-14 and φB40-8, using both gene-centric alignment-driven phylogenetic analyses, as well as alignment-free gene-independent approaches was undertaken. This not only verified the human gut-specific nature of both phage, but also indicated that these phage populate a distinct and unexplored ecological landscape

  6. A Metagenomic Approach to Cyanobacterial Genomics

    PubMed Central

    Alvarenga, Danillo O.; Fiore, Marli F.; Varani, Alessandro M.

    2017-01-01

    Cyanobacteria, or oxyphotobacteria, are primary producers that establish ecological interactions with a wide variety of organisms. Although their associations with eukaryotes have received most attention, interactions with bacterial and archaeal symbionts have also been occurring for billions of years. Due to these associations, obtaining axenic cultures of cyanobacteria is usually difficult, and most isolation efforts result in unicyanobacterial cultures containing a number of associated microbes, hence composing a microbial consortium. With rising numbers of cyanobacterial blooms due to climate change, demand for genomic evaluations of these microorganisms is increasing. However, standard genomic techniques call for the sequencing of axenic cultures, an approach that not only adds months or even years for culture purification, but also appears to be impossible for some cyanobacteria, which is reflected in the relatively low number of publicly available genomic sequences of this phylum. Under the framework of metagenomics, on the other hand, cumbersome techniques for achieving axenic growth can be circumvented and individual genomes can be successfully obtained from microbial consortia. This review focuses on approaches for the genomic and metagenomic assessment of non-axenic cyanobacterial cultures that bypass requirements for axenity. These methods enable researchers to achieve faster and less costly genomic characterizations of cyanobacterial strains and raise additional information about their associated microorganisms. While non-axenic cultures may have been previously frowned upon in cyanobacteriology, latest advancements in metagenomics have provided new possibilities for in vitro studies of oxyphotobacteria, renewing the value of microbial consortia as a reliable and functional resource for the rapid assessment of bloom-forming cyanobacteria. PMID:28536564

  7. Bayesian mixture analysis for metagenomic community profiling.

    PubMed

    Morfopoulou, Sofia; Plagnol, Vincent

    2015-09-15

    Deep sequencing of clinical samples is now an established tool for the detection of infectious pathogens, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, in particular for viral pathogens. Here we present metaMix, a Bayesian mixture model framework for resolving complex metagenomic mixtures. We show that the use of parallel Monte Carlo Markov chains for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture. We demonstrate the greater accuracy of metaMix compared with relevant methods, particularly for profiling complex communities consisting of several related species. We designed metaMix specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection; however, the principles are generally applicable to all types of metagenomic mixtures. metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix sofia.morfopoulou.10@ucl.ac.uk Supplementary data are available at Bionformatics online. © The Author 2015. Published by Oxford University Press.

  8. Diverse Array of New Viral Sequences Identified in Worldwide Populations of the Asian Citrus Psyllid (Diaphorina citri) Using Viral Metagenomics

    PubMed Central

    Nouri, Shahideh; Salem, Nidá; Nigg, Jared C.

    2015-01-01

    ABSTRACT The Asian citrus psyllid, Diaphorina citri, is the natural vector of the causal agent of Huanglongbing (HLB), or citrus greening disease. Together; HLB and D. citri represent a major threat to world citrus production. As there is no cure for HLB, insect vector management is considered one strategy to help control the disease, and D. citri viruses might be useful. In this study, we used a metagenomic approach to analyze viral sequences associated with the global population of D. citri. By sequencing small RNAs and the transcriptome coupled with bioinformatics analysis, we showed that the virus-like sequences of D. citri are diverse. We identified novel viral sequences belonging to the picornavirus superfamily, the Reoviridae, Parvoviridae, and Bunyaviridae families, and an unclassified positive-sense single-stranded RNA virus. Moreover, a Wolbachia prophage-related sequence was identified. This is the first comprehensive survey to assess the viral community from worldwide populations of an agricultural insect pest. Our results provide valuable information on new putative viruses, some of which may have the potential to be used as biocontrol agents. IMPORTANCE Insects have the most species of all animals, and are hosts to, and vectors of, a great variety of known and unknown viruses. Some of these most likely have the potential to be important fundamental and/or practical resources. In this study, we used high-throughput next-generation sequencing (NGS) technology and bioinformatics analysis to identify putative viruses associated with Diaphorina citri, the Asian citrus psyllid. D. citri is the vector of the bacterium causing Huanglongbing (HLB), currently the most serious threat to citrus worldwide. Here, we report several novel viral sequences associated with D. citri. PMID:26676774

  9. A strategy for clone selection under different production conditions.

    PubMed

    Legmann, Rachel; Benoit, Brian; Fedechko, Ronald W; Deppeler, Cynthia L; Srinivasan, Sriram; Robins, Russell H; McCormick, Ellen L; Ferrick, David A; Rodgers, Seth T; Russo, A Peter

    2011-01-01

    Top performing clones have failed at the manufacturing scale while the true best performer may have been rejected early in the screening process. Therefore, the ability to screen multiple clones in complex fed-batch processes using multiple process variations can be used to assess robustness and to identify critical factors. This dynamic ranking of clones' strategy requires the execution of many parallel experiments than traditional approaches. Therefore, this approach is best suited for micro-bioreactor models which can perform hundreds of experiments quickly and efficiently. In this study, a fully monitored and controlled small scale platform was used to screen eight CHO clones producing a recombinant monoclonal antibody across several process variations, including different feeding strategies, temperature shifts and pH control profiles. The first screen utilized 240 micro-bioreactors were run for two weeks for this assessment of the scale-down model as a high-throughput tool for clone evaluation. The richness of the outcome data enable to clearly identify the best and worst clone as well as process in term of maximum monoclonal antibody titer. The follow-up comparison study utilized 180 micro-bioreactors in a full factorial design and a subset of 12 clone/process combinations was selected to be run parallel in duplicate shake flasks. Good correlation between the micro-bioreactor predictions and those made in shake flasks with a Pearson correlation value of 0.94. The results also demonstrate that this micro-scale system can perform clone screening and process optimization for gaining significant titer improvements simultaneously. This dynamic ranking strategy can support better choices of production clones. Copyright © 2011 American Institute of Chemical Engineers (AIChE).

  10. Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.

    PubMed

    Schlaberg, Robert; Chiu, Charles Y; Miller, Steve; Procop, Gary W; Weinstock, George

    2017-06-01

    - Metagenomic sequencing can be used for detection of any pathogens using unbiased, shotgun next-generation sequencing (NGS), without the need for sequence-specific amplification. Proof-of-concept has been demonstrated in infectious disease outbreaks of unknown causes and in patients with suspected infections but negative results for conventional tests. Metagenomic NGS tests hold great promise to improve infectious disease diagnostics, especially in immunocompromised and critically ill patients. - To discuss challenges and provide example solutions for validating metagenomic pathogen detection tests in clinical laboratories. A summary of current regulatory requirements, largely based on prior guidance for NGS testing in constitutional genetics and oncology, is provided. - Examples from 2 separate validation studies are provided for steps from assay design, and validation of wet bench and bioinformatics protocols, to quality control and assurance. - Although laboratory and data analysis workflows are still complex, metagenomic NGS tests for infectious diseases are increasingly being validated in clinical laboratories. Many parallels exist to NGS tests in other fields. Nevertheless, specimen preparation, rapidly evolving data analysis algorithms, and incomplete reference sequence databases are idiosyncratic to the field of microbiology and often overlooked.

  11. Tracking cashew economically important diseases in the West African region using metagenomics

    PubMed Central

    Monteiro, Filipa; Romeiras, Maria M.; Figueiredo, Andreia; Sebastiana, Mónica; Baldé, Aladje; Catarino, Luís; Batista, Dora

    2015-01-01

    During the last decades, agricultural land-uses in West Africa were marked by dramatic shifts in the coverage of individual crops. Nowadays, cashew (Anacardium occidentale L.) is one of the most export-oriented horticulture crops, notably in Guinea-Bissau. Relying heavily on agriculture to increase their income, developing countries have been following a strong trend of moving on from traditional farming systems toward commercial production. Emerging infectious diseases, driven either by adaptation to local conditions or inadvertent importation of plant pathogens, are able to cause tremendous cashew production losses, with economic and social impact of which, in developing countries is often underestimated. Presently, plant genomics with metagenomics as an emergent tool, presents an enormous potential to better characterize diseases by providing extensive knowledge on plant pathogens at a large scale. In this perspective, we address metagenomics as a promising genomic tool to identify cashew fungal associated diseases as well as to discriminate the causal pathogens, aiming at obtaining tools to help design effective strategies for disease control and thus promote the sustainable production of cashew in West African Region. PMID:26175748

  12. Tracking cashew economically important diseases in the West African region using metagenomics.

    PubMed

    Monteiro, Filipa; Romeiras, Maria M; Figueiredo, Andreia; Sebastiana, Mónica; Baldé, Aladje; Catarino, Luís; Batista, Dora

    2015-01-01

    During the last decades, agricultural land-uses in West Africa were marked by dramatic shifts in the coverage of individual crops. Nowadays, cashew (Anacardium occidentale L.) is one of the most export-oriented horticulture crops, notably in Guinea-Bissau. Relying heavily on agriculture to increase their income, developing countries have been following a strong trend of moving on from traditional farming systems toward commercial production. Emerging infectious diseases, driven either by adaptation to local conditions or inadvertent importation of plant pathogens, are able to cause tremendous cashew production losses, with economic and social impact of which, in developing countries is often underestimated. Presently, plant genomics with metagenomics as an emergent tool, presents an enormous potential to better characterize diseases by providing extensive knowledge on plant pathogens at a large scale. In this perspective, we address metagenomics as a promising genomic tool to identify cashew fungal associated diseases as well as to discriminate the causal pathogens, aiming at obtaining tools to help design effective strategies for disease control and thus promote the sustainable production of cashew in West African Region.

  13. Recovery of a Medieval Brucella melitensis Genome Using Shotgun Metagenomics

    PubMed Central

    Kay, Gemma L.; Sergeant, Martin J.; Giuffra, Valentina; Bandiera, Pasquale; Milanese, Marco; Bramanti, Barbara

    2014-01-01

    ABSTRACT Shotgun metagenomics provides a powerful assumption-free approach to the recovery of pathogen genomes from contemporary and historical material. We sequenced the metagenome of a calcified nodule from the skeleton of a 14th-century middle-aged male excavated from the medieval Sardinian settlement of Geridu. We obtained 6.5-fold coverage of a Brucella melitensis genome. Sequence reads from this genome showed signatures typical of ancient or aged DNA. Despite the relatively low coverage, we were able to use information from single-nucleotide polymorphisms to place the medieval pathogen genome within a clade of B. melitensis strains that included the well-studied Ether strain and two other recent Italian isolates. We confirmed this placement using information from deletions and IS711 insertions. We conclude that metagenomics stands ready to document past and present infections, shedding light on the emergence, evolution, and spread of microbial pathogens. PMID:25028426

  14. Cloning of a newly identified heart-specific troponin I isoform, which lacks the troponin T binding portion, using the yeast hybrid system.

    PubMed

    Suzuki, Hideaki; Arakawa, Yasuhiro; Ito, Masaki; Yamada, Hisashi; Horiguchi-Yamada, Junko

    2006-01-01

    To elucidate the molecular pathogenesis behind increased levels of laminin in cardiac muscle cells in cardiomyopathy by using a yeast hybrid screen. The present study reports the cloning of a newly identified heart-specific troponin I isoform, which is putatively linked to laminin. Future studies will explore the functional significance of this connection. Yeast two-hybrid screen analysis was performed using MLF1-interacting protein (amino acids 1 to 318) as bait. The human heart complementary DNA library was screened by using the yeast-mating method for overnight culture. Two final positive clones from the heart library were isolated. These two clones encoded the same protein, a short isoform of human cardiac troponin I (TnI) that lacked TnI exons 5 and 6. The TnI isoform has a heart-specific expression pattern and it shares several sequence features with human cardiac TnI; however, it lacks the troponin T binding portion. The heart-specific segment of the human cardiac TnI isoform shares several sequence features with human cardiac TnI, but it lacks the troponin T binding portion. These results suggest that the heart-specific TnI isoform may be involved in cardiac development and disease.

  15. Elucidation of Taste- and Odor-Producing Bacteria and Toxigenic Cyanobacteria in a Midwestern Drinking Water Supply Reservoir by Shotgun Metagenomic Analysis.

    PubMed

    Otten, Timothy G; Graham, Jennifer L; Harris, Theodore D; Dreher, Theo W

    2016-09-01

    While commonplace in clinical settings, DNA-based assays for identification or enumeration of drinking water pathogens and other biological contaminants remain widely unadopted by the monitoring community. In this study, shotgun metagenomics was used to identify taste-and-odor producers and toxin-producing cyanobacteria over a 2-year period in a drinking water reservoir. The sequencing data implicated several cyanobacteria, including Anabaena spp., Microcystis spp., and an unresolved member of the order Oscillatoriales as the likely principal producers of geosmin, microcystin, and 2-methylisoborneol (MIB), respectively. To further demonstrate this, quantitative PCR (qPCR) assays targeting geosmin-producing Anabaena and microcystin-producing Microcystis were utilized, and these data were fitted using generalized linear models and compared with routine monitoring data, including microscopic cell counts, sonde-based physicochemical analyses, and assays of all inorganic and organic nitrogen and phosphorus forms and fractions. The qPCR assays explained the greatest variation in observed geosmin (adjusted R(2) = 0.71) and microcystin (adjusted R(2) = 0.84) concentrations over the study period, highlighting their potential for routine monitoring applications. The origin of the monoterpene cyclase required for MIB biosynthesis was putatively linked to a periphytic cyanobacterial mat attached to the concrete drinking water inflow structure. We conclude that shotgun metagenomics can be used to identify microbial agents involved in water quality deterioration and to guide PCR assay selection or design for routine monitoring purposes. Finally, we offer estimates of microbial diversity and metagenomic coverage of our data sets for reference to others wishing to apply shotgun metagenomics to other lacustrine systems. Cyanobacterial toxins and microbial taste-and-odor compounds are a growing concern for drinking water utilities reliant upon surface water resources. Specific

  16. To clone or not to clone--a Jewish perspective.

    PubMed Central

    Lipschutz, J H

    1999-01-01

    Many new reproductive methods such as artificial insemination, in vitro fertilisation, freezing of human embryos, and surrogate motherhood were at first widely condemned but are now seen in Western society as not just ethically and morally acceptable, but beneficial in that they allow otherwise infertile couples to have children. The idea of human cloning was also quickly condemned but debate is now emerging. This article examines cloning from a Jewish perspective and finds evidence to support the view that there is nothing inherently wrong with the idea of human cloning. A hypothesis is also advanced suggesting that even if a body was cloned, the brain, which is the essence of humanity, would remain unique. This author suggests that the debate should be changed from "Is cloning wrong?" to "When is cloning wrong?". PMID:10226913

  17. Protein Structure Determination using Metagenome sequence data

    PubMed Central

    Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A.; Kim, David E.; Kamisetty, Hetunandan; Kyrpides, Nikos C.; Baker, David

    2017-01-01

    Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost. PMID:28104891

  18. Metabolic Reconstruction for Metagenomic Data and Its Application to the Human Microbiome

    PubMed Central

    Abubucker, Sahar; Segata, Nicola; Goll, Johannes; Schubert, Alyxandria M.; Izard, Jacques; Cantarel, Brandi L.; Rodriguez-Mueller, Beltran; Zucker, Jeremy; Thiagarajan, Mathangi; Henrissat, Bernard; White, Owen; Kelley, Scott T.; Methé, Barbara; Schloss, Patrick D.; Gevers, Dirk; Mitreva, Makedonka; Huttenhower, Curtis

    2012-01-01

    Microbial communities carry out the majority of the biochemical activity on the planet, and they play integral roles in processes including metabolism and immune homeostasis in the human microbiome. Shotgun sequencing of such communities' metagenomes provides information complementary to organismal abundances from taxonomic markers, but the resulting data typically comprise short reads from hundreds of different organisms and are at best challenging to assemble comparably to single-organism genomes. Here, we describe an alternative approach to infer the functional and metabolic potential of a microbial community metagenome. We determined the gene families and pathways present or absent within a community, as well as their relative abundances, directly from short sequence reads. We validated this methodology using a collection of synthetic metagenomes, recovering the presence and abundance both of large pathways and of small functional modules with high accuracy. We subsequently applied this method, HUMAnN, to the microbial communities of 649 metagenomes drawn from seven primary body sites on 102 individuals as part of the Human Microbiome Project (HMP). This provided a means to compare functional diversity and organismal ecology in the human microbiome, and we determined a core of 24 ubiquitously present modules. Core pathways were often implemented by different enzyme families within different body sites, and 168 functional modules and 196 metabolic pathways varied in metagenomic abundance specifically to one or more niches within the microbiome. These included glycosaminoglycan degradation in the gut, as well as phosphate and amino acid transport linked to host phenotype (vaginal pH) in the posterior fornix. An implementation of our methodology is available at http://huttenhower.sph.harvard.edu/humann. This provides a means to accurately and efficiently characterize microbial metabolic pathways and functional modules directly from high-throughput sequencing reads

  19. Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing.

    PubMed

    Fang, Chao; Zhong, Huanzi; Lin, Yuxiang; Chen, Bing; Han, Mo; Ren, Huahui; Lu, Haorong; Luber, Jacob M; Xia, Min; Li, Wangsheng; Stein, Shayna; Xu, Xun; Zhang, Wenwei; Drmanac, Radoje; Wang, Jian; Yang, Huanming; Hammarström, Lennart; Kostic, Aleksandar D; Kristiansen, Karsten; Li, Junhua

    2018-03-01

    More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of 2 Illumina platforms. Using fecal samples from 20 healthy individuals, we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform vs the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the 2 Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high-quality reads (96.06% of raw reads) per sample, with 90.56% of bases scoring Q30 and above, was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02%-3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias toward genes with higher GC content being enriched on the HiSeq platforms. Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.

  20. Metagenomics of Thermophiles with a Focus on Discovery of Novel Thermozymes

    PubMed Central

    DeCastro, María-Eugenia; Rodríguez-Belmonte, Esther; González-Siso, María-Isabel

    2016-01-01

    Microbial populations living in environments with temperatures above 50°C (thermophiles) have been widely studied, increasing our knowledge in the composition and function of these ecological communities. Since these populations express a broad number of heat-resistant enzymes (thermozymes), they also represent an important source for novel biocatalysts that can be potentially used in industrial processes. The integrated study of the whole-community DNA from an environment, known as metagenomics, coupled with the development of next generation sequencing (NGS) technologies, has allowed the generation of large amounts of data from thermophiles. In this review, we summarize the main approaches commonly utilized for assessing the taxonomic and functional diversity of thermophiles through metagenomics, including several bioinformatics tools and some metagenome-derived methods to isolate their thermozymes. PMID:27729905

  1. Metagenomic analysis of carbon cycling and biogenic methane formation in terrestrial serpentinizing fluid springs

    NASA Astrophysics Data System (ADS)

    Woycheese, K. M.; Meyer-Dombard, D. R.; Cardace, D.; Arcilla, C. A.; Ono, S.

    2016-12-01

    The products of serpentinization are proposed to support a hydrogen-driven microbial biosphere in ultrabasic, highly reducing fluids. Shotgun metagenomic analysis of microbial communities collected from terrestrial serpentinizing springs in the Philippines and Turkey suggest that mutualistic relationships may help microbial communities thrive in highly oligotrophic environments. Understanding how these relationships affect production of methane in the deep subsurface is critical to applications such as carbon sequestration and natural gas production. There is conflicting evidence regarding whether methane and C2-C6 alkanes in serpentinizing ecosystems are produced abiogenically or through biotic reactions such as methanogenesis1, 2. While geochemical analysis of methane from serpentinizing ecosystems has previously indicated abiogenic and/or mixed formation3, 4, methanogens have been detected in an increasing number of investigations2. Here, putative metabolisms were identified via assembly and annotation of metagenomic sequence data from the Philippines and Turkey. At both sites, hydrogenotrophic methanogenesis and homoacetogenesis were identified as the principal autotrophic carbon fixation pathways. Heterotrophic acetogenesis and acetoclastic methanogenesis were also detected in sequence data. Other heterotrophic metabolic pathways identified included sulfate reduction, methanotrophy, and biodegradation of aromatic carbon compounds. Many of these metabolic pathways have been shown to be favorable under conditions typical of serpentinizing habitats5. Metagenomic analysis strongly suggests that at least some of the methane originating from these serpentinizing ecosystems may be biologically derived. Ongoing work will further clarify the mechanisms of methane formation by examining the clumped isotopologue ratios of dissolved methane in serpentinizing fluids. 1. Wang et al. (2015). Science. 348. doi: 10.1126/science.aaa4326 2. Kohl et al. (2016). JGR. Biogeosci

  2. ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

    DOE PAGES

    Orellana, Luis H.; Rodriguez-R, Luis M.; Konstantinidis, Konstantinos T.

    2016-10-07

    Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles andmore » related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N 2O, to inert N 2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.« less

  3. ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Orellana, Luis H.; Rodriguez-R, Luis M.; Konstantinidis, Konstantinos T.

    Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles andmore » related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N 2O, to inert N 2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.« less

  4. ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

    PubMed Central

    2017-01-01

    Abstract Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes. PMID:28180325

  5. A better sequence-read simulator program for metagenomics.

    PubMed

    Johnson, Stephen; Trost, Brett; Long, Jeffrey R; Pittet, Vanessa; Kusalik, Anthony

    2014-01-01

    There are many programs available for generating simulated whole-genome shotgun sequence reads. The data generated by many of these programs follow predefined models, which limits their use to the authors' original intentions. For example, many models assume that read lengths follow a uniform or normal distribution. Other programs generate models from actual sequencing data, but are limited to reads from single-genome studies. To our knowledge, there are no programs that allow a user to generate simulated data following non-parametric read-length distributions and quality profiles based on empirically-derived information from metagenomics sequencing data. We present BEAR (Better Emulation for Artificial Reads), a program that uses a machine-learning approach to generate reads with lengths and quality values that closely match empirically-derived distributions. BEAR can emulate reads from various sequencing platforms, including Illumina, 454, and Ion Torrent. BEAR requires minimal user input, as it automatically determines appropriate parameter settings from user-supplied data. BEAR also uses a unique method for deriving run-specific error rates, and extracts useful statistics from the metagenomic data itself, such as quality-error models. Many existing simulators are specific to a particular sequencing technology; however, BEAR is not restricted in this way. Because of its flexibility, BEAR is particularly useful for emulating the behaviour of technologies like Ion Torrent, for which no dedicated sequencing simulators are currently available. BEAR is also the first metagenomic sequencing simulator program that automates the process of generating abundances, which can be an arduous task. BEAR is useful for evaluating data processing tools in genomics. It has many advantages over existing comparable software, such as generating more realistic reads and being independent of sequencing technology, and has features particularly useful for metagenomics work.

  6. Applications of metagenomics for industrial bioproducts

    USDA-ARS?s Scientific Manuscript database

    Recent progress in mining the rich genetic resource of non-culturable microbes has led to the discovery of new genes, enzymes, and natural products. The impact of metagenomics is witnessed in the development of commodity and fine chemicals, agrochemicals and pharmaceuticals where the benefit of enz...

  7. Quantitative metagenomics reveals unique gut microbiome biomarkers in ankylosing spondylitis.

    PubMed

    Wen, Chengping; Zheng, Zhijun; Shao, Tiejuan; Liu, Lin; Xie, Zhijun; Le Chatelier, Emmanuelle; He, Zhixing; Zhong, Wendi; Fan, Yongsheng; Zhang, Linshuang; Li, Haichang; Wu, Chunyan; Hu, Changfeng; Xu, Qian; Zhou, Jia; Cai, Shunfeng; Wang, Dawei; Huang, Yun; Breban, Maxime; Qin, Nan; Ehrlich, Stanislav Dusko

    2017-07-27

    The assessment and characterization of the gut microbiome has become a focus of research in the area of human autoimmune diseases. Ankylosing spondylitis is an inflammatory autoimmune disease and evidence showed that ankylosing spondylitis may be a microbiome-driven disease. To investigate the relationship between the gut microbiome and ankylosing spondylitis, a quantitative metagenomics study based on deep shotgun sequencing was performed, using gut microbial DNA from 211 Chinese individuals. A total of 23,709 genes and 12 metagenomic species were shown to be differentially abundant between ankylosing spondylitis patients and healthy controls. Patients were characterized by a form of gut microbial dysbiosis that is more prominent than previously reported cases with inflammatory bowel disease. Specifically, the ankylosing spondylitis patients demonstrated increases in the abundance of Prevotella melaninogenica, Prevotella copri, and Prevotella sp. C561 and decreases in Bacteroides spp. It is noteworthy that the Bifidobacterium genus, which is commonly used in probiotics, accumulated in the ankylosing spondylitis patients. Diagnostic algorithms were established using a subset of these gut microbial biomarkers. Alterations of the gut microbiome are associated with development of ankylosing spondylitis. Our data suggest biomarkers identified in this study might participate in the pathogenesis or development process of ankylosing spondylitis, providing new leads for the development of new diagnostic tools and potential treatments.

  8. Modulations of the Chicken Cecal Microbiome and Metagenome in Response to Anticoccidial and Growth Promoter Treatment

    PubMed Central

    Danzeisen, Jessica L.; Kim, Hyeun Bum; Isaacson, Richard E.; Tu, Zheng Jin; Johnson, Timothy J.

    2011-01-01

    With increasing pressures to reduce or eliminate the use of antimicrobials for growth promotion purposes in production animals, there is a growing need to better understand the effects elicited by these agents in order to identify alternative approaches that might be used to maintain animal health. Antibiotic usage at subtherapeutic levels is postulated to confer a number of modulations in the microbes within the gut that ultimately result in growth promotion and reduced occurrence of disease. This study examined the effects of the coccidiostat monensin and the growth promoters virginiamycin and tylosin on the broiler chicken cecal microbiome and metagenome. Using a longitudinal design, cecal contents of commercial chickens were extracted and examined using 16S rRNA and total DNA shotgun metagenomic pyrosequencing. A number of genus-level enrichments and depletions were observed in response to monensin alone, or monensin in combination with virginiamycin or tylosin. Of note, monensin effects included depletions of Roseburia, Lactobacillus and Enterococcus, and enrichments in Coprococcus and Anaerofilum. The most notable effect observed in the monensin/virginiamycin and monensin/tylosin treatments, but not in the monensin-alone treatments, was enrichments in Escherichia coli. Analysis of the metagenomic dataset identified enrichments in transport system genes, type I fimbrial genes, and type IV conjugative secretion system genes. No significant differences were observed with regard to antimicrobial resistance gene counts. Overall, this study provides a more comprehensive glimpse of the chicken cecum microbial community, the modulations of this community in response to growth promoters, and targets for future efforts to mimic these effects using alternative approaches. PMID:22114729

  9. Metagenomic ventures into outer sequence space.

    PubMed

    Dutilh, Bas E

    Sequencing DNA or RNA directly from the environment often results in many sequencing reads that have no homologs in the database. These are referred to as "unknowns," and reflect the vast unexplored microbial sequence space of our biosphere, also known as "biological dark matter." However, unknowns also exist because metagenomic datasets are not optimally mined. There is a pressure on researchers to publish and move on, and the unknown sequences are often left for what they are, and conclusions drawn based on reads with annotated homologs. This can cause abundant and widespread genomes to be overlooked, such as the recently discovered human gut bacteriophage crAssphage. The unknowns may be enriched for bacteriophage sequences, the most abundant and genetically diverse component of the biosphere and of sequence space. However, it remains an open question, what is the actual size of biological sequence space? The de novo assembly of shotgun metagenomes is the most powerful tool to address this question.

  10. Genovo: De Novo Assembly for Metagenomes

    NASA Astrophysics Data System (ADS)

    Laserson, Jonathan; Jojic, Vladimir; Koller, Daphne

    Next-generation sequencing technologies produce a large number of noisy reads from the DNA in a sample. Metagenomics and population sequencing aim to recover the genomic sequences of the species in the sample, which could be of high diversity. Methods geared towards single sequence reconstruction are not sensitive enough when applied in this setting. We introduce a generative probabilistic model of read generation from environmental samples and present Genovo, a novel de novo sequence assembler that discovers likely sequence reconstructions under the model. A Chinese restaurant process prior accounts for the unknown number of genomes in the sample. Inference is made by applying a series of hill-climbing steps iteratively until convergence. We compare the performance of Genovo to three other short read assembly programs across one synthetic dataset and eight metagenomic datasets created using the 454 platform, the largest of which has 311k reads. Genovo's reconstructions cover more bases and recover more genes than the other methods, and yield a higher assembly score.

  11. Metagenomic Assembly Reveals Hosts of Antibiotic Resistance Genes and the Shared Resistome in Pig, Chicken, and Human Feces.

    PubMed

    Ma, Liping; Xia, Yu; Li, Bing; Yang, Ying; Li, Li-Guan; Tiedje, James M; Zhang, Tong

    2016-01-05

    The risk associated with antibiotic resistance disseminating from animal and human feces is an urgent public issue. In the present study, we sought to establish a pipeline for annotating antibiotic resistance genes (ARGs) based on metagenomic assembly to investigate ARGs and their co-occurrence with associated genetic elements. Genetic elements found on the assembled genomic fragments include mobile genetic elements (MGEs) and metal resistance genes (MRGs). We then explored the hosts of these resistance genes and the shared resistome of pig, chicken and human fecal samples. High levels of tetracycline, multidrug, erythromycin, and aminoglycoside resistance genes were discovered in these fecal samples. In particular, significantly high level of ARGs (7762 ×/Gb) was detected in adult chicken feces, indicating higher ARG contamination level than other fecal samples. Many ARGs arrangements (e.g., macA-macB and tetA-tetR) were discovered shared by chicken, pig and human feces. In addition, MGEs such as the aadA5-dfrA17-carrying class 1 integron were identified on an assembled scaffold of chicken feces, and are carried by human pathogens. Differential coverage binning analysis revealed significant ARG enrichment in adult chicken feces. A draft genome, annotated as multidrug resistant Escherichia coli, was retrieved from chicken feces metagenomes and was determined to carry diverse ARGs (multidrug, acriflavine, and macrolide). The present study demonstrates the determination of ARG hosts and the shared resistome from metagenomic data sets and successfully establishes the relationship between ARGs, hosts, and environments. This ARG annotation pipeline based on metagenomic assembly will help to bridge the knowledge gaps regarding ARG-associated genes and ARG hosts with metagenomic data sets. Moreover, this pipeline will facilitate the evaluation of environmental risks in the genetic context of ARGs.

  12. Extremozymes from metagenome: Potential applications in food processing.

    PubMed

    Khan, Mahejibin; Sathya, T A

    2017-06-12

    The long-established use of enzymes for food processing and product formulation has resulted in an increased enzyme market compounding to 7.0% annual growth rate. Advancements in molecular biology and recognition that enzymes with specific properties have application for industrial production of infant, baby and functional foods boosted research toward sourcing the genes of microorganisms for enzymes with distinctive properties. In this regard, functional metagenomics for extremozymes has gained attention on the premise that such enzymes can catalyze specific reactions. Hence, metagenomics that can isolate functional genes of unculturable extremophilic microorganisms has expanded attention as a promising tool. Developments in this field of research in relation to food sector are reviewed.

  13. Taxonomic and functional metagenomic profiling of gastrointestinal tract microbiome of the farmed adult turbot (Scophthalmus maximus).

    PubMed

    Xing, Mengxin; Hou, Zhanhui; Yuan, Jianbo; Liu, Yuan; Qu, Yanmei; Liu, Bin

    2013-12-01

    Metagenomics combined with 16S rRNA gene sequence analyses was applied to unveil the taxonomic composition and functional diversity of the farmed adult turbot gastrointestinal (GI) microbiome. Proteobacteria and Firmicutes which existed in both GI content and mucus were dominated in the turbot GI microbiome. 16S rRNA gene sequence analyses also indicated that the turbot GI tract may harbor some bacteria which originated from associated seawater. Functional analyses indicated that the clustering-based subsystem and many metabolic subsystems were dominant in the turbot GI metagenome. Compared with other gut metagenomes, quorum sensing and biofilm formation was overabundant in the turbot GI metagenome. Genes associated with quorum sensing and biofilm formation were found in species within Vibrio, including Vibrio vulnificus, Vibrio cholerae and Vibrio parahaemolyticus. In farmed fish gut metagenomes, the stress response and protein folding subsystems were over-represented and several genes concerning antibiotic and heavy metal resistance were also detected. These data suggested that the turbot GI microbiome may be affected by human factors in aquaculture. Additionally, iron acquisition and the metabolism subsystem were more abundant in the turbot GI metagenome when compared with freshwater fish gut metagenome, suggesting that unique metabolic potential may be observed in marine animal GI microbiomes. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  14. Use of simulated data sets to evaluate the fidelity of metagenomic processing methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mavromatis, K; Ivanova, N; Barry, Kerrie

    2007-01-01

    Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods presently used to process metagenomic sequences, we constructed three simulated data sets of varying complexity by combining sequencing reads randomly selected from 113 isolate genomes. These data sets were designed to model real metagenomes in terms of complexity and phylogenetic composition. We assembled sampled reads using three commonly used genome assemblers (Phrap, Arachne and JAZZ), and predicted genes using two popular gene-finding pipelines (fgenesb and CRITICA/GLIMMER). The phylogenetic origins of the assembled contigs were predicted using one sequence similarity-based ( blast hit distribution) and twomore » sequence composition-based (PhyloPythia, oligonucleotide frequencies) binning methods. We explored the effects of the simulated community structure and method combinations on the fidelity of each processing step by comparison to the corresponding isolate genomes. The simulated data sets are available online to facilitate standardized benchmarking of tools for metagenomic analysis.« less

  15. BeerDeCoded: the open beer metagenome project.

    PubMed

    Sobel, Jonathan; Henry, Luc; Rotman, Nicolas; Rando, Gianpaolo

    2017-01-01

    Next generation sequencing has radically changed research in the life sciences, in both academic and corporate laboratories. The potential impact is tremendous, yet a majority of citizens have little or no understanding of the technological and ethical aspects of this widespread adoption. We designed BeerDeCoded as a pretext to discuss the societal issues related to genomic and metagenomic data with fellow citizens, while advancing scientific knowledge of the most popular beverage of all. In the spirit of citizen science, sample collection and DNA extraction were carried out with the participation of non-scientists in the community laboratory of Hackuarium, a not-for-profit organisation that supports unconventional research and promotes the public understanding of science. The dataset presented herein contains the targeted metagenomic profile of 39 bottled beers from 5 countries, based on internal transcribed spacer (ITS) sequencing of fungal species. A preliminary analysis reveals the presence of a large diversity of wild yeast species in commercial brews. With this project, we demonstrate that coupling simple laboratory procedures that can be carried out in a non-professional environment with state-of-the-art sequencing technologies and targeted metagenomic analyses, can lead to the detection and identification of the microbial content in bottled beer.

  16. BeerDeCoded: the open beer metagenome project

    PubMed Central

    Sobel, Jonathan; Henry, Luc; Rotman, Nicolas; Rando, Gianpaolo

    2017-01-01

    Next generation sequencing has radically changed research in the life sciences, in both academic and corporate laboratories. The potential impact is tremendous, yet a majority of citizens have little or no understanding of the technological and ethical aspects of this widespread adoption. We designed BeerDeCoded as a pretext to discuss the societal issues related to genomic and metagenomic data with fellow citizens, while advancing scientific knowledge of the most popular beverage of all. In the spirit of citizen science, sample collection and DNA extraction were carried out with the participation of non-scientists in the community laboratory of Hackuarium, a not-for-profit organisation that supports unconventional research and promotes the public understanding of science. The dataset presented herein contains the targeted metagenomic profile of 39 bottled beers from 5 countries, based on internal transcribed spacer (ITS) sequencing of fungal species. A preliminary analysis reveals the presence of a large diversity of wild yeast species in commercial brews. With this project, we demonstrate that coupling simple laboratory procedures that can be carried out in a non-professional environment with state-of-the-art sequencing technologies and targeted metagenomic analyses, can lead to the detection and identification of the microbial content in bottled beer. PMID:29123645

  17. Metagenomic insight of nitrogen metabolism in a tannery wastewater treatment plant bioaugmented with the microbial consortium BM-S-1.

    PubMed

    Sul, Woo-Jun; Kim, In-Soo; Ekpeghere, Kalu I; Song, Bongkeun; Kim, Bong-Soo; Kim, Hong-Gi; Kim, Jong-Tae; Koh, Sung-Cheol

    2016-11-09

    Nitrogen (N) removal in a tannery wastewater treatment plant was significantly enhanced by the bioaugmentation of the novel consortium BM-S-1. In order to identify dominant taxa responsible for N metabolisms in the different stages of the treatment process, Illumina MiSeq Sequencer was used to conduct metagenome sequencing of the microbial communities in the different stages of treatment system, including influent (I), buffering (B), primary aeration (PA), secondary aeration (SA) and sludge digestion (SD). Based on MG-RAST analysis, the dominant phyla were Proteobacteria, Bacteroidetes and Firmicutes in B, PA, SA and SD, whereas Firmicutes was the most dominant in I before augmentation. The augmentation increased the abundance of the denitrification genes found in the genera such as Ralstonia (nirS, norB and nosZ), Pseudomonas (narG, nirS and norB) and Escherichia (narG) in B and PA. In addition, Bacteroides, Geobacter, Porphyromonasand Wolinella carrying nrfA gene encoding dissimilatory nitrate reduction to ammonium were abundantly present in B and PA. This was corroborated with the higher total N removal in these two stages. Thus, metagenomic analysis was able to identify the dominant taxa responsible for dissimilatory N metabolisms in the tannery wastewater treatment system undergoing bioaugmentation. This metagenomic insight into the nitrogen metabolism will contribute to a successful monitoring and operation of the eco-friendly tannery wastewater treatment system.

  18. Whole genome re-sequencing identifies a mutation in an ABC transporter (mdr2) in a Plasmodium chabaudi clone with altered susceptibility to antifolate drugs☆

    PubMed Central

    Martinelli, Axel; Henriques, Gisela; Cravo, Pedro; Hunt, Paul

    2011-01-01

    In malaria parasites, mutations in two genes of folate biosynthesis encoding dihydrofolate reductase (dhfr) and dihydropteroate synthase (dhps) modify responses to antifolate therapies which target these enzymes. However, the involvement of other genes which modify the availability of exogenous folate, for example, has been proposed. Here, we used short-read whole-genome re-sequencing to determine the mutations in a clone of the rodent malaria parasite, Plasmodium chabaudi, which has altered susceptibility to both sulphadoxine and pyrimethamine. This clone bears a previously identified S106N mutation in dhfr and no mutation in dhps. Instead, three additional point mutations in genes on chromosomes 2, 13 and 14 were identified. The mutated gene on chromosome 13 (mdr2 K392Q) encodes an ABC transporter. Because Quantitative Trait Locus analysis previously indicated an association of genetic markers on chromosome 13 with responses to individual and combined antifolates, MDR2 is proposed to modulate antifolate responses, possibly mediated by the transport of folate intermediates. PMID:20858498

  19. [Media, cloning, and bioethics].

    PubMed

    Costa, S I; Diniz, D

    2000-01-01

    This article was based on an analysis of three hundred articles from mainstream Brazilian periodicals over a period of eighteen months, beginning with the announcement of the Dolly case in February 1997. There were two main objectives: to outline the moral constants in the press associated with the possibility of cloning human beings and to identify some of the moral assumptions concerning scientific research with non-human animals that were published carelessly by the media. The authors conclude that there was a haphazard spread of fear concerning the cloning of human beings rather than an ethical debate on the issue, and that there is a serious gap between bioethical reflections and the Brazilian media.

  20. Targeted Metagenomic Survey of the Fe-Cycling Microbial Community at Chocolate Pots Hot Springs, Yellowstone National Park

    NASA Astrophysics Data System (ADS)

    Fortney, N. W.; He, S.; Kulkarni, A.; Friedrich, M. W.; Boyd, E. S.; Roden, E. E.

    2016-12-01

    Chocolate Pots hot springs (CP) is a circumneutral pH, Fe-rich geothermal feature located in Yellowstone National Park. Fe-based metabolic processes are deeply rooted in the tree of life and studying environments like CP are important for us to study to gain insight into ancient Earth ecosystems. Recently identified features on Mars are indicative of near-surface hydrothermal environments and studies of modern Earth systems like CP allow us a glimpse into how life may have potentially arisen on other rocky worlds. Previous enrichment culture studies of the microbial community present at CP identified close relatives of dissimilatory Fe-reducing bacteria (DIRB), including Geobacter metallireducens and Melioribacter roseus. However, the question still remains as to the composition and activity of the microbial community in situ. Here we used 13C stable isotope probing to gain an understanding of the Fe cycling microbial community at CP. Fe-Si oxide sediments collected from near the hot spring vent were incubated under in situ conditions and amended with 13C-acetate or -bicarbonate to target DIRB and Fe-oxidizing bacteria, respectively. 16S rRNA gene amplicon libraries along with shotgun metagenomic libraries were obtained from both sets of incubations. Differential read coverage mapping of metagenomic reads identified a set of taxonomic bins that showed a response to the incubation treatments. We searched the Fe-reducing incubation bins for homologues of genes involved in known extracellular electron transfer (EET) systems such as Pcc and MtrAB, as well as putative porins proximal to multiheme cytochrome c genes. We also searched bins from the Fe-oxidizing incubations for these EET systems in addition to homologues of the outer membrane cytochrome c Cyc2. The Fe-oxidizing bins were also examined for genes encoding RuBisCo to identify potential chemolithoautotrophs. Our targeted metagenomic analysis will identify which organisms are likely to be part of an active Fe

  1. Metagenomics, metaMicrobesOnline and Kbase Data Integration (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Dehal, Paramvir

    2018-02-06

    Berkeley Lab's Paramvir Dehal on "Managing and Storing large Datasets in MicrobesOnline, metaMicrobesOnline and the DOE Knowledgebase" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  2. Comparison of methods for library construction and short read annotation of shellfish viral metagenomes.

    PubMed

    Wei, Hong-Ying; Huang, Sheng; Wang, Jiang-Yong; Gao, Fang; Jiang, Jing-Zhe

    2018-03-01

    The emergence and widespread use of high-throughput sequencing technologies have promoted metagenomic studies on environmental or animal samples. Library construction for metagenome sequencing and annotation of the produced sequence reads are important steps in such studies and influence the quality of metagenomic data. In this study, we collected some marine mollusk samples, such as Crassostrea hongkongensis, Chlamys farreri, and Ruditapes philippinarum, from coastal areas in South China. These samples were divided into two batches to compare two library construction methods for shellfish viral metagenome. Our analysis showed that reverse-transcribing RNA into cDNA and then amplifying it simultaneously with DNA by whole genome amplification (WGA) yielded a larger amount of DNA compared to using only WGA or WTA (whole transcriptome amplification). Moreover, higher quality libraries were obtained by agarose gel extraction rather than with AMPure bead size selection. However, the latter can also provide good results if combined with the adjustment of the filter parameters. This, together with its simplicity, makes it a viable alternative. Finally, we compared three annotation tools (BLAST, DIAMOND, and Taxonomer) and two reference databases (NCBI's NR and Uniprot's Uniref). Considering the limitations of computing resources and data transfer speed, we propose the use of DIAMOND with Uniref for annotating metagenomic short reads as its running speed can guarantee a good annotation rate. This study may serve as a useful reference for selecting methods for Shellfish viral metagenome library construction and read annotation.

  3. DNA-SIP based genome-centric metagenomics identifies key long-chain fatty acid-degrading populations in anaerobic digesters with different feeding frequencies

    PubMed Central

    Ziels, Ryan M; Sousa, Diana Z; Stensel, H David; Beck, David A C

    2018-01-01

    Fats, oils and greases (FOG) are energy-dense wastes that can be added to anaerobic digesters to substantially increase biomethane recovery via their conversion through long-chain fatty acids (LCFAs). However, a better understanding of the ecophysiology of syntrophic LCFA-degrading microbial communities in anaerobic digesters is needed to develop operating strategies that mitigate inhibitory LCFA accumulation from FOG. In this research, DNA stable isotope probing (SIP) was coupled with metagenomic sequencing for a genome-centric comparison of oleate (C18:1)-degrading populations in two anaerobic codigesters operated with either a pulse feeding or continuous-feeding strategy. The pulse-fed codigester microcosms converted oleate into methane at over 20% higher rates than the continuous-fed codigester microcosms. Differential coverage binning was demonstrated for the first time to recover population genome bins (GBs) from DNA-SIP metagenomes. About 70% of the 13C-enriched GBs were taxonomically assigned to the Syntrophomonas genus, thus substantiating the importance of Syntrophomonas species to LCFA degradation in anaerobic digesters. Phylogenetic comparisons of 13C-enriched GBs showed that phylogenetically distinct Syntrophomonas GBs were unique to each codigester. Overall, these results suggest that syntrophic populations in anaerobic digesters can have different adaptive capacities, and that selection for divergent populations may be achieved by adjusting reactor operating conditions to maximize biomethane recovery. PMID:28895946

  4. Whither or wither geomicrobiology in the era of 'community metagenomics'

    USGS Publications Warehouse

    Oremland, R.S.; Capone, D.G.; Stolz, J.F.; Fuhrman, J.

    2005-01-01

    Molecular techniques are valuable tools that can improve our understanding of the structure of microbial communities. They provide the ability to probe for life in all niches of the biosphere, perhaps even supplanting the need to cultivate microorganisms or to conduct ecophysiological investigations. However, an overemphasis and strict dependence on such large information-driven endeavours as environmental metagenomics could overwhelm the field, to the detriment of microbial ecology. We now call for more balanced, hypothesis-driven research efforts that couple metagenomics with classic approaches.

  5. Metagenomic analysis reveals significant changes of microbial compositions and protective functions during drinking water treatment.

    PubMed

    Chao, Yuanqing; Ma, Liping; Yang, Ying; Ju, Feng; Zhang, Xu-Xiang; Wu, Wei-Min; Zhang, Tong

    2013-12-19

    The metagenomic approach was applied to characterize variations of microbial structure and functions in raw (RW) and treated water (TW) in a drinking water treatment plant (DWTP) at Pearl River Delta, China. Microbial structure was significantly influenced by the treatment processes, shifting from Gammaproteobacteria and Betaproteobacteria in RW to Alphaproteobacteria in TW. Further functional analysis indicated the basic metabolic functions of microorganisms in TW did not vary considerably. However, protective functions, i.e. glutathione synthesis genes in 'oxidative stress' and 'detoxification' subsystems, significantly increased, revealing the surviving bacteria may have higher chlorine resistance. Similar results were also found in glutathione metabolism pathway, which identified the major reaction for glutathione synthesis and supported more genes for glutathione metabolism existed in TW. This metagenomic study largely enhanced our knowledge about the influences of treatment processes, especially chlorination, on bacterial community structure and protective functions (e.g. glutathione metabolism) in ecosystems of DWTPs.

  6. The YNP Metagenome Project: Environmental Parameters Responsible for Microbial Distribution in the Yellowstone Geothermal Ecosystem

    PubMed Central

    Inskeep, William P.; Jay, Zackary J.; Tringe, Susannah G.; Herrgård, Markus J.; Rusch, Douglas B.

    2013-01-01

    The Yellowstone geothermal complex contains over 10,000 diverse geothermal features that host numerous phylogenetically deeply rooted and poorly understood archaea, bacteria, and viruses. Microbial communities in high-temperature environments are generally less diverse than soil, marine, sediment, or lake habitats and therefore offer a tremendous opportunity for studying the structure and function of different model microbial communities using environmental metagenomics. One of the broader goals of this study was to establish linkages among microbial distribution, metabolic potential, and environmental variables. Twenty geochemically distinct geothermal ecosystems representing a broad spectrum of Yellowstone hot-spring environments were used for metagenomic and geochemical analysis and included approximately equal numbers of: (1) phototrophic mats, (2) “filamentous streamer” communities, and (3) archaeal-dominated sediments. The metagenomes were analyzed using a suite of complementary and integrative bioinformatic tools, including phylogenetic and functional analysis of both individual sequence reads and assemblies of predominant phylotypes. This volume identifies major environmental determinants of a large number of thermophilic microbial lineages, many of which have not been fully described in the literature nor previously cultivated to enable functional and genomic analyses. Moreover, protein family abundance comparisons and in-depth analyses of specific genes and metabolic pathways relevant to these hot-spring environments reveal hallmark signatures of metabolic capabilities that parallel the distribution of phylotypes across specific types of geochemical environments. PMID:23653623

  7. Identifying the public's knowledge and intention to use human cloning in Greek urban areas.

    PubMed

    Tzamalouka, Georgia; Soultatou, Pelagia; Papadakaki, Maria; Chatzifotiou, Sevasti; Tarlatzis, Basil; El Chliaoutakis, Joannes

    2005-02-01

    The understanding of the public's knowledge on human cloning (HC) and its acceptability are considered important for the development of evidence-based policy making. The aim of this research study was to investigate the demographic and socioeconomic variables that affect the public's knowledge and intention to use HC in urban areas of Greece. Additionally, the possible association of religiousness with the knowledge and the intention to use HC were also investigated. Individual interviews were conducted with 1020 men and women of urban areas in Greece. Stratified random sampling was performed to select the respondents. Several scientists, experts in HC, evaluated the content of the instrument initially developed. The final questionnaire was consequently the result of a pilot study. Almost half of the respondents (51.5%) believed that "HC is a sort of in vitro fertilization" and 42.9% that "it has already been applied to human being." They were not aware that "the cloned fetus grows in the woman's uterus" (41.5%) and that "HC could regenerate human organs" (41.7%). The acceptability of human cloning for the cure of terminal diseases and transplantation need is very high (70.7% and 58.6%, respectively). The public's intention to have recourse to cloning on the grounds of "bringing" back to life a loved person or because of reproductive disorders was reported as desire by 35% and 32.5%, respectively. The occupational category (scientists, self-employed, and artists), the Intention to use HC, and the number of children are highly significant predictors of valid knowledge about HC. Low rates of church attendance appeared to relate with high reported Intention to use HC, and increasing scores of valid knowledge about HC increased the public's Intention to use HC. A number of specific demographic and socioeconomic characteristics and high scores of knowledge provide a persuasive justification in demonstrating intention toward HC. The current study suggests that these

  8. Local circulating clones of Staphylococcus aureus in Ecuador.

    PubMed

    Zurita, Jeannete; Barba, Pedro; Ortega-Paredes, David; Mora, Marcelo; Rivadeneira, Sebastián

    The spread of pandemic Staphylococcus aureus clones, mainly methicillin-resistant S. aureus (MRSA), must be kept under surveillance to assemble an accurate, local epidemiological analysis. In Ecuador, the prevalence of the USA300 Latin American variant clone (USA300-LV) is well known; however, there is little information about other circulating clones. The aim of this work was to identify the sequence types (ST) using a Multiple-Locus Variable number tandem repeat Analysis 14-locus genotyping approach. We analyzed 132 S. aureus strains that were recovered from 2005 to 2013 and isolated in several clinical settings in Quito, Ecuador. MRSA isolates composed 46.97% (62/132) of the study population. Within MRSA, 37 isolates were related to the USA300-LV clone (ST8-MRSA-IV, Panton-Valentine Leukocidin [PVL] +) and 10 were related to the Brazilian clone (ST239-MRSA-III, PVL-). Additionally, two isolates (ST5-MRSA-II, PVL-) were related to the New York/Japan clone. One isolate was related to the Pediatric clone (ST5-MRSA-IV, PVL-), one isolate (ST45-MRSA-II, PVL-) was related to the USA600 clone, and one (ST22-MRSA-IV, PVL-) was related to the epidemic UK-EMRSA-15 clone. Moreover, the most prevalent MSSA sequence types were ST8 (11 isolates), ST45 (8 isolates), ST30 (8 isolates), ST5 (7 isolates) and ST22 (6 isolates). Additionally, we found one isolate that was related to the livestock associated S. aureus clone ST398. We conclude that in addition to the high prevalence of clone LV-ST8-MRSA-IV, other epidemic clones are circulating in Quito, such as the Brazilian, Pediatric and New York/Japan clones. The USA600 and UK-EMRSA-15 clones, which were not previously described in Ecuador, were also found. Moreover, we found evidence of the presence of the livestock associated clone ST398 in a hospital environment. Copyright © 2016 Sociedade Brasileira de Infectologia. Published by Elsevier Editora Ltda. All rights reserved.

  9. Photonic quantum simulator for unbiased phase covariant cloning

    NASA Astrophysics Data System (ADS)

    Knoll, Laura T.; López Grande, Ignacio H.; Larotonda, Miguel A.

    2018-01-01

    We present the results of a linear optics photonic implementation of a quantum circuit that simulates a phase covariant cloner, using two different degrees of freedom of a single photon. We experimentally simulate the action of two mirrored 1→ 2 cloners, each of them biasing the cloned states into opposite regions of the Bloch sphere. We show that by applying a random sequence of these two cloners, an eavesdropper can mitigate the amount of noise added to the original input state and therefore, prepare clones with no bias, but with the same individual fidelity, masking its presence in a quantum key distribution protocol. Input polarization qubit states are cloned into path qubit states of the same photon, which is identified as a potential eavesdropper in a quantum key distribution protocol. The device has the flexibility to produce mirrored versions that optimally clone states on either the northern or southern hemispheres of the Bloch sphere, as well as to simulate optimal and non-optimal cloning machines by tuning the asymmetry on each of the cloning machines.

  10. Defended to the Nines: 25 Years of Resistance Gene Cloning Identifies Nine Mechanisms for R Protein Function.

    PubMed

    Kourelis, Jiorgos; van der Hoorn, Renier A L

    2018-02-01

    Plants have many, highly variable resistance ( R ) gene loci, which provide resistance to a variety of pathogens. The first R gene to be cloned, maize ( Zea mays ) Hm1 , was published over 25 years ago, and since then, many different R genes have been identified and isolated. The encoded proteins have provided clues to the diverse molecular mechanisms underlying immunity. Here, we present a meta-analysis of 314 cloned R genes. The majority of R genes encode cell surface or intracellular receptors, and we distinguish nine molecular mechanisms by which R proteins can elevate or trigger disease resistance: direct (1) or indirect (2) perception of pathogen-derived molecules on the cell surface by receptor-like proteins and receptor-like kinases; direct (3) or indirect (4) intracellular detection of pathogen-derived molecules by nucleotide binding, leucine-rich repeat receptors, or detection through integrated domains (5); perception of transcription activator-like effectors through activation of executor genes (6); and active (7), passive (8), or host reprogramming-mediated (9) loss of susceptibility. Although the molecular mechanisms underlying the functions of R genes are only understood for a small proportion of known R genes, a clearer understanding of mechanisms is emerging and will be crucial for rational engineering and deployment of novel R genes. © 2018 American Society of Plant Biologists. All rights reserved.

  11. Diverse Array of New Viral Sequences Identified in Worldwide Populations of the Asian Citrus Psyllid (Diaphorina citri) Using Viral Metagenomics.

    PubMed

    Nouri, Shahideh; Salem, Nidá; Nigg, Jared C; Falk, Bryce W

    2015-12-16

    The Asian citrus psyllid, Diaphorina citri, is the natural vector of the causal agent of Huanglongbing (HLB), or citrus greening disease. Together; HLB and D. citri represent a major threat to world citrus production. As there is no cure for HLB, insect vector management is considered one strategy to help control the disease, and D. citri viruses might be useful. In this study, we used a metagenomic approach to analyze viral sequences associated with the global population of D. citri. By sequencing small RNAs and the transcriptome coupled with bioinformatics analysis, we showed that the virus-like sequences of D. citri are diverse. We identified novel viral sequences belonging to the picornavirus superfamily, the Reoviridae, Parvoviridae, and Bunyaviridae families, and an unclassified positive-sense single-stranded RNA virus. Moreover, a Wolbachia prophage-related sequence was identified. This is the first comprehensive survey to assess the viral community from worldwide populations of an agricultural insect pest. Our results provide valuable information on new putative viruses, some of which may have the potential to be used as biocontrol agents. Insects have the most species of all animals, and are hosts to, and vectors of, a great variety of known and unknown viruses. Some of these most likely have the potential to be important fundamental and/or practical resources. In this study, we used high-throughput next-generation sequencing (NGS) technology and bioinformatics analysis to identify putative viruses associated with Diaphorina citri, the Asian citrus psyllid. D. citri is the vector of the bacterium causing Huanglongbing (HLB), currently the most serious threat to citrus worldwide. Here, we report several novel viral sequences associated with D. citri. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  12. Aerially transmitted human fungal pathogens: what can we learn from metagenomics and comparative genomics?

    PubMed

    Aliouat-Denis, Cécile-Marie; Chabé, Magali; Delhaes, Laurence; Dei-Cas, Eduardo

    2014-01-01

    In the last few decades, aerially transmitted human fungal pathogens have been increasingly recognized to impact the clinical course of chronic pulmonary diseases, such as asthma, cystic fibrosis or chronic obstructive pulmonary disease. Thanks to recent development of culture-free high-throughput sequencing methods, the metagenomic approaches are now appropriate to detect, identify and even quantify prokaryotic or eukaryotic microorganism communities inhabiting human respiratory tract and to access the complexity of even low-burden microbe communities that are likely to play a role in chronic pulmonary diseases. In this review, we explore how metagenomics and comparative genomics studies can alleviate fungal culture bottlenecks, improve our knowledge about fungal biology, lift the veil on cross-talks between host lung and fungal microbiota, and gain insights into the pathogenic impact of these aerially transmitted fungi that affect human beings. We reviewed metagenomic studies and comparative genomic analyses of carefully chosen microorganisms, and confirmed the usefulness of such approaches to better delineate biology and pathogenesis of aerially transmitted human fungal pathogens. Efforts to generate and efficiently analyze the enormous amount of data produced by such novel approaches have to be pursued, and will potentially provide the patients suffering from chronic pulmonary diseases with a better management. This manuscript is part of the series of works presented at the "V International Workshop: Molecular genetic approaches to the study of human pathogenic fungi" (Oaxaca, Mexico, 2012). Copyright © 2013 Revista Iberoamericana de Micología. Published by Elsevier Espana. All rights reserved.

  13. Metagenomic detection of phage-encoded platelet-binding factors in the human oral cavity

    PubMed Central

    Willner, Dana; Furlan, Mike; Schmieder, Robert; Grasis, Juris A.; Pride, David T.; Relman, David A.; Angly, Florent E.; McDole, Tracey; Mariella, Ray P.; Rohwer, Forest; Haynes, Matthew

    2011-01-01

    The human oropharynx is a reservoir for many potential pathogens, including streptococcal species that cause endocarditis. Although oropharyngeal microbes have been well described, viral communities are essentially uncharacterized. We conducted a metagenomic study to determine the composition of oropharyngeal DNA viral communities (both phage and eukaryotic viruses) in healthy individuals and to evaluate oropharyngeal swabs as a rapid method for viral detection. Viral DNA was extracted from 19 pooled oropharyngeal swabs and sequenced. Viral communities consisted almost exclusively of phage, and complete genomes of several phage were recovered, including Escherichia coli phage T3, Propionibacterium acnes phage PA6, and Streptococcus mitis phage SM1. Phage relative abundances changed dramatically depending on whether samples were chloroform treated or filtered to remove microbial contamination. pblA and pblB genes of phage SM1 were detected in the metagenomes. pblA and pblB mediate the attachment of S. mitis to platelets and play a significant role in S. mitis virulence in the endocardium, but have never previously been detected in the oral cavity. These genes were also identified in salivary metagenomes from three individuals at three time points and in individual saliva samples by PCR. Additionally, we demonstrate that phage SM1 can be induced by commonly ingested substances. Our results indicate that the oral cavity is a reservoir for pblA and pblB genes and for phage SM1 itself. Further studies will determine the association between pblA and pblB genes in the oral cavity and the risk of endocarditis. PMID:20547834

  14. An Artificial Functional Family Filter in Homolog Searching in Next-generation Sequencing Metagenomics

    PubMed Central

    Du, Ruofei; Mercante, Donald; Fang, Zhide

    2013-01-01

    In functional metagenomics, BLAST homology search is a common method to classify metagenomic reads into protein/domain sequence families such as Clusters of Orthologous Groups of proteins (COGs) in order to quantify the abundance of each COG in the community. The resulting functional profile of the community is then used in downstream analysis to correlate the change in abundance to environmental perturbation, clinical variation, and so on. However, the short read length coupled with next-generation sequencing technologies poses a barrier in this approach, essentially because similarity significance cannot be discerned by searching with short reads. Consequently, artificial functional families are produced, in which those with a large number of reads assigned decreases the accuracy of functional profile dramatically. There is no method available to address this problem. We intended to fill this gap in this paper. We revealed that BLAST similarity scores of homologues for short reads from COG protein members coding sequences are distributed differently from the scores of those derived elsewhere. We showed that, by choosing an appropriate score cut-off, we are able to filter out most artificial families and simultaneously to preserve sufficient information in order to build the functional profile. We also showed that, by incorporated application of BLAST and RPS-BLAST, some artificial families with large read counts can be further identified after the score cutoff filtration. Evaluated on three experimental metagenomic datasets with different coverages, we found that the proposed method is robust against read coverage and consistently outperforms the other E-value cutoff methods currently used in literatures. PMID:23516532

  15. Cloning of a newly identified heart-specific troponin I isoform, which lacks the troponin T binding portion, using the yeast hybrid system

    PubMed Central

    Suzuki, Hideaki; Arakawa, Yasuhiro; Ito, Masaki; Yamada, Hisashi; Horiguchi-Yamada, Junko

    2006-01-01

    OBJECTIVE To elucidate the molecular pathogenesis behind increased levels of laminin in cardiac muscle cells in cardiomyopathy by using a yeast hybrid screen. The present study reports the cloning of a newly identified heart-specific troponin I isoform, which is putatively linked to laminin. Future studies will explore the functional significance of this connection. METHODS Yeast two-hybrid screen analysis was performed using MLF1-interacting protein (amino acids 1 to 318) as bait. The human heart complementary DNA library was screened by using the yeast-mating method for overnight culture. RESULTS Two final positive clones from the heart library were isolated. These two clones encoded the same protein, a short isoform of human cardiac troponin I (TnI) that lacked TnI exons 5 and 6. The TnI isoform has a heart-specific expression pattern and it shares several sequence features with human cardiac TnI; however, it lacks the troponin T binding portion. CONCLUSION The heart-specific segment of the human cardiac TnI isoform shares several sequence features with human cardiac TnI, but it lacks the troponin T binding portion. These results suggest that the heart-specific TnI isoform may be involved in cardiac development and disease. PMID:18651010

  16. Culture-independent discovery of natural products from soil metagenomes.

    PubMed

    Katz, Micah; Hover, Bradley M; Brady, Sean F

    2016-03-01

    Bacterial natural products have proven to be invaluable starting points in the development of many currently used therapeutic agents. Unfortunately, traditional culture-based methods for natural product discovery have been deemphasized by pharmaceutical companies due in large part to high rediscovery rates. Culture-independent, or "metagenomic," methods, which rely on the heterologous expression of DNA extracted directly from environmental samples (eDNA), have the potential to provide access to metabolites encoded by a large fraction of the earth's microbial biosynthetic diversity. As soil is both ubiquitous and rich in bacterial diversity, it is an appealing starting point for culture-independent natural product discovery efforts. This review provides an overview of the history of soil metagenome-driven natural product discovery studies and elaborates on the recent development of new tools for sequence-based, high-throughput profiling of environmental samples used in discovering novel natural product biosynthetic gene clusters. We conclude with several examples of these new tools being employed to facilitate the recovery of novel secondary metabolite encoding gene clusters from soil metagenomes and the subsequent heterologous expression of these clusters to produce bioactive small molecules.

  17. Elucidation of taste- and odor-producing bacteria and toxigenic cyanobacteria in a Midwestern drinking water supply reservoir by shotgun metagenomics analysis

    USGS Publications Warehouse

    Otten, Timothy; Graham, Jennifer L.; Harris, Theodore D.; Dreher, Theo

    2016-01-01

    While commonplace in clinical settings, DNA-based assays for identification or enumeration of drinking water pathogens and other biological contaminants remain widely unadopted by the monitoring community. In this study, shotgun metagenomics was used to identify taste-and-odor producers and toxin-producing cyanobacteria over a 2-year period in a drinking water reservoir. The sequencing data implicated several cyanobacteria, including Anabaena spp.,Microcystis spp., and an unresolved member of the order Oscillatoriales as the likely principal producers of geosmin, microcystin, and 2-methylisoborneol (MIB), respectively. To further demonstrate this, quantitative PCR (qPCR) assays targeting geosmin-producing Anabaena and microcystin-producing Microcystis were utilized, and these data were fitted using generalized linear models and compared with routine monitoring data, including microscopic cell counts, sonde-based physicochemical analyses, and assays of all inorganic and organic nitrogen and phosphorus forms and fractions. The qPCR assays explained the greatest variation in observed geosmin (adjusted R2 = 0.71) and microcystin (adjusted R2 = 0.84) concentrations over the study period, highlighting their potential for routine monitoring applications. The origin of the monoterpene cyclase required for MIB biosynthesis was putatively linked to a periphytic cyanobacterial mat attached to the concrete drinking water inflow structure. We conclude that shotgun metagenomics can be used to identify microbial agents involved in water quality deterioration and to guide PCR assay selection or design for routine monitoring purposes. Finally, we offer estimates of microbial diversity and metagenomic coverage of our data sets for reference to others wishing to apply shotgun metagenomics to other lacustrine systems.

  18. Functional Metagenomic Investigations of Microbial Communities in a Shallow-Sea Hydrothermal System

    PubMed Central

    Tang, Kai; Liu, Keshao; Jiao, Nianzhi; Zhang, Yao; Chen, Chen-Tung Arthur

    2013-01-01

    Little is known about the functional capability of microbial communities in shallow-sea hydrothermal systems (water depth of <200 m). This study analyzed two high-throughput pyrosequencing metagenomic datasets from the vent and the surface water in the shallow-sea hydrothermal system offshore NE Taiwan. This system exhibited distinct geochemical parameters. Metagenomic data revealed that the vent and the surface water were predominated by Epsilonproteobacteria (Nautiliales-like organisms) and Gammaproteobacteria ( Thiomicrospira -like organisms), respectively. A significant difference in microbial carbon fixation and sulfur metabolism was found between the vent and the surface water. The chemoautotrophic microorganisms in the vent and in the surface water might possess the reverse tricarboxylic acid cycle and the Calvin−Bassham−Benson cycle for carbon fixation in response to carbon dioxide highly enriched in the environment, which is possibly fueled by geochemical energy with sulfur and hydrogen. Comparative analyses of metagenomes showed that the shallow-sea metagenomes contained some genes similar to those present in other extreme environments. This study may serve as a basis for deeply understanding the genetic network and functional capability of the microbial members of shallow-sea hydrothermal systems. PMID:23940820

  19. Productivity and salinity structuring of the microplankton revealed by comparative freshwater metagenomics

    PubMed Central

    Eiler, Alexander; Zaremba-Niedzwiedzka, Katarzyna; Martínez-García, Manuel; McMahon, Katherine D; Stepanauskas, Ramunas; Andersson, Siv G E; Bertilsson, Stefan

    2014-01-01

    Little is known about the diversity and structuring of freshwater microbial communities beyond the patterns revealed by tracing their distribution in the landscape with common taxonomic markers such as the ribosomal RNA. To address this gap in knowledge, metagenomes from temperate lakes were compared to selected marine metagenomes. Taxonomic analyses of rRNA genes in these freshwater metagenomes confirm the previously reported dominance of a limited subset of uncultured lineages of freshwater bacteria, whereas Archaea were rare. Diversification into marine and freshwater microbial lineages was also reflected in phylogenies of functional genes, and there were also significant differences in functional beta-diversity. The pathways and functions that accounted for these differences are involved in osmoregulation, active transport, carbohydrate and amino acid metabolism. Moreover, predicted genes orthologous to active transporters and recalcitrant organic matter degradation were more common in microbial genomes from oligotrophic versus eutrophic lakes. This comparative metagenomic analysis allowed us to formulate a general hypothesis that oceanic- compared with freshwater-dwelling microorganisms, invest more in metabolism of amino acids and that strategies of carbohydrate metabolism differ significantly between marine and freshwater microbial communities. PMID:24118837

  20. Introduction to Metagenomics at DOE JGI (Opening Remarks for the Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Kyrpides, Nikos [DOE JGI

    2018-05-30

    After a quick introduction by DOE JGI Director Eddy Rubin, DOE JGI's Nikos Kyrpides delivers the opening remarks at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  1. Defended to the Nines: 25 Years of Resistance Gene Cloning Identifies Nine Mechanisms for R Protein Function[OPEN

    PubMed Central

    2018-01-01

    Plants have many, highly variable resistance (R) gene loci, which provide resistance to a variety of pathogens. The first R gene to be cloned, maize (Zea mays) Hm1, was published over 25 years ago, and since then, many different R genes have been identified and isolated. The encoded proteins have provided clues to the diverse molecular mechanisms underlying immunity. Here, we present a meta-analysis of 314 cloned R genes. The majority of R genes encode cell surface or intracellular receptors, and we distinguish nine molecular mechanisms by which R proteins can elevate or trigger disease resistance: direct (1) or indirect (2) perception of pathogen-derived molecules on the cell surface by receptor-like proteins and receptor-like kinases; direct (3) or indirect (4) intracellular detection of pathogen-derived molecules by nucleotide binding, leucine-rich repeat receptors, or detection through integrated domains (5); perception of transcription activator-like effectors through activation of executor genes (6); and active (7), passive (8), or host reprogramming-mediated (9) loss of susceptibility. Although the molecular mechanisms underlying the functions of R genes are only understood for a small proportion of known R genes, a clearer understanding of mechanisms is emerging and will be crucial for rational engineering and deployment of novel R genes. PMID:29382771

  2. The Clone Factory

    ERIC Educational Resources Information Center

    Stoddard, Beryl

    2005-01-01

    Have humans been cloned? Is it possible? Immediate interest is sparked when students are asked these questions. In response to their curiosity, the clone factory activity was developed to help them understand the process of cloning. In this activity, students reenact the cloning process, in a very simplified simulation. After completing the…

  3. An Integrated Metagenomics/Metaproteomics Investigation of the Microbial Communities and Enzymes in Solid-state Fermentation of Pu-erh tea

    PubMed Central

    Zhao, Ming; Zhang, Dong-lian; Su, Xiao-qin; Duan, Shuang-mei; Wan, Jin-qiong; Yuan, Wen-xia; Liu, Ben-ying; Ma, Yan; Pan, Ying-hong

    2015-01-01

    Microbial enzymes during solid-state fermentation (SSF), which play important roles in the food, chemical, pharmaceutical and environmental fields, remain relatively unknown. In this work, the microbial communities and enzymes in SSF of Pu-erh tea, a well-known traditional Chinese tea, were investigated by integrated metagenomics/metaproteomics approach. The dominant bacteria and fungi were identified as Proteobacteria (48.42%) and Aspergillus (94.98%), through pyrosequencing-based analyses of the bacterial 16S and fungal 18S rRNA genes, respectively. In total, 335 proteins with at least two unique peptides were identified and classified into 28 Biological Processes and 35 Molecular Function categories using a metaproteomics analysis. The integration of metagenomics and metaproteomics data demonstrated that Aspergillus was dominant fungus and major host of identified proteins (50.45%). Enzymes involved in the degradation of the plant cell wall were identified and associated with the soft-rotting of tea leaves. Peroxiredoxins, catalase and peroxidases were associated with the oxidation of catechins. In conclusion, this work greatly advances our understanding of the SSF of Pu-erh tea and provides a powerful tool for studying SSF mechanisms, especially in relation to the microbial communities present. PMID:25974221

  4. Metagenomic Approaches to Assess Bacteriophages in Various Environmental Niches

    PubMed Central

    Hayes, Stephen; Mahony, Jennifer; Nauta, Arjen; van Sinderen, Douwe

    2017-01-01

    Bacteriophages are ubiquitous and numerous parasites of bacteria and play a critical evolutionary role in virtually every ecosystem, yet our understanding of the extent of the diversity and role of phages remains inadequate for many ecological niches, particularly in cases in which the host is unculturable. During the past 15 years, the emergence of the field of viral metagenomics has drastically enhanced our ability to analyse the so-called viral ‘dark matter’ of the biosphere. Here, we review the evolution of viral metagenomic methodologies, as well as providing an overview of some of the most significant applications and findings in this field of research. PMID:28538703

  5. Metagenomic and metatranscriptomic analysis of saliva reveals disease-associated microbiota in patients with periodontitis and dental caries.

    PubMed

    Belstrøm, Daniel; Constancias, Florentin; Liu, Yang; Yang, Liang; Drautz-Moses, Daniela I; Schuster, Stephan C; Kohli, Gurjeet Singh; Jakobsen, Tim Holm; Holmstrup, Palle; Givskov, Michael

    2017-01-01

    The taxonomic composition of the salivary microbiota has been reported to differentiate between oral health and disease. However, information on bacterial activity and gene expression of the salivary microbiota is limited. The purpose of this study was to perform metagenomic and metatranscriptomic characterization of the salivary microbiota and test the hypothesis that salivary microbial presence and activity could be an indicator of the oral health status. Stimulated saliva samples were collected from 30 individuals (periodontitis: n  = 10, dental caries: n  = 10, oral health: n  = 10). Salivary microbiota was characterized using metagenomics and metatranscriptomics in order to compare community composition and the gene expression between the three groups. Streptococcus was the predominant bacterial genus constituting approx. 25 and 50% of all DNA and RNA reads, respectively. A significant disease-associated higher relative abundance of traditional periodontal pathogens such as Porphyromonas gingivalis and Filifactor alocis and salivary microbial activity of F . alocis was associated with periodontitis. Significantly higher relative abundance of caries-associated bacteria such as Streptococcus mutans and Lactobacillus fermentum was identified in saliva from patients with dental caries. Multiple genes involved in carbohydrate metabolism were significantly more expressed in healthy controls compared to periodontitis patients. Using metagenomics and metatranscriptomics we show that relative abundance of specific oral bacterial species and bacterial gene expression in saliva associates with periodontitis and dental caries. Further longitudinal studies are warranted to evaluate if screening of salivary microbial activity of specific oral bacterial species and metabolic gene expression can identify periodontitis and dental caries at preclinical stages.

  6. High Throughput Screening of Esterases, Lipases and Phospholipases in Mutant and Metagenomic Libraries: A Review.

    PubMed

    Peña-García, Carlina; Martínez-Martínez, Mónica; Reyes-Duarte, Dolores; Ferrer, Manuel

    2016-01-01

    Nowadays, enzymes can be efficiently identified and screened from metagenomic resources or mutant libraries. A set of a few hundred new enzymes can be found using a simple substrate within few months. Hence, the establishment of collections of enzymes is no longer a big hurdle. However, a key problem is the relatively low rate of positive hits and that a timeline of several years from the identification of a gene to the development of a process is the reality rather than the exception. Major problems are related to the time-consuming and cost-intensive screening process that only very few enzymes finally pass. Accessing to the highest possible enzyme and mutant diversity by different, but complementary approaches is increasingly important. The aim of this review is to deliver state-of-art status of traditional and novel screening protocols for targeting lipases, esterases and phospholipases of industrial relevance, and that can be applied at high throughput scale (HTS) for at least 200 distinct substrates, at a speed of more than 105 - 108 clones/day. We also review fine-tuning sequence analysis pipelines and in silico tools, which can further improve enzyme selection by an unprecedent speed (up to 1030 enzymes). If the hit rate in an enzyme collection could be increased by HTS approaches, it can be expected that also the very further expensive and time-consuming enzyme optimization phase could be significantly shortened, as the processes of enzyme-candidate selection by such methods can be adapted to conditions most likely similar to the ones needed at industrial scale.

  7. Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing

    PubMed Central

    Xie, G.; Chain, P.S.G.; Lo, C.; Liu, K-L.; Gans, J.; Merritt, J.; Qi, F.

    2010-01-01

    SUMMARY Human dental plaque is a complex microbial community containing an estimated 700 to 19,000 species/phylotypes. Despite numerous studies analysing species richness in healthy and diseased human subjects, the true genomic composition of the human dental plaque microbiota remains unknown. Here we report a metagenomic analysis of a healthy human plaque sample using a combination of second-generation sequencing platforms. A total of 860 million base pairs of non-human sequences were generated. Various analysis tools revealed the presence of 12 well-characterized phyla, members of the TM-7 and BRC1 clade, and sequences that could not be classified. Both pathogens and opportunistic pathogens were identified, supporting the ecological plaque hypothesis for oral diseases. Mapping the metagenomic reads to sequenced reference genomes demonstrated that 4% of the reads could be assigned to the sequenced species. Preliminary annotation identified genes belonging to all known functional categories. Interestingly, although 73% of the total assembled contig sequences were predicted to code for proteins, only 51% of them could be assigned a functional role. Furthermore, ~ 2.8% of the total predicted genes coded for proteins involved in resistance to antibiotics and toxic compounds, suggesting that the oral cavity is an important reservoir for antimicrobial resistance. PMID:21040513

  8. Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing.

    PubMed

    Xie, G; Chain, P S G; Lo, C-C; Liu, K-L; Gans, J; Merritt, J; Qi, F

    2010-12-01

    Human dental plaque is a complex microbial community containing an estimated 700 to 19,000 species/phylotypes. Despite numerous studies analysing species richness in healthy and diseased human subjects, the true genomic composition of the human dental plaque microbiota remains unknown. Here we report a metagenomic analysis of a healthy human plaque sample using a combination of second-generation sequencing platforms. A total of 860 million base pairs of non-human sequences were generated. Various analysis tools revealed the presence of 12 well-characterized phyla, members of the TM-7 and BRC1 clade, and sequences that could not be classified. Both pathogens and opportunistic pathogens were identified, supporting the ecological plaque hypothesis for oral diseases. Mapping the metagenomic reads to sequenced reference genomes demonstrated that 4% of the reads could be assigned to the sequenced species. Preliminary annotation identified genes belonging to all known functional categories. Interestingly, although 73% of the total assembled contig sequences were predicted to code for proteins, only 51% of them could be assigned a functional role. Furthermore, ~2.8% of the total predicted genes coded for proteins involved in resistance to antibiotics and toxic compounds, suggesting that the oral cavity is an important reservoir for antimicrobial resistance. © 2010 John Wiley & Sons A/S.

  9. Identification of a Novel Human Papillomavirus by Metagenomic Analysis of Samples from Patients with Febrile Respiratory Illness

    PubMed Central

    Mokili, John L.; Dutilh, Bas E.; Lim, Yan Wei; Schneider, Bradley S.; Taylor, Travis; Haynes, Matthew R.; Metzgar, David; Myers, Christopher A.; Blair, Patrick J.; Nosrat, Bahador; Wolfe, Nathan D.; Rohwer, Forest

    2013-01-01

    As part of a virus discovery investigation using a metagenomic approach, a highly divergent novel Human papillomavirus type was identified in pooled convenience nasal/oropharyngeal swab samples collected from patients with febrile respiratory illness. Phylogenetic analysis of the whole genome and the L1 gene reveals that the new HPV identified in this study clusters with previously described gamma papillomaviruses, sharing only 61.1% (whole genome) and 63.1% (L1) sequence identity with its closest relative in the Papillomavirus episteme (PAVE) database. This new virus was named HPV_SD2 pending official classification. The complete genome of HPV-SD2 is 7,299 bp long (36.3% G/C) and contains 7 open reading frames (L2, L1, E6, E7, E1, E2 and E4) and a non-coding long control region (LCR) between L1 and E6. The metagenomic procedures, coupled with the bioinformatic methods described herein are well suited to detect small circular genomes such as those of human papillomaviruses. PMID:23554892

  10. SPRUCE Deep Peat Heat (DPH) Metagenomes for Peat Samples Collected June 2015

    DOE Data Explorer

    Klumber, Laurel A. [Oak Ridge National Laboratory, U.S. Department of Energy, Oak Ridge, Tennessee, U.S.A.; Yang, Zamin K. [Oak Ridge National Laboratory, U.S. Department of Energy, Oak Ridge, Tennessee, U.S.A.; Schadt, Christopher W. [Oak Ridge National Laboratory, U.S. Department of Energy, Oak Ridge, Tennessee, U.S.A.

    2015-01-01

    This data set provides links to the results of metagenomic analyses of 38 peat core samples collected on 16 June 2015 from SPRUCE experiment treatment plots after approximately one year of belowground heating. These metagenomes are archived in the U.S. Department of Energy Joint Genome Institute (DOE JGI) Integrated Microbial Genomes (IMG) system and are available at the accession numbers provided in the accompanying inventory file.

  11. Characterization of truncated endo-β-1,4-glucanases from a compost metagenomic library and their saccharification potentials.

    PubMed

    Lee, Jae Pil; Lee, Hyun Woo; Na, Han Beur; Lee, Jun-Hee; Hong, Yeo-Jin; Jeon, Jeong-Min; Kwon, Eun Ju; Kim, Sung Kyum; Kim, Hoon

    2018-04-23

    A gene encoding an endo-β-1,4-glucanase (Cel6H-f481) was cloned from a compost metagenomic library. The gene, cel6H-f481, was composed of 1446 bp to encode a fused protein of 481 amino acid residues (50,429 Da), i.e., 445 residues (Cel6H-445) from the metagenome, and 36 residues from the pUC19 vector at N-terminus. Cel6H-445 belonged to glycosyl hydrolase (GH) family 6 and showed 71% identity with Actinotalea fermentans endoglucanase with low coverage. Several active bands of truncated forms were observed by activity staining of the crude extract. Major truncated enzymes of 35 (Cel6H-p35) and 23 kDa (Cel6H-p23) were separated by HiTrap Q chromatography. The two enzymes had the same optimum temperature (50 °C) and pH (5.5), but Cel6H-p35 was more thermostable than Cel6H-p23 and other GH6 endoglucanases reported. Both enzymes efficiently hydrolyzed carboxymethyl-cellulose (CMC) and barley β-glucan, but hardly hydrolyzed other substrates tested. The V max of Cel6H-p35 for CMC was 1.4 times greater than that of Cel6H-p23. The addition of the crude enzymes to a commercial enzyme set increased the saccharification of pretreated rice straw powder by up to 30.9%. These results suggest the N-terminal region of Cel6H-p35 contributes to thermostability and specific activity, and that the enzymes might be a useful additive for saccharification. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. Using populations of human and microbial genomes for organism detection in metagenomes

    DOE PAGES

    Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel; ...

    2015-04-29

    Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less

  13. Using populations of human and microbial genomes for organism detection in metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ames, Sasha K.; Gardner, Shea N.; Marti, Jose Manuel

    Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-freemore » human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. In conclusion, left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.« less

  14. Ten years of maintaining and expanding a microbial genome and metagenome analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2015-11-01

    Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. High-throughput cloning and expression library creation for functional proteomics.

    PubMed

    Festa, Fernanda; Steel, Jason; Bian, Xiaofang; Labaer, Joshua

    2013-05-01

    The study of protein function usually requires the use of a cloned version of the gene for protein expression and functional assays. This strategy is particularly important when the information available regarding function is limited. The functional characterization of the thousands of newly identified proteins revealed by genomics requires faster methods than traditional single-gene experiments, creating the need for fast, flexible, and reliable cloning systems. These collections of ORF clones can be coupled with high-throughput proteomics platforms, such as protein microarrays and cell-based assays, to answer biological questions. In this tutorial, we provide the background for DNA cloning, discuss the major high-throughput cloning systems (Gateway® Technology, Flexi® Vector Systems, and Creator(TM) DNA Cloning System) and compare them side-by-side. We also report an example of high-throughput cloning study and its application in functional proteomics. This tutorial is part of the International Proteomics Tutorial Programme (IPTP12). © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Metagenomic characterization of viral communities in corals: mining biological signal from methodological noise.

    PubMed

    Wood-Charlson, Elisha M; Weynberg, Karen D; Suttle, Curtis A; Roux, Simon; van Oppen, Madeleine J H

    2015-10-01

    Reef-building corals form close associations with organisms from all three domains of life and therefore have many potential viral hosts. Yet knowledge of viral communities associated with corals is barely explored. This complexity presents a number of challenges in terms of the metagenomic assessments of coral viral communities and requires specialized methods for purification and amplification of viral nucleic acids, as well as virome annotation. In this minireview, we conduct a meta-analysis of the limited number of existing coral virome studies, as well as available coral transcriptome and metagenome data, to identify trends and potential complications inherent in different methods. The analysis shows that the method used for viral nucleic acid isolation drastically affects the observed viral assemblage and interpretation of the results. Further, the small number of viral reference genomes available, coupled with short sequence read lengths might cause errors in virus identification. Despite these limitations and potential biases, the data show that viral communities associated with corals are diverse, with double- and single-stranded DNA and RNA viruses. The identified viruses are dominated by double-stranded DNA-tailed bacteriophages, but there are also viruses that infect eukaryote hosts, likely the endosymbiotic dinoflagellates, Symbiodinium spp., host coral and other eukaryotes in close association. © 2015 The Authors. Environmental Microbiology published by Society for Applied Microbiology and John Wiley & Sons Ltd.

  17. Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia

    PubMed Central

    Maezato, Yukari; Wu, Yu-Wei; Romine, Margaret F.; Lindemann, Stephen R.

    2015-01-01

    To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled the de novo reconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 of the 20 detected member species. Two Halomonas spp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of the Halomonas populations, one of the Rhodobacteraceae populations, and the Rhizobiales population. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set. PMID:26497460

  18. Identification and Resolution of Microdiversity through Metagenomic Sequencing of Parallel Consortia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, William C.; Maezato, Yukari; Wu, Yu-Wei

    2015-10-23

    To gain a predictive understanding of the interspecies interactions within microbial communities that govern community function, the genomic complement of every member population must be determined. Although metagenomic sequencing has enabled thede novoreconstruction of some microbial genomes from environmental communities, microdiversity confounds current genome reconstruction techniques. To overcome this issue, we performed short-read metagenomic sequencing on parallel consortia, defined as consortia cultivated under the same conditions from the same natural community with overlapping species composition. The differences in species abundance between the two consortia allowed reconstruction of near-complete (at an estimated >85% of gene complement) genome sequences for 17 ofmore » the 20 detected member species. TwoHalomonasspp. indistinguishable by amplicon analysis were found to be present within the community. In addition, comparison of metagenomic reads against the consensus scaffolds revealed within-species variation for one of theHalomonaspopulations, one of theRhodobacteraceaepopulations, and theRhizobialespopulation. Genomic comparison of these representative instances of inter- and intraspecies microdiversity suggests differences in functional potential that may result in the expression of distinct roles in the community. In addition, isolation and complete genome sequence determination of six member species allowed an investigation into the sensitivity and specificity of genome reconstruction processes, demonstrating robustness across a wide range of sequence coverage (9× to 2,700×) within the metagenomic data set.« less

  19. Fizzy: feature subset selection for metagenomics.

    PubMed

    Ditzler, Gregory; Morrison, J Calvin; Lan, Yemin; Rosen, Gail L

    2015-11-04

    Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & β-diversity. Feature subset selection--a sub-field of machine learning--can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.

  20. Introduction to Metagenomics at DOE JGI: Program Overview and Program Informatics (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Tringe, Susannah

    2018-01-15

    Susannah Tringe of the DOE Joint Genome Institute talks about the Program Overview and Program Informatics at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  1. Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.

    PubMed

    Mangericao, Tatiana C; Peng, Zhanhao; Zhang, Xuegong

    2016-01-11

    CRISPR has been becoming a hot topic as a powerful technique for genome editing for human and other higher organisms. The original CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats coupled with CRISPR-associated proteins) is an important adaptive defence system for prokaryotes that provides resistance against invading elements such as viruses and plasmids. A CRISPR cassette contains short nucleotide sequences called spacers. These unique regions retain a history of the interactions between prokaryotes and their invaders in individual strains and ecosystems. One important ecosystem in the human body is the human gut, a rich habitat populated by a great diversity of microorganisms. Gut microbiomes are important for human physiology and health. Metagenome sequencing has been widely applied for studying the gut microbiomes. Most efforts in metagenome study has been focused on profiling taxa compositions and gene catalogues and identifying their associations with human health. Less attention has been paid to the analysis of the ecosystems of microbiomes themselves especially their CRISPR composition. We conducted a preliminary analysis of CRISPR sequences in a human gut metagenomic data set of Chinese individuals of type-2 diabetes patients and healthy controls. Applying an available CRISPR-identification algorithm, PILER-CR, we identified 3169 CRISPR cassettes in the data, from which we constructed a set of 1302 unique repeat sequences and 36,709 spacers. A more extensive analysis was made for the CRISPR repeats: these repeats were submitted to a more comprehensive clustering and classification using the web server tool CRISPRmap. All repeats were compared with known CRISPRs in the database CRISPRdb. A total of 784 repeats had matches in the database, and the remaining 518 repeats from our set are potentially novel ones. The computational analysis of CRISPR composition based contigs of metagenome sequencing data is feasible. It provides an efficient

  2. Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures

    PubMed Central

    Wang, Ying; Fu, Lei; Ren, Jie; Yu, Zhaoxia; Chen, Ting; Sun, Fengzhu

    2018-01-01

    Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying group-specific sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “group-specific” in our study. Our main purpose is to discover group-specific sequence regions between control and case groups as disease-associated markers. We developed a long k-mer (k ≥ 30 bps)-based computational pipeline to detect group-specific sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide de novo assembly. We called our method MetaGO: Group-specific oligonucleotide analysis for metagenomic samples. An open-source pipeline on Apache Spark was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified group-specific markers. In the simulated dataset, 99.11% of group-specific logical 40-mers covered 98.89% disease-specific regions from the disease-associated strain. In addition, 97.90% of group-specific numerical 40-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647 group-specific 40-mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10 group-specific features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All group-specific 40-mers were present in LC patients, but not healthy controls. All the assembled 11 LC-specific sequences can be mapped to two

  3. Microfluidic-based mini-metagenomics enables discovery of novel microbial lineages from complex environmental samples.

    PubMed

    Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R

    2017-07-05

    Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.

  4. Metagenomic approaches to exploit the biotechnological potential of the microbial consortia of marine sponges.

    PubMed

    Kennedy, Jonathan; Marchesi, Julian R; Dobson, Alan D W

    2007-05-01

    Natural products isolated from sponges are an important source of new biologically active compounds. However, the development of these compounds into drugs has been held back by the difficulties in achieving a sustainable supply of these often-complex molecules for pre-clinical and clinical development. Increasing evidence implicates microbial symbionts as the source of many of these biologically active compounds, but the vast majority of the sponge microbial community remain uncultured. Metagenomics offers a biotechnological solution to this supply problem. Metagenomes of sponge microbial communities have been shown to contain genes and gene clusters typical for the biosynthesis of biologically active natural products. Heterologous expression approaches have also led to the isolation of secondary metabolism gene clusters from uncultured microbial symbionts of marine invertebrates and from soil metagenomic libraries. Combining a metagenomic approach with heterologous expression holds much promise for the sustainable exploitation of the chemical diversity present in the sponge microbial community.

  5. MBMC: An Effective Markov Chain Approach for Binning Metagenomic Reads from Environmental Shotgun Sequencing Projects.

    PubMed

    Wang, Ying; Hu, Haiyan; Li, Xiaoman

    2016-08-01

    Metagenomics is a next-generation omics field currently impacting postgenomic life sciences and medicine. Binning metagenomic reads is essential for the understanding of microbial function, compositions, and interactions in given environments. Despite the existence of dozens of computational methods for metagenomic read binning, it is still very challenging to bin reads. This is especially true for reads from unknown species, from species with similar abundance, and/or from low-abundance species in environmental samples. In this study, we developed a novel taxonomy-dependent and alignment-free approach called MBMC (Metagenomic Binning by Markov Chains). Different from all existing methods, MBMC bins reads by measuring the similarity of reads to the trained Markov chains for different taxa instead of directly comparing reads with known genomic sequences. By testing on more than 24 simulated and experimental datasets with species of similar abundance, species of low abundance, and/or unknown species, we report here that MBMC reliably grouped reads from different species into separate bins. Compared with four existing approaches, we demonstrated that the performance of MBMC was comparable with existing approaches when binning reads from sequenced species, and superior to existing approaches when binning reads from unknown species. MBMC is a pivotal tool for binning metagenomic reads in the current era of Big Data and postgenomic integrative biology. The MBMC software can be freely downloaded at http://hulab.ucf.edu/research/projects/metagenomics/MBMC.html .

  6. Meeting Report: “Metagenomics, Metadata and Meta-analysis” (M3) Special Interest Group at ISMB 2009

    PubMed Central

    Field, Dawn; Friedberg, Iddo; Sterk, Peter; Kottmann, Renzo; Glöckner, Frank Oliver; Hirschman, Lynette; Garrity, George M.; Cochrane, Guy; Wooley, John; Gilbert, Jack

    2009-01-01

    This report summarizes the proceedings of the “Metagenomics, Metadata and Meta-analysis” (M3) Special Interest Group (SIG) meeting held at the Intelligent Systems for Molecular Biology 2009 conference. The Genomic Standards Consortium (GSC) hosted this meeting to explore the bottlenecks and emerging solutions for obtaining biological insights through large-scale comparative analysis of metagenomic datasets. The M3 SIG included 16 talks, half of which were selected from submitted abstracts, a poster session and a panel discussion involving members of the GSC Board. This report summarizes this one-day SIG, attempts to identify shared themes and recapitulates community recommendations for the future of this field. The GSC will also host an M3 workshop at the Pacific Symposium on Biocomputing (PSB) in January 2010. Further information about the GSC and its range of activities can be found at http://gensc.org/. PMID:21304668

  7. Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Richard Allen; Bottos, Eric M.; Roy Chowdhury, Taniya

    ABSTRACT Soil metagenomics has been touted as the “grand challenge” for metagenomics, as the high microbial diversity and spatial heterogeneity of soils make them unamenable to current assembly platforms. Here, we aimed to improve soil metagenomic sequence assembly by applying the Moleculo synthetic long-read sequencing technology. In total, we obtained 267 Gbp of raw sequence data from a native prairie soil; these data included 109.7 Gbp of short-read data (~100 bp) from the Joint Genome Institute (JGI), an additional 87.7 Gbp of rapid-mode read data (~250 bp), plus 69.6 Gbp (>1.5 kbp) from Moleculo sequencing. The Moleculo data alone yielded over 5,600more » reads of >10 kbp in length, and over 95% of the unassembled reads mapped to contigs of >1.5 kbp. Hybrid assembly of all data resulted in more than 10,000 contigs over 10 kbp in length. We mapped three replicate metatranscriptomes derived from the same parent soil to the Moleculo subassembly and found that 95% of the predicted genes, based on their assignments to Enzyme Commission (EC) numbers, were expressed. The Moleculo subassembly also enabled binning of >100 microbial genome bins. We obtained via direct binning the first complete genome, that of “CandidatusPseudomonas sp. strain JKJ-1” from a native soil metagenome. By mapping metatranscriptome sequence reads back to the bins, we found that several bins corresponding to low-relative-abundanceAcidobacteriawere highly transcriptionally active, whereas bins corresponding to high-relative-abundanceVerrucomicrobiawere not. These results demonstrate that Moleculo sequencing provides a significant advance for resolving complex soil microbial communities. IMPORTANCESoil microorganisms carry out key processes for life on our planet, including cycling of carbon and other nutrients and supporting growth of plants. However, there is poor molecular-level understanding of their functional roles in ecosystem stability and responses to environmental

  8. Moleculo Long-Read Sequencing Facilitates Assembly and Genomic Binning from Complex Soil Metagenomes

    PubMed Central

    White, Richard Allen; Bottos, Eric M.; Roy Chowdhury, Taniya; Zucker, Jeremy D.; Brislawn, Colin J.; Nicora, Carrie D.; Fansler, Sarah J.; Glaesemann, Kurt R.; Glass, Kevin

    2016-01-01

    ABSTRACT Soil metagenomics has been touted as the “grand challenge” for metagenomics, as the high microbial diversity and spatial heterogeneity of soils make them unamenable to current assembly platforms. Here, we aimed to improve soil metagenomic sequence assembly by applying the Moleculo synthetic long-read sequencing technology. In total, we obtained 267 Gbp of raw sequence data from a native prairie soil; these data included 109.7 Gbp of short-read data (~100 bp) from the Joint Genome Institute (JGI), an additional 87.7 Gbp of rapid-mode read data (~250 bp), plus 69.6 Gbp (>1.5 kbp) from Moleculo sequencing. The Moleculo data alone yielded over 5,600 reads of >10 kbp in length, and over 95% of the unassembled reads mapped to contigs of >1.5 kbp. Hybrid assembly of all data resulted in more than 10,000 contigs over 10 kbp in length. We mapped three replicate metatranscriptomes derived from the same parent soil to the Moleculo subassembly and found that 95% of the predicted genes, based on their assignments to Enzyme Commission (EC) numbers, were expressed. The Moleculo subassembly also enabled binning of >100 microbial genome bins. We obtained via direct binning the first complete genome, that of “Candidatus Pseudomonas sp. strain JKJ-1” from a native soil metagenome. By mapping metatranscriptome sequence reads back to the bins, we found that several bins corresponding to low-relative-abundance Acidobacteria were highly transcriptionally active, whereas bins corresponding to high-relative-abundance Verrucomicrobia were not. These results demonstrate that Moleculo sequencing provides a significant advance for resolving complex soil microbial communities. IMPORTANCE Soil microorganisms carry out key processes for life on our planet, including cycling of carbon and other nutrients and supporting growth of plants. However, there is poor molecular-level understanding of their functional roles in ecosystem stability and responses to environmental

  9. A potential source for cellulolytic enzyme discovery and environmental aspects revealed through metagenomics of Brazilian mangroves

    PubMed Central

    2013-01-01

    The mangroves are among the most productive and biologically important environments. The possible presence of cellulolytic enzymes and microorganisms useful for biomass degradation as well as taxonomic and functional aspects of two Brazilian mangroves were evaluated using cultivation and metagenomic approaches. From a total of 296 microorganisms with visual differences in colony morphology and growth (including bacteria, yeast and filamentous fungus), 179 (60.5%) and 117 (39.5%) were isolated from the Rio de Janeiro (RJ) and Bahia (BA) samples, respectively. RJ metagenome showed the higher number of microbial isolates, which is consistent with its most conserved state and higher diversity. The metagenomic sequencing data showed similar predominant bacterial phyla in the BA and RJ mangroves with an abundance of Proteobacteria (57.8% and 44.6%), Firmicutes (11% and 12.3%) and Actinobacteria (8.4% and 7.5%). A higher number of enzymes involved in the degradation of polycyclic aromatic compounds were found in the BA mangrove. Specific sequences involved in the cellulolytic degradation, belonging to cellulases, hemicellulases, carbohydrate binding domains, dockerins and cohesins were identified, and it was possible to isolate cultivable fungi and bacteria related to biomass decomposition and with potential applications for the production of biofuels. These results showed that the mangroves possess all fundamental molecular tools required for building the cellulosome, which is required for the efficient degradation of cellulose material and sugar release. PMID:24160319

  10. Activity-based metagenomic screening and biochemical characterization of bovine ruminal protozoan glycoside hydrolases.

    PubMed

    Findley, Seth D; Mormile, Melanie R; Sommer-Hurley, Andrea; Zhang, Xue-Cheng; Tipton, Peter; Arnett, Krista; Porter, James H; Kerley, Monty; Stacey, Gary

    2011-11-01

    The rumen, the foregut of herbivorous ruminant animals such as cattle, functions as a bioreactor to process complex plant material. Among the numerous and diverse microbes involved in ruminal digestion are the ruminal protozoans, which are single-celled, ciliated eukaryotic organisms. An activity-based screen was executed to identify genes encoding fibrolytic enzymes present in the metatranscriptome of a bovine ruminal protozoan-enriched cDNA expression library. Of the four novel genes identified, two were characterized in biochemical assays. Our results provide evidence for the effective use of functional metagenomics to retrieve novel enzymes from microbial populations that cannot be maintained in axenic cultures.

  11. An evaluation of the accuracy and speed of metagenome analysis tools

    PubMed Central

    Lindgreen, Stinus; Adair, Karen L.; Gardner, Paul P.

    2016-01-01

    Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html PMID:26778510

  12. Host-associated bacterial taxa from Chlorobi, Chloroflexi, GN02, Synergistetes, SR1, TM7, and WPS-2 Phyla/candidate divisions

    PubMed Central

    Camanocha, Anuj; Dewhirst, Floyd E.

    2014-01-01

    Background and objective In addition to the well-known phyla Firmicutes, Proteobacteria, Bacteroidetes, Actinobacteria, Spirochaetes, Fusobacteria, Tenericutes, and Chylamydiae, the oral microbiomes of mammals contain species from the lesser-known phyla or candidate divisions, including Synergistetes, TM7, Chlorobi, Chloroflexi, GN02, SR1, and WPS-2. The objectives of this study were to create phyla-selective 16S rDNA PCR primer pairs, create selective 16S rDNA clone libraries, identify novel oral taxa, and update canine and human oral microbiome databases. Design 16S rRNA gene sequences for members of the lesser-known phyla were downloaded from GenBank and Greengenes databases and aligned with sequences in our RNA databases. Primers with potential phylum level selectivity were designed heuristically with the goal of producing nearly full-length 16S rDNA amplicons. The specificity of primer pairs was examined by making clone libraries from PCR amplicons and determining phyla identity by BLASTN analysis. Results Phylum-selective primer pairs were identified that allowed construction of clone libraries with 96–100% specificity for each of the lesser-known phyla. From these clone libraries, seven human and two canine novel oral taxa were identified and added to their respective taxonomic databases. For each phylum, genome sequences closest to human oral taxa were identified and added to the Human Oral Microbiome Database to facilitate metagenomic, transcriptomic, and proteomic studies that involve tiling sequences to the most closely related taxon. While examining ribosomal operons in lesser-known phyla from single-cell genomes and metagenomes, we identified a novel rRNA operon order (23S-5S-16S) in three SR1 genomes and the splitting of the 23S rRNA gene by an I-CeuI-like homing endonuclease in a WPS-2 genome. Conclusions This study developed useful primer pairs for making phylum-selective 16S rRNA clone libraries. Phylum-specific libraries were shown to be useful

  13. Metagenomics Reveals a Novel Virophage Population in a Tibetan Mountain Lake

    PubMed Central

    Oh, Seungdae; Yoo, Dongwan; Liu, Wen-Tso

    2016-01-01

    Virophages are parasites of giant viruses that infect eukaryotic organisms and may affect the ecology of inland water ecosystems. Despite the potential ecological impact, limited information is available on the distribution, diversity, and hosts of virophages in ecosystems. Metagenomics revealed that virophages were widely distributed in inland waters with various environmental characteristics including salinity and nutrient availability. A novel virophage population was overrepresented in a planktonic microbial community of the Tibetan mountain lake, Lake Qinghai. Our study identified coccolithophores and coccolithovirus-like phycodnaviruses in the same community, which may serve as eukaryotic and viral hosts of the virophage population, respectively. PMID:27151658

  14. Evaluation of rapid and simple techniques for the enrichment of viruses prior to metagenomic virus discovery.

    PubMed

    Hall, Richard J; Wang, Jing; Todd, Angela K; Bissielo, Ange B; Yen, Seiha; Strydom, Hugo; Moore, Nicole E; Ren, Xiaoyun; Huang, Q Sue; Carter, Philip E; Peacey, Matthew

    2014-01-01

    The discovery of new or divergent viruses using metagenomics and high-throughput sequencing has become more commonplace. The preparation of a sample is known to have an effect on the representation of virus sequences within the metagenomic dataset yet comparatively little attention has been given to this. Physical enrichment techniques are often applied to samples to increase the number of viral sequences and therefore enhance the probability of detection. With the exception of virus ecology studies, there is a paucity of information available to researchers on the type of sample preparation required for a viral metagenomic study that seeks to identify an aetiological virus in an animal or human diagnostic sample. A review of published virus discovery studies revealed the most commonly used enrichment methods, that were usually quick and simple to implement, namely low-speed centrifugation, filtration, nuclease-treatment (or combinations of these) which have been routinely used but often without justification. These were applied to a simple and well-characterised artificial sample composed of bacterial and human cells, as well as DNA (adenovirus) and RNA viruses (influenza A and human enterovirus), being either non-enveloped capsid or enveloped viruses. The effect of the enrichment method was assessed by both quantitative real-time PCR and metagenomic analysis that incorporated an amplification step. Reductions in the absolute quantities of bacteria and human cells were observed for each method as determined by qPCR, but the relative abundance of viral sequences in the metagenomic dataset remained largely unchanged. A 3-step method of centrifugation, filtration and nuclease-treatment showed the greatest increase in the proportion of viral sequences. This study provides a starting point for the selection of a purification method in future virus discovery studies, and highlights the need for more data to validate the effect of enrichment methods on different sample

  15. The MG-RAST Metagenomics Database and Portal in 2015

    DOE PAGES

    Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang; ...

    2015-12-09

    MG-RAST (http://metagenomics.anl.gov) is an opensubmission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. Currently, the system hosts over 200 000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. Lastly, to show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignmentmore » tools.« less

  16. Metagenomic profiling of microbial composition and antibiotic resistance determinants in Puget Sound.

    PubMed

    Port, Jesse A; Wallace, James C; Griffith, William C; Faustman, Elaine M

    2012-01-01

    Human-health relevant impacts on marine ecosystems are increasing on both spatial and temporal scales. Traditional indicators for environmental health monitoring and microbial risk assessment have relied primarily on single species analyses and have provided only limited spatial and temporal information. More high-throughput, broad-scale approaches to evaluate these impacts are therefore needed to provide a platform for informing public health. This study uses shotgun metagenomics to survey the taxonomic composition and antibiotic resistance determinant content of surface water bacterial communities in the Puget Sound estuary. Metagenomic DNA was collected at six sites in Puget Sound in addition to one wastewater treatment plant (WWTP) that discharges into the Sound and pyrosequenced. A total of ~550 Mbp (1.4 million reads) were obtained, 22 Mbp of which could be assembled into contigs. While the taxonomic and resistance determinant profiles across the open Sound samples were similar, unique signatures were identified when comparing these profiles across the open Sound, a nearshore marina and WWTP effluent. The open Sound was dominated by α-Proteobacteria (in particular Rhodobacterales sp.), γ-Proteobacteria and Bacteroidetes while the marina and effluent had increased abundances of Actinobacteria, β-Proteobacteria and Firmicutes. There was a significant increase in the antibiotic resistance gene signal from the open Sound to marina to WWTP effluent, suggestive of a potential link to human impacts. Mobile genetic elements associated with environmental and pathogenic bacteria were also differentially abundant across the samples. This study is the first comparative metagenomic survey of Puget Sound and provides baseline data for further assessments of community composition and antibiotic resistance determinants in the environment using next generation sequencing technologies. In addition, these genomic signals of potential human impact can be used to guide initial

  17. Metagenomic Profiling of Microbial Composition and Antibiotic Resistance Determinants in Puget Sound

    PubMed Central

    Port, Jesse A.; Wallace, James C.; Griffith, William C.; Faustman, Elaine M.

    2012-01-01

    Human-health relevant impacts on marine ecosystems are increasing on both spatial and temporal scales. Traditional indicators for environmental health monitoring and microbial risk assessment have relied primarily on single species analyses and have provided only limited spatial and temporal information. More high-throughput, broad-scale approaches to evaluate these impacts are therefore needed to provide a platform for informing public health. This study uses shotgun metagenomics to survey the taxonomic composition and antibiotic resistance determinant content of surface water bacterial communities in the Puget Sound estuary. Metagenomic DNA was collected at six sites in Puget Sound in addition to one wastewater treatment plant (WWTP) that discharges into the Sound and pyrosequenced. A total of ∼550 Mbp (1.4 million reads) were obtained, 22 Mbp of which could be assembled into contigs. While the taxonomic and resistance determinant profiles across the open Sound samples were similar, unique signatures were identified when comparing these profiles across the open Sound, a nearshore marina and WWTP effluent. The open Sound was dominated by α-Proteobacteria (in particular Rhodobacterales sp.), γ-Proteobacteria and Bacteroidetes while the marina and effluent had increased abundances of Actinobacteria, β-Proteobacteria and Firmicutes. There was a significant increase in the antibiotic resistance gene signal from the open Sound to marina to WWTP effluent, suggestive of a potential link to human impacts. Mobile genetic elements associated with environmental and pathogenic bacteria were also differentially abundant across the samples. This study is the first comparative metagenomic survey of Puget Sound and provides baseline data for further assessments of community composition and antibiotic resistance determinants in the environment using next generation sequencing technologies. In addition, these genomic signals of potential human impact can be used to guide

  18. Analysis of Metagenomic Sequences: From Megabases to Terabases

    ScienceCinema

    Krypides, Nikos

    2018-05-04

    Nikos Krypides of the DOE Joint Genome Institute discusses metagenomics and the challenge of dealing with terabases of data on June 4, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM.

  19. IMG/M-HMP: a metagenome comparative analysis system for the Human Microbiome Project.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Szeto, Ernest; Palaniappan, Krishna; Jacob, Biju; Ratner, Anna; Liolios, Konstantinos; Pagani, Ioanna; Huntemann, Marcel; Mavromatis, Konstantinos; Ivanova, Natalia N; Kyrpides, Nikos C

    2012-01-01

    The Integrated Microbial Genomes and Metagenomes (IMG/M) resource is a data management system that supports the analysis of sequence data from microbial communities in the integrated context of all publicly available draft and complete genomes from the three domains of life as well as a large number of plasmids and viruses. IMG/M currently contains thousands of genomes and metagenome samples with billions of genes. IMG/M-HMP is an IMG/M data mart serving the US National Institutes of Health (NIH) Human Microbiome Project (HMP), focussed on HMP generated metagenome datasets, and is one of the central resources provided from the HMP Data Analysis and Coordination Center (DACC). IMG/M-HMP is available at http://www.hmpdacc-resources.org/imgm_hmp/.

  20. International Standards for Genomes, Transcriptomes, and Metagenomes

    PubMed Central

    Mason, Christopher E.; Afshinnekoo, Ebrahim; Tighe, Scott; Wu, Shixiu; Levy, Shawn

    2017-01-01

    Challenges and biases in preparing, characterizing, and sequencing DNA and RNA can have significant impacts on research in genomics across all kingdoms of life, including experiments in single-cells, RNA profiling, and metagenomics (across multiple genomes). Technical artifacts and contamination can arise at each point of sample manipulation, extraction, sequencing, and analysis. Thus, the measurement and benchmarking of these potential sources of error are of paramount importance as next-generation sequencing (NGS) projects become more global and ubiquitous. Fortunately, a variety of methods, standards, and technologies have recently emerged that improve measurements in genomics and sequencing, from the initial input material to the computational pipelines that process and annotate the data. Here we review current standards and their applications in genomics, including whole genomes, transcriptomes, mixed genomic samples (metagenomes), and the modified bases within each (epigenomes and epitranscriptomes). These standards, tools, and metrics are critical for quantifying the accuracy of NGS methods, which will be essential for robust approaches in clinical genomics and precision medicine. PMID:28337071