Science.gov

Sample records for metagenome fragment classification

  1. Metagenome fragment classification based on multiple motif-occurrence profiles.

    PubMed

    Matsushita, Naoki; Seno, Shigeto; Takenaka, Yoichi; Matsuda, Hideo

    2014-01-01

    A vast amount of metagenomic data has been obtained by extracting multiple genomes simultaneously from microbial communities, including genomes from uncultivable microbes. By analyzing these metagenomic data, novel microbes are discovered and new microbial functions are elucidated. The first step in analyzing these data is sequenced-read classification into reference genomes from which each read can be derived. The Naïve Bayes Classifier is a method for this classification. To identify the derivation of the reads, this method calculates a score based on the occurrence of a DNA sequence motif in each reference genome. However, large differences in the sizes of the reference genomes can bias the scoring of the reads. This bias might cause erroneous classification and decrease the classification accuracy. To address this issue, we have updated the Naïve Bayes Classifier method using multiple sets of occurrence profiles for each reference genome by normalizing the genome sizes, dividing each genome sequence into a set of subsequences of similar length and generating profiles for each subsequence. This multiple profile strategy improves the accuracy of the results generated by the Naïve Bayes Classifier method for simulated and Sargasso Sea datasets.

  2. Metagenome fragment classification based on multiple motif-occurrence profiles.

    PubMed

    Matsushita, Naoki; Seno, Shigeto; Takenaka, Yoichi; Matsuda, Hideo

    2014-01-01

    A vast amount of metagenomic data has been obtained by extracting multiple genomes simultaneously from microbial communities, including genomes from uncultivable microbes. By analyzing these metagenomic data, novel microbes are discovered and new microbial functions are elucidated. The first step in analyzing these data is sequenced-read classification into reference genomes from which each read can be derived. The Naïve Bayes Classifier is a method for this classification. To identify the derivation of the reads, this method calculates a score based on the occurrence of a DNA sequence motif in each reference genome. However, large differences in the sizes of the reference genomes can bias the scoring of the reads. This bias might cause erroneous classification and decrease the classification accuracy. To address this issue, we have updated the Naïve Bayes Classifier method using multiple sets of occurrence profiles for each reference genome by normalizing the genome sizes, dividing each genome sequence into a set of subsequences of similar length and generating profiles for each subsequence. This multiple profile strategy improves the accuracy of the results generated by the Naïve Bayes Classifier method for simulated and Sargasso Sea datasets. PMID:25210663

  3. Microbial Community Analysis with Ribosomal Gene Fragments from Shotgun Metagenomes

    PubMed Central

    Guo, Jiarong; Cole, James R.; Zhang, Qingpeng; Brown, C. Titus

    2015-01-01

    Shotgun metagenomic sequencing does not depend on gene-targeted primers or PCR amplification; thus, it is not affected by primer bias or chimeras. However, searching rRNA genes from large shotgun Illumina data sets is computationally expensive, and no approach exists for unsupervised community analysis of small-subunit (SSU) rRNA gene fragments retrieved from shotgun data. We present a pipeline, SSUsearch, to achieve the faster identification of short-subunit rRNA gene fragments and enabled unsupervised community analysis with shotgun data. It also includes classification and copy number correction, and the output can be used by traditional amplicon analysis platforms. Shotgun metagenome data using this pipeline yielded higher diversity estimates than amplicon data but retained the grouping of samples in ordination analyses. We applied this pipeline to soil samples with paired shotgun and amplicon data and confirmed bias against Verrucomicrobia in a commonly used V6-V8 primer set, as well as discovering likely bias against Actinobacteria and for Verrucomicrobia in a commonly used V4 primer set. This pipeline can utilize all variable regions in SSU rRNA and also can be applied to large-subunit (LSU) rRNA genes for confirmation of community structure. The pipeline can scale to handle large amounts of soil metagenomic data (5 Gb memory and 5 central processing unit hours to process 38 Gb [1 lane] of trimmed Illumina HiSeq2500 data) and is freely available at https://github.com/dib-lab/SSUsearch under a BSD license. PMID:26475107

  4. Metagenomic taxonomic classification using extreme learning machines.

    PubMed

    Rasheed, Zeehasham; Rangwala, Huzefa

    2012-10-01

    Next-generation sequencing technologies have allowed researchers to determine the collective genomes of microbial communities co-existing within diverse ecological environments. Varying species abundance, length and complexities within different communities, coupled with discovery of new species makes the problem of taxonomic assignment to short DNA sequence reads extremely challenging. We have developed a new sequence composition-based taxonomic classifier using extreme learning machines referred to as TAC-ELM for metagenomic analysis. TAC-ELM uses the framework of extreme learning machines to quickly and accurately learn the weights for a neural network model. The input features consist of GC content and oligonucleotides. TAC-ELM is evaluated on two metagenomic benchmarks with sequence read lengths reflecting the traditional and current sequencing technologies. Our empirical results indicate the strength of the developed approach, which outperforms state-of-the-art taxonomic classifiers in terms of accuracy and implementation complexity. We also perform experiments that evaluate the pervasive case within metagenome analysis, where a species may not have been previously sequenced or discovered and will not exist in the reference genome databases. TAC-ELM was also combined with BLAST to show improved classification results. Code and Supplementary Results: http://www.cs.gmu.edu/~mlbio/TAC-ELM (BSD License). PMID:22849369

  5. Gene prediction in metagenomic fragments based on the SVM algorithm

    PubMed Central

    2013-01-01

    Background Metagenomic sequencing is becoming a powerful technology for exploring micro-ogranisms from various environments, such as human body, without isolation and cultivation. Accurately identifying genes from metagenomic fragments is one of the most fundamental issues. Results In this article, we present a novel gene prediction method named MetaGUN for metagenomic fragments based on a machine learning approach of SVM. It implements in a three-stage strategy to predict genes. Firstly, it classifies input fragments into phylogenetic groups by a k-mer based sequence binning method. Then, protein-coding sequences are identified for each group independently with SVM classifiers that integrate entropy density profiles (EDP) of codon usage, translation initiation site (TIS) scores and open reading frame (ORF) length as input patterns. Finally, the TISs are adjusted by employing a modified version of MetaTISA. To identify protein-coding sequences, MetaGun builds the universal module and the novel module. The former is based on a set of representative species, while the latter is designed to find potential functionary DNA sequences with conserved domains. Conclusions Comparisons on artificial shotgun fragments with multiple current metagenomic gene finders show that MetaGUN predicts better results on both 3' and 5' ends of genes with fragments of various lengths. Especially, it makes the most reliable predictions among these methods. As an application, MetaGUN was used to predict genes for two samples of human gut microbiome. It identifies thousands of additional genes with significant evidences. Further analysis indicates that MetaGUN tends to predict more potential novel genes than other current metagenomic gene finders. PMID:23735199

  6. CoMeta: Classification of Metagenomes Using k-mers

    PubMed Central

    Kawulok, Jolanta; Deorowicz, Sebastian

    2015-01-01

    Nowadays, the study of environmental samples has been developing rapidly. Characterization of the environment composition broadens the knowledge about the relationship between species composition and environmental conditions. An important element of extracting the knowledge of the sample composition is to compare the extracted fragments of DNA with sequences derived from known organisms. In the presented paper, we introduce an algorithm called CoMeta (Classification of metagenomes), which assigns a query read (a DNA fragment) into one of the groups previously prepared by the user. Typically, this is one of the taxonomic rank (e.g., phylum, genus), however prepared groups may contain sequences having various functions. In CoMeta, we used the exact method for read classification using short subsequences (k-mers) and fast program for indexing large set of k-mers. In contrast to the most popular methods based on BLAST, where the query is compared with each reference sequence, we begin the classification from the top of the taxonomy tree to reduce the number of comparisons. The presented experimental study confirms that CoMeta outperforms other programs used in this context. CoMeta is available at https://github.com/jkawulok/cometa under a free GNU GPL 2 license. PMID:25884504

  7. [Construction of large fragment metagenome library of natural mangrove soil].

    PubMed

    Jiang, Yun-Xia; Zheng, Tian-Ling

    2007-11-01

    Applying our optimized direct extraction method, the percentage of large fragment DNA in the total extracted mangrove soil DNA was significant increased. The large fragment metagenome library derived from natural mangrove soil over four seasons was successfully constructed by the optimized DNA extraction and electro elution purification method. All of the clones had recombinant Cosmids and each differed in their fragment profiles when Cosmid DNA was extracted from 12 randomly picked colonies and digested with BamHI. The average insert size for this library was larger than 35 kbp. This culturing-independent library at least encompassed 335 Mbp valuable genetic information of mangrove soil microbes. It allowed mining of valuable intertidal microbial resource to become a reality. It is a recommended method for those researchers who have still not circumvented the large insert environmental libraries or for those beginning research in this field, so as to avoid them attempting repetitive, fussy work.

  8. Fast and sensitive taxonomic classification for metagenomics with Kaiju

    PubMed Central

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-01-01

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows–Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk. PMID:27071849

  9. Fast and sensitive taxonomic classification for metagenomics with Kaiju.

    PubMed

    Menzel, Peter; Ng, Kim Lee; Krogh, Anders

    2016-04-13

    Metagenomics emerged as an important field of research not only in microbial ecology but also for human health and disease, and metagenomic studies are performed on increasingly larger scales. While recent taxonomic classification programs achieve high speed by comparing genomic k-mers, they often lack sensitivity for overcoming evolutionary divergence, so that large fractions of the metagenomic reads remain unclassified. Here we present the novel metagenome classifier Kaiju, which finds maximum (in-)exact matches on the protein-level using the Burrows-Wheeler transform. We show in a genome exclusion benchmark that Kaiju classifies reads with higher sensitivity and similar precision compared with current k-mer-based classifiers, especially in genera that are underrepresented in reference databases. We also demonstrate that Kaiju classifies up to 10 times more reads in real metagenomes. Kaiju can process millions of reads per minute and can run on a standard PC. Source code and web server are available at http://kaiju.binf.ku.dk.

  10. Taxonomic classification of metagenomic shotgun sequences with CARMA3.

    PubMed

    Gerlach, Wolfgang; Stoye, Jens

    2011-08-01

    The vast majority of microbes are unculturable and thus cannot be sequenced by means of traditional methods. High-throughput sequencing techniques like 454 or Solexa-Illumina make it possible to explore those microbes by studying whole natural microbial communities and analysing their biological diversity as well as the underlying metabolic pathways. Over the past few years, different methods have been developed for the taxonomic and functional characterization of metagenomic shotgun sequences. However, the taxonomic classification of metagenomic sequences from novel species without close homologue in the biological sequence databases poses a challenge due to the high number of wrong taxonomic predictions on lower taxonomic ranks. Here we present CARMA3, a new method for the taxonomic classification of assembled and unassembled metagenomic sequences that has been adapted to work with both BLAST and HMMER3 homology searches. We show that our method makes fewer wrong taxonomic predictions (at the same sensitivity) than other BLAST-based methods. CARMA3 is freely accessible via the web application WebCARMA from http://webcarma.cebitec.uni-bielefeld.de.

  11. De Novo Repeat Classification and Fragment Assembly

    PubMed Central

    Pevzner, Paul A.; Tang, Haixu; Tesler, Glenn

    2004-01-01

    Repetitive sequences make up a significant fraction of almost any genome, and an important and still open question in bioinformatics is how to represent all repeats in DNA sequences. We propose a new approach to repeat classification that represents all repeats in a genome as a mosaic of sub-repeats. Our key algorithmic idea also leads to new approaches to multiple alignment and fragment assembly. In particular, we show that our FragmentGluer assembler improves on Phrap and ARACHNE in assembly of BACs and bacterial genomes. PMID:15342561

  12. Identification and characterization of metagenomic fragments from tidal flat sediment.

    PubMed

    Kim, Byung Kwon; Park, Yoon-Dong; Oh, Hyun-Myung; Chun, Jongsik

    2009-08-01

    Phylogenetic surveys based on cultivation-independent methods have revealed that tidal flat sediments are environments with extensive microbial diversity. Since most of prokaryotes in nature cannot be easily cultivated under general laboratory conditions, our knowledge on prokaryotic dwellers in tidal flat sediment is mainly based on the analysis of metagenomes. Microbial community analysis based on the 16S rRNA gene and other phylogenetic markers has been widely used to provide important information on the role of microorganisms, but it is basically an indirect means, compared with direct sequencing of metagenomic DNAs. In this study, we applied a sequence-based metagenomic approach to characterize uncultivated prokaryotes from tidal flat sediment. Two large-insert genomic libraries based on fosmid were constructed from tidal flat metagenomic DNA. A survey based on end-sequencing of selected fosmid clones resulted in the identification of clones containing 274 bacterial and 16 archaeal homologs in which majority were of proteobacterial origins. Two fosmid clones containing large metagenomic DNAs were completely sequenced using the shotgun method. Both DNA inserts contained more than 20 genes encoding putative proteins which implied their ecological roles in tidal flat sediment. Phylogenetic analyses of evolutionary conserved proteins indicate that these clones are not closely related to known prokaryotes whose genome sequence is known, and genes in tidal flat may be subjected to extensive lateral gene transfer, notably between domains Bacteria and Archaea. This is the first report demonstrating that direct sequencing of metagenomic gene library is useful in underpinning the genetic makeup and functional roles of prokaryotes in tidal flat sediments. PMID:19763413

  13. Large-scale machine learning for metagenomics sequence classification

    PubMed Central

    Vervier, Kévin; Mahé, Pierre; Tournoud, Maud; Veyrieras, Jean-Baptiste; Vert, Jean-Philippe

    2016-01-01

    Motivation: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions. Results: We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning the method involves training machine learning models on about 108 samples in 107 dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2–17 times with respect to the BWA-MEM short read mapper, depending

  14. Accurate phylogenetic classification of DNA fragments based onsequence composition

    SciTech Connect

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip; Rigoutsos, Isidore

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.

  15. Classification and quantification of bacteriophage taxa in human gut metagenomes

    PubMed Central

    Waller, Alison S; Yamada, Takuji; Kristensen, David M; Kultima, Jens Roat; Sunagawa, Shinichi; Koonin, Eugene V; Bork, Peer

    2014-01-01

    Bacteriophages have key roles in microbial communities, to a large extent shaping the taxonomic and functional composition of the microbiome, but data on the connections between phage diversity and the composition of communities are scarce. Using taxon-specific marker genes, we identified and monitored 20 viral taxa in 252 human gut metagenomic samples, mostly at the level of genera. On average, five phage taxa were identified in each sample, with up to three of these being highly abundant. The abundances of most phage taxa vary by up to four orders of magnitude between the samples, and several taxa that are highly abundant in some samples are absent in others. Significant correlations exist between the abundances of some phage taxa and human host metadata: for example, ‘Group 936 lactococcal phages' are more prevalent and abundant in Danish samples than in samples from Spain or the United States of America. Quantification of phages that exist as integrated prophages revealed that the abundance profiles of prophages are highly individual-specific and remain unique to an individual over a 1-year time period, and prediction of prophage lysis across the samples identified hundreds of prophages that are apparently active in the gut and vary across the samples, in terms of presence and lytic state. Finally, a prophage–host network of the human gut was established and includes numerous novel host–phage associations. PMID:24621522

  16. Methods for virus classification and the challenge of incorporating metagenomic sequence data.

    PubMed

    Simmonds, Peter

    2015-06-01

    The division of viruses into orders, families, genera and species provides a classification framework that seeks to organize and make sense of the diversity of viruses infecting animals, plants and bacteria. Classifications are based on similarities in genome structure and organization, the presence of homologous genes and sequence motifs and at lower levels such as species, host range, nucleotide and antigenic relatedness and epidemiology. Classification below the level of family must also be consistent with phylogeny and virus evolutionary histories. Recently developed methods such as PASC, DEMaRC and NVR offer alternative strategies for genus and species assignments that are based purely on degrees of divergence between genome sequences. They offer the possibility of automating classification of the vast number of novel virus sequences being generated by next-generation metagenomic sequencing. However, distance-based methods struggle to deal with the complex evolutionary history of virus genomes that are shuffled by recombination and reassortment, and where taxonomic lineages evolve at different rates. In biological terms, classifications based on sequence distances alone are also arbitrary whereas the current system of virus taxonomy is of utility precisely because it is primarily based upon phenotypic characteristics. However, a separate system is clearly needed by which virus variants that lack biological information might be incorporated into the ICTV classification even if based solely on sequence relationships to existing taxa. For these, simplified taxonomic proposals and naming conventions represent a practical way to expand the existing virus classification and catalogue our rapidly increasing knowledge of virus diversity. PMID:26068186

  17. IDENTIFICATION OF AVIAN-SPECIFIC FECAL METAGENOMIC SEQUENCES USING GENOME FRAGMENT ENRICHMENTS

    EPA Science Inventory

    Sequence analysis of microbial genomes has provided biologists the opportunity to compare genetic differences between closely related microorganisms. While random sequencing has also been used to study natural microbial communities, metagenomic comparisons via sequencing analysis...

  18. Metagenomic Classification and Characterization Marine Actinobacteria from the Gulf of Maine without Representative Genomes

    NASA Astrophysics Data System (ADS)

    Sachdeva, R.; Heidelberg, J.

    2012-12-01

    Actinobacteria represent one of the largest and most diverse bacterial phyla and unlike most marine prokaryotes are gram-positive. This phylum encompasses a broad range of physiologies, morphologies, and metabolic properties with a broad array of lifestyles. The marine actinobacterial assemblage is dominated by the orders Actinomycetales and Acidimicrobiales (also known as the marine Actinobacteria clade). The Acidimicrobiales bacteria typically outnumber the Actinomycetales bacteria and are mostly represented by the OCS155 group. Although bacteria of the order Acidimicrobiales make up ~7.6% of the 16S matches from the Global Ocean Survey shotgun metagenomic libraries; very little is known about their potential function and role in biogeochemical cycling. Samples were collected from surface seawater samples in the Gulf of Maine (GOM) from the summer and winter of 2006. Sanger sequences were generated from the 0.1-0.8 μm fractions using paired-end medium insert shotgun libraries. The resulting 2.2 Gb were assembled using the Celera Assembler package into 280 Mb of non-redundant scaffolds. Putative actinobacterial assemblies were identified using (1) ribosomal RNA genes (16S and 23S), (2) phylogenetically informative non-ribosomal core genes thought to be resistant to horizontal gene transfer (e.g. RecA and RpoB) and (3) compositional binning using oligonucleotide frequency pattern based hierarchical clustering. Binning resulted in 3.6 Mb (4.2X coverage) of actinobacterial scaffolds that were comprised of 15.1 Mb of unassembled reads. Putative actinobacterial assemblies included both summer and winter reads demonstrating that the Actinobacteria are abundant year round. Classification reveals that all of the sampled Actinobacteria are from the orders Acidimicrobiales and Actinomycetales and are similar to those found in the global ocean. The GOM Actinobacteria show a broad range of G+C % content (32-66%) indicating a high level of genomic diversity. Those assemblies

  19. TWARIT: an extremely rapid and efficient approach for phylogenetic classification of metagenomic sequences.

    PubMed

    Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

    2012-09-01

    Phylogenetic assignment of individual sequence reads to their respective taxa, referred to as 'taxonomic binning', constitutes a key step of metagenomic analysis. Existing binning methods have limitations either with respect to time or accuracy/specificity of binning. Given these limitations, development of a method that can bin vast amounts of metagenomic sequence data in a rapid, efficient and computationally inexpensive manner can profoundly influence metagenomic analysis in computational resource poor settings. We introduce TWARIT, a hybrid binning algorithm, that employs a combination of short-read alignment and composition-based signature sorting approaches to achieve rapid binning rates without compromising on binning accuracy and specificity. TWARIT is validated with simulated and real-world metagenomes and the results demonstrate significantly lower overall binning times compared to that of existing methods. Furthermore, the binning accuracy and specificity of TWARIT are observed to be comparable/superior to them. A web server implementing TWARIT algorithm is available at http://metagenomics.atc.tcs.com/Twarit/

  20. Improved ethanol production from biomass by a rumen metagenomic DNA fragment expressed in Escherichia coli MS04 during fermentation.

    PubMed

    Loaces, Inés; Amarelle, Vanesa; Muñoz-Gutierrez, Iván; Fabiano, Elena; Martinez, Alfredo; Noya, Francisco

    2015-11-01

    With the aim of improving current ethanologenic Escherichia coli strains, we screened a metagenomic library from bovine ruminal fluid for cellulolytic enzymes. We isolated one fosmid, termed Csd4, which was able to confer to E. coli the ability to grow on complex cellulosic material as the sole carbon source such as avicel, carboxymethyl cellulose, filter paper, pretreated sugarcane bagasse, and xylan. Glucanolytic activity obtained from E. coli transformed with Csd4 was maximal at 24 h of incubation and was inhibited when glucose or xylose were present in the media. The 34,406-bp DNA fragment of Csd4 was completely sequenced, and a putative endoglucanase, a xylosidase/arabinosidase, and a laccase gene were identified. Comparison analysis revealed that Csd4 derived from an organism closely related to Prevotella ruminicola, but no homologies were found with any of the genomes already sequenced. Csd4 was introduced into the ethanologenic E. coli MS04 strain and ethanol production from CMC, avicel, sugarcane bagasse, or filter paper was observed. Exogenously expressed β-glucosidase had a positie effect on cell growth in agreement with the fact that no putative β-glucosidase was found in Csd4. Ethanol production from sugarcane bagasse was improved threefold by Csd4 after saccharification by commercial Trichoderma reesei cellulases underlining the ability of Csd4 to act as a saccharification enhancer to reduce the enzymatic load and time required for cellulose deconstruction.

  1. Signal Processing for Metagenomics: Extracting Information from the Soup

    PubMed Central

    Rosen, Gail L.; Sokhansanj, Bahrad A.; Polikar, Robi; Bruns, Mary Ann; Russell, Jacob; Garbarine, Elaine; Essinger, Steve; Yok, Non

    2009-01-01

    Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology. PMID:20436876

  2. Motif-Based Text Mining of Microbial Metagenome Redundancy Profiling Data for Disease Classification

    PubMed Central

    Wang, Yin; Zhou, Yuhua; Ling, Zongxin; Guo, Xiaokui; Xie, Lu; Liu, Lei

    2016-01-01

    Background. Text data of 16S rRNA are informative for classifications of microbiota-associated diseases. However, the raw text data need to be systematically processed so that features for classification can be defined/extracted; moreover, the high-dimension feature spaces generated by the text data also pose an additional difficulty. Results. Here we present a Phylogenetic Tree-Based Motif Finding algorithm (PMF) to analyze 16S rRNA text data. By integrating phylogenetic rules and other statistical indexes for classification, we can effectively reduce the dimension of the large feature spaces generated by the text datasets. Using the retrieved motifs in combination with common classification methods, we can discriminate different samples of both pneumonia and dental caries better than other existing methods. Conclusions. We extend the phylogenetic approaches to perform supervised learning on microbiota text data to discriminate the pathological states for pneumonia and dental caries. The results have shown that PMF may enhance the efficiency and reliability in analyzing high-dimension text data. PMID:27057545

  3. Classification of fragments of objects by the Fourier masks pattern recognition system

    NASA Astrophysics Data System (ADS)

    Barajas-García, Carolina; Solorza-Calderón, Selene; Álvarez-Borrego, Josué

    2016-05-01

    The automation process of the pattern recognition for fragments of objects is a challenge to humanity. For humans it is relatively easy to classify the fragment of some object even if it is isolated and perhaps this identification could be more complicated if it is partially overlapped by other object. However, the emulation of the functions of the human eye and brain by a computer is not a trivial issue. This paper presents a pattern recognition digital system based on Fourier binary rings masks in order to classify fragments of objects. The system is invariant to position, scale and rotation, and it is robust in the classification of images that have noise. Moreover, it classifies images that present an occlusion or elimination of approximately 50% of the area of the object.

  4. Novel organic solvent-tolerant esterase isolated by metagenomics: insights into the lipase/esterase classification.

    PubMed

    Berlemont, Renaud; Spee, Olivier; Delsaute, Maud; Lara, Yannick; Schuldes, Jörg; Simon, Carola; Power, Pablo; Daniel, Rolf; Galleni, Moreno

    2013-01-01

    in order to isolate novel organic solvent-tolerant (OST) lipases, a metagenomic library was built using DNA derived from a temperate forest soil sample. A two-step activity-based screening allowed the isolation of a lipolytic clone active in the presence of organic solvents. Sequencing of the plasmid pRBest recovered from the positive clone revealed the presence of a putative lipase/esterase encoding gene. The deduced amino acid sequence (RBest1) contains the conserved lipolytic enzyme signature and is related to the previously described OST lipase from Lysinibacillus sphaericus 205y, which is the sole studied prokaryotic enzyme belonging to the 4.4 α/β hydrolase subgroup (abH04.04). Both in vivo and in vitro studies of the substrate specificity of RBest1, using triacylglycerols or nitrophenyl-esters, respectively, revealed that the enzyme is highly specific for butyrate (C4) compounds, behaving as an esterase rather than a lipase. The RBest1 esterase was purified and biochemically characterized. The optimal esterase activity was observed at pH 6.5 and at temperatures ranging from 38 to 45 °C. Enzymatic activity, determined by hydrolysis of p-nitrophenyl esters, was found to be affected by the presence of different miscible and non-miscible organic solvents, and salts. Noteworthy, RBest1 remains significantly active at high ionic strength. These findings suggest that RBest1 possesses the ability of OST enzymes to molecular adaptation in the presence of organic compounds and resistance of halophilic proteins.

  5. Bounds on the distribution of the number of gaps when circles and lines are covered by fragments: Theory and practical application to genomic and metagenomic projects

    PubMed Central

    Moriarty, John; Marchesi, Julian R; Metcalfe, Anthony

    2007-01-01

    Background The question of how a circle or line segment becomes covered when random arcs are marked off has arisen repeatedly in bioinformatics. The number of uncovered gaps is of particular interest. Approximate distributions for the number of gaps have been given in the literature, one motivation being ease of computation. Error bounds for these approximate distributions have not been given. Results We give bounds on the probability distribution of the number of gaps when a circle is covered by fragments of fixed size. The absolute error in the approximation is typically on the order of 0.1% at 10× coverage depth. The method can be applied to coverage problems on the interval, including edge effects, and applications are given to metagenomic libraries and shotgun sequencing. PMID:17335566

  6. Carcinogenicity prediction of noncongeneric chemicals by augmented top priority fragment classification.

    PubMed

    Casalegno, Mosè; Sello, Guido

    2016-04-01

    Carcinogenicity prediction is an important process that can be performed to cut down experimental costs and save animal lives. The current reliability of the results is however disputed. Here, a blind exercise in carcinogenicity category assessment is performed using augmented top priority fragment classification. The procedure analyses the applicability domain of the dataset, allocates in clusters the compounds using a leading molecular fragment, and a similarity measure. The exercise is applied to three compound datasets derived from the Lois Gold Carcinogenic Database. The results, showing good agreement with experimental data, are compared with published ones. A final discussion on our viewpoint on the possibilities that the carcinogenicity modelling of chemical compounds offers is presented.

  7. Genomic characterization of Defluviitoga tunisiensis L3, a key hydrolytic bacterium in a thermophilic biogas plant and its abundance as determined by metagenome fragment recruitment.

    PubMed

    Maus, Irena; Cibis, Katharina Gabriela; Bremges, Andreas; Stolze, Yvonne; Wibberg, Daniel; Tomazetto, Geizecler; Blom, Jochen; Sczyrba, Alexander; König, Helmut; Pühler, Alfred; Schlüter, Andreas

    2016-08-20

    The genome sequence of Defluviitoga tunisiensis L3 originating from a thermophilic biogas-production plant was established and recently published as Genome Announcement by our group. The circular chromosome of D. tunisiensis L3 has a size of 2,053,097bp and a mean GC content of 31.38%. To analyze the D. tunisiensis L3 genome sequence in more detail, a phylogenetic analysis of completely sequenced Thermotogae strains based on shared core genes was performed. It appeared that Petrotoga mobilis DSM 10674(T), originally isolated from a North Sea oil-production well, is the closest relative of D. tunisiensis L3. Comparative genome analyses of P. mobilis DSM 10674(T) and D. tunisiensis L3 showed moderate similarities regarding occurrence of orthologous genes. Both genomes share a common set of 1351 core genes. Reconstruction of metabolic pathways important for the biogas production process revealed that the D. tunisiensis L3 genome encodes a large set of genes predicted to facilitate utilization of a variety of complex polysaccharides including cellulose, chitin and xylan. Ethanol, acetate, hydrogen (H2) and carbon dioxide (CO2) were found as possible end-products of the fermentation process. The latter three metabolites are considered to represent substrates for methanogenic Archaea, the key organisms in the final step of the anaerobic digestion process. To determine the degree of relatedness between D. tunisiensis L3 and dominant biogas community members within the thermophilic biogas-production plant, metagenome sequences obtained from the corresponding microbial community were mapped onto the L3 genome sequence. This fragment recruitment revealed that the D. tunisiensis L3 genome is almost completely covered with metagenome sequences featuring high matching accuracy. This result indicates that strains highly related or even identical to the reference strain D. tunisiensis L3 play a dominant role within the community of the thermophilic biogas-production plant. PMID

  8. Genomic characterization of Defluviitoga tunisiensis L3, a key hydrolytic bacterium in a thermophilic biogas plant and its abundance as determined by metagenome fragment recruitment.

    PubMed

    Maus, Irena; Cibis, Katharina Gabriela; Bremges, Andreas; Stolze, Yvonne; Wibberg, Daniel; Tomazetto, Geizecler; Blom, Jochen; Sczyrba, Alexander; König, Helmut; Pühler, Alfred; Schlüter, Andreas

    2016-08-20

    The genome sequence of Defluviitoga tunisiensis L3 originating from a thermophilic biogas-production plant was established and recently published as Genome Announcement by our group. The circular chromosome of D. tunisiensis L3 has a size of 2,053,097bp and a mean GC content of 31.38%. To analyze the D. tunisiensis L3 genome sequence in more detail, a phylogenetic analysis of completely sequenced Thermotogae strains based on shared core genes was performed. It appeared that Petrotoga mobilis DSM 10674(T), originally isolated from a North Sea oil-production well, is the closest relative of D. tunisiensis L3. Comparative genome analyses of P. mobilis DSM 10674(T) and D. tunisiensis L3 showed moderate similarities regarding occurrence of orthologous genes. Both genomes share a common set of 1351 core genes. Reconstruction of metabolic pathways important for the biogas production process revealed that the D. tunisiensis L3 genome encodes a large set of genes predicted to facilitate utilization of a variety of complex polysaccharides including cellulose, chitin and xylan. Ethanol, acetate, hydrogen (H2) and carbon dioxide (CO2) were found as possible end-products of the fermentation process. The latter three metabolites are considered to represent substrates for methanogenic Archaea, the key organisms in the final step of the anaerobic digestion process. To determine the degree of relatedness between D. tunisiensis L3 and dominant biogas community members within the thermophilic biogas-production plant, metagenome sequences obtained from the corresponding microbial community were mapped onto the L3 genome sequence. This fragment recruitment revealed that the D. tunisiensis L3 genome is almost completely covered with metagenome sequences featuring high matching accuracy. This result indicates that strains highly related or even identical to the reference strain D. tunisiensis L3 play a dominant role within the community of the thermophilic biogas-production plant.

  9. Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification.

    PubMed

    Yi, Chucai; Tian, Yingli

    2012-09-01

    In this paper, we propose a novel framework to extract text regions from scene images with complex backgrounds and multiple text appearances. This framework consists of three main steps: boundary clustering (BC), stroke segmentation, and string fragment classification. In BC, we propose a new bigram-color-uniformity-based method to model both text and attachment surface, and cluster edge pixels based on color pairs and spatial positions into boundary layers. Then, stroke segmentation is performed at each boundary layer by color assignment to extract character candidates. We propose two algorithms to combine the structural analysis of text stroke with color assignment and filter out background interferences. Further, we design a robust string fragment classification based on Gabor-based text features. The features are obtained from feature maps of gradient, stroke distribution, and stroke width. The proposed framework of text localization is evaluated on scene images, born-digital images, broadcast video images, and images of handheld objects captured by blind persons. Experimental results on respective datasets demonstrate that the framework outperforms state-of-the-art localization algorithms.

  10. The classification of complex 4-part humeral fractures revisited: the missing fifth fragment and indications for surgery.

    PubMed

    Russo, Raffaele; Cautiero, Fabio; Della Rotonda, Giuseppe

    2012-05-01

    We describe a new classification of complex 4-part proximal humeral fractures (PHF). Its novelty lies in the involvement of fractures of the calcar area (i.e., the missing fifth fragment) in relation to fragments of the head, tuberosities and shaft. The classification consists of 6 groups (divided into 15 subgroups) of calcar fracture patterns. We hypothesized that this classification could aid surgical decision making in terms of osteosynthesis versus prothesis. To test this hypothesis, two shoulder surgeons, trained in the classification, re-examined the X-rays and CT scans of 100 cases of 4-part PHF to codify each calcar fracture pattern. CT scans proved to be essential for this process. We then theoretically assigned the most appropriate treatment to each subgroup. Subsequent verification of clinical records confirmed our hypothesis that this classification could help the surgeon to decide the best approach to complex 4-part PHF.

  11. Improved glycerol to ethanol conversion by E. coli using a metagenomic fragment isolated from an anaerobic reactor.

    PubMed

    Loaces, Inés; Rodríguez, Cecilia; Amarelle, Vanesa; Fabiano, Elena; Noya, Francisco

    2016-10-01

    Crude glycerol obtained as a by-product of biodiesel production is a reliable feedstock with the potential to be converted into reduced chemicals with high yields. It has been previously shown that ethanol is the primary product of glycerol fermentation by Escherichia coli. However, few efforts were made to enhance this conversion by means of the expression of heterologous genes with the potential to improve glycerol transport or metabolism. In this study, a fosmid-based metagenomic library constructed from an anaerobic reactor purge sludge was screened for genetic elements that promote the use and fermentation of crude glycerol by E. coli. One clone was selected based on its improved growth rate on this feedstock. The corresponding fosmid, named G1, was fully sequenced (41 kbp long) and the gene responsible for the observed phenotype was pinpointed by in vitro insertion mutagenesis. Ethanol production from both pure and crude glycerol was evaluated using the parental G1 clone harboring the ethanologenic plasmid pLOI297 or the industrial strain LY180 complemented with G1. In mineral salts media containing 50 % (v/v) pure glycerol, ethanol concentrations increased two-fold on average when G1 was present in the cells reaching up to 20 g/L after 24 h fermentation. Similar fermentation experiments were done using crude instead of pure glycerol. With an initial OD620 of 8.0, final ethanol concentrations after 24 h were much higher reaching 67 and 75 g/L with LY180 cells carrying the control fosmid or the G1 fosmid, respectively. This translates into a specific ethanol production rate of 0.39 g h(-1) OD(-1) L(-1).

  12. Improved glycerol to ethanol conversion by E. coli using a metagenomic fragment isolated from an anaerobic reactor.

    PubMed

    Loaces, Inés; Rodríguez, Cecilia; Amarelle, Vanesa; Fabiano, Elena; Noya, Francisco

    2016-10-01

    Crude glycerol obtained as a by-product of biodiesel production is a reliable feedstock with the potential to be converted into reduced chemicals with high yields. It has been previously shown that ethanol is the primary product of glycerol fermentation by Escherichia coli. However, few efforts were made to enhance this conversion by means of the expression of heterologous genes with the potential to improve glycerol transport or metabolism. In this study, a fosmid-based metagenomic library constructed from an anaerobic reactor purge sludge was screened for genetic elements that promote the use and fermentation of crude glycerol by E. coli. One clone was selected based on its improved growth rate on this feedstock. The corresponding fosmid, named G1, was fully sequenced (41 kbp long) and the gene responsible for the observed phenotype was pinpointed by in vitro insertion mutagenesis. Ethanol production from both pure and crude glycerol was evaluated using the parental G1 clone harboring the ethanologenic plasmid pLOI297 or the industrial strain LY180 complemented with G1. In mineral salts media containing 50 % (v/v) pure glycerol, ethanol concentrations increased two-fold on average when G1 was present in the cells reaching up to 20 g/L after 24 h fermentation. Similar fermentation experiments were done using crude instead of pure glycerol. With an initial OD620 of 8.0, final ethanol concentrations after 24 h were much higher reaching 67 and 75 g/L with LY180 cells carrying the control fosmid or the G1 fosmid, respectively. This translates into a specific ethanol production rate of 0.39 g h(-1) OD(-1) L(-1). PMID:27522660

  13. An advanced fragment analysis-based individualized subtype classification of pediatric acute lymphoblastic leukemia

    PubMed Central

    Zhang, Han; Cheng, Hao; Wang, Qingqing; Zeng, Xianping; Chen, Yanfen; Yan, Jin; Sun, Yanran; Zhao, Xiaoxi; Li, Weijing; Gao, Chao; Gong, Wenyu; Li, Bei; Zhang, Ruidong; Nan, Li; Wu, Yong; Bao, Shilai; Han, Jing-Dong J.; Zheng, Huyong

    2015-01-01

    Pediatric acute lymphoblastic leukemia (ALL) is the most common neoplasm and one of the primary causes of death in children. Its treatment is highly dependent on the correct classification of subtype. Previously, we developed a microarray-based subtype classifier based on the relative expression levels of 62 marker genes, which can predict 7 different ALL subtypes with an accuracy as high as 97% in completely independent samples. Because the classifier is based on gene expression rank values rather than actual values, the classifier enables an individualized diagnosis, without the need to reference the background distribution of the marker genes in a large number of other samples, and also enables cross platform application. Here, we demonstrate that the classifier can be extended from a microarray-based technology to a multiplex qPCR-based technology using the same set of marker genes as the advanced fragment analysis (AFA). Compared to microarray assays, the new assay system makes the convenient, low cost and individualized subtype diagnosis of pediatric ALL a reality and is clinically applicable, particularly in developing countries. PMID:26196328

  14. Enhanced Acylcarnitine Annotation in High-Resolution Mass Spectrometry Data: Fragmentation Analysis for the Classification and Annotation of Acylcarnitines

    PubMed Central

    van der Hooft, Justin J. J.; Ridder, Lars; Barrett, Michael P.; Burgess, Karl E. V.

    2015-01-01

    Metabolite annotation and identification are primary challenges in untargeted metabolomics experiments. Rigorous workflows for reliable annotation of mass features with chemical structures or compound classes are needed to enhance the power of untargeted mass spectrometry. High-resolution mass spectrometry considerably improves the confidence in assigning elemental formulas to mass features in comparison to nominal mass spectrometry, and embedding of fragmentation methods enables more reliable metabolite annotations and facilitates metabolite classification. However, the analysis of mass fragmentation spectra can be a time-consuming step and requires expert knowledge. This study demonstrates how characteristic fragmentations, specific to compound classes, can be used to systematically analyze their presence in complex biological extracts like urine that have undergone untargeted mass spectrometry combined with data dependent or targeted fragmentation. Human urine extracts were analyzed using normal phase liquid chromatography (hydrophilic interaction chromatography) coupled to an Ion Trap-Orbitrap hybrid instrument. Subsequently, mass chromatograms and collision-induced dissociation and higher-energy collisional dissociation (HCD) fragments were annotated using the freely available MAGMa software1. Acylcarnitines play a central role in energy metabolism by transporting fatty acids into the mitochondrial matrix. By filtering on a combination of a mass fragment and neutral loss designed based on the MAGMa fragment annotations, we were able to classify and annotate 50 acylcarnitines in human urine extracts, based on high-resolution mass spectrometry HCD fragmentation spectra at different energies for all of them. Of these annotated acylcarnitines, 31 are not described in HMDB yet and for only 4 annotated acylcarnitines the fragmentation spectra could be matched to reference spectra. Therefore, we conclude that the use of mass fragmentation filters within the context

  15. The PhyloPythiaS Web Server for Taxonomic Assignment of Metagenome Sequences

    PubMed Central

    Patil, Kaustubh Raosaheb; Roune, Linus; McHardy, Alice Carolyn

    2012-01-01

    Metagenome sequencing is becoming common and there is an increasing need for easily accessible tools for data analysis. An essential step is the taxonomic classification of sequence fragments. We describe a web server for the taxonomic assignment of metagenome sequences with PhyloPythiaS. PhyloPythiaS is a fast and accurate sequence composition-based classifier that utilizes the hierarchical relationships between clades. Taxonomic assignments with the web server can be made with a generic model, or with sample-specific models that users can specify and create. Several interactive visualization modes and multiple download formats allow quick and convenient analysis and downstream processing of taxonomic assignments. Here, we demonstrate usage of our web server by taxonomic assignment of metagenome samples from an acidophilic biofilm community of an acid mine and of a microbial community from cow rumen. PMID:22745671

  16. The PhyloPythiaS web server for taxonomic assignment of metagenome sequences.

    PubMed

    Patil, Kaustubh Raosaheb; Roune, Linus; McHardy, Alice Carolyn

    2012-01-01

    Metagenome sequencing is becoming common and there is an increasing need for easily accessible tools for data analysis. An essential step is the taxonomic classification of sequence fragments. We describe a web server for the taxonomic assignment of metagenome sequences with PhyloPythiaS. PhyloPythiaS is a fast and accurate sequence composition-based classifier that utilizes the hierarchical relationships between clades. Taxonomic assignments with the web server can be made with a generic model, or with sample-specific models that users can specify and create. Several interactive visualization modes and multiple download formats allow quick and convenient analysis and downstream processing of taxonomic assignments. Here, we demonstrate usage of our web server by taxonomic assignment of metagenome samples from an acidophilic biofilm community of an acid mine and of a microbial community from cow rumen.

  17. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    SciTech Connect

    Reddy, Tatiparthi B. K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

    2014-10-27

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.

  18. The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

    PubMed Central

    Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

    2015-01-01

    The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402

  19. The Phylogenetic Diversity of Metagenomes

    PubMed Central

    Kembel, Steven W.; Eisen, Jonathan A.; Pollard, Katherine S.; Green, Jessica L.

    2011-01-01

    Phylogenetic diversity—patterns of phylogenetic relatedness among organisms in ecological communities—provides important insights into the mechanisms underlying community assembly. Studies that measure phylogenetic diversity in microbial communities have primarily been limited to a single marker gene approach, using the small subunit of the rRNA gene (SSU-rRNA) to quantify phylogenetic relationships among microbial taxa. In this study, we present an approach for inferring phylogenetic relationships among microorganisms based on the random metagenomic sequencing of DNA fragments. To overcome challenges caused by the fragmentary nature of metagenomic data, we leveraged fully sequenced bacterial genomes as a scaffold to enable inference of phylogenetic relationships among metagenomic sequences from multiple phylogenetic marker gene families. The resulting metagenomic phylogeny can be used to quantify the phylogenetic diversity of microbial communities based on metagenomic data sets. We applied this method to understand patterns of microbial phylogenetic diversity and community assembly along an oceanic depth gradient, and compared our findings to previous studies of this gradient using SSU-rRNA gene and metagenomic analyses. Bacterial phylogenetic diversity was highest at intermediate depths beneath the ocean surface, whereas taxonomic diversity (diversity measured by binning sequences into taxonomically similar groups) showed no relationship with depth. Phylogenetic diversity estimates based on the SSU-rRNA gene and the multi-gene metagenomic phylogeny were broadly concordant, suggesting that our approach will be applicable to other metagenomic data sets for which corresponding SSU-rRNA gene sequences are unavailable. Our approach opens up the possibility of using metagenomic data to study microbial diversity in a phylogenetic context. PMID:21912589

  20. The metagenomic telescope.

    PubMed

    Szalkai, Balázs; Scheer, Ildikó; Nagy, Kinga; Vértessy, Beáta G; Grolmusz, Vince

    2014-01-01

    Next generation sequencing technologies led to the discovery of numerous new microbe species in diverse environmental samples. Some of the new species contain genes never encountered before. Some of these genes encode proteins with novel functions, and some of these genes encode proteins that perform some well-known function in a novel way. A tool, named the Metagenomic Telescope, is described here that applies artificial intelligence methods, and seems to be capable of identifying new protein functions even in the well-studied model organisms. As a proof-of-principle demonstration of the Metagenomic Telescope, we considered DNA repair enzymes in the present work. First we identified proteins in DNA repair in well-known organisms (i.e., proteins in base excision repair, nucleotide excision repair, mismatch repair and DNA break repair); next we applied multiple alignments and then built hidden Markov profiles for each protein separately, across well-researched organisms; next, using public depositories of metagenomes, originating from extreme environments, we identified DNA repair genes in the samples. While the phylogenetic classification of the metagenomic samples are not typically available, we hypothesized that some very special DNA repair strategies need to be applied in bacteria and Archaea living in those extreme circumstances. It is a difficult task to evaluate the results obtained from mostly unknown species; therefore we applied again the hidden Markov profiling: for the identified DNA repair genes in the extreme metagenomes, we prepared new hidden Markov profiles (for each genes separately, subsequent to a cluster analysis); and we searched for similarities to those profiles in model organisms. We have found well known DNA repair proteins, numerous proteins with unknown functions, and also proteins with known, but different functions in the model organisms. PMID:25054802

  1. The metagenomic telescope.

    PubMed

    Szalkai, Balázs; Scheer, Ildikó; Nagy, Kinga; Vértessy, Beáta G; Grolmusz, Vince

    2014-01-01

    Next generation sequencing technologies led to the discovery of numerous new microbe species in diverse environmental samples. Some of the new species contain genes never encountered before. Some of these genes encode proteins with novel functions, and some of these genes encode proteins that perform some well-known function in a novel way. A tool, named the Metagenomic Telescope, is described here that applies artificial intelligence methods, and seems to be capable of identifying new protein functions even in the well-studied model organisms. As a proof-of-principle demonstration of the Metagenomic Telescope, we considered DNA repair enzymes in the present work. First we identified proteins in DNA repair in well-known organisms (i.e., proteins in base excision repair, nucleotide excision repair, mismatch repair and DNA break repair); next we applied multiple alignments and then built hidden Markov profiles for each protein separately, across well-researched organisms; next, using public depositories of metagenomes, originating from extreme environments, we identified DNA repair genes in the samples. While the phylogenetic classification of the metagenomic samples are not typically available, we hypothesized that some very special DNA repair strategies need to be applied in bacteria and Archaea living in those extreme circumstances. It is a difficult task to evaluate the results obtained from mostly unknown species; therefore we applied again the hidden Markov profiling: for the identified DNA repair genes in the extreme metagenomes, we prepared new hidden Markov profiles (for each genes separately, subsequent to a cluster analysis); and we searched for similarities to those profiles in model organisms. We have found well known DNA repair proteins, numerous proteins with unknown functions, and also proteins with known, but different functions in the model organisms.

  2. The Metagenomic Telescope

    PubMed Central

    Szalkai, Balázs; Scheer, Ildikó; Nagy, Kinga; Vértessy, Beáta G.; Grolmusz, Vince

    2014-01-01

    Next generation sequencing technologies led to the discovery of numerous new microbe species in diverse environmental samples. Some of the new species contain genes never encountered before. Some of these genes encode proteins with novel functions, and some of these genes encode proteins that perform some well-known function in a novel way. A tool, named the Metagenomic Telescope, is described here that applies artificial intelligence methods, and seems to be capable of identifying new protein functions even in the well-studied model organisms. As a proof-of-principle demonstration of the Metagenomic Telescope, we considered DNA repair enzymes in the present work. First we identified proteins in DNA repair in well–known organisms (i.e., proteins in base excision repair, nucleotide excision repair, mismatch repair and DNA break repair); next we applied multiple alignments and then built hidden Markov profiles for each protein separately, across well–researched organisms; next, using public depositories of metagenomes, originating from extreme environments, we identified DNA repair genes in the samples. While the phylogenetic classification of the metagenomic samples are not typically available, we hypothesized that some very special DNA repair strategies need to be applied in bacteria and Archaea living in those extreme circumstances. It is a difficult task to evaluate the results obtained from mostly unknown species; therefore we applied again the hidden Markov profiling: for the identified DNA repair genes in the extreme metagenomes, we prepared new hidden Markov profiles (for each genes separately, subsequent to a cluster analysis); and we searched for similarities to those profiles in model organisms. We have found well known DNA repair proteins, numerous proteins with unknown functions, and also proteins with known, but different functions in the model organisms. PMID:25054802

  3. Rapid identification and classification of bacteria by 16S rDNA restriction fragment melting curve analyses (RFMCA).

    PubMed

    Rudi, Knut; Kleiberg, Gro H; Heiberg, Ragnhild; Rosnes, Jan T

    2007-08-01

    The aim of this work was to evaluate restriction fragment melting curve analyses (RFMCA) as a novel approach for rapid classification of bacteria during food production. RFMCA was evaluated for bacteria isolated from sous vide food products, and raw materials used for sous vide production. We identified four major bacterial groups in the material analysed (cluster I-Streptococcus, cluster II-Carnobacterium/Bacillus, cluster III-Staphylococcus and cluster IV-Actinomycetales). The accuracy of RFMCA was evaluated by comparison with 16S rDNA sequencing. The strains satisfying the RFMCA quality filtering criteria (73%, n=57), with both 16S rDNA sequence information and RFMCA data (n=45) gave identical group assignments with the two methods. RFMCA enabled rapid and accurate classification of bacteria that is database compatible. Potential application of RFMCA in the food or pharmaceutical industry will include development of classification models for the bacteria expected in a given product, and then to build an RFMCA database as a part of the product quality control. PMID:17367680

  4. Exploration of noncoding sequences in metagenomes.

    PubMed

    Tobar-Tosse, Fabián; Rodríguez, Adrián C; Vélez, Patricia E; Zambrano, María M; Moreno, Pedro A

    2013-01-01

    Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C) content, Codon Usage (Cd), Trinucleotide Usage (Tn), and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS) in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment. PMID:23536879

  5. Exploration of noncoding sequences in metagenomes.

    PubMed

    Tobar-Tosse, Fabián; Rodríguez, Adrián C; Vélez, Patricia E; Zambrano, María M; Moreno, Pedro A

    2013-01-01

    Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C) content, Codon Usage (Cd), Trinucleotide Usage (Tn), and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS) in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment.

  6. METAXA2: improved identification and taxonomic classification of small and large subunit rRNA in metagenomic data.

    PubMed

    Bengtsson-Palme, Johan; Hartmann, Martin; Eriksson, Karl Martin; Pal, Chandan; Thorell, Kaisa; Larsson, Dan Göran Joakim; Nilsson, Rolf Henrik

    2015-11-01

    The ribosomal rRNA genes are widely used as genetic markers for taxonomic identification of microbes. Particularly the small subunit (SSU; 16S/18S) rRNA gene is frequently used for species- or genus-level identification, but also the large subunit (LSU; 23S/28S) rRNA gene is employed in taxonomic assignment. The METAXA software tool is a popular utility for extracting partial rRNA sequences from large sequencing data sets and assigning them to an archaeal, bacterial, nuclear eukaryote, mitochondrial or chloroplast origin. This study describes a comprehensive update to METAXA - METAXA2 - that extends the capabilities of the tool, introducing support for the LSU rRNA gene, a greatly improved classifier allowing classification down to genus or species level, as well as enhanced support for short-read (100 bp) and paired-end sequences, among other changes. The performance of METAXA2 was compared to other commonly used taxonomic classifiers, showing that METAXA2 often outperforms previous methods in terms of making correct predictions while maintaining a low misclassification rate. METAXA2 is freely available from http://microbiology.se/software/metaxa2/. PMID:25732605

  7. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  8. Combined bromodeoxyuridine immunocapture and terminal-restriction fragment length polymorphism analysis highlights differences in the active soil bacterial metagenome due to Glomus mosseae inoculation or plant species.

    PubMed

    Artursson, Veronica; Finlay, Roger D; Jansson, Janet K

    2005-12-01

    High numbers of bacteria are associated with arbuscular mycorrhizal (AM) fungi, but their functions and in situ activities are largely unknown and most have never been characterized. The aim of the present study was to study the impact of Glomus mosseae inoculation and plant type on the active bacterial communities in soil by using a molecular approach, bromodeoxyuridine (BrdU) immunocapture in combination with terminal-restriction fragment length polymorphism (T-RFLP). This approach combined with sequence information from clone libraries, enabled the identification of actively growing populations, within the total bacterial community. Distinct differences in active bacterial community compositions were found according to G. mosseae inoculation, treatment with an antifungal compound (Benomyl) and plant type. The putative identities of the dominant bacterial species that were activated as a result of G. mosseae inoculation were found to be mostly uncultured bacteria and Paenibacillus species. These populations may represent novel bacterial groups that are able to influence the AM relationship and its subsequent effect on plant growth.

  9. [Expression of the genes CelA and XylA isolated from a fragment of metagenomic DNA in Escherichia coli].

    PubMed

    Shedova, E N; Lunina, N A; Berezina, O V; Zverlov, V V; Schwarz, V; Velikodvorskaia, G A

    2009-01-01

    The glycosyl hydrolase genes cel5A and xyl3A previously isolated by ourselves within a fragment of DNA from the methagenomic library of cow rumen microflora DNA were sub-cloned and expressed in E. coli. The recombinant proteins Cel5A and Xyl3A were purified and characterized. Cellulase Cel5A belongs to the Family 5 glycosyl hydrolases and is a one-module 38.2 kDa enzyme that hydrolyses the 1,4-glycoside bonds of soluble cellulose substrates and amorphous cellulose, showing its maximal activity (31200 u/mg) on lichenan, a soluble substrate with mixed (beta-1,3-1,4) bonds. The end product of the amorphous cellulose hydrolysis is cellobiose. Cel5A is inactive toward the crystal forms of cellulose. Cel5A is an endoglucanase capable of exohydrolysis. The molecular mass of beta-xylosidase Xyl3A belonging to the Family 3 glycosyl hydrolases is 83.7 kDa. The enzyme is active only on xylooligosaccharides, with the maximal activity shown on xylobiose, the end product of the reaction being xylose. No activity on xylane was hitherto observed. Recombinant Cel5A and Xyl3A are stable over a wide range of pH and temperatures, their maximal activity being observed at pH 6.5 and at 55 degrees C.

  10. Swine Fecal Metagenomics

    EPA Science Inventory

    Metagenomic approaches are providing rapid and more robust means to investigate the composition and functional genetic potential of complex microbial communities. In this study, we utilized a metagenomic approach to further understand the functional diversity of the swine gut. To...

  11. Evolutionary dynamics of clustered irregularly interspaced short palindromic repeat systems in the ocean metagenome.

    PubMed

    Sorokin, Valery A; Gelfand, Mikhail S; Artamonova, Irena I

    2010-04-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) form a recently characterized type of prokaryotic antiphage defense system. The phage-host interactions involving CRISPRs have been studied in experiments with selected bacterial or archaeal species and, computationally, in completely sequenced genomes. However, these studies do not allow one to take prokaryotic population diversity and phage-host interaction dynamics into account. This gap can be filled by using metagenomic data: in particular, the largest existing data set, generated from the Sorcerer II Global Ocean Sampling expedition. The application of three publicly available CRISPR recognition programs to the Global Ocean metagenome produced a large proportion of false-positive results. To address this problem, a filtering procedure was designed. It resulted in about 200 reliable CRISPR cassettes, which were then studied in detail. The repeat consensuses were clustered into several stable classes that differed from the existing classification. Short fragments of DNA similar to the cassette spacers were more frequently present in the same geographical location than in other locations (P, <0.0001). We developed a catalogue of elementary CRISPR-forming events and reconstructed the likely evolutionary history of cassettes that had common spacers. Metagenomic collections allow for relatively unbiased analysis of phage-host interactions and CRISPR evolution. The results of this study demonstrate that CRISPR cassettes retain the memory of the local virus population at a particular ocean location. CRISPR evolution may be described using a limited vocabulary of elementary events that have a natural biological interpretation.

  12. Megraft: A software package to graft ribosomal small subunit (16S/18S) fragments onto full-length sequences for accurate species richness and sequencing depth analysis in pyrosequencing-length metagenomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Metagenomic libraries represent subsamples of the total DNA found at a study site and offer unprecedented opportunities to study ecological and functional aspects of microbial communities. To examine the depth of the sequencing effort, rarefaction analysis of the ribosomal small sub-unit (SSU/16S/18...

  13. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  14. Livermore Metagenomics Analysis Toolkit

    2012-10-01

    LMAT is designed to take as input a collection of raw metagenomic sequencer reads, and search each read against a reference genome database and assign a taxonomic label and confidence value to each read and report a summary of the predicted taxonomic contents of the metagenomic sample.

  15. Structural and functional insights from the metagenome of an acidic hot spring microbial planktonic community in the Colombian Andes.

    PubMed

    Jiménez, Diego Javier; Andreote, Fernando Dini; Chaves, Diego; Montaña, José Salvador; Osorio-Forero, Cesar; Junca, Howard; Zambrano, María Mercedes; Baena, Sandra

    2012-01-01

    A taxonomic and annotated functional description of microbial life was deduced from 53 Mb of metagenomic sequence retrieved from a planktonic fraction of the Neotropical high Andean (3,973 meters above sea level) acidic hot spring El Coquito (EC). A classification of unassembled metagenomic reads using different databases showed a high proportion of Gammaproteobacteria and Alphaproteobacteria (in total read affiliation), and through taxonomic affiliation of 16S rRNA gene fragments we observed the presence of Proteobacteria, micro-algae chloroplast and Firmicutes. Reads mapped against the genomes Acidiphilium cryptum JF-5, Legionella pneumophila str. Corby and Acidithiobacillus caldus revealed the presence of transposase-like sequences, potentially involved in horizontal gene transfer. Functional annotation and hierarchical comparison with different datasets obtained by pyrosequencing in different ecosystems showed that the microbial community also contained extensive DNA repair systems, possibly to cope with ultraviolet radiation at such high altitudes. Analysis of genes involved in the nitrogen cycle indicated the presence of dissimilatory nitrate reduction to N2 (narGHI, nirS, norBCDQ and nosZ), associated with Proteobacteria-like sequences. Genes involved in the sulfur cycle (cysDN, cysNC and aprA) indicated adenylsulfate and sulfite production that were affiliated to several bacterial species. In summary, metagenomic sequence data provided insight regarding the structure and possible functions of this hot spring microbial community, describing some groups potentially involved in the nitrogen and sulfur cycling in this environment. PMID:23251687

  16. Structural and Functional Insights from the Metagenome of an Acidic Hot Spring Microbial Planktonic Community in the Colombian Andes

    PubMed Central

    Jiménez, Diego Javier; Andreote, Fernando Dini; Chaves, Diego; Montaña, José Salvador; Osorio-Forero, Cesar; Junca, Howard; Zambrano, María Mercedes; Baena, Sandra

    2012-01-01

    A taxonomic and annotated functional description of microbial life was deduced from 53 Mb of metagenomic sequence retrieved from a planktonic fraction of the Neotropical high Andean (3,973 meters above sea level) acidic hot spring El Coquito (EC). A classification of unassembled metagenomic reads using different databases showed a high proportion of Gammaproteobacteria and Alphaproteobacteria (in total read affiliation), and through taxonomic affiliation of 16S rRNA gene fragments we observed the presence of Proteobacteria, micro-algae chloroplast and Firmicutes. Reads mapped against the genomes Acidiphilium cryptum JF-5, Legionella pneumophila str. Corby and Acidithiobacillus caldus revealed the presence of transposase-like sequences, potentially involved in horizontal gene transfer. Functional annotation and hierarchical comparison with different datasets obtained by pyrosequencing in different ecosystems showed that the microbial community also contained extensive DNA repair systems, possibly to cope with ultraviolet radiation at such high altitudes. Analysis of genes involved in the nitrogen cycle indicated the presence of dissimilatory nitrate reduction to N2 (narGHI, nirS, norBCDQ and nosZ), associated with Proteobacteria-like sequences. Genes involved in the sulfur cycle (cysDN, cysNC and aprA) indicated adenylsulfate and sulfite production that were affiliated to several bacterial species. In summary, metagenomic sequence data provided insight regarding the structure and possible functions of this hot spring microbial community, describing some groups potentially involved in the nitrogen and sulfur cycling in this environment. PMID:23251687

  17. Metagenomics and probiotics.

    PubMed

    Gueimonde, M; Collado, M C

    2012-07-01

    The development of extensive sequencing methods has allowed metagenomic studies on the human gut microbiome to be carried out. This has tremendously increased our knowledge on gut microbiota composition and activity, allowing microbiota aberrations related to different diseases to be identified. These aberrations constitute targets for the development of probiotics directed to correct them. Probiotics are extensively used to modulate gut microbiota. Nevertheless, metagenomic studies on the effects of probiotics are still very scarce. In the near future, the use of metagenomics promises to expand our understanding of probiotic action.

  18. Reference databases for taxonomic assignment in metagenomics.

    PubMed

    Santamaria, Monica; Fosso, Bruno; Consiglio, Arianna; De Caro, Giorgio; Grillo, Giorgio; Licciulli, Flavio; Liuni, Sabino; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Pesole, Graziano

    2012-11-01

    Metagenomics is providing an unprecedented access to the environmental microbial diversity. The amplicon-based metagenomics approach involves the PCR-targeted sequencing of a genetic locus fitting different features. Namely, it must be ubiquitous in the taxonomic range of interest, variable enough to discriminate between different species but flanked by highly conserved sequences, and of suitable size to be sequenced through next-generation platforms. The internal transcribed spacers 1 and 2 (ITS1 and ITS2) of the ribosomal DNA operon and one or more hyper-variable regions of 16S ribosomal RNA gene are typically used to identify fungal and bacterial species, respectively. In this context, reliable reference databases and taxonomies are crucial to assign amplicon sequence reads to the correct phylogenetic ranks. Several resources provide consistent phylogenetic classification of publicly available 16S ribosomal DNA sequences, whereas the state of ribosomal internal transcribed spacers reference databases is notably less advanced. In this review, we aim to give an overview of existing reference resources for both types of markers, highlighting strengths and possible shortcomings of their use for metagenomics purposes. Moreover, we present a new database, ITSoneDB, of well annotated and phylogenetically classified ITS1 sequences to be used as a reference collection in metagenomic studies of environmental fungal communities. ITSoneDB is available for download and browsing at http://itsonedb.ba.itb.cnr.it/.

  19. Metagenomics and antibiotics.

    PubMed

    Garmendia, L; Hernandez, A; Sanchez, M B; Martinez, J L

    2012-07-01

    Most of the bacterial species that form part of the biosphere have never been cultivated. In this situation, a comprehensive study of bacterial communities requires the utilization of non-culture-based methods, which have been named metagenomics. In this paper we review the use of different metagenomic techniques for understanding the effect of antibiotics on microbial communities, to synthesize new antimicrobial compounds and to analyse the distribution of antibiotic resistance genes in different ecosystems. These techniques include functional metagenomics, which serves to find new antibiotics or new antibiotic resistance genes, and descriptive metagenomics, which serves to analyse changes in the composition of the microbiota and to track the presence and abundance of already known antibiotic resistance genes in different ecosystems.

  20. Random whole metagenomic sequencing for forensic discrimination of soils.

    PubMed

    Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

    2014-01-01

    Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations. PMID:25111003

  1. Ocean microbial metagenomics

    NASA Astrophysics Data System (ADS)

    Kerkhof, Lee J.; Goodman, Robert M.

    2009-09-01

    Technology for accessing the genomic DNA of microorganisms, directly from environmental samples without prior cultivation, has opened new vistas to understanding microbial diversity and functions. Especially as applied to soils and the oceans, environments on Earth where microbial diversity is vast, metagenomics and its emergent approaches have the power to transform rapidly our understanding of environmental microbiology. Here we explore select recent applications of the metagenomic suite to ocean microbiology.

  2. A primer on metagenomics.

    PubMed

    Wooley, John C; Godzik, Adam; Friedberg, Iddo

    2010-02-26

    Metagenomics is a discipline that enables the genomic study of uncultured microorganisms. Faster, cheaper sequencing technologies and the ability to sequence uncultured microbes sampled directly from their habitats are expanding and transforming our view of the microbial world. Distilling meaningful information from the millions of new genomic sequences presents a serious challenge to bioinformaticians. In cultured microbes, the genomic data come from a single clone, making sequence assembly and annotation tractable. In metagenomics, the data come from heterogeneous microbial communities, sometimes containing more than 10,000 species, with the sequence data being noisy and partial. From sampling, to assembly, to gene calling and function prediction, bioinformatics faces new demands in interpreting voluminous, noisy, and often partial sequence data. Although metagenomics is a relative newcomer to science, the past few years have seen an explosion in computational methods applied to metagenomic-based research. It is therefore not within the scope of this article to provide an exhaustive review. Rather, we provide here a concise yet comprehensive introduction to the current computational requirements presented by metagenomics, and review the recent progress made. We also note whether there is software that implements any of the methods presented here, and briefly review its utility. Nevertheless, it would be useful if readers of this article would avail themselves of the comment section provided by this journal, and relate their own experiences. Finally, the last section of this article provides a few representative studies illustrating different facets of recent scientific discoveries made using metagenomics.

  3. Reconstruction of Novel Cyanobacterial Siphovirus Genomes from Mediterranean Metagenomic Fosmids

    PubMed Central

    Mizuno, Carolina Megumi; Garcia-Heredia, Inmaculada; Martin-Cuadrado, Ana-Belen; Ghai, Rohit

    2013-01-01

    Cellular metagenomes are primarily used for investigating microbial community structure and function. However, cloned fosmids from such metagenomes capture phage genome fragments that can be used as a source of phage genomes. We show that fosmid cloning from cellular metagenomes and sequencing at a high coverage is a credible alternative to constructing metaviriomes and allows capturing and assembling novel, complete phage genomes. It is likely that phages recovered from cellular metagenomes are those replicating within cells during sample collection and represent “active” phages, naturally amplifying their genomic DNA and increasing chances for cloning. We describe five sets of siphoviral contigs (MEDS1, MEDS2, MEDS3, MEDS4, and MEDS5), obtained by sequencing fosmids from the cellular metagenome of the deep chlorophyll maximum in the Mediterranean. Three of these represent complete siphoviral genomes and two represent partial ones. This is the first set of phage genomes assembled directly from cellular metagenomic fosmid libraries. They exhibit low sequence similarities to one another and to known siphoviruses but are remarkably similar in overall genome architecture. We present evidence suggesting they infect picocyanobacteria, likely Synechococcus. Four of these sets also define a novel branch in the phylogenetic tree of phage large subunit terminases. Moreover, some of these siphoviral groups are globally distributed and abundant in the oceans, comparable to some known myoviruses and podoviruses. This suggests that, as more siphoviral genomes become available, we will be better able to assess the abundance and influence of this diverse and polyphyletic group in the marine habitat. PMID:23160125

  4. Metagenomic mining for microbiologists.

    PubMed

    Delmont, Tom O; Malandain, Cedric; Prestat, Emmanuel; Larose, Catherine; Monier, Jean-Michel; Simonet, Pascal; Vogel, Timothy M

    2011-12-01

    Microbial ecologists can now start digging into the accumulating mountains of metagenomic data to uncover the occurrence of functional genes and their correlations to microbial community members. Limitations and biases in DNA extraction and sequencing technologies impact sequence distributions, and therefore, have to be considered. However, when comparing metagenomes from widely differing environments, these fluctuations have a relatively minor role in microbial community discrimination. As a consequence, any functional gene or species distribution pattern can be compared among metagenomes originating from various environments and projects. In particular, global comparisons would help to define ecosystem specificities, such as involvement and response to climate change (for example, carbon and nitrogen cycle), human health risks (eg, presence of pathogen species, toxin genes and viruses) and biodegradation capacities. Although not all scientists have easy access to high-throughput sequencing technologies, they do have access to the sequences that have been deposited in databases, and therefore, can begin to intensively mine these metagenomic data to generate hypotheses that can be validated experimentally. Information about metabolic functions and microbial species compositions can already be compared among metagenomes from different ecosystems. These comparisons add to our understanding about microbial adaptation and the role of specific microbes in different ecosystems. Concurrent with the rapid growth of sequencing technologies, we have entered a new age of microbial ecology, which will enable researchers to experimentally confirm putative relationships between microbial functions and community structures.

  5. METAGENassist: a comprehensive web server for comparative metagenomics.

    PubMed

    Arndt, David; Xia, Jianguo; Liu, Yifeng; Zhou, You; Guo, An Chi; Cruz, Joseph A; Sinelnikov, Igor; Budwill, Karen; Nesbø, Camilla L; Wishart, David S

    2012-07-01

    With recent improvements in DNA sequencing and sample extraction techniques, the quantity and quality of metagenomic data are now growing exponentially. This abundance of richly annotated metagenomic data and bacterial census information has spawned a new branch of microbiology called comparative metagenomics. Comparative metagenomics involves the comparison of bacterial populations between different environmental samples, different culture conditions or different microbial hosts. However, in order to do comparative metagenomics, one typically requires a sophisticated knowledge of multivariate statistics and/or advanced software programming skills. To make comparative metagenomics more accessible to microbiologists, we have developed a freely accessible, easy-to-use web server for comparative metagenomic analysis called METAGENassist. Users can upload their bacterial census data from a wide variety of common formats, using either amplified 16S rRNA data or shotgun metagenomic data. Metadata concerning environmental, culture, or host conditions can also be uploaded. During the data upload process, METAGENassist also performs an automated taxonomic-to-phenotypic mapping. Phenotypic information covering nearly 20 functional categories such as GC content, genome size, oxygen requirements, energy sources and preferred temperature range is automatically generated from the taxonomic input data. Using this phenotypically enriched data, users can then perform a variety of multivariate and univariate data analyses including fold change analysis, t-tests, PCA, PLS-DA, clustering and classification. To facilitate data processing, users are guided through a step-by-step analysis workflow using a variety of menus, information hyperlinks and check boxes. METAGENassist also generates colorful, publication quality tables and graphs that can be downloaded and used directly in the preparation of scientific papers. METAGENassist is available at http://www.metagenassist.ca.

  6. Metagenomics of extreme environments.

    PubMed

    Cowan, D A; Ramond, J-B; Makhalanyane, T P; De Maayer, P

    2015-06-01

    Whether they are exposed to extremes of heat or cold, or buried deep beneath the Earth's surface, microorganisms have an uncanny ability to survive under these conditions. This ability to survive has fascinated scientists for nearly a century, but the recent development of metagenomics and 'omics' tools has allowed us to make huge leaps in understanding the remarkable complexity and versatility of extremophile communities. Here, in the context of the recently developed metagenomic tools, we discuss recent research on the community composition, adaptive strategies and biological functions of extremophiles. PMID:26048196

  7. Metagenomics of extreme environments.

    PubMed

    Cowan, D A; Ramond, J-B; Makhalanyane, T P; De Maayer, P

    2015-06-01

    Whether they are exposed to extremes of heat or cold, or buried deep beneath the Earth's surface, microorganisms have an uncanny ability to survive under these conditions. This ability to survive has fascinated scientists for nearly a century, but the recent development of metagenomics and 'omics' tools has allowed us to make huge leaps in understanding the remarkable complexity and versatility of extremophile communities. Here, in the context of the recently developed metagenomic tools, we discuss recent research on the community composition, adaptive strategies and biological functions of extremophiles.

  8. Recent progresses in metagenomics

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Metagenomics addresses the collective genetic structure and functional composition of a microbial community at its native habitat. This approach has emerged as a powerful tool to study the structure and function of the microbiota for the past few years and is revolutionizing studies of microbial ec...

  9. A study of the effectiveness of machine learning methods for classification of clinical interview fragments into a large number of categories.

    PubMed

    Hasan, Mehedi; Kotov, Alexander; Idalski Carcone, April; Dong, Ming; Naar, Sylvie; Brogan Hartlieb, Kathryn

    2016-08-01

    This study examines the effectiveness of state-of-the-art supervised machine learning methods in conjunction with different feature types for the task of automatic annotation of fragments of clinical text based on codebooks with a large number of categories. We used a collection of motivational interview transcripts consisting of 11,353 utterances, which were manually annotated by two human coders as the gold standard, and experimented with state-of-art classifiers, including Naïve Bayes, J48 Decision Tree, Support Vector Machine (SVM), Random Forest (RF), AdaBoost, DiscLDA, Conditional Random Fields (CRF) and Convolutional Neural Network (CNN) in conjunction with lexical, contextual (label of the previous utterance) and semantic (distribution of words in the utterance across the Linguistic Inquiry and Word Count dictionaries) features. We found out that, when the number of classes is large, the performance of CNN and CRF is inferior to SVM. When only lexical features were used, interview transcripts were automatically annotated by SVM with the highest classification accuracy among all classifiers of 70.8%, 61% and 53.7% based on the codebooks consisting of 17, 20 and 41 codes, respectively. Using contextual and semantic features, as well as their combination, in addition to lexical ones, improved the accuracy of SVM for annotation of utterances in motivational interview transcripts with a codebook consisting of 17 classes to 71.5%, 74.2%, and 75.1%, respectively. Our results demonstrate the potential of using machine learning methods in conjunction with lexical, semantic and contextual features for automatic annotation of clinical interview transcripts with near-human accuracy.

  10. A study of the effectiveness of machine learning methods for classification of clinical interview fragments into a large number of categories.

    PubMed

    Hasan, Mehedi; Kotov, Alexander; Idalski Carcone, April; Dong, Ming; Naar, Sylvie; Brogan Hartlieb, Kathryn

    2016-08-01

    This study examines the effectiveness of state-of-the-art supervised machine learning methods in conjunction with different feature types for the task of automatic annotation of fragments of clinical text based on codebooks with a large number of categories. We used a collection of motivational interview transcripts consisting of 11,353 utterances, which were manually annotated by two human coders as the gold standard, and experimented with state-of-art classifiers, including Naïve Bayes, J48 Decision Tree, Support Vector Machine (SVM), Random Forest (RF), AdaBoost, DiscLDA, Conditional Random Fields (CRF) and Convolutional Neural Network (CNN) in conjunction with lexical, contextual (label of the previous utterance) and semantic (distribution of words in the utterance across the Linguistic Inquiry and Word Count dictionaries) features. We found out that, when the number of classes is large, the performance of CNN and CRF is inferior to SVM. When only lexical features were used, interview transcripts were automatically annotated by SVM with the highest classification accuracy among all classifiers of 70.8%, 61% and 53.7% based on the codebooks consisting of 17, 20 and 41 codes, respectively. Using contextual and semantic features, as well as their combination, in addition to lexical ones, improved the accuracy of SVM for annotation of utterances in motivational interview transcripts with a codebook consisting of 17 classes to 71.5%, 74.2%, and 75.1%, respectively. Our results demonstrate the potential of using machine learning methods in conjunction with lexical, semantic and contextual features for automatic annotation of clinical interview transcripts with near-human accuracy. PMID:27185608

  11. Beyond Biodiversity: Fish Metagenomes

    PubMed Central

    Ardura, Alba; Planes, Serge; Garcia-Vazquez, Eva

    2011-01-01

    Biodiversity and intra-specific genetic diversity are interrelated and determine the potential of a community to survive and evolve. Both are considered together in Prokaryote communities treated as metagenomes or ensembles of functional variants beyond species limits. Many factors alter biodiversity in higher Eukaryote communities, and human exploitation can be one of the most important for some groups of plants and animals. For example, fisheries can modify both biodiversity and genetic diversity (intra specific). Intra-specific diversity can be drastically altered by overfishing. Intense fishing pressure on one stock may imply extinction of some genetic variants and subsequent loss of intra-specific diversity. The objective of this study was to apply a metagenome approach to fish communities and explore its value for rapid evaluation of biodiversity and genetic diversity at community level. Here we have applied the metagenome approach employing the Barcoding target gene COI as a model sequence in catch from four very different fish assemblages exploited by fisheries: freshwater communities from the Amazon River and northern Spanish rivers, and marine communities from the Cantabric and Mediterranean seas. Treating all sequences obtained from each regional catch as a biological unit (exploited community) we found that metagenomic diversity indices of the Amazonian catch sample here examined were lower than expected. Reduced diversity could be explained, at least partially, by overexploitation of the fish community that had been independently estimated by other methods. We propose using a metagenome approach for estimating diversity in Eukaryote communities and early evaluating genetic variation losses at multi-species level. PMID:21829636

  12. Beyond biodiversity: fish metagenomes.

    PubMed

    Ardura, Alba; Planes, Serge; Garcia-Vazquez, Eva

    2011-01-01

    Biodiversity and intra-specific genetic diversity are interrelated and determine the potential of a community to survive and evolve. Both are considered together in Prokaryote communities treated as metagenomes or ensembles of functional variants beyond species limits.Many factors alter biodiversity in higher Eukaryote communities, and human exploitation can be one of the most important for some groups of plants and animals. For example, fisheries can modify both biodiversity and genetic diversity (intra specific). Intra-specific diversity can be drastically altered by overfishing. Intense fishing pressure on one stock may imply extinction of some genetic variants and subsequent loss of intra-specific diversity. The objective of this study was to apply a metagenome approach to fish communities and explore its value for rapid evaluation of biodiversity and genetic diversity at community level. Here we have applied the metagenome approach employing the barcoding target gene coi as a model sequence in catch from four very different fish assemblages exploited by fisheries: freshwater communities from the Amazon River and northern Spanish rivers, and marine communities from the Cantabric and Mediterranean seas.Treating all sequences obtained from each regional catch as a biological unit (exploited community) we found that metagenomic diversity indices of the Amazonian catch sample here examined were lower than expected. Reduced diversity could be explained, at least partially, by overexploitation of the fish community that had been independently estimated by other methods.We propose using a metagenome approach for estimating diversity in Eukaryote communities and early evaluating genetic variation losses at multi-species level.

  13. Recovering full-length viral genomes from metagenomes

    PubMed Central

    Smits, Saskia L.; Bodewes, Rogier; Ruiz-González, Aritz; Baumgärtner, Wolfgang; Koopmans, Marion P.; Osterhaus, Albert D. M. E.; Schürch, Anita C.

    2015-01-01

    Infectious disease metagenomics is driven by the question: “what is causing the disease?” in contrast to classical metagenome studies which are guided by “what is out there?” In case of a novel virus, a first step to eventually establishing etiology can be to recover a full-length viral genome from a metagenomic sample. However, retrieval of a full-length genome of a divergent virus is technically challenging and can be time-consuming and costly. Here we discuss different assembly and fragment linkage strategies such as iterative assembly, motif searches, k-mer frequency profiling, coverage profile binning, and other strategies used to recover genomes of potential viral pathogens in a timely and cost-effective manner. PMID:26483782

  14. Accessing the Soil Metagenome for Studies of Microbial Diversity▿ †

    PubMed Central

    Delmont, Tom O.; Robe, Patrick; Cecillon, Sébastien; Clark, Ian M.; Constancias, Florentin; Simonet, Pascal; Hirsch, Penny R.; Vogel, Timothy M.

    2011-01-01

    Soil microbial communities contain the highest level of prokaryotic diversity of any environment, and metagenomic approaches involving the extraction of DNA from soil can improve our access to these communities. Most analyses of soil biodiversity and function assume that the DNA extracted represents the microbial community in the soil, but subsequent interpretations are limited by the DNA recovered from the soil. Unfortunately, extraction methods do not provide a uniform and unbiased subsample of metagenomic DNA, and as a consequence, accurate species distributions cannot be determined. Moreover, any bias will propagate errors in estimations of overall microbial diversity and may exclude some microbial classes from study and exploitation. To improve metagenomic approaches, investigate DNA extraction biases, and provide tools for assessing the relative abundances of different groups, we explored the biodiversity of the accessible community DNA by fractioning the metagenomic DNA as a function of (i) vertical soil sampling, (ii) density gradients (cell separation), (iii) cell lysis stringency, and (iv) DNA fragment size distribution. Each fraction had a unique genetic diversity, with different predominant and rare species (based on ribosomal intergenic spacer analysis [RISA] fingerprinting and phylochips). All fractions contributed to the number of bacterial groups uncovered in the metagenome, thus increasing the DNA pool for further applications. Indeed, we were able to access a more genetically diverse proportion of the metagenome (a gain of more than 80% compared to the best single extraction method), limit the predominance of a few genomes, and increase the species richness per sequencing effort. This work stresses the difference between extracted DNA pools and the currently inaccessible complete soil metagenome. PMID:21183646

  15. Metagenomic Analysis of Bacterial Communities of Antarctic Surface Snow.

    PubMed

    Lopatina, Anna; Medvedeva, Sofia; Shmakov, Sergey; Logacheva, Maria D; Krylenkov, Vjacheslav; Severinov, Konstantin

    2016-01-01

    The diversity of bacteria present in surface snow around four Russian stations in Eastern Antarctica was studied by high throughput sequencing of amplified 16S rRNA gene fragments and shotgun metagenomic sequencing. Considerable class- and genus-level variation between the samples was revealed indicating a presence of inter-site diversity of bacteria in Antarctic snow. Flavobacterium was a major genus in one sampling site and was also detected in other sites. The diversity of flavobacterial type II-C CRISPR spacers in the samples was investigated by metagenome sequencing. Thousands of unique spacers were revealed with less than 35% overlap between the sampling sites, indicating an enormous natural variety of flavobacterial CRISPR spacers and, by extension, high level of adaptive activity of the corresponding CRISPR-Cas system. None of the spacers matched known spacers of flavobacterial isolates from the Northern hemisphere. Moreover, the percentage of spacers with matches with Antarctic metagenomic sequences obtained in this work was significantly higher than with sequences from much larger publically available environmental metagenomic database. The results indicate that despite the overall very high level of diversity, Antarctic Flavobacteria comprise a separate pool that experiences pressures from mobile genetic elements different from those present in other parts of the world. The results also establish analysis of metagenomic CRISPR spacer content as a powerful tool to study bacterial populations diversity. PMID:27064693

  16. Metagenomic Analysis of Bacterial Communities of Antarctic Surface Snow

    PubMed Central

    Lopatina, Anna; Medvedeva, Sofia; Shmakov, Sergey; Logacheva, Maria D.; Krylenkov, Vjacheslav; Severinov, Konstantin

    2016-01-01

    The diversity of bacteria present in surface snow around four Russian stations in Eastern Antarctica was studied by high throughput sequencing of amplified 16S rRNA gene fragments and shotgun metagenomic sequencing. Considerable class- and genus-level variation between the samples was revealed indicating a presence of inter-site diversity of bacteria in Antarctic snow. Flavobacterium was a major genus in one sampling site and was also detected in other sites. The diversity of flavobacterial type II-C CRISPR spacers in the samples was investigated by metagenome sequencing. Thousands of unique spacers were revealed with less than 35% overlap between the sampling sites, indicating an enormous natural variety of flavobacterial CRISPR spacers and, by extension, high level of adaptive activity of the corresponding CRISPR-Cas system. None of the spacers matched known spacers of flavobacterial isolates from the Northern hemisphere. Moreover, the percentage of spacers with matches with Antarctic metagenomic sequences obtained in this work was significantly higher than with sequences from much larger publically available environmental metagenomic database. The results indicate that despite the overall very high level of diversity, Antarctic Flavobacteria comprise a separate pool that experiences pressures from mobile genetic elements different from those present in other parts of the world. The results also establish analysis of metagenomic CRISPR spacer content as a powerful tool to study bacterial populations diversity. PMID:27064693

  17. MetaStorm: A Public Resource for Customizable Metagenomics Annotation.

    PubMed

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  18. MetaStorm: A Public Resource for Customizable Metagenomics Annotation

    PubMed Central

    Arango-Argoty, Gustavo; Singh, Gargi; Heath, Lenwood S.; Pruden, Amy; Xiao, Weidong; Zhang, Liqing

    2016-01-01

    Metagenomics is a trending research area, calling for the need to analyze large quantities of data generated from next generation DNA sequencing technologies. The need to store, retrieve, analyze, share, and visualize such data challenges current online computational systems. Interpretation and annotation of specific information is especially a challenge for metagenomic data sets derived from environmental samples, because current annotation systems only offer broad classification of microbial diversity and function. Moreover, existing resources are not configured to readily address common questions relevant to environmental systems. Here we developed a new online user-friendly metagenomic analysis server called MetaStorm (http://bench.cs.vt.edu/MetaStorm/), which facilitates customization of computational analysis for metagenomic data sets. Users can upload their own reference databases to tailor the metagenomics annotation to focus on various taxonomic and functional gene markers of interest. MetaStorm offers two major analysis pipelines: an assembly-based annotation pipeline and the standard read annotation pipeline used by existing web servers. These pipelines can be selected individually or together. Overall, MetaStorm provides enhanced interactive visualization to allow researchers to explore and manipulate taxonomy and functional annotation at various levels of resolution. PMID:27632579

  19. Hot Spring Metagenomics

    PubMed Central

    López-López, Olalla; Cerdán, María Esperanza; González-Siso, María Isabel

    2013-01-01

    Hot springs have been investigated since the XIX century, but isolation and examination of their thermophilic microbial inhabitants did not start until the 1950s. Many thermophilic microorganisms and their viruses have since been discovered, although the real complexity of thermal communities was envisaged when research based on PCR amplification of the 16S rRNA genes arose. Thereafter, the possibility of cloning and sequencing the total environmental DNA, defined as metagenome, and the study of the genes rescued in the metagenomic libraries and assemblies made it possible to gain a more comprehensive understanding of microbial communities—their diversity, structure, the interactions existing between their components, and the factors shaping the nature of these communities. In the last decade, hot springs have been a source of thermophilic enzymes of industrial interest, encouraging further study of the poorly understood diversity of microbial life in these habitats. PMID:25369743

  20. Hot spring metagenomics.

    PubMed

    López-López, Olalla; Cerdán, María Esperanza; González-Siso, María Isabel

    2013-01-01

    Hot springs have been investigated since the XIX century, but isolation and examination of their thermophilic microbial inhabitants did not start until the 1950s. Many thermophilic microorganisms and their viruses have since been discovered, although the real complexity of thermal communities was envisaged when research based on PCR amplification of the 16S rRNA genes arose. Thereafter, the possibility of cloning and sequencing the total environmental DNA, defined as metagenome, and the study of the genes rescued in the metagenomic libraries and assemblies made it possible to gain a more comprehensive understanding of microbial communities-their diversity, structure, the interactions existing between their components, and the factors shaping the nature of these communities. In the last decade, hot springs have been a source of thermophilic enzymes of industrial interest, encouraging further study of the poorly understood diversity of microbial life in these habitats. PMID:25369743

  1. Databases of the marine metagenomics.

    PubMed

    Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.

  2. Microbial Metagenomics: Beyond the Genome

    NASA Astrophysics Data System (ADS)

    Gilbert, Jack A.; Dupont, Christopher L.

    2011-01-01

    Metagenomics literally means “beyond the genome.” Marine microbial metagenomic databases presently comprise ˜400 billion base pairs of DNA, only ˜3% of that found in 1 ml of seawater. Very soon a trillion-base-pair sequence run will be feasible, so it is time to reflect on what we have learned from metagenomics. We review the impact of metagenomics on our understanding of marine microbial communities. We consider the studies facilitated by data generated through the Global Ocean Sampling expedition, as well as the revolution wrought at the individual laboratory level through next generation sequencing technologies. We review recent studies and discoveries since 2008, provide a discussion of bioinformatic analyses, including conceptual pipelines and sequence annotation and predict the future of metagenomics, with suggestions of collaborative community studies tailored toward answering some of the fundamental questions in marine microbial ecology.

  3. Microbial metagenomics: beyond the genome.

    PubMed

    Gilbert, Jack A; Dupont, Christopher L

    2011-01-01

    Metagenomics literally means "beyond the genome." Marine microbial metagenomic databases presently comprise approximately 400 billion base pairs of DNA, only approximately 3% of that found in 1 ml of seawater. Very soon a trillion-base-pair sequence run will be feasible, so it is time to reflect on what we have learned from metagenomics. We review the impact of metagenomics on our understanding of marine microbial communities. We consider the studies facilitated by data generated through the Global Ocean Sampling expedition, as well as the revolution wrought at the individual laboratory level through next generation sequencing technologies. We review recent studies and discoveries since 2008, provide a discussion of bioinformatic analyses, including conceptual pipelines and sequence annotation and predict the future of metagenomics, with suggestions of collaborative community studies tailored toward answering some of the fundamental questions in marine microbial ecology.

  4. Use of Substrate-Induced Gene Expression in Metagenomic Analysis of an Aromatic Hydrocarbon-Contaminated Soil

    PubMed Central

    Meier, Matthew J.; Paterson, E. Suzanne

    2015-01-01

    Metagenomics allows the study of genes related to xenobiotic degradation in a culture-independent manner, but many of these studies are limited by the lack of genomic context for metagenomic sequences. This study combined a phenotypic screen known as substrate-induced gene expression (SIGEX) with whole-metagenome shotgun sequencing. SIGEX is a high-throughput promoter-trap method that relies on transcriptional activation of a green fluorescent protein (GFP) reporter gene in response to an inducing compound and subsequent fluorescence-activated cell sorting to isolate individual inducible clones from a metagenomic DNA library. We describe a SIGEX procedure with improved library construction from fragmented metagenomic DNA and improved flow cytometry sorting procedures. We used SIGEX to interrogate an aromatic hydrocarbon (AH)-contaminated soil metagenome. The recovered clones contained sequences with various degrees of similarity to genes (or partial genes) involved in aromatic metabolism, for example, nahG (salicylate oxygenase) family genes and their respective upstream nahR regulators. To obtain a broader context for the recovered fragments, clones were mapped to contigs derived from de novo assembly of shotgun-sequenced metagenomic DNA which, in most cases, contained complete operons involved in aromatic metabolism, providing greater insight into the origin of the metagenomic fragments. A comparable set of contigs was generated using a significantly less computationally intensive procedure in which assembly of shotgun-sequenced metagenomic DNA was directed by the SIGEX-recovered sequences. This methodology may have broad applicability in identifying biologically relevant subsets of metagenomes (including both novel and known sequences) that can be targeted computationally by in silico assembly and prediction tools. PMID:26590287

  5. Use of Substrate-Induced Gene Expression in Metagenomic Analysis of an Aromatic Hydrocarbon-Contaminated Soil.

    PubMed

    Meier, Matthew J; Paterson, E Suzanne; Lambert, Iain B

    2016-02-01

    Metagenomics allows the study of genes related to xenobiotic degradation in a culture-independent manner, but many of these studies are limited by the lack of genomic context for metagenomic sequences. This study combined a phenotypic screen known as substrate-induced gene expression (SIGEX) with whole-metagenome shotgun sequencing. SIGEX is a high-throughput promoter-trap method that relies on transcriptional activation of a green fluorescent protein (GFP) reporter gene in response to an inducing compound and subsequent fluorescence-activated cell sorting to isolate individual inducible clones from a metagenomic DNA library. We describe a SIGEX procedure with improved library construction from fragmented metagenomic DNA and improved flow cytometry sorting procedures. We used SIGEX to interrogate an aromatic hydrocarbon (AH)-contaminated soil metagenome. The recovered clones contained sequences with various degrees of similarity to genes (or partial genes) involved in aromatic metabolism, for example, nahG (salicylate oxygenase) family genes and their respective upstream nahR regulators. To obtain a broader context for the recovered fragments, clones were mapped to contigs derived from de novo assembly of shotgun-sequenced metagenomic DNA which, in most cases, contained complete operons involved in aromatic metabolism, providing greater insight into the origin of the metagenomic fragments. A comparable set of contigs was generated using a significantly less computationally intensive procedure in which assembly of shotgun-sequenced metagenomic DNA was directed by the SIGEX-recovered sequences. This methodology may have broad applicability in identifying biologically relevant subsets of metagenomes (including both novel and known sequences) that can be targeted computationally by in silico assembly and prediction tools. PMID:26590287

  6. Captured metagenomics: large-scale targeting of genes based on ‘sequence capture’ reveals functional diversity in soils

    PubMed Central

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K.; Hedlund, Katarina; Ahrén, Dag

    2015-01-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances. PMID:26490729

  7. Captured metagenomics: large-scale targeting of genes based on 'sequence capture' reveals functional diversity in soils.

    PubMed

    Manoharan, Lokeshwaran; Kushwaha, Sandeep K; Hedlund, Katarina; Ahrén, Dag

    2015-12-01

    Microbial enzyme diversity is a key to understand many ecosystem processes. Whole metagenome sequencing (WMG) obtains information on functional genes, but it is costly and inefficient due to large amount of sequencing that is required. In this study, we have applied a captured metagenomics technique for functional genes in soil microorganisms, as an alternative to WMG. Large-scale targeting of functional genes, coding for enzymes related to organic matter degradation, was applied to two agricultural soil communities through captured metagenomics. Captured metagenomics uses custom-designed, hybridization-based oligonucleotide probes that enrich functional genes of interest in metagenomic libraries where only probe-bound DNA fragments are sequenced. The captured metagenomes were highly enriched with targeted genes while maintaining their target diversity and their taxonomic distribution correlated well with the traditional ribosomal sequencing. The captured metagenomes were highly enriched with genes related to organic matter degradation; at least five times more than similar, publicly available soil WMG projects. This target enrichment technique also preserves the functional representation of the soils, thereby facilitating comparative metagenomics projects. Here, we present the first study that applies the captured metagenomics approach in large scale, and this novel method allows deep investigations of central ecosystem processes by studying functional gene abundances.

  8. New Hydrocarbon Degradation Pathways in the Microbial Metagenome from Brazilian Petroleum Reservoirs

    PubMed Central

    Sierra-García, Isabel Natalia; Correa Alvarez, Javier; Pantaroto de Vasconcellos, Suzan; Pereira de Souza, Anete; dos Santos Neto, Eugenio Vaz; de Oliveira, Valéria Maia

    2014-01-01

    Current knowledge of the microbial diversity and metabolic pathways involved in hydrocarbon degradation in petroleum reservoirs is still limited, mostly due to the difficulty in recovering the complex community from such an extreme environment. Metagenomics is a valuable tool to investigate the genetic and functional diversity of previously uncultured microorganisms in natural environments. Using a function-driven metagenomic approach, we investigated the metabolic abilities of microbial communities in oil reservoirs. Here, we describe novel functional metabolic pathways involved in the biodegradation of aromatic compounds in a metagenomic library obtained from an oil reservoir. Although many of the deduced proteins shared homology with known enzymes of different well-described aerobic and anaerobic catabolic pathways, the metagenomic fragments did not contain the complete clusters known to be involved in hydrocarbon degradation. Instead, the metagenomic fragments comprised genes belonging to different pathways, showing novel gene arrangements. These results reinforce the potential of the metagenomic approach for the identification and elucidation of new genes and pathways in poorly studied environments and contribute to a broader perspective on the hydrocarbon degradation processes in petroleum reservoirs. PMID:24587220

  9. IMG/M 4 version of the integrated metagenome comparative analysis system.

    PubMed

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Szeto, Ernest; Palaniappan, Krishna; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Pagani, Ioanna; Tringe, Susannah; Huntemann, Marcel; Billis, Konstantinos; Varghese, Neha; Tennessen, Kristin; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    IMG/M (http://img.jgi.doe.gov/m) provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in the context of a comprehensive set of reference genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG/M's data content and analytical tools have expanded continuously since its first version was released in 2007. Since the last report published in the 2012 NAR Database Issue, IMG/M's database architecture, annotation and data integration pipelines and analysis tools have been extended to copewith the rapid growth in the number and size of metagenome data sets handled by the system. IMG/M data marts provide support for the analysis of publicly available genomes, expert review of metagenome annotations (IMG/M ER: http://img.jgi.doe.gov/mer) and Human Microbiome Project (HMP)-specific metagenome samples (IMG/M HMP: http://img.jgi.doe.gov/imgm_hmp). PMID:24136997

  10. Detection of Helicosporidium spp. in metagenomic DNA.

    PubMed

    Mancera, Norberto; Douma, Lauren G; James, Sheldon; Liu, Stephanie; Van, Amy; Boucias, Drion G; Tartar, Aurélien

    2012-09-15

    Distinct isolates of the invertebrate pathogenic alga Helicosporidium sp., collected from different insect hosts and different geographic locations, were processed to sequence the 18S rDNA and β-tubulin genes. The sequences were analyzed to assess genetic variation within the genus Helicosporidium and to design Helicosporidium-specific 18S rDNA primers. The specificity of these primers was demonstrated by testing not only on the Helicosporidium sp. isolates, but also on two trebouxiophyte algae known to be close Helicosporidium relatives, Prototheca wickerhamii and Prototheca zopfii. The genus-specific primers were used to develop a culture-independent assay aimed at detecting the presence of Helicosporidium spp. in environmental waters. The assay was based on the PCR amplification of 18SrDNA gene fragments from metagenomic DNA preparations, and it resulted in the amplification of detectable products for all sampled sites. Phylogenetic analyses that included the environmental sequences demonstrated that all amplification products clustered in a strongly supported, monophyletic Helicosporidium clade, thereby validating the metagenomic approach and the taxonomic origin of the produced environmental sequences. In addition, the phylogenetic analyses established that Helicosporidium spp. isolated from coleopteran hosts are more closely related to each other than they are to the isolate collected from a dipteran host. Finally, the phylogenetic trees depicted intergeneric relationships that supported a Helicosporidium-Prototheca cluster but did not support a Helicosporidium-Coccomyxa grouping, suggesting that pathogenicity to invertebrates evolved at least twice independently within the trebouxiophyte green algae. PMID:22609409

  11. Explaining diversity in metagenomic datasets by phylogenetic-based feature weighting.

    PubMed

    Albanese, Davide; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-03-01

    Metagenomics is revolutionizing our understanding of microbial communities, showing that their structure and composition have profound effects on the ecosystem and in a variety of health and disease conditions. Despite the flourishing of new analysis methods, current approaches based on statistical comparisons between high-level taxonomic classes often fail to identify the microbial taxa that are differentially distributed between sets of samples, since in many cases the taxonomic schema do not allow an adequate description of the structure of the microbiota. This constitutes a severe limitation to the use of metagenomic data in therapeutic and diagnostic applications. To provide a more robust statistical framework, we introduce a class of feature-weighting algorithms that discriminate the taxa responsible for the classification of metagenomic samples. The method unambiguously groups the relevant taxa into clades without relying on pre-defined taxonomic categories, thus including in the analysis also those sequences for which a taxonomic classification is difficult. The phylogenetic clades are weighted and ranked according to their abundance measuring their contribution to the differentiation of the classes of samples, and a criterion is provided to define a reduced set of most relevant clades. Applying the method to public datasets, we show that the data-driven definition of relevant phylogenetic clades accomplished by our ranking strategy identifies features in the samples that are lost if phylogenetic relationships are not considered, improving our ability to mine metagenomic datasets. Comparison with supervised classification methods currently used in metagenomic data analysis highlights the advantages of using phylogenetic information.

  12. Use of object-oriented classification and fragmentation analysis (1985-2008) to identify important areas for conservation in Cockpit Country, Jamaica.

    PubMed

    Newman, Minke E; McLaren, Kurt P; Wilson, Byron S

    2011-01-01

    Forest fragmentation is one of the most important threats to global biodiversity, particularly in tropical developing countries. Identifying priority areas for conservation within these forests is essential to their effective management. However, this requires current, accurate environmental information that is often lacking in developing countries. The Cockpit Country, Jamaica, contains forests of international importance in terms of levels of endemism and overall diversity. These forests are under severe threat from the prospect of bauxite mining and other anthropogenic disturbances. In the absence of adequate, up-to-date ecological information, we used satellite remote sensing data and fragmentation analysis to identify interior forested areas that have experienced little or no change as priority conservation sites. We classified Landsat images from 1985, 1989, 1995, 2002, and 2008, using an object-oriented method, which allowed for the inclusion of roads. We conducted our fragmentation analysis using metrics to quantify changes in forest patch number, area, shape, and aggregation. Deforestation and fragmentation fluctuated within the 23-year period but were mostly confined to the periphery of the forest, close to roads and access trails. An area of core forest that remained intact over the period of study was identified within the largest forest patch, most of which was located within the boundaries of a forest reserve and included the last remaining patches of closed-broadleaf forest. These areas should be given highest priority for conservation, as they constitute important refuges for endemic or threatened biodiversity. Minimizing and controlling access will be important in maintaining this core.

  13. CLaMS: Classifier for Metagenomic Sequences

    SciTech Connect

    Pati, Amrita

    2010-12-01

    CLaMS-"Classifer for Metagenonic Sequences" is a Java application for binning assembled metagenomes wings user-specified training sequence sets and other user-specified initial parameters. Since ClAmS analyzes and matches sequence composition-based genomic signatures, it is much faster than binning tools that rely on alignments to homologs; CLaMS can bin ~20,000 sequences in 3 minutes on a laptop with a 2.4 Ghz. Intel Core 2 Duo processor and 2 GB Ram. CLaMS is meant to be desktop application for biologist and can be run on any machine under any operating system on which the Java Runtime Environment is enabled. CLaMS is freely available in both GVI-based and command-line based forms.

  14. CLaMS: Classifier for Metagenomic Sequences

    2010-12-01

    CLaMS-"Classifer for Metagenonic Sequences" is a Java application for binning assembled metagenomes wings user-specified training sequence sets and other user-specified initial parameters. Since ClAmS analyzes and matches sequence composition-based genomic signatures, it is much faster than binning tools that rely on alignments to homologs; CLaMS can bin ~20,000 sequences in 3 minutes on a laptop with a 2.4 Ghz. Intel Core 2 Duo processor and 2 GB Ram. CLaMS is meant to be desktop applicationmore » for biologist and can be run on any machine under any operating system on which the Java Runtime Environment is enabled. CLaMS is freely available in both GVI-based and command-line based forms.« less

  15. A Bioinformatician's Guide to Metagenomics

    SciTech Connect

    Kunin, Victor; Copeland, Alex; Lapidus, Alla; Mavromatis, Konstantinos; Hugenholtz, Philip

    2008-08-01

    As random shotgun metagenomic projects proliferate and become the dominant source of publicly available sequence data, procedures for best practices in their execution and analysis become increasingly important. Based on our experience at the Joint Genome Institute, we describe step-by-step the chain of decisions accompanying a metagenomic project from the viewpoint of a bioinformatician. We guide the reader through a standard workflow for a metagenomic project beginning with pre-sequencing considerations such as community composition and sequence data type that will greatly influence downstream analyses. We proceed with recommendations for sampling and data generation including sample and metadata collection, community profiling, construction of shotgun libraries and sequencing strategies. We then discuss the application of generic sequence processing steps (read preprocessing, assembly, and gene prediction and annotation) to metagenomic datasets by contrast to genome projects. Different types of data analyses particular to metagenomes are then presented including binning, dominant population analysis and gene-centric analysis. Finally data management systems and issues are presented and discussed. We hope that this review will assist bioinformaticians and biologists in making better-informed decisions on their journey during a metagenomic project.

  16. MetaBAT: Metagenome Binning based on Abundance and Tetranucleotide frequence

    SciTech Connect

    Kang, Dongwan; Froula, Jeff; Egan, Rob; Wang, Zhong

    2014-03-21

    Grouping large fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Here we developed automated metagenome binning software, called MetaBAT, which integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency. On synthetic datasets MetaBAT on average achieves 98percent precision and 90percent recall at the strain level with 281 near complete unique genomes. Applying MetaBAT to a human gut microbiome data set we recovered 176 genome bins with 92percent precision and 80percent recall. Further analyses suggest MetaBAT is able to recover genome fragments missed in reference genomes up to 19percent, while 53 genome bins are novel. In summary, we believe MetaBAT is a powerful tool to facilitate comprehensive understanding of complex microbial communities.

  17. Open resource metagenomics: a model for sharing metagenomic libraries

    PubMed Central

    Neufeld, J.D.; Engel, K.; Cheng, J.; Moreno-Hagelsieb, G.; Rose, D.R.; Charles, T.C.

    2011-01-01

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM2BL [1]). The CM2BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the

  18. Open resource metagenomics: a model for sharing metagenomic libraries.

    PubMed

    Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C

    2011-11-30

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the

  19. Challenges of the Unknown: Clinical Application of Microbial Metagenomics.

    PubMed

    Rose, Graham; Wooldridge, David J; Anscombe, Catherine; Mee, Edward T; Misra, Raju V; Gharbia, Saheer

    2015-01-01

    Availability of fast, high throughput and low cost whole genome sequencing holds great promise within public health microbiology, with applications ranging from outbreak detection and tracking transmission events to understanding the role played by microbial communities in health and disease. Within clinical metagenomics, identifying microorganisms from a complex and host enriched background remains a central computational challenge. As proof of principle, we sequenced two metagenomic samples, a known viral mixture of 25 human pathogens and an unknown complex biological model using benchtop technology. The datasets were then analysed using a bioinformatic pipeline developed around recent fast classification methods. A targeted approach was able to detect 20 of the viruses against a background of host contamination from multiple sources and bacterial contamination. An alternative untargeted identification method was highly correlated with these classifications, and over 1,600 species were identified when applied to the complex biological model, including several species captured at over 50% genome coverage. In summary, this study demonstrates the great potential of applying metagenomics within the clinical laboratory setting and that this can be achieved using infrastructure available to nondedicated sequencing centres. PMID:26451363

  20. Challenges of the Unknown: Clinical Application of Microbial Metagenomics

    PubMed Central

    Rose, Graham; Wooldridge, David J.; Anscombe, Catherine; Mee, Edward T.; Misra, Raju V.; Gharbia, Saheer

    2015-01-01

    Availability of fast, high throughput and low cost whole genome sequencing holds great promise within public health microbiology, with applications ranging from outbreak detection and tracking transmission events to understanding the role played by microbial communities in health and disease. Within clinical metagenomics, identifying microorganisms from a complex and host enriched background remains a central computational challenge. As proof of principle, we sequenced two metagenomic samples, a known viral mixture of 25 human pathogens and an unknown complex biological model using benchtop technology. The datasets were then analysed using a bioinformatic pipeline developed around recent fast classification methods. A targeted approach was able to detect 20 of the viruses against a background of host contamination from multiple sources and bacterial contamination. An alternative untargeted identification method was highly correlated with these classifications, and over 1,600 species were identified when applied to the complex biological model, including several species captured at over 50% genome coverage. In summary, this study demonstrates the great potential of applying metagenomics within the clinical laboratory setting and that this can be achieved using infrastructure available to nondedicated sequencing centres. PMID:26451363

  1. Functional metagenomics to decipher food-microbe-host crosstalk.

    PubMed

    Larraufie, Pierre; de Wouters, Tomas; Potocki-Veronese, Gabrielle; Blottière, Hervé M; Doré, Joël

    2015-02-01

    The recent developments of metagenomics permit an extremely high-resolution molecular scan of the intestinal microbiota giving new insights and opening perspectives for clinical applications. Beyond the unprecedented vision of the intestinal microbiota given by large-scale quantitative metagenomics studies, such as the EU MetaHIT project, functional metagenomics tools allow the exploration of fine interactions between food constituents, microbiota and host, leading to the identification of signals and intimate mechanisms of crosstalk, especially between bacteria and human cells. Cloning of large genome fragments, either from complex intestinal communities or from selected bacteria, allows the screening of these biological resources for bioactivity towards complex plant polymers or functional food such as prebiotics. This permitted identification of novel carbohydrate-active enzyme families involved in dietary fibre and host glycan breakdown, and highlighted unsuspected bacterial players at the top of the intestinal microbial food chain. Similarly, exposure of fractions from genomic and metagenomic clones onto human cells engineered with reporter systems to track modulation of immune response, cell proliferation or cell metabolism has allowed the identification of bioactive clones modulating key cell signalling pathways or the induction of specific genes. This opens the possibility to decipher mechanisms by which commensal bacteria or candidate probiotics can modulate the activity of cells in the intestinal epithelium or even in distal organs such as the liver, adipose tissue or the brain. Hence, in spite of our inability to culture many of the dominant microbes of the human intestine, functional metagenomics open a new window for the exploration of food-microbe-host crosstalk. PMID:25417646

  2. Functional metagenomics to decipher food-microbe-host crosstalk.

    PubMed

    Larraufie, Pierre; de Wouters, Tomas; Potocki-Veronese, Gabrielle; Blottière, Hervé M; Doré, Joël

    2015-02-01

    The recent developments of metagenomics permit an extremely high-resolution molecular scan of the intestinal microbiota giving new insights and opening perspectives for clinical applications. Beyond the unprecedented vision of the intestinal microbiota given by large-scale quantitative metagenomics studies, such as the EU MetaHIT project, functional metagenomics tools allow the exploration of fine interactions between food constituents, microbiota and host, leading to the identification of signals and intimate mechanisms of crosstalk, especially between bacteria and human cells. Cloning of large genome fragments, either from complex intestinal communities or from selected bacteria, allows the screening of these biological resources for bioactivity towards complex plant polymers or functional food such as prebiotics. This permitted identification of novel carbohydrate-active enzyme families involved in dietary fibre and host glycan breakdown, and highlighted unsuspected bacterial players at the top of the intestinal microbial food chain. Similarly, exposure of fractions from genomic and metagenomic clones onto human cells engineered with reporter systems to track modulation of immune response, cell proliferation or cell metabolism has allowed the identification of bioactive clones modulating key cell signalling pathways or the induction of specific genes. This opens the possibility to decipher mechanisms by which commensal bacteria or candidate probiotics can modulate the activity of cells in the intestinal epithelium or even in distal organs such as the liver, adipose tissue or the brain. Hence, in spite of our inability to culture many of the dominant microbes of the human intestine, functional metagenomics open a new window for the exploration of food-microbe-host crosstalk.

  3. Optimizing solubility and permeability of a biopharmaceutics classification system (BCS) class 4 antibiotic drug using lipophilic fragments disturbing the crystal lattice.

    PubMed

    Tehler, Ulrika; Fagerberg, Jonas H; Svensson, Richard; Larhed, Mats; Artursson, Per; Bergström, Christel A S

    2013-03-28

    Esterification was used to simultaneously increase solubility and permeability of ciprofloxacin, a biopharmaceutics classification system (BCS) class 4 drug (low solubility/low permeability) with solid-state limited solubility. Molecular flexibility was increased to disturb the crystal lattice, lower the melting point, and thereby improve the solubility, whereas lipophilicity was increased to enhance the intestinal permeability. These structural changes resulted in BCS class 1 analogues (high solubility/high permeability) emphasizing that simple medicinal chemistry may improve both these properties.

  4. IDENTIFICATION OF CHICKEN-SPECIFIC FECAL MICROBIAL SEQUENCES USING A METAGENOMIC APPROACH

    EPA Science Inventory

    In this study, we applied a genome fragment enrichment (GFE) method to select for genomic regions that differ between different fecal metagenomes. Competitive DNA hybridizations were performed between chicken fecal DNA and pig fecal DNA (C-P) and between chicken fecal DNA and an ...

  5. MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.

    PubMed

    Reddy, Rachamalla Maheedhar; Mohammed, Monzoorul Haque; Mande, Sharmila S

    2014-01-01

    A key challenge in analyzing metagenomics data pertains to assembly of sequenced DNA fragments (i.e. reads) originating from various microbes in a given environmental sample. Several existing methodologies can assemble reads originating from a single genome. However, these methodologies cannot be applied for efficient assembly of metagenomic sequence datasets. In this study, we present MetaCAA - a clustering-aided methodology which helps in improving the quality of metagenomic sequence assembly. MetaCAA initially groups sequences constituting a given metagenome into smaller clusters. Subsequently, sequences in each cluster are independently assembled using CAP3, an existing single genome assembly program. Contigs formed in each of the clusters along with the unassembled reads are then subjected to another round of assembly for generating the final set of contigs. Validation using simulated and real-world metagenomic datasets indicates that MetaCAA aids in improving the overall quality of assembly. A software implementation of MetaCAA is available at https://metagenomics.atc.tcs.com/MetaCAA.

  6. A Novel Abundance-Based Algorithm for Binning Metagenomic Sequences Using l-Tuples

    NASA Astrophysics Data System (ADS)

    Wu, Yu-Wei; Ye, Yuzhen

    Metagenomics is the study of microbial communities sampled directly from their natural environment, without prior culturing. Among the computational tools recently developed for metagenomic sequence analysis, binning tools attempt to classify all (or most) of the sequences in a metagenomic dataset into different bins (i.e., species), based on various DNA composition patterns (e.g., the tetramer frequencies) of various genomes. Composition-based binning methods, however, cannot be used to classify very short fragments, because of the substantial variation of DNA composition patterns within a single genome. We developed a novel approach (AbundanceBin) for metagenomics binning by utilizing the different abundances of species living in the same environment. AbundanceBin is an application of the Lander-Waterman model to metagenomics, which is based on the l-tuple content of the reads. AbundanceBin achieved accurate, unsupervised, clustering of metagenomic sequences into different bins, such that the reads classified in a bin belong to species of identical or very similar abundances in the sample. In addition, AbundanceBin gave accurate estimations of species abundances, as well as their genome sizes - two important parameters for characterizing a microbial community. We also show that AbundanceBin performed well when the sequence lengths are very short (e.g. 75 bp) or have sequencing errors.

  7. Web Resources for Metagenomics Studies.

    PubMed

    Dudhagara, Pravin; Bhavsar, Sunil; Bhagat, Chintan; Ghelani, Anjana; Bhatt, Shreyas; Patel, Rajesh

    2015-10-01

    The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint. PMID:26602607

  8. Web Resources for Metagenomics Studies.

    PubMed

    Dudhagara, Pravin; Bhavsar, Sunil; Bhagat, Chintan; Ghelani, Anjana; Bhatt, Shreyas; Patel, Rajesh

    2015-10-01

    The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint.

  9. riboFrame: An Improved Method for Microbial Taxonomy Profiling from Non-Targeted Metagenomics

    PubMed Central

    Ramazzotti, Matteo; Donati, Claudio; Cavalieri, Duccio

    2015-01-01

    Non-targeted metagenomics offers the unprecedented possibility of simultaneously investigate the microbial profile and the genetic capabilities of a sample by a direct analysis of its entire DNA content. The assessment of the microbial taxonomic composition is frequently obtained by mapping reads to genomic databases that, although growing, are still limited and biased. Here we present riboFrame, a novel procedure for microbial profiling based on the identification and classification of 16S rDNA sequences in non-targeted metagenomics datasets. Reads overlapping the 16S rDNA genes are identified using Hidden Markov Models and a taxonomic assignment is obtained by naïve Bayesian classification. All reads identified as ribosomal are coherently positioned in the 16S rDNA gene, allowing the use of the topology of the gene (i.e., the secondary structure and the location of variable regions) to guide the abundance analysis. We tested and verified the effectiveness of our method on simulated ribosomal data, on simulated metagenomes and on a real dataset. riboFrame exploits the taxonomic potentialities of the 16S rDNA gene in the context of non-targeted metagenomics, giving an accurate perspective on the microbial profile in metagenomic samples. PMID:26635865

  10. VIROME: a standard operating procedure for analysis of viral metagenome sequences.

    PubMed

    Wommack, K Eric; Bhavsar, Jaysheel; Polson, Shawn W; Chen, Jing; Dumas, Michael; Srinivasiah, Sharath; Furman, Megan; Jamindar, Sanchita; Nasko, Daniel J

    2012-07-30

    One consistent finding among studies using shotgun metagenomics to analyze whole viral communities is that most viral sequences show no significant homology to known sequences. Thus, bioinformatic analyses based on sequence collections such as GenBank nr, which are largely comprised of sequences from known organisms, tend to ignore a majority of sequences within most shotgun viral metagenome libraries. Here we describe a bioinformatic pipeline, the Viral Informatics Resource for Metagenome Exploration (VIROME), that emphasizes the classification of viral metagenome sequences (predicted open-reading frames) based on homology search results against both known and environmental sequences. Functional and taxonomic information is derived from five annotated sequence databases which are linked to the UniRef 100 database. Environmental classifications are obtained from hits against a custom database, MetaGenomes On-Line, which contains 49 million predicted environmental peptides. Each predicted viral metagenomic ORF run through the VIROME pipeline is placed into one of seven ORF classes, thus, every sequence receives a meaningful annotation. Additionally, the pipeline includes quality control measures to remove contaminating and poor quality sequence and assesses the potential amount of cellular DNA contamination in a viral metagenome library by screening for rRNA genes. Access to the VIROME pipeline and analysis results are provided through a web-application interface that is dynamically linked to a relational back-end database. The VIROME web-application interface is designed to allow users flexibility in retrieving sequences (reads, ORFs, predicted peptides) and search results for focused secondary analyses. PMID:23407591

  11. Metagenomics and novel gene discovery

    PubMed Central

    Culligan, Eamonn P; Sleator, Roy D; Marchesi, Julian R; Hill, Colin

    2014-01-01

    Metagenomics provides a means of assessing the total genetic pool of all the microbes in a particular environment, in a culture-independent manner. It has revealed unprecedented diversity in microbial community composition, which is further reflected in the encoded functional diversity of the genomes, a large proportion of which consists of novel genes. Herein, we review both sequence-based and functional metagenomic methods to uncover novel genes and outline some of the associated problems of each type of approach, as well as potential solutions. Furthermore, we discuss the potential for metagenomic biotherapeutic discovery, with a particular focus on the human gut microbiome and finally, we outline how the discovery of novel genes may be used to create bioengineered probiotics. PMID:24317337

  12. Metagenomic biomarker discovery and explanation

    PubMed Central

    2011-01-01

    This study describes and validates a new method for metagenomic biomarker discovery by way of class comparison, tests of biological consistency and effect size estimation. This addresses the challenge of finding organisms, genes, or pathways that consistently explain the differences between two or more microbial communities, which is a central problem to the study of metagenomics. We extensively validate our method on several microbiomes and a convenient online interface for the method is provided at http://huttenhower.sph.harvard.edu/lefse/. PMID:21702898

  13. GB Virus C/Hepatitis G Virus Groups and Subgroups: Classification by a Restriction Fragment Length Polymorphism Method Based on Phylogenetic Analysis of the 5′ Untranslated Region

    PubMed Central

    Quarleri, J. F.; Mathet, V. L.; Feld, M.; Ferrario, D.; della Latta, M. P.; Verdun, R.; Sánchez, D. O.; Oubiña, J. R.

    1999-01-01

    A phylogenetic tree based on 150 5′ untranslated region sequences deposited in GenBank database allowed segregation of the sequences into three major groups, including two subgroups, i.e., 1, 2a, 2b, and 3, supported by bootstrap analysis. Restriction site analysis of these sequences predicted that HinfI and either AatII or AciI could be used for genomic typing with 99.4% accuracy. cDNA sequencing and subsequent alignment of 21 Argentine GB virus C/hepatitis G virus strains confirmed restriction fragment length polymorphism patterns theoretically predicted. This method may be useful for a rapid screening of samples when either epidemiological or transmission studies of this agent are carried out. PMID:10203483

  14. Phylogenetic diversity and metagenomics of candidate division OP3.

    PubMed

    Glöckner, Jana; Kube, Michael; Shrestha, Pravin Malla; Weber, Marc; Glöckner, Frank Oliver; Reinhardt, Richard; Liesack, Werner

    2010-05-01

    Except for environmental 16S rRNA gene sequences, no information is available for members of the candidate division OP3. These bacteria appear to thrive in anoxic environments, such as marine sediments, hypersaline deep sea, freshwater lakes, aquifers, flooded paddy soils and methanogenic bioreactors. The 16S rRNA phylogeny suggests that OP3 belongs to the Planctomycetes/Verrucomicrobia/Chlamydiae (PVC) superphylum. Metagenomic fosmid libraries were constructed from flooded paddy soil and screened for 16S rRNA gene-containing fragments affiliated with the PVC superphylum. The screening of 63 000 clones resulted in 23 assay-positive fosmids, of which three clones were affiliated with OP3. The 16S rRNA gene sequence divergence between the fragments OP3/1, OP3/2 and OP3/3 ranges from 18% to 25%, indicating that they belong to different OP3 subdivisions. The 23S rRNA phylogeny confirmed the membership of OP3 in the PVC superphylum. Sequencing the OP3 fragments resulted in a total of 105 kb of genomic information and 90 ORFs, of which 47 could be assigned a putative function and 11 were conserved hypothetical. Using BLASTP searches, a high proportion of ORFs had best matches to homologues from Deltaproteobacteria, rather than to those of members of the PVC superphylum. On the fragment OP3/3, a cluster of nine ORFs was predicted to encode the bacterial NADH dehydrogenase I. Given the high proportion of homologues present in deltaproteobacteria and anoxic conditions in the natural environment of OP3 bacteria, the detection of NADH dehydrogenase I may suggest an anaerobic respiration mode. Oligonucleotide frequencies calculated for OP3/1, OP3/2 and OP/3 show high intraphylum correlations. This novel sequence information could therefore be used to identify OP3-related fragments in large metagenomic data sets using marker gene-independent procedures in the future. In addition to the OP3 fragments, a single metagenomic fragment affiliated with the candidate division BRC1 was

  15. Practical application of self-organizing maps to interrelate biodiversity and functional data in NGS-based metagenomics.

    PubMed

    Weber, Marc; Teeling, Hanno; Huang, Sixing; Waldmann, Jost; Kassabgy, Mariette; Fuchs, Bernhard M; Klindworth, Anna; Klockow, Christine; Wichels, Antje; Gerdts, Gunnar; Amann, Rudolf; Glöckner, Frank Oliver

    2011-05-01

    Next-generation sequencing (NGS) technologies have enabled the application of broad-scale sequencing in microbial biodiversity and metagenome studies. Biodiversity is usually targeted by classifying 16S ribosomal RNA genes, while metagenomic approaches target metabolic genes. However, both approaches remain isolated, as long as the taxonomic and functional information cannot be interrelated. Techniques like self-organizing maps (SOMs) have been applied to cluster metagenomes into taxon-specific bins in order to link biodiversity with functions, but have not been applied to broad-scale NGS-based metagenomics yet. Here, we provide a novel implementation, demonstrate its potential and practicability, and provide a web-based service for public usage. Evaluation with published data sets mimicking varyingly complex habitats resulted into classification specificities and sensitivities of close to 100% to above 90% from phylum to genus level for assemblies exceeding 8 kb for low and medium complexity data. When applied to five real-world metagenomes of medium complexity from direct pyrosequencing of marine subsurface waters, classifications of assemblies above 2.5 kb were in good agreement with fluorescence in situ hybridizations, indicating that biodiversity was mostly retained within the metagenomes, and confirming high classification specificities. This was validated by two protein-based classifications (PBCs) methods. SOMs were able to retrieve the relevant taxa down to the genus level, while surpassing PBCs in resolution. In order to make the approach accessible to a broad audience, we implemented a feature-rich web-based SOM application named TaxSOM, which is freely available at http://www.megx.net/toolbox/taxsom. TaxSOM can classify reads or assemblies exceeding 2.5 kb with high accuracy and thus assists in linking biodiversity and functions in metagenome studies, which is a precondition to study microbial ecology in a holistic fashion.

  16. Metagenomic Analysis of the Pygmy Loris Fecal Microbiome Reveals Unique Functional Capacity Related to Metabolism of Aromatic Compounds

    PubMed Central

    Xu, Bo; Xu, Weijiang; Yang, Fuya; Li, Junjun; Yang, Yunjuan; Tang, Xianghua; Mu, Yuelin; Zhou, Junpei; Huang, Zunxi

    2013-01-01

    The animal gastrointestinal tract contains a complex community of microbes, whose composition ultimately reflects the co-evolution of microorganisms with their animal host. An analysis of 78,619 pyrosequencing reads generated from pygmy loris fecal DNA extracts was performed to help better understand the microbial diversity and functional capacity of the pygmy loris gut microbiome. The taxonomic analysis of the metagenomic reads indicated that pygmy loris fecal microbiomes were dominated by Bacteroidetes and Proteobacteria phyla. The hierarchical clustering of several gastrointestinal metagenomes demonstrated the similarities of the microbial community structures of pygmy loris and mouse gut systems despite their differences in functional capacity. The comparative analysis of function classification revealed that the metagenome of the pygmy loris was characterized by an overrepresentation of those sequences involved in aromatic compound metabolism compared with humans and other animals. The key enzymes related to the benzoate degradation pathway were identified based on the Kyoto Encyclopedia of Genes and Genomes pathway assignment. These results would contribute to the limited body of primate metagenome studies and provide a framework for comparative metagenomic analysis between human and non-human primates, as well as a comparative understanding of the evolution of humans and their microbiome. However, future studies on the metagenome sequencing of pygmy loris and other prosimians regarding the effects of age, genetics, and environment on the composition and activity of the metagenomes are required. PMID:23457582

  17. HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes.

    PubMed

    Forster, Samuel C; Browne, Hilary P; Kumar, Nitin; Hunt, Martin; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Lawley, Trevor D

    2016-01-01

    The Human Pan-Microbe Communities (HPMC) database (http://www.hpmcd.org/) provides a manually curated, searchable, metagenomic resource to facilitate investigation of human gastrointestinal microbiota. Over the past decade, the application of metagenome sequencing to elucidate the microbial composition and functional capacity present in the human microbiome has revolutionized many concepts in our basic biology. When sufficient high quality reference genomes are available, whole genome metagenomic sequencing can provide direct biological insights and high-resolution classification. The HPMC database provides species level, standardized phylogenetic classification of over 1800 human gastrointestinal metagenomic samples. This is achieved by combining a manually curated list of bacterial genomes from human faecal samples with over 21000 additional reference genomes representing bacteria, viruses, archaea and fungi with manually curated species classification and enhanced sample metadata annotation. A user-friendly, web-based interface provides the ability to search for (i) microbial groups associated with health or disease state, (ii) health or disease states and community structure associated with a microbial group, (iii) the enrichment of a microbial gene or sequence and (iv) enrichment of a functional annotation. The HPMC database enables detailed analysis of human microbial communities and supports research from basic microbiology and immunology to therapeutic development in human health and disease. PMID:26578596

  18. HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes.

    PubMed

    Forster, Samuel C; Browne, Hilary P; Kumar, Nitin; Hunt, Martin; Denise, Hubert; Mitchell, Alex; Finn, Robert D; Lawley, Trevor D

    2016-01-01

    The Human Pan-Microbe Communities (HPMC) database (http://www.hpmcd.org/) provides a manually curated, searchable, metagenomic resource to facilitate investigation of human gastrointestinal microbiota. Over the past decade, the application of metagenome sequencing to elucidate the microbial composition and functional capacity present in the human microbiome has revolutionized many concepts in our basic biology. When sufficient high quality reference genomes are available, whole genome metagenomic sequencing can provide direct biological insights and high-resolution classification. The HPMC database provides species level, standardized phylogenetic classification of over 1800 human gastrointestinal metagenomic samples. This is achieved by combining a manually curated list of bacterial genomes from human faecal samples with over 21000 additional reference genomes representing bacteria, viruses, archaea and fungi with manually curated species classification and enhanced sample metadata annotation. A user-friendly, web-based interface provides the ability to search for (i) microbial groups associated with health or disease state, (ii) health or disease states and community structure associated with a microbial group, (iii) the enrichment of a microbial gene or sequence and (iv) enrichment of a functional annotation. The HPMC database enables detailed analysis of human microbial communities and supports research from basic microbiology and immunology to therapeutic development in human health and disease.

  19. Multivariate Analysis of Functional Metagenomes

    PubMed Central

    Dinsdale, Elizabeth A.; Edwards, Robert A.; Bailey, Barbara A.; Tuba, Imre; Akhter, Sajia; McNair, Katelyn; Schmieder, Robert; Apkarian, Naneh; Creek, Michelle; Guan, Eric; Hernandez, Mayra; Isaacs, Katherine; Peterson, Chris; Regh, Todd; Ponomarenko, Vadim

    2013-01-01

    Metagenomics is a primary tool for the description of microbial and viral communities. The sheer magnitude of the data generated in each metagenome makes identifying key differences in the function and taxonomy between communities difficult to elucidate. Here we discuss the application of seven different data mining and statistical analyses by comparing and contrasting the metabolic functions of 212 microbial metagenomes within and between 10 environments. Not all approaches are appropriate for all questions, and researchers should decide which approach addresses their questions. This work demonstrated the use of each approach: for example, random forests provided a robust and enlightening description of both the clustering of metagenomes and the metabolic processes that were important in separating microbial communities from different environments. All analyses identified that the presence of phage genes within the microbial community was a predictor of whether the microbial community was host-associated or free-living. Several analyses identified the subtle differences that occur with environments, such as those seen in different regions of the marine environment. PMID:23579547

  20. Estimating richness from phage metagenomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Bacteriophages are important drivers of ecosystem functions, yet little is known about the vast majority of phages. Phage metagenomics, or the study of the collective genome of an assemblage of phages, enables the investigation of broad ecological questions in phage communities. One ecological cha...

  1. Metagenome Assembly at the DOE JGI (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Chain, Patrick

    2011-10-13

    Patrick Chain of DOE JGI at LANL, Co-Chair of the Metagenome-specific Assembly session, on "Metagenome Assembly at the DOE JGI" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  2. Metagenome Assembly at the DOE JGI (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Chain, Patrick [DOE JGI at LANL

    2016-07-12

    Patrick Chain of DOE JGI at LANL, Co-Chair of the Metagenome-specific Assembly session, on "Metagenome Assembly at the DOE JGI" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  3. The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes.

    SciTech Connect

    Meyer, F.; Paarmann, D.; D'Souza, M.; Olson, R.; Glass, E. M.; Kubal, M.; Paczian, T.; Stevens, R.; Wilke, A.; Wilkening, J.; Edwards, R. A.; Rodriguez, A.; Mathematics and Computer Science; Univ. of Chicago; San Diego State Univ.

    2008-09-19

    Random community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers. A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. user access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing databasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats. The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis--the available of high-performance computing for annotating the data.

  4. Taxonomic and functional assignment of cloned sequences from high Andean forest soil metagenome.

    PubMed

    Montaña, José Salvador; Jiménez, Diego Javier; Hernández, Mónica; Angel, Tatiana; Baena, Sandra

    2012-02-01

    Total metagenomic DNA was isolated from high Andean forest soil and subjected to taxonomical and functional composition analyses by means of clone library generation and sequencing. The obtained yield of 1.7 μg of DNA/g of soil was used to construct a metagenomic library of approximately 20,000 clones (in the plasmid p-Bluescript II SK+) with an average insert size of 4 Kb, covering 80 Mb of the total metagenomic DNA. Metagenomic sequences near the plasmid cloning site were sequenced and them trimmed and assembled, obtaining 299 reads and 31 contigs (0.3 Mb). Taxonomic assignment of total sequences was performed by BLASTX, resulting in 68.8, 44.8 and 24.5% classification into taxonomic groups using the metagenomic RAST server v2.0, WebCARMA v1.0 online system and MetaGenome Analyzer v3.8 software, respectively. Most clone sequences were classified as Bacteria belonging to phlya Actinobacteria, Proteobacteria and Acidobacteria. Among the most represented orders were Actinomycetales (34% average), Rhizobiales, Burkholderiales and Myxococcales and with a greater number of sequences in the genus Mycobacterium (7% average), Frankia, Streptomyces and Bradyrhizobium. The vast majority of sequences were associated with the metabolism of carbohydrates, proteins, lipids and catalytic functions, such as phosphatases, glycosyltransferases, dehydrogenases, methyltransferases, dehydratases and epoxide hydrolases. In this study we compared different methods of taxonomic and functional assignment of metagenomic clone sequences to evaluate microbial diversity in an unexplored soil ecosystem, searching for putative enzymes of biotechnological interest and generating important information for further functional screening of clone libraries. PMID:21792685

  5. Metagenomes from Argonne's MG-RAST Metagenomics Analysis Server

    DOE Data Explorer

    MG-RAST has a large number of datasets that researchers have deposited for public use. As of July, 2014, the number of metagenomes represented by MG-RAST numbered more than 18,500, and the number of available sequences was more than 75 million! The public can browse the collection several different ways, and researchers can login to deposit new data. Researchers have the choice of keeping a dataset private so that it is viewable only by them when logged in, or they can choose to make a dataset public at any time with a simple click of a link. MG-RAST was launched in 2007 by the Mathematics and Computer Science Division at Argonne National Laboratory (ANL). It is part of the toolkit available to the Terragenomics project, which seeks to do a comprehensive metagenomics study of U.S. soil. The Terragenomics project page is located at http://www.mcs.anl.gov/research/projects/terragenomics/.

  6. Metagenomics of the Svalbard Reindeer Rumen Microbiome Reveals Abundance of Polysaccharide Utilization Loci

    PubMed Central

    Pope, Phillip B.; Mackenzie, Alasdair K.; Gregor, Ivan; Smith, Wendy; Sundset, Monica A.; McHardy, Alice C.; Morrison, Mark; Eijsink, Vincent G.H.

    2012-01-01

    Lignocellulosic biomass remains a largely untapped source of renewable energy predominantly due to its recalcitrance and an incomplete understanding of how this is overcome in nature. We present here a compositional and comparative analysis of metagenomic data pertaining to a natural biomass-converting ecosystem adapted to austere arctic nutritional conditions, namely the rumen microbiome of Svalbard reindeer (Rangifer tarandus platyrhynchus). Community analysis showed that deeply-branched cellulolytic lineages affiliated to the Bacteroidetes and Firmicutes are dominant, whilst sequence binning methods facilitated the assemblage of metagenomic sequence for a dominant and novel Bacteroidales clade (SRM-1). Analysis of unassembled metagenomic sequence as well as metabolic reconstruction of SRM-1 revealed the presence of multiple polysaccharide utilization loci-like systems (PULs) as well as members of more than 20 glycoside hydrolase and other carbohydrate-active enzyme families targeting various polysaccharides including cellulose, xylan and pectin. Functional screening of cloned metagenome fragments revealed high cellulolytic activity and an abundance of PULs that are rich in endoglucanases (GH5) but devoid of other common enzymes thought to be involved in cellulose degradation. Combining these results with known and partly re-evaluated metagenomic data strongly indicates that much like the human distal gut, the digestive system of herbivores harbours high numbers of deeply branched and as-yet uncultured members of the Bacteroidetes that depend on PUL-like systems for plant biomass degradation. PMID:22701672

  7. Metagenomics of the Svalbard reindeer rumen microbiome reveals abundance of polysaccharide utilization loci.

    PubMed

    Pope, Phillip B; Mackenzie, Alasdair K; Gregor, Ivan; Smith, Wendy; Sundset, Monica A; McHardy, Alice C; Morrison, Mark; Eijsink, Vincent G H

    2012-01-01

    Lignocellulosic biomass remains a largely untapped source of renewable energy predominantly due to its recalcitrance and an incomplete understanding of how this is overcome in nature. We present here a compositional and comparative analysis of metagenomic data pertaining to a natural biomass-converting ecosystem adapted to austere arctic nutritional conditions, namely the rumen microbiome of Svalbard reindeer (Rangifer tarandus platyrhynchus). Community analysis showed that deeply-branched cellulolytic lineages affiliated to the Bacteroidetes and Firmicutes are dominant, whilst sequence binning methods facilitated the assemblage of metagenomic sequence for a dominant and novel Bacteroidales clade (SRM-1). Analysis of unassembled metagenomic sequence as well as metabolic reconstruction of SRM-1 revealed the presence of multiple polysaccharide utilization loci-like systems (PULs) as well as members of more than 20 glycoside hydrolase and other carbohydrate-active enzyme families targeting various polysaccharides including cellulose, xylan and pectin. Functional screening of cloned metagenome fragments revealed high cellulolytic activity and an abundance of PULs that are rich in endoglucanases (GH5) but devoid of other common enzymes thought to be involved in cellulose degradation. Combining these results with known and partly re-evaluated metagenomic data strongly indicates that much like the human distal gut, the digestive system of herbivores harbours high numbers of deeply branched and as-yet uncultured members of the Bacteroidetes that depend on PUL-like systems for plant biomass degradation. PMID:22701672

  8. Metagenomics of the Svalbard reindeer rumen microbiome reveals abundance of polysaccharide utilization loci.

    PubMed

    Pope, Phillip B; Mackenzie, Alasdair K; Gregor, Ivan; Smith, Wendy; Sundset, Monica A; McHardy, Alice C; Morrison, Mark; Eijsink, Vincent G H

    2012-01-01

    Lignocellulosic biomass remains a largely untapped source of renewable energy predominantly due to its recalcitrance and an incomplete understanding of how this is overcome in nature. We present here a compositional and comparative analysis of metagenomic data pertaining to a natural biomass-converting ecosystem adapted to austere arctic nutritional conditions, namely the rumen microbiome of Svalbard reindeer (Rangifer tarandus platyrhynchus). Community analysis showed that deeply-branched cellulolytic lineages affiliated to the Bacteroidetes and Firmicutes are dominant, whilst sequence binning methods facilitated the assemblage of metagenomic sequence for a dominant and novel Bacteroidales clade (SRM-1). Analysis of unassembled metagenomic sequence as well as metabolic reconstruction of SRM-1 revealed the presence of multiple polysaccharide utilization loci-like systems (PULs) as well as members of more than 20 glycoside hydrolase and other carbohydrate-active enzyme families targeting various polysaccharides including cellulose, xylan and pectin. Functional screening of cloned metagenome fragments revealed high cellulolytic activity and an abundance of PULs that are rich in endoglucanases (GH5) but devoid of other common enzymes thought to be involved in cellulose degradation. Combining these results with known and partly re-evaluated metagenomic data strongly indicates that much like the human distal gut, the digestive system of herbivores harbours high numbers of deeply branched and as-yet uncultured members of the Bacteroidetes that depend on PUL-like systems for plant biomass degradation.

  9. Marine metagenomics as a source for bioprospecting.

    PubMed

    Kodzius, Rimantas; Gojobori, Takashi

    2015-12-01

    This review summarizes usage of genome-editing technologies for metagenomic studies; these studies are used to retrieve and modify valuable microorganisms for production, particularly in marine metagenomics. Organisms may be cultivable or uncultivable. Metagenomics is providing especially valuable information for uncultivable samples. The novel genes, pathways and genomes can be deducted. Therefore, metagenomics, particularly genome engineering and system biology, allows for the enhancement of biological and chemical producers and the creation of novel bioresources. With natural resources rapidly depleting, genomics may be an effective way to efficiently produce quantities of known and novel foods, livestock feed, fuels, pharmaceuticals and fine or bulk chemicals. PMID:26204808

  10. Metagenomic Assembly: Overview, Challenges and Applications

    PubMed Central

    Ghurye, Jay S.; Cepeda-Espinoza, Victoria; Pop, Mihai

    2016-01-01

    Advances in sequencing technologies have led to the increased use of high throughput sequencing in characterizing the microbial communities associated with our bodies and our environment. Critical to the analysis of the resulting data are sequence assembly algorithms able to reconstruct genes and organisms from complex mixtures. Metagenomic assembly involves new computational challenges due to the specific characteristics of the metagenomic data. In this survey, we focus on major algorithmic approaches for genome and metagenome assembly, and discuss the new challenges and opportunities afforded by this new field. We also review several applications of metagenome assembly in addressing interesting biological problems.

  11. Marine metagenomics as a source for bioprospecting.

    PubMed

    Kodzius, Rimantas; Gojobori, Takashi

    2015-12-01

    This review summarizes usage of genome-editing technologies for metagenomic studies; these studies are used to retrieve and modify valuable microorganisms for production, particularly in marine metagenomics. Organisms may be cultivable or uncultivable. Metagenomics is providing especially valuable information for uncultivable samples. The novel genes, pathways and genomes can be deducted. Therefore, metagenomics, particularly genome engineering and system biology, allows for the enhancement of biological and chemical producers and the creation of novel bioresources. With natural resources rapidly depleting, genomics may be an effective way to efficiently produce quantities of known and novel foods, livestock feed, fuels, pharmaceuticals and fine or bulk chemicals.

  12. Metagenomic Assembly: Overview, Challenges and Applications

    PubMed Central

    Ghurye, Jay S.; Cepeda-Espinoza, Victoria; Pop, Mihai

    2016-01-01

    Advances in sequencing technologies have led to the increased use of high throughput sequencing in characterizing the microbial communities associated with our bodies and our environment. Critical to the analysis of the resulting data are sequence assembly algorithms able to reconstruct genes and organisms from complex mixtures. Metagenomic assembly involves new computational challenges due to the specific characteristics of the metagenomic data. In this survey, we focus on major algorithmic approaches for genome and metagenome assembly, and discuss the new challenges and opportunities afforded by this new field. We also review several applications of metagenome assembly in addressing interesting biological problems. PMID:27698619

  13. Magma Fragmentation

    NASA Astrophysics Data System (ADS)

    Gonnermann, Helge M.

    2015-05-01

    Magma fragmentation is the breakup of a continuous volume of molten rock into discrete pieces, called pyroclasts. Because magma contains bubbles of compressible magmatic volatiles, decompression of low-viscosity magma leads to rapid expansion. The magma is torn into fragments, as it is stretched into hydrodynamically unstable sheets and filaments. If the magma is highly viscous, resistance to bubble growth will instead lead to excess gas pressure and the magma will deform viscoelastically by fracturing like a glassy solid, resulting in the formation of a violently expanding gas-pyroclast mixture. In either case, fragmentation represents the conversion of potential energy into the surface energy of the newly created fragments and the kinetic energy of the expanding gas-pyroclast mixture. If magma comes into contact with external water, the conversion of thermal energy will vaporize water and quench magma at the melt-water interface, thus creating dynamic stresses that cause fragmentation and the release of kinetic energy. Lastly, shear deformation of highly viscous magma may cause brittle fractures and release seismic energy.

  14. Functional metagenomics of extreme environments.

    PubMed

    Mirete, Salvador; Morgante, Verónica; González-Pastor, José Eduardo

    2016-04-01

    The bioprospecting of enzymes that operate under extreme conditions is of particular interest for many biotechnological and industrial processes. Nevertheless, there is a considerable limitation to retrieve novel enzymes as only a small fraction of microorganisms derived from extreme environments can be cultured under standard laboratory conditions. Functional metagenomics has the advantage of not requiring the cultivation of microorganisms or previous sequence information to known genes, thus representing a valuable approach for mining enzymes with new features. In this review, we summarize studies showing how functional metagenomics was employed to retrieve genes encoding for proteins involved not only in molecular adaptation and resistance to extreme environmental conditions but also in other enzymatic activities of biotechnological interest.

  15. Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

    PubMed Central

    Tsai, Yu-Chih; Deming, Clayton; Segre, Julia A.; Kong, Heidi H.; Korlach, Jonas

    2016-01-01

    ABSTRACT Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT) sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation. PMID:26861018

  16. Subtractive assembly for comparative metagenomics, and its application to type 2 diabetes metagenomes.

    PubMed

    Wang, Mingjie; Doak, Thomas G; Ye, Yuzhen

    2015-11-02

    Comparative metagenomics remains challenging due to the size and complexity of metagenomic datasets. Here we introduce subtractive assembly, a de novo assembly approach for comparative metagenomics that directly assembles only the differential reads that distinguish between two groups of metagenomes. Using simulated datasets, we show it improves both the efficiency of the assembly and the assembly quality of the differential genomes and genes. Further, its application to type 2 diabetes (T2D) metagenomic datasets reveals clear signatures of the T2D gut microbiome, revealing new phylogenetic and functional features of the gut microbial communities associated with T2D.

  17. Explaining diversity in metagenomic datasets by phylogenetic-based feature weighting.

    PubMed

    Albanese, Davide; De Filippo, Carlotta; Cavalieri, Duccio; Donati, Claudio

    2015-03-01

    Metagenomics is revolutionizing our understanding of microbial communities, showing that their structure and composition have profound effects on the ecosystem and in a variety of health and disease conditions. Despite the flourishing of new analysis methods, current approaches based on statistical comparisons between high-level taxonomic classes often fail to identify the microbial taxa that are differentially distributed between sets of samples, since in many cases the taxonomic schema do not allow an adequate description of the structure of the microbiota. This constitutes a severe limitation to the use of metagenomic data in therapeutic and diagnostic applications. To provide a more robust statistical framework, we introduce a class of feature-weighting algorithms that discriminate the taxa responsible for the classification of metagenomic samples. The method unambiguously groups the relevant taxa into clades without relying on pre-defined taxonomic categories, thus including in the analysis also those sequences for which a taxonomic classification is difficult. The phylogenetic clades are weighted and ranked according to their abundance measuring their contribution to the differentiation of the classes of samples, and a criterion is provided to define a reduced set of most relevant clades. Applying the method to public datasets, we show that the data-driven definition of relevant phylogenetic clades accomplished by our ranking strategy identifies features in the samples that are lost if phylogenetic relationships are not considered, improving our ability to mine metagenomic datasets. Comparison with supervised classification methods currently used in metagenomic data analysis highlights the advantages of using phylogenetic information. PMID:25815895

  18. Exploring neighborhoods in the metagenome universe.

    PubMed

    Aßhauer, Kathrin P; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-01-01

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis. PMID:25026170

  19. Exploring neighborhoods in the metagenome universe.

    PubMed

    Aßhauer, Kathrin P; Klingenberg, Heiner; Lingner, Thomas; Meinicke, Peter

    2014-07-14

    The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis.

  20. Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes

    PubMed Central

    Lin, Hsin-Hung; Liao, Yu-Chieh

    2016-01-01

    Metagenomics, the application of shotgun sequencing, facilitates the reconstruction of the genomes of individual species from natural environments. A major challenge in the genome recovery domain is to agglomerate or ‘bin’ sequences assembled from metagenomic reads into individual groups. Metagenomic binning without consideration of reference sequences enables the comprehensive discovery of new microbial organisms and aids in the microbial genome reconstruction process. Here we present MyCC, an automated binning tool that combines genomic signatures, marker genes and optional contig coverages within one or multiple samples, in order to visualize the metagenomes and to identify the reconstructed genomic fragments. We demonstrate the superior performance of MyCC compared to other binning tools including CONCOCT, GroopM, MaxBin and MetaBAT on both synthetic and real human gut communities with a small sample size (one to 11 samples), as well as on a large metagenome dataset (over 250 samples). Moreover, we demonstrate the visualization of metagenomes in MyCC to aid in the reconstruction of genomes from distinct bins. MyCC is freely available at http://sourceforge.net/projects/sb2nhri/files/MyCC/. PMID:27067514

  1. Current and future resources for functional metagenomics.

    PubMed

    Lam, Kathy N; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D; Charles, Trevor C

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries-physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research. PMID:26579102

  2. Current and future resources for functional metagenomics

    PubMed Central

    Lam, Kathy N.; Cheng, Jiujun; Engel, Katja; Neufeld, Josh D.; Charles, Trevor C.

    2015-01-01

    Functional metagenomics is a powerful experimental approach for studying gene function, starting from the extracted DNA of mixed microbial populations. A functional approach relies on the construction and screening of metagenomic libraries—physical libraries that contain DNA cloned from environmental metagenomes. The information obtained from functional metagenomics can help in future annotations of gene function and serve as a complement to sequence-based metagenomics. In this Perspective, we begin by summarizing the technical challenges of constructing metagenomic libraries and emphasize their value as resources. We then discuss libraries constructed using the popular cloning vector, pCC1FOS, and highlight the strengths and shortcomings of this system, alongside possible strategies to maximize existing pCC1FOS-based libraries by screening in diverse hosts. Finally, we discuss the known bias of libraries constructed from human gut and marine water samples, present results that suggest bias may also occur for soil libraries, and consider factors that bias metagenomic libraries in general. We anticipate that discussion of current resources and limitations will advance tools and technologies for functional metagenomics research. PMID:26579102

  3. Multisubstrate isotope labeling and metagenomic analysis of active soil bacterial communities.

    PubMed

    Verastegui, Y; Cheng, J; Engel, K; Kolczynski, D; Mortimer, S; Lavigne, J; Montalibet, J; Romantsov, T; Hall, M; McConkey, B J; Rose, D R; Tomashek, J J; Scott, B R; Charles, T C; Neufeld, J D

    2014-07-15

    Soil microbial diversity represents the largest global reservoir of novel microorganisms and enzymes. In this study, we coupled functional metagenomics and DNA stable-isotope probing (DNA-SIP) using multiple plant-derived carbon substrates and diverse soils to characterize active soil bacterial communities and their glycoside hydrolase genes, which have value for industrial applications. We incubated samples from three disparate Canadian soils (tundra, temperate rainforest, and agricultural) with five native carbon ((12)C) or stable-isotope-labeled ((13)C) carbohydrates (glucose, cellobiose, xylose, arabinose, and cellulose). Indicator species analysis revealed high specificity and fidelity for many uncultured and unclassified bacterial taxa in the heavy DNA for all soils and substrates. Among characterized taxa, Actinomycetales (Salinibacterium), Rhizobiales (Devosia), Rhodospirillales (Telmatospirillum), and Caulobacterales (Phenylobacterium and Asticcacaulis) were bacterial indicator species for the heavy substrates and soils tested. Both Actinomycetales and Caulobacterales (Phenylobacterium) were associated with metabolism of cellulose, and Alphaproteobacteria were associated with the metabolism of arabinose; members of the order Rhizobiales were strongly associated with the metabolism of xylose. Annotated metagenomic data suggested diverse glycoside hydrolase gene representation within the pooled heavy DNA. By screening 2,876 cloned fragments derived from the (13)C-labeled DNA isolated from soils incubated with cellulose, we demonstrate the power of combining DNA-SIP, multiple-displacement amplification (MDA), and functional metagenomics by efficiently isolating multiple clones with activity on carboxymethyl cellulose and fluorogenic proxy substrates for carbohydrate-active enzymes. Importance: The ability to identify genes based on function, instead of sequence homology, allows the discovery of genes that would not be identified through sequence alone. This

  4. Metagenomics using next-generation sequencing.

    PubMed

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis. PMID:24515370

  5. Metagenomic applications in environmental monitoring and bioremediation.

    PubMed

    Techtmann, Stephen M; Hazen, Terry C

    2016-10-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples of the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities. PMID:27558781

  6. Metagenomic applications in environmental monitoring and bioremediation.

    PubMed

    Techtmann, Stephen M; Hazen, Terry C

    2016-10-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples of the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.

  7. Exploring the viral world through metagenomics.

    PubMed

    Rosario, Karyna; Breitbart, Mya

    2011-10-01

    Viral metagenomics, or shotgun sequencing of purified viral particles, has revolutionized the field of environmental virology by allowing the exploration of viral communities in a variety of sample types throughout the biosphere. The introduction of viral metagenomics has demonstrated that dominant viruses in environmental communities are not well-represented by the cultured viruses in existing sequence databases. Viral metagenomic studies have provided insights into viral ecology by elucidating the genetic potential, community structure, and biogeography of environmental viruses. In addition, viral metagenomics has expanded current knowledge of virus-host interactions by uncovering genes that may allow viruses to manipulate their hosts in unexpected ways. The intrinsic potential for virus discovery through viral metagenomics can help advance a wide array of disciplines including evolutionary biology, pathogen surveillance, and biotechnology.

  8. Metagenomics using next-generation sequencing.

    PubMed

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis.

  9. Metagenomic applications in environmental monitoring and bioremediation

    SciTech Connect

    Techtmann, Stephen M.; Hazen, Terry C.

    2016-01-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples of the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.

  10. Assembling the Marine Metagenome, One Cell at a Time

    SciTech Connect

    Woyke, Tanja; Xie, Gary; Copeland, Alex; Gonzalez, Jose M.; Han, Cliff; Kiss, Hajnalka; Saw, Jimmy H.; Senin, Pavel; Yang, Chi; Chatterji, Sourav; Cheng, Jan-Fang; Eisen, Jonathan A.; Sieracki, Michael E.; Stepanauskas, Ramunas

    2010-06-24

    The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91percent and 78percent, respectively. Only 0.24percent of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured

  11. Metagenomic reconstructions of bacterial CRISPR loci constrain population histories.

    PubMed

    Sun, Christine L; Thomas, Brian C; Barrangou, Rodolphe; Banfield, Jillian F

    2016-04-01

    Bacterial CRISPR-Cas systems provide insight into recent population history because they rapidly incorporate, in a unidirectional manner, short fragments (spacers) from coexisting infective virus populations into host chromosomes. Immunity is achieved by sequence identity between transcripts of spacers and their targets. Here, we used metagenomics to study the stability and dynamics of the type I-E CRISPR-Cas locus of Leptospirillum group II bacteria in biofilms sampled over 5 years from an acid mine drainage (AMD) system. Despite recovery of 452,686 spacers from CRISPR amplicons and metagenomic data, rarefaction curves of spacers show no saturation. The vast repertoire of spacers is attributed to phage/plasmid population diversity and retention of old spacers, despite rapid evolution of the targeted phage/plasmid genome regions (proto-spacers). The oldest spacers (spacers found at the trailer end) are conserved for at least 5 years, and 12% of these retain perfect or near-perfect matches to proto-spacer targets. The majority of proto-spacer regions contain an AAG proto-spacer adjacent motif (PAM). Spacers throughout the locus target the same phage population (AMDV1), but there are blocks of consecutive spacers without AMDV1 target sequences. Results suggest long-term coexistence of Leptospirillum with AMDV1 and periods when AMDV1 was less dominant. Metagenomics can be applied to millions of cells in a single sample to provide an extremely large spacer inventory, allow identification of phage/plasmids and enable analysis of previous phage/plasmid exposure. Thus, this approach can provide insights into prior bacterial environment and genetic interplay between hosts and their viruses. PMID:26394009

  12. Assembling The Marine Metagenome, One Cell At A Time

    SciTech Connect

    Xie, Gang; Han, Shunsheng; Kiss, Hajnalka; Saw, Jimmy; Senin, Pavel; Woyke, Tanja; Copeland, Alex; Gonzalez, Jose; Chatterji, Sourav; Cheng, Jan - Fang; Eisen, Jonathan A; Sieracki, Michael E; Stepanauskas, Ramunas

    2008-01-01

    The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91% and 78%, respectively. Only 0.24% of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured taxa from a complex

  13. Metagenomic reconstructions of bacterial CRISPR loci constrain population histories.

    PubMed

    Sun, Christine L; Thomas, Brian C; Barrangou, Rodolphe; Banfield, Jillian F

    2016-04-01

    Bacterial CRISPR-Cas systems provide insight into recent population history because they rapidly incorporate, in a unidirectional manner, short fragments (spacers) from coexisting infective virus populations into host chromosomes. Immunity is achieved by sequence identity between transcripts of spacers and their targets. Here, we used metagenomics to study the stability and dynamics of the type I-E CRISPR-Cas locus of Leptospirillum group II bacteria in biofilms sampled over 5 years from an acid mine drainage (AMD) system. Despite recovery of 452,686 spacers from CRISPR amplicons and metagenomic data, rarefaction curves of spacers show no saturation. The vast repertoire of spacers is attributed to phage/plasmid population diversity and retention of old spacers, despite rapid evolution of the targeted phage/plasmid genome regions (proto-spacers). The oldest spacers (spacers found at the trailer end) are conserved for at least 5 years, and 12% of these retain perfect or near-perfect matches to proto-spacer targets. The majority of proto-spacer regions contain an AAG proto-spacer adjacent motif (PAM). Spacers throughout the locus target the same phage population (AMDV1), but there are blocks of consecutive spacers without AMDV1 target sequences. Results suggest long-term coexistence of Leptospirillum with AMDV1 and periods when AMDV1 was less dominant. Metagenomics can be applied to millions of cells in a single sample to provide an extremely large spacer inventory, allow identification of phage/plasmids and enable analysis of previous phage/plasmid exposure. Thus, this approach can provide insights into prior bacterial environment and genetic interplay between hosts and their viruses.

  14. Chitinase genes revealed and compared in bacterial isolates, DNA extracts and a metagenomic library from a phytopathogen suppressive soil

    SciTech Connect

    Hjort, K.; Bergstrom, M.; Adesina, M.F.; Jansson, J.K.; Smalla, K.; Sjoling, S.

    2009-09-01

    Soil that is suppressive to disease caused by fungal pathogens is an interesting source to target for novel chitinases that might be contributing towards disease suppression. In this study we screened for chitinase genes, in a phytopathogen-suppressive soil in three ways: (1) from a metagenomic library constructed from microbial cells extracted from soil, (2) from directly extracted DNA and (3) from bacterial isolates with antifungal and chitinase activities. Terminal-restriction fragment length polymorphism (T-RFLP) of chitinase genes revealed differences in amplified chitinase genes from the metagenomic library and the directly extracted DNA, but approximately 40% of the identified chitinase terminal-restriction fragments (TRFs) were found in both sources. All of the chitinase TRFs from the isolates were matched to TRFs in the directly extracted DNA and the metagenomic library. The most abundant chitinase TRF in the soil DNA and the metagenomic library corresponded to the TRF{sup 103} of the isolate, Streptomyces mutomycini and/or Streptomyces clavifer. There were good matches between T-RFLP profiles of chitinase gene fragments obtained from different sources of DNA. However, there were also differences in both the chitinase and the 16S rRNA gene T-RFLP patterns depending on the source of DNA, emphasizing the lack of complete coverage of the gene diversity by any of the approaches used.

  15. A new approach to retrieve full lengths of functional genes from soil by PCR-DGGE and metagenome walking.

    PubMed

    Morimoto, Sho; Fujii, Takeshi

    2009-05-01

    Metagenomes are a vast genetic resource, and various approaches have been developed to explore them. Here, we present a new approach to retrieve full lengths of functional genes from soil DNA using PCR-denaturing gradient gel electrophoresis (DGGE) followed by metagenome walking. Partial fragments of benzoate 1,2-dioxygenase alpha subunit gene (benA) were detected from a 3-chlorobenzoate (3CB)-dosed soil by PCR-DGGE, and one DGGE band induced by 3CB was used as a target fragment for metagenome walking. The walking retrieved the flanking regions of the target fragment from the soil DNA, resulting in recovery of the full length of benA and also downstream gene (benB). The same strategy retrieved another gene, tfdC, and a complete tfdC and two downstream genes were obtained from the same soil. PCR-DGGE allows screening for target genes based on their potential for degrading contaminants in the environment. This feature provides an advantage over other existing metagenomic approaches.

  16. Bioprospecting metagenomes: Glycosyl hydrolases for converting biomass

    SciTech Connect

    Li, L.; van der Lelie, D.; McCorkle, S. R.; Monchy, S.; Taghavi, S.

    2009-05-18

    Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies.

  17. Bioprospecting metagenomes: glycosyl hydrolases for converting biomass

    PubMed Central

    Li, Luen-Luen; McCorkle, Sean R; Monchy, Sebastien; Taghavi, Safiyh; van der Lelie, Daniel

    2009-01-01

    Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies. PMID:19450243

  18. Bioprospecting metagenomes: glycosyl hydrolases for converting biomass.

    PubMed

    Li, Luen-Luen; McCorkle, Sean R; Monchy, Sebastien; Taghavi, Safiyh; van der Lelie, Daniel

    2009-01-01

    Throughout immeasurable time, microorganisms evolved and accumulated remarkable physiological and functional heterogeneity, and now constitute the major reserve for genetic diversity on earth. Using metagenomics, namely genetic material recovered directly from environmental samples, this biogenetic diversification can be accessed without the need to cultivate cells. Accordingly, microbial communities and their metagenomes, isolated from biotopes with high turnover rates of recalcitrant biomass, such as lignocellulosic plant cell walls, have become a major resource for bioprospecting; furthermore, this material is a major asset in the search for new biocatalytics (enzymes) for various industrial processes, including the production of biofuels from plant feedstocks. However, despite the contributions from metagenomics technologies consequent upon the discovery of novel enzymes, this relatively new enterprise requires major improvements. In this review, we compare function-based metagenome screening and sequence-based metagenome data mining, discussing the advantages and limitations of both methods. We also describe the unusual enzymes discovered via metagenomics approaches, and discuss the future prospects for metagenome technologies. PMID:19450243

  19. Human milk metagenome: a functional capacity analysis

    PubMed Central

    2013-01-01

    Background Human milk contains a diverse population of bacteria that likely influences colonization of the infant gastrointestinal tract. Recent studies, however, have been limited to characterization of this microbial community by 16S rRNA analysis. In the present study, a metagenomic approach using Illumina sequencing of a pooled milk sample (ten donors) was employed to determine the genera of bacteria and the types of bacterial open reading frames in human milk that may influence bacterial establishment and stability in this primal food matrix. The human milk metagenome was also compared to that of breast-fed and formula-fed infants’ feces (n = 5, each) and mothers’ feces (n = 3) at the phylum level and at a functional level using open reading frame abundance. Additionally, immune-modulatory bacterial-DNA motifs were also searched for within human milk. Results The bacterial community in human milk contained over 360 prokaryotic genera, with sequences aligning predominantly to the phyla of Proteobacteria (65%) and Firmicutes (34%), and the genera of Pseudomonas (61.1%), Staphylococcus (33.4%) and Streptococcus (0.5%). From assembled human milk-derived contigs, 30,128 open reading frames were annotated and assigned to functional categories. When compared to the metagenome of infants’ and mothers’ feces, the human milk metagenome was less diverse at the phylum level, and contained more open reading frames associated with nitrogen metabolism, membrane transport and stress response (P < 0.05). The human milk metagenome also contained a similar occurrence of immune-modulatory DNA motifs to that of infants’ and mothers’ fecal metagenomes. Conclusions Our results further expand the complexity of the human milk metagenome and enforce the benefits of human milk ingestion on the microbial colonization of the infant gut and immunity. Discovery of immune-modulatory motifs in the metagenome of human milk indicates more exhaustive analyses of the

  20. Metagenomics and future perspectives in virus discovery.

    PubMed

    Mokili, John L; Rohwer, Forest; Dutilh, Bas E

    2012-02-01

    Monitoring the emergence and re-emergence of viral diseases with the goal of containing the spread of viral agents requires both adequate preparedness and quick response. Identifying the causative agent of a new epidemic is one of the most important steps for effective response to disease outbreaks. Traditionally, virus discovery required propagation of the virus in cell culture, a proven technique responsible for the identification of the vast majority of viruses known to date. However, many viruses cannot be easily propagated in cell culture, thus limiting our knowledge of viruses. Viral metagenomic analyses of environmental samples suggest that the field of virology has explored less than 1% of the extant viral diversity. In the last decade, the culture-independent and sequence-independent metagenomic approach has permitted the discovery of many viruses in a wide range of samples. Phylogenetically, some of these viruses are distantly related to previously discovered viruses. In addition, 60-99% of the sequences generated in different viral metagenomic studies are not homologous to known viruses. In this review, we discuss the advances in the area of viral metagenomics during the last decade and their relevance to virus discovery, clinical microbiology and public health. We discuss the potential of metagenomics for characterization of the normal viral population in a healthy community and identification of viruses that could pose a threat to humans through zoonosis. In addition, we propose a new model of the Koch's postulates named the 'Metagenomic Koch's Postulates'. Unlike the original Koch's postulates and the Molecular Koch's postulates as formulated by Falkow, the metagenomic Koch's postulates focus on the identification of metagenomic traits in disease cases. The metagenomic traits that can be traced after healthy individuals have been exposed to the source of the suspected pathogen.

  1. The future of skin metagenomics.

    PubMed

    Mathieu, Alban; Vogel, Timothy M; Simonet, Pascal

    2014-01-01

    Metagenomics, the direct exploitation of environmental microbial DNA, is complementary to traditional culture-based approaches for deciphering taxonomic and functional microbial diversity in a plethora of ecosystems, including those related to the human body such as the mouth, saliva, teeth, gut or skin. DNA extracted from human skin analyzed by sequencing the PCR-amplified rrs gene has already revealed the taxonomic diversity of microbial communities colonizing the human skin ("skin microbiome"). Each individual possesses his/her own skin microbial community structure, with marked taxonomic differences between different parts of the body and temporal evolution depending on physical and chemical conditions (sweat, washing etc.). However, technical limitations due to the low bacterial density at the surface of the human skin or contamination by human DNA still has inhibited extended use of the metagenomic approach for investigating the skin microbiome at a functional level. These difficulties have been overcome in part by the new generation of sequencing platforms that now provide sequences describing the genes and functions carried out by skin bacteria. These methodological advances should help us understand the mechanisms by which these microorganisms adapt to the specific chemical composition of each skin and thereby lead to a better understanding of bacteria/human host interdependence. This knowledge will pave the way for more systemic and individualized pharmaceutical and cosmetic applications.

  2. Mining metagenomes for novel cellulase genes.

    PubMed

    Duan, Cheng-Jie; Feng, Jia-Xun

    2010-12-01

    Cellulases hydrolyze the β-1,4 linkages of cellulose and are widely used in food, brewing and wine, animal feed, textiles and laundry, and pulp and paper industries, especially for hydrolyzing cellulosic materials into sugars, which can be fermented to produce useful products such as ethanol. Metagenomics has become an alternative approach to conventional culture-dependent methods as it allows exhaustive mining of microbial genomes in their natural environments. This review covers the current state of research and challenges in mining novel cellulase genes from the metagenomes of various environments, and discusses the potential biotechnological applications of metagenome-derived cellulases.

  3. EBI metagenomics--a new resource for the analysis and archiving of metagenomic data.

    PubMed

    Hunter, Sarah; Corbett, Matthew; Denise, Hubert; Fraser, Matthew; Gonzalez-Beltran, Alejandra; Hunter, Christopher; Jones, Philip; Leinonen, Rasko; McAnulla, Craig; Maguire, Eamonn; Maslen, John; Mitchell, Alex; Nuka, Gift; Oisel, Arnaud; Pesseat, Sebastien; Radhakrishnan, Rajesh; Rocca-Serra, Philippe; Scheremetjew, Maxim; Sterk, Peter; Vaughan, Daniel; Cochrane, Guy; Field, Dawn; Sansone, Susanna-Assunta

    2014-01-01

    Metagenomics is a relatively recently established but rapidly expanding field that uses high-throughput next-generation sequencing technologies to characterize the microbial communities inhabiting different ecosystems (including oceans, lakes, soil, tundra, plants and body sites). Metagenomics brings with it a number of challenges, including the management, analysis, storage and sharing of data. In response to these challenges, we have developed a new metagenomics resource (http://www.ebi.ac.uk/metagenomics/) that allows users to easily submit raw nucleotide reads for functional and taxonomic analysis by a state-of-the-art pipeline, and have them automatically stored (together with descriptive, standards-compliant metadata) in the European Nucleotide Archive.

  4. Estimating DNA coverage and abundance in metagenomes using a gamma approximation

    SciTech Connect

    Hooper, Sean D; Dalevi, Daniel; Pati, Amrita; Mavromatis, Konstantinos; Ivanova, Natalia N; Kyrpides, Nikos C

    2010-01-01

    Shotgun sequencing generates large numbers of short DNA reads from either an isolated organism or, in the case of metagenomics projects, from the aggregate genome of a microbial community. These reads are then assembled based on overlapping sequences into larger, contiguous sequences (contigs). The feasibility of assembly and the coverage achieved (reads per nucleotide or distinct sequence of nucleotides) depend on several factors: the number of reads sequenced, the read length and the relative abundances of their source genomes in the microbial community. A low coverage suggests that most of the genomic DNA in the sample has not been sequenced, but it is often difficult to estimate either the extent of the uncaptured diversity or the amount of additional sequencing that would be most efficacious. In this work, we regard a metagenome as a population of DNA fragments (bins), each of which may be covered by one or more reads. We employ a gamma distribution to model this bin population due to its flexibility and ease of use. When a gamma approximation can be found that adequately fits the data, we may estimate the number of bins that were not sequenced and that could potentially be revealed by additional sequencing. We evaluated the performance of this model using simulated metagenomes and demonstrate its applicability on three recent metagenomic datasets.

  5. A field guide to eukaryotic circular single-stranded DNA viruses: insights gained from metagenomics.

    PubMed

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2012-10-01

    Despite their small size and limited protein-coding capacity, the rapid evolution rates of single-stranded DNA (ssDNA) viruses have led to their emergence as serious plant and animal pathogens. Recently, metagenomics has revealed an unprecedented diversity of ssDNA viruses, expanding their known environmental distributions and host ranges. This review summarizes and contrasts the basic characteristics of known circular ssDNA viral groups, providing a resource for analyzing the wealth of ssDNA viral sequences identified through metagenomics. Since ssDNA viruses are largely identified based on conserved rolling circle replication proteins, this review highlights distinguishing motifs and catalytic residues important for replication. Genomes identified through metagenomics have demonstrated unique ssDNA viral genome architectures and revealed characteristics that blur the boundaries between previously well-defined groups. Metagenomic discovery of ssDNA viruses has created both a challenge to current taxonomic classification schemes and an opportunity to revisit hypotheses regarding the evolutionary history of these viruses.

  6. Toward Accurate and Quantitative Comparative Metagenomics.

    PubMed

    Nayfach, Stephen; Pollard, Katherine S

    2016-08-25

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. PMID:27565341

  7. Toward Accurate and Quantitative Comparative Metagenomics

    PubMed Central

    Nayfach, Stephen; Pollard, Katherine S.

    2016-01-01

    Shotgun metagenomics and computational analysis are used to compare the taxonomic and functional profiles of microbial communities. Leveraging this approach to understand roles of microbes in human biology and other environments requires quantitative data summaries whose values are comparable across samples and studies. Comparability is currently hampered by the use of abundance statistics that do not estimate a meaningful parameter of the microbial community and biases introduced by experimental protocols and data-cleaning approaches. Addressing these challenges, along with improving study design, data access, metadata standardization, and analysis tools, will enable accurate comparative metagenomics. We envision a future in which microbiome studies are replicable and new metagenomes are easily and rapidly integrated with existing data. Only then can the potential of metagenomics for predictive ecological modeling, well-powered association studies, and effective microbiome medicine be fully realized. PMID:27565341

  8. [Pathology and viral metagenomics, a recent history].

    PubMed

    Bernardo, Pauline; Albina, Emmanuel; Eloit, Marc; Roumagnac, Philippe

    2013-05-01

    Human, animal and plant viral diseases have greatly benefited from recent metagenomics developments. Viral metagenomics is a culture-independent approach used to investigate the complete viral genetic populations of a sample. During the last decade, metagenomics concepts and techniques that were first used by ecologists progressively spread into the scientific field of viral pathology. The sample, which was first for ecologists a fraction of ecosystem, became for pathologists an organism that hosts millions of microbes and viruses. This new approach, providing without a priori high resolution qualitative and quantitative data on the viral diversity, is now revolutionizing the way pathologists decipher viral diseases. This review describes the very last improvements of the high throughput next generation sequencing methods and discusses the applications of viral metagenomics in viral pathology, including discovery of novel viruses, viral surveillance and diagnostic, large-scale molecular epidemiology, and viral evolution.

  9. Finding the Needles in the Metagenome Haystack

    PubMed Central

    Speksnijder, Arjen G. C. L.; Zhang, Kun; Goodman, Robert M.; van Veen, Johannes A.

    2007-01-01

    In the collective genomes (the metagenome) of the microorganisms inhabiting the Earth’s diverse environments is written the history of life on this planet. New molecular tools developed and used for the past 15 years by microbial ecologists are facilitating the extraction, cloning, screening, and sequencing of these genomes. This approach allows microbial ecologists to access and study the full range of microbial diversity, regardless of our ability to culture organisms, and provides an unprecedented access to the breadth of natural products that these genomes encode. However, there is no way that the mere collection of sequences, no matter how expansive, can provide full coverage of the complex world of microbial metagenomes within the foreseeable future. Furthermore, although it is possible to fish out highly informative and useful genes from the sea of gene diversity in the environment, this can be a highly tedious and inefficient procedure. Microbial ecologists must be clever in their pursuit of ecologically relevant, valuable, and niche-defining genomic information within the vast haystack of microbial diversity. In this report, we seek to describe advances and prospects that will help microbial ecologists glean more knowledge from investigations into metagenomes. These include technological advances in sequencing and cloning methodologies, as well as improvements in annotation and comparative sequence analysis. More significant, however, will be ways to focus in on various subsets of the metagenome that may be of particular relevance, either by limiting the target community under study or improving the focus or speed of screening procedures. Lastly, given the cost and infrastructure necessary for large metagenome projects, and the almost inexhaustible amount of data they can produce, trends toward broader use of metagenome data across the research community coupled with the needed investment in bioinformatics infrastructure devoted to metagenomics will no

  10. Finding the needles in the metagenome haystack.

    PubMed

    Kowalchuk, George A; Speksnijder, Arjen G C L; Zhang, Kun; Goodman, Robert M; van Veen, Johannes A

    2007-04-01

    In the collective genomes (the metagenome) of the microorganisms inhabiting the Earth's diverse environments is written the history of life on this planet. New molecular tools developed and used for the past 15 years by microbial ecologists are facilitating the extraction, cloning, screening, and sequencing of these genomes. This approach allows microbial ecologists to access and study the full range of microbial diversity, regardless of our ability to culture organisms, and provides an unprecedented access to the breadth of natural products that these genomes encode. However, there is no way that the mere collection of sequences, no matter how expansive, can provide full coverage of the complex world of microbial metagenomes within the foreseeable future. Furthermore, although it is possible to fish out highly informative and useful genes from the sea of gene diversity in the environment, this can be a highly tedious and inefficient procedure. Microbial ecologists must be clever in their pursuit of ecologically relevant, valuable, and niche-defining genomic information within the vast haystack of microbial diversity. In this report, we seek to describe advances and prospects that will help microbial ecologists glean more knowledge from investigations into metagenomes. These include technological advances in sequencing and cloning methodologies, as well as improvements in annotation and comparative sequence analysis. More significant, however, will be ways to focus in on various subsets of the metagenome that may be of particular relevance, either by limiting the target community under study or improving the focus or speed of screening procedures. Lastly, given the cost and infrastructure necessary for large metagenome projects, and the almost inexhaustible amount of data they can produce, trends toward broader use of metagenome data across the research community coupled with the needed investment in bioinformatics infrastructure devoted to metagenomics will no

  11. [PKS gene screening based on metagenome of Halichondria rugosa].

    PubMed

    Zhang, Xu-sheng; Li, Zhi-yong; Miao, Xiao-ling

    2007-06-01

    Metagenome DNA was extracted from Halichondria rugosa which was collected from South China Sea and kept in -4 degrees C. PKS gene fragment was amplified using PCR with KS domain primers in PKS gene. A DNA fragment about 671bp in length was obtained by PCR. The PCR product was measured by agrose gel electrophoresis. Then the product was recovered from gel and cloned into pUCm-T vector. After that vectors were transformed into competent cells (DH5alpha). PKS gene fragment in positive clones was sequenced. Consequently, the corresponding amino acid sequence was deduced based on nucleotide sequence. BLAST analysis showed that the homology of this amino acid sequence with that deduced from KS domain of PKS gene in Rhodobacterales bacterium was up to 96%. Phylogenetic analysis indicated that the obtained PKS gene belongs to trans-AT KS domains. Meanwhile the result demonstrated the diversity and differences of microorganisms associated with and around sponge in different sea area. It is the first time to find bacterial PKS gene in sponge Halichondria rugosa, which provide powerful proof to the microbial origin hypothesis of sponge active compounds. At the same time, this study lay basis for the utilization of uncultured microorganisms associated with sponge from the aspect of genes.

  12. Challenges and opportunities of airborne metagenomics.

    PubMed

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-05-06

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles.

  13. Challenges and Opportunities of Airborne Metagenomics

    PubMed Central

    Behzad, Hayedeh; Gojobori, Takashi; Mineta, Katsuhiko

    2015-01-01

    Recent metagenomic studies of environments, such as marine and soil, have significantly enhanced our understanding of the diverse microbial communities living in these habitats and their essential roles in sustaining vast ecosystems. The increase in the number of publications related to soil and marine metagenomics is in sharp contrast to those of air, yet airborne microbes are thought to have significant impacts on many aspects of our lives from their potential roles in atmospheric events such as cloud formation, precipitation, and atmospheric chemistry to their major impact on human health. In this review, we will discuss the current progress in airborne metagenomics, with a special focus on exploring the challenges and opportunities of undertaking such studies. The main challenges of conducting metagenomic studies of airborne microbes are as follows: 1) Low density of microorganisms in the air, 2) efficient retrieval of microorganisms from the air, 3) variability in airborne microbial community composition, 4) the lack of standardized protocols and methodologies, and 5) DNA sequencing and bioinformatics-related challenges. Overcoming these challenges could provide the groundwork for comprehensive analysis of airborne microbes and their potential impact on the atmosphere, global climate, and our health. Metagenomic studies offer a unique opportunity to examine viral and bacterial diversity in the air and monitor their spread locally or across the globe, including threats from pathogenic microorganisms. Airborne metagenomic studies could also lead to discoveries of novel genes and metabolic pathways relevant to meteorological and industrial applications, environmental bioremediation, and biogeochemical cycles. PMID:25953766

  14. FCMM: A comparative metagenomic approach for functional characterization of multiple metagenome samples.

    PubMed

    Lee, Jongin; Lee, Hoon Taek; Hong, Woon-young; Jang, Eunji; Kim, Jaebum

    2015-08-01

    Next-generation sequencing (NGS) technologies make it possible to obtain the entire genomic content of microorganisms in metagenome samples. Thus, many studies have developed methods for the processing and analysis of metagenomic NGS reads, including analyses for predicting functions and their enrichments in environmental metagenome samples. Especially, comparative functional studies by using multi-metagenome samples are essential for identifying and comparing different characteristics of multiple environmental samples. In this paper, we introduce a pipeline for functional characterization of multiple metagenome samples to infer major functions as well as their quantitative scores in a comparative metagenomics manner. The pipeline performs the annotation of functions related to expected proteins in the metagenome samples, calculates their enrichment scores based on the reads per kilobase per million reads (RPKM) measure, and predicts the relative abundance of associated functions by a statistical test. The results from single sample analysis are then used to find common and sample-specific major functions. By applying the pipeline to six different environmental metagenome samples, including two ocean (Antarctica aquatic and Baltic Sea) and four terrestrial (Acid mine drainage, human gut microbiome, Amazon River, and Wasca soil) samples, we were able to predict common functions as well as environment-specific functions. Our pipeline is available at http://bioinfo.konkuk.ac.kr/FCMM/. PMID:26027543

  15. Alignment-free Visualization of Metagenomic Data by Nonlinear Dimension Reduction

    PubMed Central

    Laczny, Cedric C.; Pinel, Nicolás; Vlassis, Nikos; Wilmes, Paul

    2014-01-01

    The visualization of metagenomic data, especially without prior taxonomic identification of reconstructed genomic fragments, is a challenging problem in computational biology. An ideal visualization method should, among others, enable clear distinction of congruent groups of sequences of closely related taxa, be applicable to fragments of lengths typically achievable following assembly, and allow the efficient analysis of the growing amounts of community genomic sequence data. Here, we report a scalable approach for the visualization of metagenomic data that is based on nonlinear dimension reduction via Barnes-Hut Stochastic Neighbor Embedding of centered log-ratio transformed oligonucleotide signatures extracted from assembled genomic sequence fragments. The approach allows for alignment-free assessment of the data-inherent taxonomic structure, and it can potentially facilitate the downstream binning of genomic fragments into uniform clusters reflecting organismal origin. We demonstrate the performance of our approach by visualizing community genomic sequence data from simulated as well as groundwater, human-derived and marine microbial communities. PMID:24682077

  16. Under-detection of endospore-forming Firmicutes in metagenomic data

    DOE PAGESBeta

    Filippidou, Sevasti; Junier, Thomas; Wunderlin, Tina; Lo, Chien -Chi; Li, Po -E; Chain, Patrick S.; Junier, Pilar

    2015-04-25

    Microbial diversity studies based on metagenomic sequencing have greatly enhanced our knowledge of the microbial world. However, one caveat is the fact that not all microorganisms are equally well detected, questioning the universality of this approach. Firmicutes are known to be a dominant bacterial group. Several Firmicutes species are endospore formers and this property makes them hardy in potentially harsh conditions, and thus likely to be present in a wide variety of environments, even as residents and not functional players. While metagenomic libraries can be expected to contain endospore formers, endospores are known to be resilient to many traditional methodsmore » of DNA isolation and thus potentially undetectable. In this study we evaluated the representation of endospore-forming Firmicutes in 73 published metagenomic datasets using two molecular markers unique to this bacterial group (spo0A and gpr). Both markers were notably absent in well-known habitats of Firmicutes such as soil, with spo0A found only in three mammalian gut microbiomes. A tailored DNA extraction method resulted in the detection of a large diversity of endospore-formers in amplicon sequencing of the 16S rRNA and spo0A genes. However, shotgun classification was still poor with only a minor fraction of the community assigned to Firmicutes. Thus, removing a specific bias in a molecular workflow improves detection in amplicon sequencing, but it was insufficient to overcome the limitations for detecting endospore-forming Firmicutes in whole-genome metagenomics. In conclusion, this study highlights the importance of understanding the specific methodological biases that can contribute to improve the universality of metagenomic approaches.« less

  17. Under-detection of endospore-forming Firmicutes in metagenomic data

    SciTech Connect

    Filippidou, Sevasti; Junier, Thomas; Wunderlin, Tina; Lo, Chien -Chi; Li, Po -E; Chain, Patrick S.; Junier, Pilar

    2015-04-25

    Microbial diversity studies based on metagenomic sequencing have greatly enhanced our knowledge of the microbial world. However, one caveat is the fact that not all microorganisms are equally well detected, questioning the universality of this approach. Firmicutes are known to be a dominant bacterial group. Several Firmicutes species are endospore formers and this property makes them hardy in potentially harsh conditions, and thus likely to be present in a wide variety of environments, even as residents and not functional players. While metagenomic libraries can be expected to contain endospore formers, endospores are known to be resilient to many traditional methods of DNA isolation and thus potentially undetectable. In this study we evaluated the representation of endospore-forming Firmicutes in 73 published metagenomic datasets using two molecular markers unique to this bacterial group (spo0A and gpr). Both markers were notably absent in well-known habitats of Firmicutes such as soil, with spo0A found only in three mammalian gut microbiomes. A tailored DNA extraction method resulted in the detection of a large diversity of endospore-formers in amplicon sequencing of the 16S rRNA and spo0A genes. However, shotgun classification was still poor with only a minor fraction of the community assigned to Firmicutes. Thus, removing a specific bias in a molecular workflow improves detection in amplicon sequencing, but it was insufficient to overcome the limitations for detecting endospore-forming Firmicutes in whole-genome metagenomics. In conclusion, this study highlights the importance of understanding the specific methodological biases that can contribute to improve the universality of metagenomic approaches.

  18. Under-detection of endospore-forming Firmicutes in metagenomic data

    PubMed Central

    Filippidou, Sevasti; Junier, Thomas; Wunderlin, Tina; Lo, Chien-Chi; Li, Po-E; Chain, Patrick S.; Junier, Pilar

    2015-01-01

    Microbial diversity studies based on metagenomic sequencing have greatly enhanced our knowledge of the microbial world. However, one caveat is the fact that not all microorganisms are equally well detected, questioning the universality of this approach. Firmicutes are known to be a dominant bacterial group. Several Firmicutes species are endospore formers and this property makes them hardy in potentially harsh conditions, and thus likely to be present in a wide variety of environments, even as residents and not functional players. While metagenomic libraries can be expected to contain endospore formers, endospores are known to be resilient to many traditional methods of DNA isolation and thus potentially undetectable. In this study we evaluated the representation of endospore-forming Firmicutes in 73 published metagenomic datasets using two molecular markers unique to this bacterial group (spo0A and gpr). Both markers were notably absent in well-known habitats of Firmicutes such as soil, with spo0A found only in three mammalian gut microbiomes. A tailored DNA extraction method resulted in the detection of a large diversity of endospore-formers in amplicon sequencing of the 16S rRNA and spo0A genes. However, shotgun classification was still poor with only a minor fraction of the community assigned to Firmicutes. Thus, removing a specific bias in a molecular workflow improves detection in amplicon sequencing, but it was insufficient to overcome the limitations for detecting endospore-forming Firmicutes in whole-genome metagenomics. In conclusion, this study highlights the importance of understanding the specific methodological biases that can contribute to improve the universality of metagenomic approaches. PMID:25973144

  19. RiboFR-Seq: a novel approach to linking 16S rRNA amplicon profiles to metagenomes

    PubMed Central

    Zhang, Yanming; Ji, Peifeng; Wang, Jinfeng; Zhao, Fangqing

    2016-01-01

    16S rRNA amplicon analysis and shotgun metagenome sequencing are two main culture-independent strategies to explore the genetic landscape of various microbial communities. Recently, numerous studies have employed these two approaches together, but downstream data analyses were performed separately, which always generated incongruent or conflict signals on both taxonomic and functional classifications. Here we propose a novel approach, RiboFR-Seq (Ribosomal RNA gene flanking region sequencing), for capturing both ribosomal RNA variable regions and their flanking protein-coding genes simultaneously. Through extensive testing on clonal bacterial strain, salivary microbiome and bacterial epibionts of marine kelp, we demonstrated that RiboFR-Seq could detect the vast majority of bacteria not only in well-studied microbiomes but also in novel communities with limited reference genomes. Combined with classical amplicon sequencing and shotgun metagenome sequencing, RiboFR-Seq can link the annotations of 16S rRNA and metagenomic contigs to make a consensus classification. By recognizing almost all 16S rRNA copies, the RiboFR-seq approach can effectively reduce the taxonomic abundance bias resulted from 16S rRNA copy number variation. We believe that RiboFR-Seq, which provides an integrated view of 16S rRNA profiles and metagenomes, will help us better understand diverse microbial communities. PMID:26984526

  20. Recent progress and new challenges in metagenomics for biotechnology.

    PubMed

    Chistoserdova, Ludmila

    2010-10-01

    A brief historical perspective on metagenomics is given followed by a discussion of the rapid progress in this field largely defined by transition to the next generation sequencing technologies. Problems and challenges connected to this transition are also addressed. The review focuses on recent literature describing metagenomic approaches connecting sequence information to functionality that are especially relevant to biotechnological applications, including metagenomics of specialized or enriched microbial communities, metagenomics combined with specific labeling techniques, metatranscriptomics and metaproteomics.

  1. Fragmentation pathways of protonated peptides.

    PubMed

    Paizs, Béla; Suhai, Sándor

    2005-01-01

    The fragmentation pathways of protonated peptides are reviewed in the present paper paying special attention to classification of the known fragmentation channels into a simple hierarchy defined according to the chemistry involved. It is shown that the 'mobile proton' model of peptide fragmentation can be used to understand the MS/MS spectra of protonated peptides only in a qualitative manner rationalizing differences observed for low-energy collision induced dissociation of peptide ions having or lacking a mobile proton. To overcome this limitation, a deeper understanding of the dissociation chemistry of protonated peptides is needed. To this end use of the 'pathways in competition' (PIC) model that involves a detailed energetic and kinetic characterization of the major peptide fragmentation pathways (PFPs) is proposed. The known PFPs are described in detail including all the pre-dissociation, dissociation, and post-dissociation events. It is our hope that studies to further extend PIC will lead to semi-quantative understanding of the MS/MS spectra of protonated peptides which could be used to develop refined bioinformatics algorithms for MS/MS based proteomics. Experimental and computational data on the fragmentation of protonated peptides are reevaluated from the point of view of the PIC model considering the mechanism, energetics, and kinetics of the major PFPs. Evidence proving semi-quantitative predictability of some of the ion intensity relationships (IIRs) of the MS/MS spectra of protonated peptides is presented. PMID:15389847

  2. Metazen – metadata capture for metagenomes

    DOE PAGESBeta

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker

    2014-12-08

    Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack themore » appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.« less

  3. Metagenomics: Application of Genomics to Uncultured Microorganisms

    PubMed Central

    Handelsman, Jo

    2004-01-01

    Metagenomics (also referred to as environmental and community genomics) is the genomic analysis of microorganisms by direct extraction and cloning of DNA from an assemblage of microorganisms. The development of metagenomics stemmed from the ineluctable evidence that as-yet-uncultured microorganisms represent the vast majority of organisms in most environments on earth. This evidence was derived from analyses of 16S rRNA gene sequences amplified directly from the environment, an approach that avoided the bias imposed by culturing and led to the discovery of vast new lineages of microbial life. Although the portrait of the microbial world was revolutionized by analysis of 16S rRNA genes, such studies yielded only a phylogenetic description of community membership, providing little insight into the genetics, physiology, and biochemistry of the members. Metagenomics provides a second tier of technical innovation that facilitates study of the physiology and ecology of environmental microorganisms. Novel genes and gene products discovered through metagenomics include the first bacteriorhodopsin of bacterial origin; novel small molecules with antimicrobial activity; and new members of families of known proteins, such as an Na+(Li+)/H+ antiporter, RecA, DNA polymerase, and antibiotic resistance determinants. Reassembly of multiple genomes has provided insight into energy and nutrient cycling within the community, genome structure, gene function, population genetics and microheterogeneity, and lateral gene transfer among members of an uncultured community. The application of metagenomic sequence information will facilitate the design of better culturing strategies to link genomic analysis with pure culture studies. PMID:15590779

  4. Preliminary High-Throughput Metagenome Assembly

    SciTech Connect

    Dusheyko, Serge; Furman, Craig; Pangilinan, Jasmyn; Shapiro, Harris; Tu, Hank

    2007-03-26

    Metagenome data sets present a qualitatively different assembly problem than traditional single-organism whole-genome shotgun (WGS) assembly. The unique aspects of such projects include the presence of a potentially large number of distinct organisms and their representation in the data set at widely different fractions. In addition, multiple closely related strains could be present, which would be difficult to assemble separately. Failure to take these issues into account can result in poor assemblies that either jumble together different strains or which fail to yield useful results. The DOE Joint Genome Institute has sequenced a number of metagenomic projects and plans to considerably increase this number in the coming year. As a result, the JGI has a need for high-throughput tools and techniques for handling metagenome projects. We present the techniques developed to handle metagenome assemblies in a high-throughput environment. This includes a streamlined assembly wrapper, based on the JGI?s in-house WGS assembler, Jazz. It also includes the selection of sensible defaults targeted for metagenome data sets, as well as quality control automation for cleaning up the raw results. While analysis is ongoing, we will discuss preliminary assessments of the quality of the assembly results (http://fames.jgi-psf.org).

  5. Metazen – metadata capture for metagenomes

    SciTech Connect

    Bischof, Jared; Harrison, Travis; Paczian, Tobias; Glass, Elizabeth; Wilke, Andreas; Meyer, Folker

    2014-12-08

    Background: As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. These tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results: Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusion: Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility.

  6. Viral Metagenomics: MetaView Software

    SciTech Connect

    Zhou, C; Smith, J

    2007-10-22

    The purpose of this report is to design and develop a tool for analysis of raw sequence read data from viral metagenomics experiments. The tool should compare read sequences of known viral nucleic acid sequence data and enable a user to attempt to determine, with some degree of confidence, what virus groups may be present in the sample. This project was conducted in two phases. In phase 1 we surveyed the literature and examined existing metagenomics tools to educate ourselves and to more precisely define the problem of analyzing raw read data from viral metagenomic experiments. In phase 2 we devised an approach and built a prototype code and database. This code takes viral metagenomic read data in fasta format as input and accesses all complete viral genomes from Kpath for sequence comparison. The system executes at the UNIX command line, producing output that is stored in an Oracle relational database. We provide here a description of the approach we came up with for handling un-assembled, short read data sets from viral metagenomics experiments. We include a discussion of the current MetaView code capabilities and additional functionality that we believe should be added, should additional funding be acquired to continue the work.

  7. Metazen – metadata capture for metagenomes

    PubMed Central

    2014-01-01

    Background As the impact and prevalence of large-scale metagenomic surveys grow, so does the acute need for more complete and standards compliant metadata. Metadata (data describing data) provides an essential complement to experimental data, helping to answer questions about its source, mode of collection, and reliability. Metadata collection and interpretation have become vital to the genomics and metagenomics communities, but considerable challenges remain, including exchange, curation, and distribution. Currently, tools are available for capturing basic field metadata during sampling, and for storing, updating and viewing it. Unfortunately, these tools are not specifically designed for metagenomic surveys; in particular, they lack the appropriate metadata collection templates, a centralized storage repository, and a unique ID linking system that can be used to easily port complete and compatible metagenomic metadata into widely used assembly and sequence analysis tools. Results Metazen was developed as a comprehensive framework designed to enable metadata capture for metagenomic sequencing projects. Specifically, Metazen provides a rapid, easy-to-use portal to encourage early deposition of project and sample metadata. Conclusions Metazen is an interactive tool that aids users in recording their metadata in a complete and valid format. A defined set of mandatory fields captures vital information, while the option to add fields provides flexibility. PMID:25780508

  8. Shotgun metagenomic data streams: surfing without fear

    SciTech Connect

    Berendzen, Joel R

    2010-12-06

    Timely information about bio-threat prevalence, consequence, propagation, attribution, and mitigation is needed to support decision-making, both routinely and in a crisis. One DNA sequencer can stream 25 Gbp of information per day, but sampling strategies and analysis techniques are needed to turn raw sequencing power into actionable knowledge. Shotgun metagenomics can enable biosurveillance at the level of a single city, hospital, or airplane. Metagenomics characterizes viruses and bacteria from complex environments such as soil, air filters, or sewage. Unlike targeted-primer-based sequencing, shotgun methods are not blind to sequences that are truly novel, and they can measure absolute prevalence. Shotgun metagenomic sampling can be non-invasive, efficient, and inexpensive while being informative. We have developed analysis techniques for shotgun metagenomic sequencing that rely upon phylogenetic signature patterns. They work by indexing local sequence patterns in a manner similar to web search engines. Our methods are laptop-fast and favorable scaling properties ensure they will be sustainable as sequencing methods grow. We show examples of application to soil metagenomic samples.

  9. Metagenomic applications in environmental monitoring and bioremediation

    DOE PAGESBeta

    Techtmann, Stephen M.; Hazen, Terry C.

    2016-01-01

    With the rapid advances in sequencing technology, the cost of sequencing has dramatically dropped and the scale of sequencing projects has increased accordingly. This has provided the opportunity for the routine use of sequencing techniques in the monitoring of environmental microbes. While metagenomic applications have been routinely applied to better understand the ecology and diversity of microbes, their use in environmental monitoring and bioremediation is increasingly common. In this review we seek to provide an overview of some of the metagenomic techniques used in environmental systems biology, addressing their application and limitation. We will also provide several recent examples ofmore » the application of metagenomics to bioremediation. We discuss examples where microbial communities have been used to predict the presence and extent of contamination, examples of how metagenomics can be used to characterize the process of natural attenuation by unculturable microbes, as well as examples detailing the use of metagenomics to understand the impact of biostimulation on microbial communities.« less

  10. Viral metagenomics and blood safety.

    PubMed

    Sauvage, V; Eloit, M

    2016-02-01

    The characterization of the human blood-associated viral community (also called blood virome) is essential for epidemiological surveillance and to anticipate new potential threats for blood transfusion safety. Currently, the risk of blood-borne agent transmission of well-known viruses (HBV, HCV, HIV and HTLV) can be considered as under control in high-resource countries. However, other viruses unknown or unsuspected may be transmitted to recipients by blood-derived products. This is particularly relevant considering that a significant proportion of transfused patients are immunocompromised and more frequently subjected to fatal outcomes. Several measures to prevent transfusion transmission of unknown viruses have been implemented including the exclusion of at-risk donors, leukocyte reduction of donor blood, and physicochemical treatment of the different blood components. However, up to now there is no universal method for pathogen inactivation, which would be applicable for all types of blood components and, equally effective for all viral families. In addition, among available inactivation procedures of viral genomes, some of them are recognized to be less effective on non-enveloped viruses, and inadequate to inactivate higher viral titers in plasma pools or derivatives. Given this, there is the need to implement new methodologies for the discovery of unknown viruses that may affect blood transfusion. Viral metagenomics combined with High Throughput Sequencing appears as a promising approach for the identification and global surveillance of new and/or unexpected viruses that could impair blood transfusion safety. PMID:26778104

  11. Viral metagenomics: are we missing the giants?

    PubMed

    Halary, S; Temmam, S; Raoult, D; Desnues, C

    2016-06-01

    Amoeba-infecting giant viruses are recently discovered viruses that have been isolated from diverse environments all around the world. In parallel to isolation efforts, metagenomics confirmed their worldwide distribution from a broad range of environmental and host-associated samples, including humans, depicting them as a major component of eukaryotic viruses in nature and a possible resident of the human/animal virome whose role is still unclear. Nevertheless, metagenomics data about amoeba-infecting giant viruses still remain scarce, mainly because of methodological limitations. Efforts should be pursued both at the metagenomic sample preparation level and on in silico analyses to better understand their roles in the environment and in human/animal health and disease. PMID:26851442

  12. A catalog of the mouse gut metagenome.

    PubMed

    Xiao, Liang; Feng, Qiang; Liang, Suisha; Sonne, Si Brask; Xia, Zhongkui; Qiu, Xinmin; Li, Xiaoping; Long, Hua; Zhang, Jianfeng; Zhang, Dongya; Liu, Chuan; Fang, Zhiwei; Chou, Joyce; Glanville, Jacob; Hao, Qin; Kotowska, Dorota; Colding, Camilla; Licht, Tine Rask; Wu, Donghai; Yu, Jun; Sung, Joseph Jao Yiu; Liang, Qiaoyi; Li, Junhua; Jia, Huijue; Lan, Zhou; Tremaroli, Valentina; Dworzynski, Piotr; Nielsen, H Bjørn; Bäckhed, Fredrik; Doré, Joël; Le Chatelier, Emmanuelle; Ehrlich, S Dusko; Lin, John C; Arumugam, Manimozhiyan; Wang, Jun; Madsen, Lise; Kristiansen, Karsten

    2015-10-01

    We established a catalog of the mouse gut metagenome comprising ∼2.6 million nonredundant genes by sequencing DNA from fecal samples of 184 mice. To secure high microbiome diversity, we used mouse strains of diverse genetic backgrounds, from different providers, kept in different housing laboratories and fed either a low-fat or high-fat diet. Similar to the human gut microbiome, >99% of the cataloged genes are bacterial. We identified 541 metagenomic species and defined a core set of 26 metagenomic species found in 95% of the mice. The mouse gut microbiome is functionally similar to its human counterpart, with 95.2% of its Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologous groups in common. However, only 4.0% of the mouse gut microbial genes were shared (95% identity, 90% coverage) with those of the human gut microbiome. This catalog provides a useful reference for future studies.

  13. Pathway-Based Functional Analysis of Metagenomes

    NASA Astrophysics Data System (ADS)

    Bercovici, Sivan; Sharon, Itai; Pinter, Ron Y.; Shlomi, Tomer

    Metagenomic data enables the study of microbes and viruses through their DNA as retrieved directly from the environment in which they live. Functional analysis of metagenomes explores the abundance of gene families, pathways, and systems, rather than their taxonomy. Through such analysis researchers are able to identify those functional capabilities most important to organisms in the examined environment. Recently, a statistical framework for the functional analysis of metagenomes was described that focuses on gene families. Here we describe two pathway level computational models for functional analysis that take into account important, yet unaddressed issues such as pathway size, gene length and overlap in gene content among pathways. We test our models over carefully designed simulated data and propose novel approaches for performance evaluation. Our models significantly improve over current approach with respect to pathway ranking and the computations of relative abundance of pathways in environments.

  14. Analysis of 23S rRNA genes in metagenomes - a case study from the Global Ocean Sampling Expedition.

    PubMed

    Yilmaz, Pelin; Kottmann, Renzo; Pruesse, Elmar; Quast, Christian; Glöckner, Frank Oliver

    2011-09-01

    As an evolutionary marker, 23S ribosomal RNA (rRNA) offers more diagnostic sequence stretches and greater sequence variation than 16S rRNA. However, 23S rRNA is still not as widely used. Based on 80 metagenome samples from the Global Ocean Sampling (GOS) Expedition, the usefulness and taxonomic resolution of 23S rRNA were compared to those of 16S rRNA. Since 23S rRNA is approximately twice as large as 16S rRNA, twice as many 23S rRNA gene fragments were retrieved from the GOS reads than 16S rRNA gene fragments, with 23S rRNA gene fragments being generally about 100bp longer. Datasets for 16S and 23S rRNA sequences revealed similar relative abundances for major marine bacterial and archaeal taxa. However, 16S rRNA sequences had a better taxonomic resolution due to their significantly larger reference database. Reevaluation of the specificity of previously published PCR amplification primers and group specific fluorescence in situ hybridization probes on this metagenomic set of non-amplified 23S rRNA sequences revealed that out of 16 primers investigated, only two had more than 90% target group coverage. Evaluations of two probes, BET42a and GAM42a, were in accordance with previous evaluations, with a discrepancy in the target group coverage of the GAM42a probe when evaluated against the GOS metagenomic dataset.

  15. Comparison of metagenomic samples using sequence signatures

    PubMed Central

    2012-01-01

    Background Sequence signatures, as defined by the frequencies of k-tuples (or k-mers, k-grams), have been used extensively to compare genomic sequences of individual organisms, to identify cis-regulatory modules, and to study the evolution of regulatory sequences. Recently many next-generation sequencing (NGS) read data sets of metagenomic samples from a variety of different environments have been generated. The assembly of these reads can be difficult and analysis methods based on mapping reads to genes or pathways are also restricted by the availability and completeness of existing databases. Sequence-signature-based methods, however, do not need the complete genomes or existing databases and thus, can potentially be very useful for the comparison of metagenomic samples using NGS read data. Still, the applications of sequence signature methods for the comparison of metagenomic samples have not been well studied. Results We studied several dissimilarity measures, including d2, d2* and d2S recently developed from our group, a measure (hereinafter noted as Hao) used in CVTree developed from Hao’s group (Qi et al., 2004), measures based on relative di-, tri-, and tetra-nucleotide frequencies as in Willner et al. (2009), as well as standard lp measures between the frequency vectors, for the comparison of metagenomic samples using sequence signatures. We compared their performance using a series of extensive simulations and three real next-generation sequencing (NGS) metagenomic datasets: 39 fecal samples from 33 mammalian host species, 56 marine samples across the world, and 13 fecal samples from human individuals. Results showed that the dissimilarity measure d2S can achieve superior performance when comparing metagenomic samples by clustering them into different groups as well as recovering environmental gradients affecting microbial samples. New insights into the environmental factors affecting microbial compositions in metagenomic samples are obtained through

  16. [Metagenomics in studying gastrointestinal tract microorganism].

    PubMed

    Xu, Bo; Yang, Yunjuan; Li, Junjun; Tang, Xianghua; Mu, Yuelin; Huang, Zunxi

    2013-12-01

    Animal gastrointestinal tract contains a complex community of microbes, whose composition ultimately reflects the co-evolution of microorganisms with their animal host. The gut microbial community of humans and animals has received significant attention from researchers because of its association with health and disease. The application of metagenomics technology enables researchers to study not only the microbial composition but also the function of microbes in the gastrointestinal tract. In this paper, combined with our own findings, we summarized advances in studying gastrointestinal tract microorganism with metagenomics and the bioinformatics technology.

  17. A metagenomic study of primate insect diet diversity.

    PubMed

    Pickett, Sarah B; Bergey, Christina M; Di Fiore, Anthony

    2012-07-01

    Descriptions of primate diets are generally based on either direct observation of foraging behavior, morphological classification of food remains from feces, or analysis of the stomach contents of deceased individuals. Some diet items (e.g. insect prey), however, are difficult to identify visually, and observation conditions often do not permit adequate quantitative sampling of feeding behavior. Moreover, the taxonomically informative morphology of some food species (e.g. swallowed seeds, insect exoskeletons) may be destroyed by the digestive process. Because of these limitations, we used a metagenomic approach to conduct a preliminary, "proof of concept" study of interspecific variation in the insect component of the diets of six sympatric New World monkeys known, based on observational field studies, to differ markedly in their feeding ecology. We used generalized arthropod polymerase chain reaction (PCR) primers and cloning to sequence mitochondrial DNA (mtDNA) sequences of the arthropod cytochrome b (CYT B) gene from fecal samples of wild woolly, titi, saki, capuchin, squirrel, and spider monkeys collected from a single sampling site in western Amazonia where these genera occur sympatrically. We then assigned preliminary taxonomic identifications to the sequences by basic local alignment search tool (BLAST) comparison to arthropod CYT B sequences present in GenBank. This study is the first to use molecular techniques to identify insect prey in primate diets. The results suggest that a metagenomic approach may prove valuable in augmenting and corroborating observational data and increasing the resolution of primate diet studies, although the lack of comparative reference sequences for many South American insects limits the approach at present. As such reference data become available for more animal and plant taxa, this approach also holds promise for studying additional components of primate diets. PMID:22553123

  18. Metagenomic exploration of viruses throughout the Indian Ocean.

    PubMed

    Williamson, Shannon J; Allen, Lisa Zeigler; Lorenzi, Hernan A; Fadrosh, Douglas W; Brami, Daniel; Thiagarajan, Mathangi; McCrow, John P; Tovchigrechko, Andrey; Yooseph, Shibu; Venter, J Craig

    2012-01-01

    The characterization of global marine microbial taxonomic and functional diversity is a primary goal of the Global Ocean Sampling Expedition. As part of this study, 19 water samples were collected aboard the Sorcerer II sailing vessel from the southern Indian Ocean in an effort to more thoroughly understand the lifestyle strategies of the microbial inhabitants of this ultra-oligotrophic region. No investigations of whole virioplankton assemblages have been conducted on waters collected from the Indian Ocean or across multiple size fractions thus far. Therefore, the goals of this study were to examine the effect of size fractionation on viral consortia structure and function and understand the diversity and functional potential of the Indian Ocean virome. Five samples were selected for comprehensive metagenomic exploration; and sequencing was performed on the microbes captured on 3.0-, 0.8- and 0.1 µm membrane filters as well as the viral fraction (<0.1 µm). Phylogenetic approaches were also used to identify predicted proteins of viral origin in the larger fractions of data from all Indian Ocean samples, which were included in subsequent metagenomic analyses. Taxonomic profiling of viral sequences suggested that size fractionation of marine microbial communities enriches for specific groups of viruses within the different size classes and functional characterization further substantiated this observation. Functional analyses also revealed a relative enrichment for metabolic proteins of viral origin that potentially reflect the physiological condition of host cells in the Indian Ocean including those involved in nitrogen metabolism and oxidative phosphorylation. A novel classification method, MGTAXA, was used to assess virus-host relationships in the Indian Ocean by predicting the taxonomy of putative host genera, with Prochlorococcus, Acanthochlois and members of the SAR86 cluster comprising the most abundant predictions. This is the first study to holistically

  19. A metagenomic study of primate insect diet diversity.

    PubMed

    Pickett, Sarah B; Bergey, Christina M; Di Fiore, Anthony

    2012-07-01

    Descriptions of primate diets are generally based on either direct observation of foraging behavior, morphological classification of food remains from feces, or analysis of the stomach contents of deceased individuals. Some diet items (e.g. insect prey), however, are difficult to identify visually, and observation conditions often do not permit adequate quantitative sampling of feeding behavior. Moreover, the taxonomically informative morphology of some food species (e.g. swallowed seeds, insect exoskeletons) may be destroyed by the digestive process. Because of these limitations, we used a metagenomic approach to conduct a preliminary, "proof of concept" study of interspecific variation in the insect component of the diets of six sympatric New World monkeys known, based on observational field studies, to differ markedly in their feeding ecology. We used generalized arthropod polymerase chain reaction (PCR) primers and cloning to sequence mitochondrial DNA (mtDNA) sequences of the arthropod cytochrome b (CYT B) gene from fecal samples of wild woolly, titi, saki, capuchin, squirrel, and spider monkeys collected from a single sampling site in western Amazonia where these genera occur sympatrically. We then assigned preliminary taxonomic identifications to the sequences by basic local alignment search tool (BLAST) comparison to arthropod CYT B sequences present in GenBank. This study is the first to use molecular techniques to identify insect prey in primate diets. The results suggest that a metagenomic approach may prove valuable in augmenting and corroborating observational data and increasing the resolution of primate diet studies, although the lack of comparative reference sequences for many South American insects limits the approach at present. As such reference data become available for more animal and plant taxa, this approach also holds promise for studying additional components of primate diets.

  20. Metagenomic Exploration of Viruses throughout the Indian Ocean

    PubMed Central

    Lorenzi, Hernan A.; Fadrosh, Douglas W.; Brami, Daniel; Thiagarajan, Mathangi; McCrow, John P.; Tovchigrechko, Andrey; Yooseph, Shibu; Venter, J. Craig

    2012-01-01

    The characterization of global marine microbial taxonomic and functional diversity is a primary goal of the Global Ocean Sampling Expedition. As part of this study, 19 water samples were collected aboard the Sorcerer II sailing vessel from the southern Indian Ocean in an effort to more thoroughly understand the lifestyle strategies of the microbial inhabitants of this ultra-oligotrophic region. No investigations of whole virioplankton assemblages have been conducted on waters collected from the Indian Ocean or across multiple size fractions thus far. Therefore, the goals of this study were to examine the effect of size fractionation on viral consortia structure and function and understand the diversity and functional potential of the Indian Ocean virome. Five samples were selected for comprehensive metagenomic exploration; and sequencing was performed on the microbes captured on 3.0-, 0.8- and 0.1 µm membrane filters as well as the viral fraction (<0.1 µm). Phylogenetic approaches were also used to identify predicted proteins of viral origin in the larger fractions of data from all Indian Ocean samples, which were included in subsequent metagenomic analyses. Taxonomic profiling of viral sequences suggested that size fractionation of marine microbial communities enriches for specific groups of viruses within the different size classes and functional characterization further substantiated this observation. Functional analyses also revealed a relative enrichment for metabolic proteins of viral origin that potentially reflect the physiological condition of host cells in the Indian Ocean including those involved in nitrogen metabolism and oxidative phosphorylation. A novel classification method, MGTAXA, was used to assess virus-host relationships in the Indian Ocean by predicting the taxonomy of putative host genera, with Prochlorococcus, Acanthochlois and members of the SAR86 cluster comprising the most abundant predictions. This is the first study to holistically

  1. Metagenomics as a Tool for Enzyme Discovery: Hydrolytic Enzymes from Marine-Related Metagenomes.

    PubMed

    Popovic, Ana; Tchigvintsev, Anatoly; Tran, Hai; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N; Yakunin, Alexander F

    2015-01-01

    This chapter discusses metagenomics and its application for enzyme discovery, with a focus on hydrolytic enzymes from marine metagenomic libraries. With less than one percent of culturable microorganisms in the environment, metagenomics, or the collective study of community genetics, has opened up a rich pool of uncharacterized metabolic pathways, enzymes, and adaptations. This great untapped pool of genes provides the particularly exciting potential to mine for new biochemical activities or novel enzymes with activities tailored to peculiar sets of environmental conditions. Metagenomes also represent a huge reservoir of novel enzymes for applications in biocatalysis, biofuels, and bioremediation. Here we present the results of enzyme discovery for four enzyme activities, of particular industrial or environmental interest, including esterase/lipase, glycosyl hydrolase, protease and dehalogenase. PMID:26621459

  2. Metagenomics as a Tool for Enzyme Discovery: Hydrolytic Enzymes from Marine-Related Metagenomes.

    PubMed

    Popovic, Ana; Tchigvintsev, Anatoly; Tran, Hai; Chernikova, Tatyana N; Golyshina, Olga V; Yakimov, Michail M; Golyshin, Peter N; Yakunin, Alexander F

    2015-01-01

    This chapter discusses metagenomics and its application for enzyme discovery, with a focus on hydrolytic enzymes from marine metagenomic libraries. With less than one percent of culturable microorganisms in the environment, metagenomics, or the collective study of community genetics, has opened up a rich pool of uncharacterized metabolic pathways, enzymes, and adaptations. This great untapped pool of genes provides the particularly exciting potential to mine for new biochemical activities or novel enzymes with activities tailored to peculiar sets of environmental conditions. Metagenomes also represent a huge reservoir of novel enzymes for applications in biocatalysis, biofuels, and bioremediation. Here we present the results of enzyme discovery for four enzyme activities, of particular industrial or environmental interest, including esterase/lipase, glycosyl hydrolase, protease and dehalogenase.

  3. Unravelling core microbial metabolisms in the hypersaline microbial mats of Shark Bay using high-throughput metagenomics

    SciTech Connect

    Ruvindy, Rendy; White, Richard A.; Neilan, Brett A.; Burns, Brendan P.

    2015-05-29

    Modern microbial mats are potential analogues of some of Earth’s earliest ecosystems. Excellent examples can be found in Shark Bay, Australia, with mats of various morphologies. To further our understanding of the functional genetic potential of these complex microbial ecosystems, we conducted for the first time shotgun metagenomic analyses. We assembled metagenomic nextgeneration sequencing data to classify the taxonomic and metabolic potential across diverse morphologies of marine mats in Shark Bay. The microbial community across taxonomic classifications using protein-coding and small subunit rRNA genes directly extracted from the metagenomes suggests that three phyla Proteobacteria, Cyanobacteria and Bacteriodetes dominate all marine mats. However, the microbial community structure between Shark Bay and Highbourne Cay (Bahamas) marine systems appears to be distinct from each other. The metabolic potential (based on SEED subsystem classifications) of the Shark Bay and Highbourne Cay microbial communities were also distinct. Shark Bay metagenomes have a metabolic pathway profile consisting of both heterotrophic and photosynthetic pathways, whereas Highbourne Cay appears to be dominated almost exclusively by photosynthetic pathways. Alternative non-rubisco-based carbon metabolism including reductive TCA cycle and 3-hydroxypropionate/4-hydroxybutyrate pathways is highly represented in Shark Bay metagenomes while not represented in Highbourne Cay microbial mats or any other mat forming ecosystems investigated to date. Potentially novel aspects of nitrogen cycling were also observed, as well as putative heavy metal cycling (arsenic, mercury, copper and cadmium). Finally, archaea are highly represented in Shark Bay and may have critical roles in overall ecosystem function in these modern microbial mats.

  4. Unravelling core microbial metabolisms in the hypersaline microbial mats of Shark Bay using high-throughput metagenomics.

    PubMed

    Ruvindy, Rendy; White, Richard Allen; Neilan, Brett Anthony; Burns, Brendan Paul

    2016-01-01

    Modern microbial mats are potential analogues of some of Earth's earliest ecosystems. Excellent examples can be found in Shark Bay, Australia, with mats of various morphologies. To further our understanding of the functional genetic potential of these complex microbial ecosystems, we conducted for the first time shotgun metagenomic analyses. We assembled metagenomic next-generation sequencing data to classify the taxonomic and metabolic potential across diverse morphologies of marine mats in Shark Bay. The microbial community across taxonomic classifications using protein-coding and small subunit rRNA genes directly extracted from the metagenomes suggests that three phyla Proteobacteria, Cyanobacteria and Bacteriodetes dominate all marine mats. However, the microbial community structure between Shark Bay and Highbourne Cay (Bahamas) marine systems appears to be distinct from each other. The metabolic potential (based on SEED subsystem classifications) of the Shark Bay and Highbourne Cay microbial communities were also distinct. Shark Bay metagenomes have a metabolic pathway profile consisting of both heterotrophic and photosynthetic pathways, whereas Highbourne Cay appears to be dominated almost exclusively by photosynthetic pathways. Alternative non-rubisco-based carbon metabolism including reductive TCA cycle and 3-hydroxypropionate/4-hydroxybutyrate pathways is highly represented in Shark Bay metagenomes while not represented in Highbourne Cay microbial mats or any other mat forming ecosystems investigated to date. Potentially novel aspects of nitrogen cycling were also observed, as well as putative heavy metal cycling (arsenic, mercury, copper and cadmium). Finally, archaea are highly represented in Shark Bay and may have critical roles in overall ecosystem function in these modern microbial mats. PMID:26023869

  5. Unravelling core microbial metabolisms in the hypersaline microbial mats of Shark Bay using high-throughput metagenomics.

    PubMed

    Ruvindy, Rendy; White, Richard Allen; Neilan, Brett Anthony; Burns, Brendan Paul

    2016-01-01

    Modern microbial mats are potential analogues of some of Earth's earliest ecosystems. Excellent examples can be found in Shark Bay, Australia, with mats of various morphologies. To further our understanding of the functional genetic potential of these complex microbial ecosystems, we conducted for the first time shotgun metagenomic analyses. We assembled metagenomic next-generation sequencing data to classify the taxonomic and metabolic potential across diverse morphologies of marine mats in Shark Bay. The microbial community across taxonomic classifications using protein-coding and small subunit rRNA genes directly extracted from the metagenomes suggests that three phyla Proteobacteria, Cyanobacteria and Bacteriodetes dominate all marine mats. However, the microbial community structure between Shark Bay and Highbourne Cay (Bahamas) marine systems appears to be distinct from each other. The metabolic potential (based on SEED subsystem classifications) of the Shark Bay and Highbourne Cay microbial communities were also distinct. Shark Bay metagenomes have a metabolic pathway profile consisting of both heterotrophic and photosynthetic pathways, whereas Highbourne Cay appears to be dominated almost exclusively by photosynthetic pathways. Alternative non-rubisco-based carbon metabolism including reductive TCA cycle and 3-hydroxypropionate/4-hydroxybutyrate pathways is highly represented in Shark Bay metagenomes while not represented in Highbourne Cay microbial mats or any other mat forming ecosystems investigated to date. Potentially novel aspects of nitrogen cycling were also observed, as well as putative heavy metal cycling (arsenic, mercury, copper and cadmium). Finally, archaea are highly represented in Shark Bay and may have critical roles in overall ecosystem function in these modern microbial mats.

  6. Physiological and evolutionary potential of microorganisms from the Canterbury Basin subseafloor, a metagenomic approach.

    PubMed

    Gaboyer, Frédéric; Burgaud, Gaëtan; Alain, Karine

    2015-05-01

    Subseafloor sediments represent a large reservoir of organic matter and are inhabited by microbial groups of the three domains of life. Besides impacting the planetary geochemical cycles, the subsurface biosphere remains poorly understood, notably questions related to possible metabolic pathways and selective advantages that may be deployed by buried microorganisms (sporulation, response to stress, dormancy). In order to better understand physiological potentials and possible lifestyles of subseafloor microbial communities, we analyzed two metagenomes from subseafloor sediments collected at 31 mbsf (meters below the sea floor) and 136 mbsf in the Canterbury Basin. Metagenomic phylogenetic and functional diversities were very similar. Phylogenetic diversity was mostly represented by Chloroflexi, Firmicutes and Proteobacteria for Bacteria and by Thaumarchaeota and Euryarchaeota for Archaea. Predicted anaerobic metabolisms encompassed fermentation, methanogenesis and utilization of fatty acids, aromatic and halogenated substrates. Potential processes that may confer selective advantages for subsurface microorganisms included sporulation, detoxication equipment or osmolyte accumulation. Annotation of genomic fragments described the metabolic versatility of Chloroflexi, Miscellaneous Crenarchaeotic Group and Euryarchaeota and showed frequent recombination events within subsurface taxa. This study confirmed that the subseafloor habitat is unique compared to other habitats at the (meta)-genomic level and described physiological potential of still uncultured groups. PMID:25873465

  7. The Metagenome of Utricularia gibba's Traps: Into the Microbial Input to a Carnivorous Plant

    PubMed Central

    Alcaraz, Luis David; Martínez-Sánchez, Shamayim; Torres, Ignacio; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis

    2016-01-01

    The genome and transcriptome sequences of the aquatic, rootless, and carnivorous plant Utricularia gibba L. (Lentibulariaceae), were recently determined. Traps are necessary for U. gibba because they help the plant to survive in nutrient-deprived environments. The U. gibba's traps (Ugt) are specialized structures that have been proposed to selectively filter microbial inhabitants. To determine whether the traps indeed have a microbiome that differs, in composition or abundance, from the microbiome in the surrounding environment, we used whole-genome shotgun (WGS) metagenomics to describe both the taxonomic and functional diversity of the Ugt microbiome. We collected U. gibba plants from their natural habitat and directly sequenced the metagenome of the Ugt microbiome and its surrounding water. The total predicted number of species in the Ugt was more than 1,100. Using pan-genome fragment recruitment analysis, we were able to identify to the species level of some key Ugt players, such as Pseudomonas monteilii. Functional analysis of the Ugt metagenome suggests that the trap microbiome plays an important role in nutrient scavenging and assimilation while complementing the hydrolytic functions of the plant. PMID:26859489

  8. The Metagenome of Utricularia gibba's Traps: Into the Microbial Input to a Carnivorous Plant.

    PubMed

    Alcaraz, Luis David; Martínez-Sánchez, Shamayim; Torres, Ignacio; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis

    2016-01-01

    The genome and transcriptome sequences of the aquatic, rootless, and carnivorous plant Utricularia gibba L. (Lentibulariaceae), were recently determined. Traps are necessary for U. gibba because they help the plant to survive in nutrient-deprived environments. The U. gibba's traps (Ugt) are specialized structures that have been proposed to selectively filter microbial inhabitants. To determine whether the traps indeed have a microbiome that differs, in composition or abundance, from the microbiome in the surrounding environment, we used whole-genome shotgun (WGS) metagenomics to describe both the taxonomic and functional diversity of the Ugt microbiome. We collected U. gibba plants from their natural habitat and directly sequenced the metagenome of the Ugt microbiome and its surrounding water. The total predicted number of species in the Ugt was more than 1,100. Using pan-genome fragment recruitment analysis, we were able to identify to the species level of some key Ugt players, such as Pseudomonas monteilii. Functional analysis of the Ugt metagenome suggests that the trap microbiome plays an important role in nutrient scavenging and assimilation while complementing the hydrolytic functions of the plant.

  9. Novel resistance functions uncovered using functional metagenomic investigations of resistance reservoirs

    PubMed Central

    Pehrsson, Erica C.; Forsberg, Kevin J.; Gibson, Molly K.; Ahmadi, Sara; Dantas, Gautam

    2013-01-01

    Rates of infection with antibiotic-resistant bacteria have increased precipitously over the past several decades, with far-reaching healthcare and societal costs. Recent evidence has established a link between antibiotic resistance genes in human pathogens and those found in non-pathogenic, commensal, and environmental organisms, prompting deeper investigation of natural and human-associated reservoirs of antibiotic resistance. Functional metagenomic selections, in which shotgun-cloned DNA fragments are selected for their ability to confer survival to an indicator host, have been increasingly applied to the characterization of many antibiotic resistance reservoirs. These experiments have demonstrated that antibiotic resistance genes are highly diverse and widely distributed, many times bearing little to no similarity to known sequences. Through unbiased selections for survival to antibiotic exposure, functional metagenomics can improve annotations by reducing the discovery of false-positive resistance and by allowing for the identification of previously unrecognizable resistance genes. In this review, we summarize the novel resistance functions uncovered using functional metagenomic investigations of natural and human-impacted resistance reservoirs. Examples of novel antibiotic resistance genes include those highly divergent from known sequences, those for which sequence is entirely unable to predict resistance function, bifunctional resistance genes, and those with unconventional, atypical resistance mechanisms. Overcoming antibiotic resistance in the clinic will require a better understanding of existing resistance reservoirs and the dissemination networks that govern horizontal gene exchange, informing best practices to limit the spread of resistance-conferring genes to human pathogens. PMID:23760651

  10. Multisubstrate Isotope Labeling and Metagenomic Analysis of Active Soil Bacterial Communities

    PubMed Central

    Verastegui, Y.; Cheng, J.; Engel, K.; Kolczynski, D.; Mortimer, S.; Lavigne, J.; Montalibet, J.; Romantsov, T.; Hall, M.; McConkey, B. J.; Rose, D. R.; Tomashek, J. J.; Scott, B. R.

    2014-01-01

    ABSTRACT Soil microbial diversity represents the largest global reservoir of novel microorganisms and enzymes. In this study, we coupled functional metagenomics and DNA stable-isotope probing (DNA-SIP) using multiple plant-derived carbon substrates and diverse soils to characterize active soil bacterial communities and their glycoside hydrolase genes, which have value for industrial applications. We incubated samples from three disparate Canadian soils (tundra, temperate rainforest, and agricultural) with five native carbon (12C) or stable-isotope-labeled (13C) carbohydrates (glucose, cellobiose, xylose, arabinose, and cellulose). Indicator species analysis revealed high specificity and fidelity for many uncultured and unclassified bacterial taxa in the heavy DNA for all soils and substrates. Among characterized taxa, Actinomycetales (Salinibacterium), Rhizobiales (Devosia), Rhodospirillales (Telmatospirillum), and Caulobacterales (Phenylobacterium and Asticcacaulis) were bacterial indicator species for the heavy substrates and soils tested. Both Actinomycetales and Caulobacterales (Phenylobacterium) were associated with metabolism of cellulose, and Alphaproteobacteria were associated with the metabolism of arabinose; members of the order Rhizobiales were strongly associated with the metabolism of xylose. Annotated metagenomic data suggested diverse glycoside hydrolase gene representation within the pooled heavy DNA. By screening 2,876 cloned fragments derived from the 13C-labeled DNA isolated from soils incubated with cellulose, we demonstrate the power of combining DNA-SIP, multiple-displacement amplification (MDA), and functional metagenomics by efficiently isolating multiple clones with activity on carboxymethyl cellulose and fluorogenic proxy substrates for carbohydrate-active enzymes. PMID:25028422

  11. The Metagenome of Utricularia gibba's Traps: Into the Microbial Input to a Carnivorous Plant.

    PubMed

    Alcaraz, Luis David; Martínez-Sánchez, Shamayim; Torres, Ignacio; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis

    2016-01-01

    The genome and transcriptome sequences of the aquatic, rootless, and carnivorous plant Utricularia gibba L. (Lentibulariaceae), were recently determined. Traps are necessary for U. gibba because they help the plant to survive in nutrient-deprived environments. The U. gibba's traps (Ugt) are specialized structures that have been proposed to selectively filter microbial inhabitants. To determine whether the traps indeed have a microbiome that differs, in composition or abundance, from the microbiome in the surrounding environment, we used whole-genome shotgun (WGS) metagenomics to describe both the taxonomic and functional diversity of the Ugt microbiome. We collected U. gibba plants from their natural habitat and directly sequenced the metagenome of the Ugt microbiome and its surrounding water. The total predicted number of species in the Ugt was more than 1,100. Using pan-genome fragment recruitment analysis, we were able to identify to the species level of some key Ugt players, such as Pseudomonas monteilii. Functional analysis of the Ugt metagenome suggests that the trap microbiome plays an important role in nutrient scavenging and assimilation while complementing the hydrolytic functions of the plant. PMID:26859489

  12. Microbial Diversity and Biochemical Potential Encoded by Thermal Spring Metagenomes Derived from the Kamchatka Peninsula

    PubMed Central

    Wemheuer, Bernd; Taube, Robert; Akyol, Pinar; Wemheuer, Franziska; Daniel, Rolf

    2013-01-01

    Volcanic regions contain a variety of environments suitable for extremophiles. This study was focused on assessing and exploiting the prokaryotic diversity of two microbial communities derived from different Kamchatkian thermal springs by metagenomic approaches. Samples were taken from a thermoacidophilic spring near the Mutnovsky Volcano and from a thermophilic spring in the Uzon Caldera. Environmental DNA for metagenomic analysis was isolated from collected sediment samples by direct cell lysis. The prokaryotic community composition was examined by analysis of archaeal and bacterial 16S rRNA genes. A total number of 1235 16S rRNA gene sequences were obtained and used for taxonomic classification. Most abundant in the samples were members of Thaumarchaeota, Thermotogae, and Proteobacteria. The Mutnovsky hot spring was dominated by the Terrestrial Hot Spring Group, Kosmotoga, and Acidithiobacillus. The Uzon Caldera was dominated by uncultured members of the Miscellaneous Crenarchaeotic Group and Enterobacteriaceae. The remaining 16S rRNA gene sequences belonged to the Aquificae, Dictyoglomi, Euryarchaeota, Korarchaeota, Thermodesulfobacteria, Firmicutes, and some potential new phyla. In addition, the recovered DNA was used for generation of metagenomic libraries, which were subsequently mined for genes encoding lipolytic and proteolytic enzymes. Three novel genes conferring lipolytic and one gene conferring proteolytic activity were identified. PMID:23533327

  13. Towards a more complete metagenomics toolkit

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The emerging scientific discipline of metagenomics has not only created a myriad of opportunities for biologists to reveal new insights into the microbial underpinnings of our environment, but has also presented a number of interesting challenges for bioinformatics algorithms and software developers...

  14. The metagenomic approach and causality in virology.

    PubMed

    Castrignano, Silvana Beres; Nagasse-Sugahara, Teresa Keico

    2015-01-01

    Nowadays, the metagenomic approach has been a very important tool in the discovery of new viruses in environmental and biological samples. Here we discuss how these discoveries may help to elucidate the etiology of diseases and the criteria necessary to establish a causal association between a virus and a disease.

  15. Biomolecular and metagenomic analyses of biofouling communities

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Despite the decades of research that have focused on understanding the formation of biofouling communities, relatively little is known about the soft fouling consortia that are responsible for their formation and function. In this study, we used PhyloChip microbial profiling, metagenomic DNA sequenc...

  16. The metagenomic approach and causality in virology

    PubMed Central

    Castrignano, Silvana Beres; Nagasse-Sugahara, Teresa Keico

    2015-01-01

    Nowadays, the metagenomic approach has been a very important tool in the discovery of new viruses in environmental and biological samples. Here we discuss how these discoveries may help to elucidate the etiology of diseases and the criteria necessary to establish a causal association between a virus and a disease. PMID:25902566

  17. Assembly of viral genomes from metagenomes

    PubMed Central

    Smits, Saskia L.; Bodewes, Rogier; Ruiz-Gonzalez, Aritz; Baumgärtner, Wolfgang; Koopmans, Marion P.; Osterhaus, Albert D. M. E.; Schürch, Anita C.

    2014-01-01

    Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity are, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes. PMID:25566226

  18. Functional metagenomics: recent advances and future challenges.

    PubMed

    Chistoserdovai, Ludmila

    2010-01-01

    Metagenomics is a relatively new but fast growing field within environmental biology directed at obtaining knowledge on genomes of environmental microbes as well as of entire microbial communities. With the sequencing technologies improving steadily, generating large amounts of sequence is becoming routine. However, it remains difficult to connect specific microbial phyla to specific functions in the environment. A number of 'functional metagenomics' approaches have been implemented in the recent years that allow high-resolution genomic analysis of uncultivated microbes, connecting them to specific functions in the environment. These include analysis of niche-specialized low complexity communities, reactor enrichments, and the use labeling technologies. Metatranscriptomics and metaproteomics are the newest sub-disciplines within the metagenomics field that provide further levels of resolution for functional analysis of uncultivated microbes and communities. The recent emergence of new (next generation) sequencing technologies, resulting in higher sequence output and dramatic drop in the price of sequencing, will be defining a new era in metagenomics. At this time the sequencing effort will be taken to a new level to allow addressing new, previously unattainable biological questions as well as accelerating genome-based discovery for medical and biotechnological applications.

  19. Fractals and fragmentation

    NASA Technical Reports Server (NTRS)

    Turcotte, D. L.

    1986-01-01

    The use of renormalization group techniques on fragmentation problems is examined. The equations which represent fractals and the size-frequency distributions of fragments are presented. Method for calculating the size distributions of asteriods and meteorites are described; the frequency-mass distribution for these interplanetary objects are due to fragmentation. The application of two renormalization group models to fragmentation is analyzed. It is observed that the models yield a fractal behavior for fragmentation; however, different values for the fractal dimension are produced . It is concluded that fragmentation is a scale invariant process and that the fractal dimension is a measure of the fragility of the fragmented material.

  20. Comparative metagenomics of microbial communities inhabiting deep-sea hydrothermal vent chimneys with contrasting chemistries

    PubMed Central

    Xie, Wei; Wang, Fengping; Guo, Lei; Chen, Zeling; Sievert, Stefan M; Meng, Jun; Huang, Guangrui; Li, Yuxin; Yan, Qingyu; Wu, Shan; Wang, Xin; Chen, Shangwu; He, Guangyuan; Xiao, Xiang; Xu, Anlong

    2011-01-01

    Deep-sea hydrothermal vent chimneys harbor a high diversity of largely unknown microorganisms. Although the phylogenetic diversity of these microorganisms has been described previously, the adaptation and metabolic potential of the microbial communities is only beginning to be revealed. A pyrosequencing approach was used to directly obtain sequences from a fosmid library constructed from a black smoker chimney 4143-1 in the Mothra hydrothermal vent field at the Juan de Fuca Ridge. A total of 308 034 reads with an average sequence length of 227 bp were generated. Comparative genomic analyses of metagenomes from a variety of environments by two-way clustering of samples and functional gene categories demonstrated that the 4143-1 metagenome clustered most closely with that from a carbonate chimney from Lost City. Both are highly enriched in genes for mismatch repair and homologous recombination, suggesting that the microbial communities have evolved extensive DNA repair systems to cope with the extreme conditions that have potential deleterious effects on the genomes. As previously reported for the Lost City microbiome, the metagenome of chimney 4143-1 exhibited a high proportion of transposases, implying that horizontal gene transfer may be a common occurrence in the deep-sea vent chimney biosphere. In addition, genes for chemotaxis and flagellar assembly were highly enriched in the chimney metagenomes, reflecting the adaptation of the organisms to the highly dynamic conditions present within the chimney walls. Reconstruction of the metabolic pathways revealed that the microbial community in the wall of chimney 4143-1 was mainly fueled by sulfur oxidation, putatively coupled to nitrate reduction to perform inorganic carbon fixation through the Calvin–Benson–Bassham cycle. On the basis of the genomic organization of the key genes of the carbon fixation and sulfur oxidation pathways contained in the large genomic fragments, both obligate and facultative

  1. Metagenomic Analysis Suggests Modern Freshwater Microbialites Harbor a Distinct Core Microbial Community.

    PubMed

    White, Richard Allen; Chan, Amy M; Gavelis, Gregory S; Leander, Brian S; Brady, Allyson L; Slater, Gregory F; Lim, Darlene S S; Suttle, Curtis A

    2015-01-01

    Modern microbialites are complex microbial communities that interface with abiotic factors to form carbonate-rich organosedimentary structures whose ancestors provide the earliest evidence of life. Past studies primarily on marine microbialites have inventoried diverse taxa and metabolic pathways, but it is unclear which of these are members of the microbialite community and which are introduced from adjacent environments. Here we control for these factors by sampling the surrounding water and nearby sediment, in addition to the microbialites and use a metagenomics approach to interrogate the microbial community. Our findings suggest that the Pavilion Lake microbialite community profile, metabolic potential and pathway distributions are distinct from those in the neighboring sediments and water. Based on RefSeq classification, members of the Proteobacteria (e.g., alpha and delta classes) were the dominant taxa in the microbialites, and possessed novel functional guilds associated with the metabolism of heavy metals, antibiotic resistance, primary alcohol biosynthesis and urea metabolism; the latter may help drive biomineralization. Urea metabolism within Pavilion Lake microbialites is a feature not previously associated in other microbialites. The microbialite communities were also significantly enriched for cyanobacteria and acidobacteria, which likely play an important role in biomineralization. Additional findings suggest that Pavilion Lake microbialites are under viral selection as genes associated with viral infection (e.g CRISPR-Cas, phage shock and phage excision) are abundant within the microbialite metagenomes. The morphology of Pavilion Lake microbialites changes dramatically with depth; yet, metagenomic data did not vary significantly by morphology or depth, indicating that microbialite morphology is altered by other factors, perhaps transcriptional differences or abiotic conditions. This work provides a comprehensive metagenomic perspective of the

  2. Metagenomic Analysis Suggests Modern Freshwater Microbialites Harbor a Distinct Core Microbial Community

    PubMed Central

    White, Richard Allen; Chan, Amy M.; Gavelis, Gregory S.; Leander, Brian S.; Brady, Allyson L.; Slater, Gregory F.; Lim, Darlene S. S.; Suttle, Curtis A.

    2016-01-01

    Modern microbialites are complex microbial communities that interface with abiotic factors to form carbonate-rich organosedimentary structures whose ancestors provide the earliest evidence of life. Past studies primarily on marine microbialites have inventoried diverse taxa and metabolic pathways, but it is unclear which of these are members of the microbialite community and which are introduced from adjacent environments. Here we control for these factors by sampling the surrounding water and nearby sediment, in addition to the microbialites and use a metagenomics approach to interrogate the microbial community. Our findings suggest that the Pavilion Lake microbialite community profile, metabolic potential and pathway distributions are distinct from those in the neighboring sediments and water. Based on RefSeq classification, members of the Proteobacteria (e.g., alpha and delta classes) were the dominant taxa in the microbialites, and possessed novel functional guilds associated with the metabolism of heavy metals, antibiotic resistance, primary alcohol biosynthesis and urea metabolism; the latter may help drive biomineralization. Urea metabolism within Pavilion Lake microbialites is a feature not previously associated in other microbialites. The microbialite communities were also significantly enriched for cyanobacteria and acidobacteria, which likely play an important role in biomineralization. Additional findings suggest that Pavilion Lake microbialites are under viral selection as genes associated with viral infection (e.g CRISPR-Cas, phage shock and phage excision) are abundant within the microbialite metagenomes. The morphology of Pavilion Lake microbialites changes dramatically with depth; yet, metagenomic data did not vary significantly by morphology or depth, indicating that microbialite morphology is altered by other factors, perhaps transcriptional differences or abiotic conditions. This work provides a comprehensive metagenomic perspective of the

  3. A function-based screen for seeking RubisCO active clones from metagenomes: novel enzymes influencing RubisCO activity.

    PubMed

    Böhnke, Stefanie; Perner, Mirjam

    2015-03-01

    Ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO) is a key enzyme of the Calvin cycle, which is responsible for most of Earth's primary production. Although research on RubisCO genes and enzymes in plants, cyanobacteria and bacteria has been ongoing for years, still little is understood about its regulation and activation in bacteria. Even more so, hardly any information exists about the function of metagenomic RubisCOs and the role of the enzymes encoded on the flanking DNA owing to the lack of available function-based screens for seeking active RubisCOs from the environment. Here we present the first solely activity-based approach for identifying RubisCO active fosmid clones from a metagenomic library. We constructed a metagenomic library from hydrothermal vent fluids and screened 1056 fosmid clones. Twelve clones exhibited RubisCO activity and the metagenomic fragments resembled genes from Thiomicrospira crunogena. One of these clones was further analyzed. It contained a 35.2 kb metagenomic insert carrying the RubisCO gene cluster and flanking DNA regions. Knockouts of twelve genes and two intergenic regions on this metagenomic fragment demonstrated that the RubisCO activity was significantly impaired and was attributed to deletions in genes encoding putative transcriptional regulators and those believed to be vital for RubisCO activation. Our new technique revealed a novel link between a poorly characterized gene and RubisCO activity. This screen opens the door to directly investigating RubisCO genes and respective enzymes from environmental samples.

  4. Classification Options

    ERIC Educational Resources Information Center

    Exceptional Children, 1978

    1978-01-01

    The interview presents opinions of Nicholas Hobbs on the classification of exceptional children, including topics such as ecologically oriented classification systems, the role of parents, and need for revision of teacher preparation programs. (IM)

  5. Metagenomic studies of the Red Sea.

    PubMed

    Behzad, Hayedeh; Ibarra, Martin Augusto; Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    Metagenomics has significantly advanced the field of marine microbial ecology, revealing the vast diversity of previously unknown microbial life forms in different marine niches. The tremendous amount of data generated has enabled identification of a large number of microbial genes (metagenomes), their community interactions, adaptation mechanisms, and their potential applications in pharmaceutical and biotechnology-based industries. Comparative metagenomics reveals that microbial diversity is a function of the local environment, meaning that unique or unusual environments typically harbor novel microbial species with unique genes and metabolic pathways. The Red Sea has an abundance of unique characteristics; however, its microbiota is one of the least studied among marine environments. The Red Sea harbors approximately 25 hot anoxic brine pools, plus a vibrant coral reef ecosystem. Physiochemical studies describe the Red Sea as an oligotrophic environment that contains one of the warmest and saltiest waters in the world with year-round high UV radiations. These characteristics are believed to have shaped the evolution of microbial communities in the Red Sea. Over-representation of genes involved in DNA repair, high-intensity light responses, and osmoregulation were found in the Red Sea metagenomic databases suggesting acquisition of specific environmental adaptation by the Red Sea microbiota. The Red Sea brine pools harbor a diverse range of halophilic and thermophilic bacterial and archaeal communities, which are potential sources of enzymes for pharmaceutical and biotechnology-based application. Understanding the mechanisms of these adaptations and their function within the larger ecosystem could also prove useful in light of predicted global warming scenarios where global ocean temperatures are expected to rise by 1-3°C in the next few decades. In this review, we provide an overview of the published metagenomic studies that were conducted in the Red Sea, and

  6. Metagenomic studies of the Red Sea.

    PubMed

    Behzad, Hayedeh; Ibarra, Martin Augusto; Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    Metagenomics has significantly advanced the field of marine microbial ecology, revealing the vast diversity of previously unknown microbial life forms in different marine niches. The tremendous amount of data generated has enabled identification of a large number of microbial genes (metagenomes), their community interactions, adaptation mechanisms, and their potential applications in pharmaceutical and biotechnology-based industries. Comparative metagenomics reveals that microbial diversity is a function of the local environment, meaning that unique or unusual environments typically harbor novel microbial species with unique genes and metabolic pathways. The Red Sea has an abundance of unique characteristics; however, its microbiota is one of the least studied among marine environments. The Red Sea harbors approximately 25 hot anoxic brine pools, plus a vibrant coral reef ecosystem. Physiochemical studies describe the Red Sea as an oligotrophic environment that contains one of the warmest and saltiest waters in the world with year-round high UV radiations. These characteristics are believed to have shaped the evolution of microbial communities in the Red Sea. Over-representation of genes involved in DNA repair, high-intensity light responses, and osmoregulation were found in the Red Sea metagenomic databases suggesting acquisition of specific environmental adaptation by the Red Sea microbiota. The Red Sea brine pools harbor a diverse range of halophilic and thermophilic bacterial and archaeal communities, which are potential sources of enzymes for pharmaceutical and biotechnology-based application. Understanding the mechanisms of these adaptations and their function within the larger ecosystem could also prove useful in light of predicted global warming scenarios where global ocean temperatures are expected to rise by 1-3°C in the next few decades. In this review, we provide an overview of the published metagenomic studies that were conducted in the Red Sea, and

  7. Current and future trends in metagenomics : Development of knowledge bases

    NASA Astrophysics Data System (ADS)

    Mori, Hiroshi; Yamada, Takuji; Kurokawa, Ken

    Microbes are essential for every part of life on Earth. Numerous microbes inhabit the biosphere, many of which are uncharacterized or uncultivable. They form a complex microbial community that deeply affects against surrounding environments. Metagenome analysis provides a radically new way of examining such complex microbial community without isolation or cultivation of individual bacterial community members. In this article, we present a brief discussion about a metagenomics and the development of knowledge bases, and also discuss about the future trends in metagenomics.

  8. Algal biogeography: metagenomics shows distribution of a picoplanktonic pelagophyte.

    PubMed

    Raven, John A

    2012-09-11

    How can we determine the distribution of uncultured marine microorganisms? Targeted metagenomics has provided the complete chloroplast genome sequence, and the distribution, for a picoplanktonic pelagophyte alga.

  9. An Experimental Metagenome Data Management and AnalysisSystem

    SciTech Connect

    Markowitz, Victor M.; Korzeniewski, Frank; Palaniappan, Krishna; Szeto, Ernest; Ivanova, Natalia N.; Kyrpides, Nikos C.; Hugenholtz, Philip

    2006-03-01

    The application of shotgun sequencing to environmental samples has revealed a new universe of microbial community genomes (metagenomes) involving previously uncultured organisms. Metagenome analysis, which is expected to provide a comprehensive picture of the gene functions and metabolic capacity of microbial community, needs to be conducted in the context of a comprehensive data management and analysis system. We present in this paper IMG/M, an experimental metagenome data management and analysis system that is based on the Integrated Microbial Genomes (IMG) system. IMG/M provides tools and viewers for analyzing both metagenomes and isolate genomes individually or in a comparative context.

  10. The porcine gut microbial metagenomic library for mining novel cellulases established from growing pigs fed cellulose-supplemented high-fat diets.

    PubMed

    Wang, W; Archbold, T; Kimber, M S; Li, J; Lam, J S; Fan, M Z

    2012-12-01

    The porcine gut microbiome is a novel genomic resource for screening cellulose-degrading enzymes. A plasmid metagenomic expression library was constructed from the hindgut microbiota of 6 Yorkshire growing pigs (25 to 40 kg) fed a high-fat basal diet supplemented with 10% Solka-Floc for 28 d. Fresh cecal and colonic digesta samples were collected, flash-frozen in liquid N, and stored under -80°C. Metagenomic DNA was extracted, mechanically sheared, and cleaned to remove small DNA fragments (<1.0 kb). The resulting DNA fragments were subjected to blunt-end polishing, fractionation, and purification by using commercial kits. The end-modified DNA fragments were ligated to pCR4Blunt-TOPO vector and transformed into competent Escherichia coli TOPO10 cells. Metagenomic plasmid libraries were screened for carboxymethyl cellulolytic activities by using lysogeny broth agar plates. The average insert size of the resulting library was approximately 4.2 kb. Screening for the ability to hydrolyze carboxymethyl cellulose yielded 14 positive colonies, giving an estimated 430 Mb of metagenomic DNA in the approximately 102,000 E. coli clones with an overall hit rate of 0.14%. The 11 assembled insert sequences included 4 function-related gene clusters, and a total of 18 putative carbohydrate active enzyme genes were identified. This included genes encoding 11 cellulases, 4 hemicellulases, 1 polygalacturonas, 1 glycoside hydrolase family 26 mannanase-family 5 cellulase chimeric enzyme gene, and 1 cellobiose phosphorylase. In conclusion, the coupling of functional metagenomic mining with biochemical characterization of fiber-degrading enzymes is a powerful strategy for exploring the enzymological underpinnings of the anaerobic fermentation of dietary fiber in the complex animal gut environment.

  11. Genomics and metagenomics in medical microbiology.

    PubMed

    Padmanabhan, Roshan; Mishra, Ajay Kumar; Raoult, Didier; Fournier, Pierre-Edouard

    2013-12-01

    Over the last two decades, sequencing tools have evolved from laborious time-consuming methodologies to real-time detection and deciphering of genomic DNA. Genome sequencing, especially using next generation sequencing (NGS) has revolutionized the landscape of microbiology and infectious disease. This deluge of sequencing data has not only enabled advances in fundamental biology but also helped improve diagnosis, typing of pathogen, virulence and antibiotic resistance detection, and development of new vaccines and culture media. In addition, NGS also enabled efficient analysis of complex human micro-floras, both commensal, and pathological, through metagenomic methods, thus helping the comprehension and management of human diseases such as obesity. This review summarizes technological advances in genomics and metagenomics relevant to the field of medical microbiology.

  12. [Marine microbial metagenomics: progress and prospect].

    PubMed

    Li, Xiang; Qin, Ling; Dai, Shi-kun; Jiang, Shu-mei; Liu, Zhi-heng

    2007-06-01

    Preliminary statistics showed that there are more than one million species of microbes in marine environments that formed a dynamic genetic reservoir, among which the majority are not revealed and categorized due to barrier in cultivation techniques. However, the situation has changed in recent years because of the rapid development of phylogenetic studies based on small ribosomal RNA and rDNA sequencing independent to standard laboratory cultivation. These changes have significantly altered our understanding about microbial diversity and microbial ecology. In this review, we highlight some of recent progress and innovation in research on microbial diversity, and propose a metagenomic scheme as an alternative to overcome some of the barriers that still remain for exploitation of marine microbial diversity for its enormous potential in pharmaceutical applications. We believe that rapid progress in marine metagenomics allows direct access to the genomes of numerous non-cultivable microorganisms for their associated chemical prosperity.

  13. MetaPhinder—Identifying Bacteriophage Sequences in Metagenomic Data Sets

    PubMed Central

    Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten

    2016-01-01

    Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder. PMID:27684958

  14. Metagenomic Assembly Reveals Hosts of Antibiotic Resistance Genes and the Shared Resistome in Pig, Chicken, and Human Feces.

    PubMed

    Ma, Liping; Xia, Yu; Li, Bing; Yang, Ying; Li, Li-Guan; Tiedje, James M; Zhang, Tong

    2016-01-01

    The risk associated with antibiotic resistance disseminating from animal and human feces is an urgent public issue. In the present study, we sought to establish a pipeline for annotating antibiotic resistance genes (ARGs) based on metagenomic assembly to investigate ARGs and their co-occurrence with associated genetic elements. Genetic elements found on the assembled genomic fragments include mobile genetic elements (MGEs) and metal resistance genes (MRGs). We then explored the hosts of these resistance genes and the shared resistome of pig, chicken and human fecal samples. High levels of tetracycline, multidrug, erythromycin, and aminoglycoside resistance genes were discovered in these fecal samples. In particular, significantly high level of ARGs (7762 ×/Gb) was detected in adult chicken feces, indicating higher ARG contamination level than other fecal samples. Many ARGs arrangements (e.g., macA-macB and tetA-tetR) were discovered shared by chicken, pig and human feces. In addition, MGEs such as the aadA5-dfrA17-carrying class 1 integron were identified on an assembled scaffold of chicken feces, and are carried by human pathogens. Differential coverage binning analysis revealed significant ARG enrichment in adult chicken feces. A draft genome, annotated as multidrug resistant Escherichia coli, was retrieved from chicken feces metagenomes and was determined to carry diverse ARGs (multidrug, acriflavine, and macrolide). The present study demonstrates the determination of ARG hosts and the shared resistome from metagenomic data sets and successfully establishes the relationship between ARGs, hosts, and environments. This ARG annotation pipeline based on metagenomic assembly will help to bridge the knowledge gaps regarding ARG-associated genes and ARG hosts with metagenomic data sets. Moreover, this pipeline will facilitate the evaluation of environmental risks in the genetic context of ARGs. PMID:26650334

  15. Metagenomic characterization of Chesapeake Bay virioplankton.

    PubMed

    Bench, Shellie R; Hanson, Thomas E; Williamson, Kurt E; Ghosh, Dhritiman; Radosovich, Mark; Wang, Kui; Wommack, K Eric

    2007-12-01

    Viruses are ubiquitous and abundant throughout the biosphere. In marine systems, virus-mediated processes can have significant impacts on microbial diversity and on global biogeocehmical cycling. However, viral genetic diversity remains poorly characterized. To address this shortcoming, a metagenomic library was constructed from Chesapeake Bay virioplankton. The resulting sequences constitute the largest collection of long-read double-stranded DNA (dsDNA) viral metagenome data reported to date. BLAST homology comparisons showed that Chesapeake Bay virioplankton contained a high proportion of unknown (homologous only to environmental sequences) and novel (no significant homolog) sequences. This analysis suggests that dsDNA viruses are likely one of the largest reservoirs of unknown genetic diversity in the biosphere. The taxonomic origin of BLAST homologs to viral library sequences agreed well with reported abundances of cooccurring bacterial subphyla within the estuary and indicated that cyanophages were abundant. However, the low proportion of Siphophage homologs contradicts a previous assertion that this family comprises most bacteriophage diversity. Identification and analyses of cyanobacterial homologs of the psbA gene illustrated the value of metagenomic studies of virioplankton. The phylogeny of inferred PsbA protein sequences suggested that Chesapeake Bay cyanophage strains are endemic in that environment. The ratio of psbA homologous sequences to total cyanophage sequences in the metagenome indicated that the psbA gene may be nearly universal in Chesapeake Bay cyanophage genomes. Furthermore, the low frequency of psbD homologs in the library supports the prediction that Chesapeake Bay cyanophage populations are dominated by Podoviridae.

  16. [Comparative Metagenomics of BIOLAK and A2O Activated Sludge Based on Next-generation Sequencing Technology].

    PubMed

    Tian, Mei; Liu, Han-hu; Shen, Xin

    2016-02-15

    This is the first report of comparative metagenomic analyses of BIOLAK sludge and anaerobic/anoxic/oxic (A2O) sludge. In the BIOLAK and A2O sludge metagenomes, 47 and 51 phyla were identified respectively, more than the numbers of phyla identified in Australia EBPR (enhanced biological phosphorus removal), USA EBPR and Bibby sludge. All phyla found in the BIOLAK sludge were detected in the A2O sludge, but four phyla were exclusively found in the A20 sludge. The proportion of the phylum Ignavibacteriae in the A2O sludge was 2.0440%, which was 3.2 times as much as that in the BIOLAK sludge (0.6376%). Meanwhile, the proportion of the bacterial phylum Gemmatimonadetes in the BIOLAK sludge was 2.4673%, which was >17 times as much as that in the A2O sludge (0.1404%). The proportion of the bacterial phylum Chlamydiae in the BIOLAK metagenome (0.2192%) was >6 times higher than that in the A2O (0.0360%). Furthermore, 167 genera found in the A20 sludge were not detected in the BIOLAK sludge. And 50 genera found in the BIOLAK sludge were not detected in the A20 sludge. From the analyses of both the phylum and genus levels, there were huge differences between the two biological communities of A2O and BIOLAK sludge. However, the proportions of each group of functional genes associated with metabolism of nitrogen, phosphor, sulfur and aromatic compounds in BIOLAK were very similar to those in A2O sludge. Moreover, the rankings of all six KEGG (Kyoto Encyclopedia for Genes and Genomes) categories were identical in the two sludges. In addition, the analyses of functional classification and pathway related nitrogen metabolism showed that the abundant enzymes had identical ranking in the BIOLAK and A2O metagenomes. Therefore, comparative metagenomics of BIOLAK and A2O activated sludge indicated similar function assignments from the two different biological communities.

  17. New Extremophilic Lipases and Esterases from Metagenomics

    PubMed Central

    López-López, Olalla; Cerdán, Maria E; González Siso, Maria I

    2014-01-01

    Lipolytic enzymes catalyze the hydrolysis of ester bonds in the presence of water. In media with low water content or in organic solvents, they can catalyze synthetic reactions such as esterification and transesterification. Lipases and esterases, in particular those from extremophilic origin, are robust enzymes, functional under the harsh conditions of industrial processes owing to their inherent thermostability and resistance towards organic solvents, which combined with their high chemo-, regio- and enantioselectivity make them very attractive biocatalysts for a variety of industrial applications. Likewise, enzymes from extremophile sources can provide additional features such as activity at extreme temperatures, extreme pH values or high salinity levels, which could be interesting for certain purposes. New lipases and esterases have traditionally been discovered by the isolation of microbial strains producing lipolytic activity. The Genome Projects Era allowed genome mining, exploiting homology with known lipases and esterases, to be used in the search for new enzymes. The Metagenomic Era meant a step forward in this field with the study of the metagenome, the pool of genomes in an environmental microbial community. Current molecular biology techniques make it possible to construct total environmental DNA libraries, including the genomes of unculturable organisms, opening a new window to a vast field of unknown enzymes with new and unique properties. Here, we review the latest advances and findings from research into new extremophilic lipases and esterases, using metagenomic approaches, and their potential industrial and biotechnological applications. PMID:24588890

  18. Generating viral metagenomes from the coral holobiont

    PubMed Central

    Wood-Charlson, Elisha M.; Suttle, Curtis A.; van Oppen, Madeleine J. H.

    2014-01-01

    Reef-building corals comprise multipartite symbioses where the cnidarian animal is host to an array of eukaryotic and prokaryotic organisms, and the viruses that infect them. These viruses are critical elements of the coral holobiont, serving not only as agents of mortality, but also as potential vectors for lateral gene flow, and as elements encoding a variety of auxiliary metabolic functions. Consequently, understanding the functioning and health of the coral holobiont requires detailed knowledge of the associated viral assemblage and its function. Currently, the most tractable way of uncovering viral diversity and function is through metagenomic approaches, which is inherently difficult in corals because of the complex holobiont community, an extracellular mucus layer that all corals secrete, and the variety of sizes and structures of nucleic acids found in viruses. Here we present the first protocol for isolating, purifying and amplifying viral nucleic acids from corals based on mechanical disruption of cells. This method produces at least 50% higher yields of viral nucleic acids, has very low levels of cellular sequence contamination and captures wider viral diversity than previously used chemical-based extraction methods. We demonstrate that our mechanical-based method profiles a greater diversity of DNA and RNA genomes, including virus groups such as Retro-transcribing and ssRNA viruses, which are absent from metagenomes generated via chemical-based methods. In addition, we briefly present (and make publically available) the first paired DNA and RNA viral metagenomes from the coral Acropora tenuis. PMID:24847321

  19. A retrospective metagenomics approach to studying Blastocystis.

    PubMed

    Andersen, Lee O'Brien; Bonde, Ida; Nielsen, Henrik Bjørn; Stensvold, Christen Rune

    2015-07-01

    Blastocystis is a common single-celled intestinal parasitic genus, comprising several subtypes. Here, we screened data obtained by metagenomic analysis of faecal DNA for Blastocystis by searching for subtype-specific genes in coabundance gene groups, which are groups of genes that covary across a selection of 316 human faecal samples, hence representing genes originating from a single subtype. The 316 faecal samples were from 236 healthy individuals, 13 patients with Crohn's disease (CD) and 67 patients with ulcerative colitis (UC). The prevalence of Blastocystis was 20.3% in the healthy individuals and 14.9% in patients with UC. Meanwhile, Blastocystis was absent in patients with CD. Individuals with intestinal microbiota dominated by Bacteroides were much less prone to having Blastocystis-positive stool (Matthew's correlation coefficient = -0.25, P < 0.0001) than individuals with Ruminococcus- and Prevotella-driven enterotypes. This is the first study to investigate the relationship between Blastocystis and communities of gut bacteria using a metagenomics approach. The study serves as an example of how it is possible to retrospectively investigate microbial eukaryotic communities in the gut using metagenomic datasets targeting the bacterial component of the intestinal microbiome and the interplay between these microbial communities.

  20. Selectable fragmentation warhead

    SciTech Connect

    Bryan, C.S.; Paisley, D.L.; Montoya, N.I.; Stahl, D.B.

    1992-12-31

    This report discusses a selectable fragmentation warhead which is capable of producing a predetermined number of fragments from a metal plate, and accelerating the fragments toward a target. A first explosive located adjacent to the plate is detonated at selected number of points by laser-driven slapper detonators. In one embodiment, a smoother-disk and a second explosive, located adjacent to the first explosive, serve to increase acceleration of the fragments toward a target. The ability to produce a selected number of fragments allows for effective destruction of a chosen target.

  1. Selectable fragmentation warhead

    DOEpatents

    Bryan, Courtney S.; Paisley, Dennis L.; Montoya, Nelson I.; Stahl, David B.

    1993-01-01

    A selectable fragmentation warhead capable of producing a predetermined number of fragments from a metal plate, and accelerating the fragments toward a target. A first explosive located adjacent to the plate is detonated at selected number of points by laser-driven slapper detonators. In one embodiment, a smoother-disk and a second explosive, located adjacent to the first explosive, serve to increase acceleration of the fragments toward a target. The ability to produce a selected number of fragments allows for effective destruction of a chosen target.

  2. [A review on the bioinformatics pipelines for metagenomic research].

    PubMed

    Ye, Dan-Dan; Fan, Meng-Meng; Guan, Qiong; Chen, Hong-Ju; Ma, Zhan-Shan

    2012-12-01

    Metagenome, a term first dubbed by Handelsman in 1998 as "the genomes of the total microbiota found in nature", refers to sequence data directly sampled from the environment (which may be any habitat in which microbes live, such as the guts of humans and animals, milk, soil, lakes, glaciers, and oceans). Metagenomic technologies originated from environmental microbiology studies and their wide application has been greatly facilitated by next-generation high throughput sequencing technologies. Like genomics studies, the bottle neck of metagenomic research is how to effectively and efficiently analyze the gigantic amount of metagenomic sequence data using the bioinformatics pipelines to obtain meaningful biological insights. In this article, we briefly review the state-of-the-art bioinformatics software tools in metagenomic research. Due to the differences between the metagenomic data obtained from whole genome sequencing (i.e., shotgun metagenomics) and amplicon sequencing (i.e., 16S-rRNA and gene-targeted metagenomics) methods, there are significant differences between the corresponding bioinformatics tools for these data; accordingly, we review the computational pipelines separately for these two types of data. PMID:23266976

  3. Functional assignment of metagenomic data: challenges and applications

    PubMed Central

    Prakash, Tulika

    2012-01-01

    Metagenomic sequencing provides a unique opportunity to explore earth’s limitless environments harboring scores of yet unknown and mostly unculturable microbes and other organisms. Functional analysis of the metagenomic data plays a central role in projects aiming to explore the most essential questions in microbiology, namely ‘In a given environment, among the microbes present, what are they doing, and how are they doing it?’ Toward this goal, several large-scale metagenomic projects have recently been conducted or are currently underway. Functional analysis of metagenomic data mainly suffers from the vast amount of data generated in these projects. The shear amount of data requires much computational time and storage space. These problems are compounded by other factors potentially affecting the functional analysis, including, sample preparation, sequencing method and average genome size of the metagenomic samples. In addition, the read-lengths generated during sequencing influence sequence assembly, gene prediction and subsequently the functional analysis. The level of confidence for functional predictions increases with increasing read-length. Usually, the most reliable functional annotations for metagenomic sequences are achieved using homology-based approaches against publicly available reference sequence databases. Here, we present an overview of the current state of functional analysis of metagenomic sequence data, bottlenecks frequently encountered and possible solutions in light of currently available resources and tools. Finally, we provide some examples of applications from recent metagenomic studies which have been successfully conducted in spite of the known difficulties. PMID:22772835

  4. Which Microbial Communities Are Present? Sequence-Based Metagenomics

    NASA Astrophysics Data System (ADS)

    Caffrey, Sean M.

    The use of metagenomic methods that directly sequence environmental samples has revealed the extraordinary microbial diversity missed by traditional culture-based methodologies. Therefore, to develop a complete and representative model of an environment's microbial community and activities, metagenomic analysis is an essential tool.

  5. Universality of fragment shapes

    PubMed Central

    Domokos, Gábor; Kun, Ferenc; Sipos, András Árpád; Szabó, Tímea

    2015-01-01

    The shape of fragments generated by the breakup of solids is central to a wide variety of problems ranging from the geomorphic evolution of boulders to the accumulation of space debris orbiting Earth. Although the statistics of the mass of fragments has been found to show a universal scaling behavior, the comprehensive characterization of fragment shapes still remained a fundamental challenge. We performed a thorough experimental study of the problem fragmenting various types of materials by slowly proceeding weathering and by rapid breakup due to explosion and hammering. We demonstrate that the shape of fragments obeys an astonishing universality having the same generic evolution with the fragment size irrespective of materials details and loading conditions. There exists a cutoff size below which fragments have an isotropic shape, however, as the size increases an exponential convergence is obtained to a unique elongated form. We show that a discrete stochastic model of fragmentation reproduces both the size and shape of fragments tuning only a single parameter which strengthens the general validity of the scaling laws. The dependence of the probability of the crack plan orientation on the linear extension of fragments proved to be essential for the shape selection mechanism. PMID:25772300

  6. Universality of fragment shapes.

    PubMed

    Domokos, Gábor; Kun, Ferenc; Sipos, András Árpád; Szabó, Tímea

    2015-01-01

    The shape of fragments generated by the breakup of solids is central to a wide variety of problems ranging from the geomorphic evolution of boulders to the accumulation of space debris orbiting Earth. Although the statistics of the mass of fragments has been found to show a universal scaling behavior, the comprehensive characterization of fragment shapes still remained a fundamental challenge. We performed a thorough experimental study of the problem fragmenting various types of materials by slowly proceeding weathering and by rapid breakup due to explosion and hammering. We demonstrate that the shape of fragments obeys an astonishing universality having the same generic evolution with the fragment size irrespective of materials details and loading conditions. There exists a cutoff size below which fragments have an isotropic shape, however, as the size increases an exponential convergence is obtained to a unique elongated form. We show that a discrete stochastic model of fragmentation reproduces both the size and shape of fragments tuning only a single parameter which strengthens the general validity of the scaling laws. The dependence of the probability of the crack plan orientation on the linear extension of fragments proved to be essential for the shape selection mechanism. PMID:25772300

  7. Metagenomics: Retrospect and Prospects in High Throughput Age

    PubMed Central

    Kumar, Satish; Krishnani, Kishore Kumar; Bhushan, Bharat; Brahmane, Manoj Pandit

    2015-01-01

    In recent years, metagenomics has emerged as a powerful tool for mining of hidden microbial treasure in a culture independent manner. In the last two decades, metagenomics has been applied extensively to exploit concealed potential of microbial communities from almost all sorts of habitats. A brief historic progress made over the period is discussed in terms of origin of metagenomics to its current state and also the discovery of novel biological functions of commercial importance from metagenomes of diverse habitats. The present review also highlights the paradigm shift of metagenomics from basic study of community composition to insight into the microbial community dynamics for harnessing the full potential of uncultured microbes with more emphasis on the implication of breakthrough developments, namely, Next Generation Sequencing, advanced bioinformatics tools, and systems biology. PMID:26664751

  8. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond

    PubMed Central

    Hiraoka, Satoshi; Yang, Ching-chia; Iwasaki, Wataru

    2016-01-01

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  9. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond.

    PubMed

    Hiraoka, Satoshi; Yang, Ching-Chia; Iwasaki, Wataru

    2016-09-29

    Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives. PMID:27383682

  10. Comparative Viral Metagenomics of Environmental Samples from Korea

    PubMed Central

    Kim, Min-Soo; Whon, Tae Woong

    2013-01-01

    The introduction of metagenomics into the field of virology has facilitated the exploration of viral communities in various natural habitats. Understanding the viral ecology of a variety of sample types throughout the biosphere is important per se, but it also has potential applications in clinical and diagnostic virology. However, the procedures used by viral metagenomics may produce technical errors, such as amplification bias, while public viral databases are very limited, which may hamper the determination of the viral diversity in samples. This review considers the current state of viral metagenomics, based on examples from Korean viral metagenomic studies-i.e., rice paddy soil, fermented foods, human gut, seawater, and the near-surface atmosphere. Viral metagenomics has become widespread due to various methodological developments, and much attention has been focused on studies that consider the intrinsic role of viruses that interact with their hosts. PMID:24124407

  11. Symptomatic atherosclerosis is associated with an altered gut metagenome

    PubMed Central

    Karlsson, Fredrik H.; Fåk, Frida; Nookaew, Intawat; Tremaroli, Valentina; Fagerberg, Björn; Petranovic, Dina; Bäckhed, Fredrik; Nielsen, Jens

    2012-01-01

    Recent findings have implicated the gut microbiota as a contributor of metabolic diseases through the modulation of host metabolism and inflammation. Atherosclerosis is associated with lipid accumulation and inflammation in the arterial wall, and bacteria have been suggested as a causative agent of this disease. Here we use shotgun sequencing of the gut metagenome to demonstrate that the genus Collinsella was enriched in patients with symptomatic atherosclerosis, defined as stenotic atherosclerotic plaques in the carotid artery leading to cerebrovascular events, whereas Roseburia and Eubacterium were enriched in healthy controls. Further characterization of the functional capacity of the metagenomes revealed that patient gut metagenomes were enriched in genes encoding peptidoglycan synthesis and depleted in phytoene dehydrogenase; patients also had reduced serum levels of β-carotene. Our findings suggest that the gut metagenome is associated with the inflammatory status of the host and patients with symptomatic atherosclerosis harbor characteristic changes in the gut metagenome. PMID:23212374

  12. Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Canon, Shane [LBNL

    2016-07-12

    DOE JGI's Zhong Wang, chair of the High-performance Computing session, gives a brief introduction before Berkeley Lab's Shane Canon talks about "Exploiting HPC Platforms for Metagenomics: Challenges and Opportunities" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  13. Fragmentation properties of metals

    SciTech Connect

    Grady, D.E.; Kipp, M.E.

    1996-06-01

    In the present study we are developing an experimental fracture material property test method specific to dynamic fragmentation. Spherical test samples of the metals of interest are subjected to controlled impulsive stress loads by acceleration to high velocities with a light-gas launcher facility and subsequent normal impact on thin plates. Motion, deformation and fragmentation of the test samples are diagnosed with multiple flash radiography methods. The impact plate materials are selected to be transparent to the x-ray method so that only test metal material is imaged. Through a systematic series of such tests, both strain-to-failure and fragmentation resistance properties are determined through this experimental method. Fragmentation property data for several steels, copper, aluminum, tantalum and titanium have been obtained to date. Aspects of the dynamic data have been analyzed with computational methods to achieve a better understanding of the processes leading to failure and fragmentation, and to test an existing computational fragmentation model.

  14. Hubble Classification

    NASA Astrophysics Data System (ADS)

    Murdin, P.

    2000-11-01

    A classification scheme for galaxies, devised in its original form in 1925 by Edwin P Hubble (1889-1953), and still widely used today. The Hubble classification recognizes four principal types of galaxy—elliptical, spiral, barred spiral and irregular—and arranges these in a sequence that is called the tuning-fork diagram....

  15. [From molecular to genomic and metagenomic epidemiology].

    PubMed

    Zhebrun, A B

    2014-01-01

    The notion "molecular epidemiology" was introduced into scientific literature by Kilburn E. et al. in 1973. The first period of development of infectious diseases molecular epidemiology may be called "genotypic" (1980-1990s). During this period methodology of molecular marking of pathogens for purposes of monitoring of their spread and outbreak detection (novel nomenclature of diphtheria corynebacteria based on ribotyping; international network PulseNet for monitoring food source infections; international database of tuberculosis mycobacteria spoligotypes) was created. The second--"genomic" period started in the 2000s. Molecular epidemiology rapidly went through single markers (genotypes or single genes) to deciphering the whole genome of pathoge "mobileome", "resistome", "virulome" etc. took an important place in the studies of emerging and pandemic infections. Knowledge on genetic mechanisms leading to emergence and global dissemination of novel pathogens give molecular epidemiology its own scientific content and transforms it from a methodical approach to an independent field of epidemiology. The third--"metagenomic" period starts nowadays based on meta-genomic approach that allows to determine the whole set ofgenomes in the studied sample without the cultivation procedure. In the short-term this would lead to a change of a century-long paradigm of diagnostics and control of infections: instead of search of separate (key) pathogens--characteristics of the full specter of microorganisms in the material from patients and environmental samples with its identification up to any taxonomic depth. In the systems of regional and global epidemiologic control a universal monitoring of all known and re-emerging pathogens with construction and maintenance of metagenomic passports of human habitats will be realized. PMID:25286516

  16. [From molecular to genomic and metagenomic epidemiology].

    PubMed

    Zhebrun, A B

    2014-01-01

    The notion "molecular epidemiology" was introduced into scientific literature by Kilburn E. et al. in 1973. The first period of development of infectious diseases molecular epidemiology may be called "genotypic" (1980-1990s). During this period methodology of molecular marking of pathogens for purposes of monitoring of their spread and outbreak detection (novel nomenclature of diphtheria corynebacteria based on ribotyping; international network PulseNet for monitoring food source infections; international database of tuberculosis mycobacteria spoligotypes) was created. The second--"genomic" period started in the 2000s. Molecular epidemiology rapidly went through single markers (genotypes or single genes) to deciphering the whole genome of pathoge "mobileome", "resistome", "virulome" etc. took an important place in the studies of emerging and pandemic infections. Knowledge on genetic mechanisms leading to emergence and global dissemination of novel pathogens give molecular epidemiology its own scientific content and transforms it from a methodical approach to an independent field of epidemiology. The third--"metagenomic" period starts nowadays based on meta-genomic approach that allows to determine the whole set ofgenomes in the studied sample without the cultivation procedure. In the short-term this would lead to a change of a century-long paradigm of diagnostics and control of infections: instead of search of separate (key) pathogens--characteristics of the full specter of microorganisms in the material from patients and environmental samples with its identification up to any taxonomic depth. In the systems of regional and global epidemiologic control a universal monitoring of all known and re-emerging pathogens with construction and maintenance of metagenomic passports of human habitats will be realized.

  17. Metagenomic approaches to identifying infectious agents.

    PubMed

    Höper, D; Mettenleiter, T C; Beer, M

    2016-04-01

    Since the advent of next-generation sequencing (NGS) technologies, the untargeted screening of samples from outbreaks for pathogen identification using metagenomics has become technically and economically feasible. However, various aspects need to be considered in order to exploit the full potential of NGS for virus discovery. Here, the authors summarise those aspects of the main steps that have a significant impact, from sample selection through sample handling and processing, as well as sequencing and finally data analysis, with a special emphasis on existing pitfalls.

  18. THE ROLE OF WATERSHED CLASSIFICATION IN DIAGNOSING CAUSES OF BIOLOGICAL IMPAIRMENT

    EPA Science Inventory

    We compared classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmention with a gewographically-based classification scheme for two case studies involving 1) Lake Superior tributaries and 2) watersheds of riverine coastal wetlands ...

  19. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

    PubMed

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

    2016-07-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  20. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights

    PubMed Central

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi

    2016-01-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the “healthy” microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  1. Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

    PubMed

    Pasolli, Edoardo; Truong, Duy Tin; Malik, Faizan; Waldron, Levi; Segata, Nicola

    2016-07-01

    Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the "healthy" microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly

  2. Metagenomics of an Alkaline Hot Spring in Galicia (Spain): Microbial Diversity Analysis and Screening for Novel Lipolytic Enzymes

    PubMed Central

    López-López, Olalla; Knapik, Kamila; Cerdán, Maria-Esperanza; González-Siso, María-Isabel

    2015-01-01

    A fosmid library was constructed with the metagenomic DNA from the water of the Lobios hot spring (76°C, pH = 8.2) located in Ourense (Spain). Metagenomic sequencing of the fosmid library allowed the assembly of 9722 contigs ranging in size from 500 to 56,677 bp and spanning ~18 Mbp. 23,207 ORFs (Open Reading Frames) were predicted from the assembly. Biodiversity was explored by taxonomic classification and it revealed that bacteria were predominant, while the archaea were less abundant. The six most abundant bacterial phyla were Deinococcus-Thermus, Proteobacteria, Firmicutes, Acidobacteria, Aquificae, and Chloroflexi. Within the archaeal superkingdom, the phylum Thaumarchaeota was predominant with the dominant species “Candidatus Caldiarchaeum subterraneum.” Functional classification revealed the genes associated to one-carbon metabolism as the most abundant. Both taxonomic and functional classifications showed a mixture of different microbial metabolic patterns: aerobic and anaerobic, chemoorganotrophic and chemolithotrophic, autotrophic and heterotrophic. Remarkably, the presence of genes encoding enzymes with potential biotechnological interest, such as xylanases, galactosidases, proteases, and lipases, was also revealed in the metagenomic library. Functional screening of this library was subsequently done looking for genes encoding lipolytic enzymes. Six genes conferring lipolytic activity were identified and one was cloned and characterized. This gene was named LOB4Est and it was expressed in a yeast mesophilic host. LOB4Est codes for a novel esterase of family VIII, with sequence similarity to β-lactamases, but with unusual wide substrate specificity. When the enzyme was purified from the mesophilic host it showed half-life of 1 h and 43 min at 50°C, and maximal activity at 40°C and pH 7.5 with p-nitrophenyl-laurate as substrate. Interestingly, the enzyme retained more than 80% of maximal activity in a broad range of pH from 6.5 to 8. PMID:26635759

  3. [Research in metagenomics and its applications in translational medicine].

    PubMed

    Jiahuan, Chen; Zheng, Sun; Xiaojun, Wang; Xiaoquan, Su; Kang, Ning

    2015-07-01

    Humans are born with microbiota, which have accompanied us through our life-span. There is an important symbiotic relationship between us and the microbial communities, thus microbial communities are of great importance to our health. All genomic information within this microbiota is referered to as "metagenomics" (also referred to as "human's second genome"). The analysis of high throughput metagenomic data generated from biomedical experiments would provide new approaches for translational research, and it have several applications in clinics. With the help of next generation sequencing technology and the emerging metagenomic approach (analysis of all genomic information in microbiota as a whole), we can overcome the pitfalls of tedious traditional method of isolation and cultivation of single microbial species. The metagenomic approach can also help us to analyze the whole microbial community efficiently and offer deep insights in human-microbe relationships as well as new ideas on many biomedical problems. In this review, we summarize frontiers in metagenomic research, including new concepts and methods. Then, we focus on the applications of metagenomic research in medical researches and clinical applications in recent years, which would clearly show the importance of metagenomic research in the field of translational medicine. PMID:26351164

  4. [Research in metagenomics and its applications in translational medicine].

    PubMed

    Jiahuan, Chen; Zheng, Sun; Xiaojun, Wang; Xiaoquan, Su; Kang, Ning

    2015-07-01

    Humans are born with microbiota, which have accompanied us through our life-span. There is an important symbiotic relationship between us and the microbial communities, thus microbial communities are of great importance to our health. All genomic information within this microbiota is referered to as "metagenomics" (also referred to as "human's second genome"). The analysis of high throughput metagenomic data generated from biomedical experiments would provide new approaches for translational research, and it have several applications in clinics. With the help of next generation sequencing technology and the emerging metagenomic approach (analysis of all genomic information in microbiota as a whole), we can overcome the pitfalls of tedious traditional method of isolation and cultivation of single microbial species. The metagenomic approach can also help us to analyze the whole microbial community efficiently and offer deep insights in human-microbe relationships as well as new ideas on many biomedical problems. In this review, we summarize frontiers in metagenomic research, including new concepts and methods. Then, we focus on the applications of metagenomic research in medical researches and clinical applications in recent years, which would clearly show the importance of metagenomic research in the field of translational medicine.

  5. Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Li, Weizhong [San Diego Supercomputer Center

    2016-07-12

    San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  6. Evaluation of the Cow Rumen Metagenome; Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies(Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Sczyrba, Alex [DOE JGI

    2016-07-12

    DOE JGI's Alex Sczyrba on "Evaluation of the Cow Rumen Metagenome" and "Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  7. MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Sakakibara, Yasumbumi [Keio University

    2016-07-12

    Keio University's Yasumbumi Sakakibara on "MetaVelvet: An Extension of Velvet Assembler to de novo Metagenome Assembly from Short Sequence Reads" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  8. Evaluation of the Cow Rumen Metagenome; Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies(Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Sczyrba, Alex

    2011-10-13

    DOE JGI's Alex Sczyrba on "Evaluation of the Cow Rumen Metagenome" and "Assembly by Single Copy Gene Analysis and Single Cell Genome Assemblies" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  9. Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Li, Weizhong

    2011-10-12

    San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  10. Size Does Matter: Application-driven Approaches for Soil Metagenomics

    PubMed Central

    Kakirde, Kavita S.; Parsley, Larissa C.; Liles, Mark R.

    2010-01-01

    Metagenomic analyses can provide extensive information on the structure, composition, and predicted gene functions of diverse environmental microbial assemblages. Each environment presents its own unique challenges to metagenomic investigation and requires a specifically designed approach to accommodate physicochemical and biotic factors unique to each environment that can pose technical hurdles and/or bias the metagenomic analyses. In particular, soils harbor an exceptional diversity of prokaryotes that are largely undescribed beyond the level of ribotype and are a potentially vast resource for natural product discovery. The successful application of a soil metagenomic approach depends on selecting the appropriate DNA extraction, purification, and if necessary, cloning methods for the intended downstream analyses. The most important technical considerations in a metagenomic study include obtaining a sufficient yield of high-purity DNA representing the targeted microorganisms within an environmental sample or enrichment and (if required) constructing a metagenomic library in a suitable vector and host. Size does matter in the context of the average insert size within a clone library or the sequence read length for a high-throughput sequencing approach. It is also imperative to select the appropriate metagenomic screening strategy to address the specific question(s) of interest, which should drive the selection of methods used in the earlier stages of a metagenomic project (e.g., DNA size, to clone or not to clone). Here, we present both the promising and problematic nature of soil metagenomics and discuss the factors that should be considered when selecting soil sampling, DNA extraction, purification, and cloning methods to implement based on the ultimate study objectives. PMID:21076656

  11. Identifying personal microbiomes using metagenomic codes

    PubMed Central

    Franzosa, Eric A.; Huang, Katherine; Meadow, James F.; Gevers, Dirk; Lemon, Katherine P.; Bohannan, Brendan J. M.; Huttenhower, Curtis

    2015-01-01

    Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed >80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability. PMID:25964341

  12. The cystic fibrosis lower airways microbial metagenome

    PubMed Central

    Moran Losada, Patricia; Chouvarine, Philippe; Dorda, Marie; Hedtfeld, Silke; Mielke, Samira; Schulz, Angela; Wiehlmann, Lutz

    2016-01-01

    Chronic airway infections determine most morbidity in people with cystic fibrosis (CF). Herein, we present unbiased quantitative data about the frequency and abundance of DNA viruses, archaea, bacteria, moulds and fungi in CF lower airways. Induced sputa were collected on several occasions from children, adolescents and adults with CF. Deep sputum metagenome sequencing identified, on average, approximately 10 DNA viruses or fungi and several hundred bacterial taxa. The metagenome of a CF patient was typically found to be made up of an individual signature of multiple, lowly abundant species superimposed by few disease-associated pathogens, such as Pseudomonas aeruginosa and Staphylococcus aureus, as major components. The host-associated signatures ranged from inconspicuous polymicrobial communities in healthy subjects to low-complexity microbiomes dominated by the typical CF pathogens in patients with advanced lung disease. The DNA virus community in CF lungs mainly consisted of phages and occasionally of human pathogens, such as adeno- and herpesviruses. The S. aureus and P. aeruginosa populations were composed of one major and numerous minor clone types. The rare clones constitute a low copy genetic resource that could rapidly expand as a response to habitat alterations, such as antimicrobial chemotherapy or invasion of novel microbes. PMID:27730195

  13. Genovo: De Novo Assembly for Metagenomes

    NASA Astrophysics Data System (ADS)

    Laserson, Jonathan; Jojic, Vladimir; Koller, Daphne

    Next-generation sequencing technologies produce a large number of noisy reads from the DNA in a sample. Metagenomics and population sequencing aim to recover the genomic sequences of the species in the sample, which could be of high diversity. Methods geared towards single sequence reconstruction are not sensitive enough when applied in this setting. We introduce a generative probabilistic model of read generation from environmental samples and present Genovo, a novel de novo sequence assembler that discovers likely sequence reconstructions under the model. A Chinese restaurant process prior accounts for the unknown number of genomes in the sample. Inference is made by applying a series of hill-climbing steps iteratively until convergence. We compare the performance of Genovo to three other short read assembly programs across one synthetic dataset and eight metagenomic datasets created using the 454 platform, the largest of which has 311k reads. Genovo's reconstructions cover more bases and recover more genes than the other methods, and yield a higher assembly score.

  14. [Metagenomics of the intestinal microbiota: potential applications].

    PubMed

    Dusko Ehrlich, S

    2010-09-01

    A major challenge in the human metagenomics field is to identify associations of the bacterial genes and human phenotypes and act to modulate microbial populations in order to improve human health and wellbeing. MetaHIT project addresses this ambitious challenge by developing and integrating a number of necessary approaches within the context of the gut microbiome. Among the first results is the establishment of a broad catalog of the human gut microbial genes, which was achieved by an original application of the new generation sequencing technology. The catalog contains 3.3 million non-redundant genes, 150-fold more than the human genome equivalent and includes a large majority of the gut metagenomic sequences determined across three continents, Europe, America and Asia. Its content corresponds to some 1000 bacterial species, which likely represent a large fraction of species associated with humankind intestinal tract. The catalog enables development of the gene profiling approaches aiming to detect associations of bacterial genes and phenotypes. These should lead to the speedy development of diagnostic and prognostic tools and open avenues to reasoned approaches to the modulation of the individual's microbiota in order to optimize health and well-being.

  15. Fizzy. Feature subset selection for metagenomics

    DOE PAGESBeta

    Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin; Rosen, Gail L.

    2015-11-04

    Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate betweenmore » age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.« less

  16. Fizzy. Feature subset selection for metagenomics

    SciTech Connect

    Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin; Rosen, Gail L.

    2015-11-04

    Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.

  17. RNA Viral Metagenome of Whiteflies Leads to the Discovery and Characterization of a Whitefly-Transmitted Carlavirus in North America

    PubMed Central

    Rosario, Karyna; Capobianco, Heather; Ng, Terry Fei Fan; Breitbart, Mya; Polston, Jane E.

    2014-01-01

    Whiteflies from the Bemisia tabaci species complex have the ability to transmit a large number of plant viruses and are some of the most detrimental pests in agriculture. Although whiteflies are known to transmit both DNA and RNA viruses, most of the diversity has been recorded for the former, specifically for the Begomovirus genus. This study investigated the total diversity of DNA and RNA viruses found in whiteflies collected from a single site in Florida to evaluate if there are additional, previously undetected viral types within the B. tabaci vector. Metagenomic analysis of viral DNA extracted from the whiteflies only resulted in the detection of begomoviruses. In contrast, whiteflies contained sequences similar to RNA viruses from divergent groups, with a diversity that extends beyond currently described viruses. The metagenomic analysis of whiteflies also led to the first report of a whitefly-transmitted RNA virus similar to Cowpea mild mottle virus (CpMMV Florida) (genus Carlavirus) in North America. Further investigation resulted in the detection of CpMMV Florida in native and cultivated plants growing near the original field site of whitefly collection and determination of its experimental host range. Analysis of complete CpMMV Florida genomes recovered from whiteflies and plants suggests that the current classification criteria for carlaviruses need to be reevaluated. Overall, metagenomic analysis supports that DNA plant viruses carried by B. tabaci are dominated by begomoviruses, whereas significantly less is known about RNA viruses present in this damaging insect vector. PMID:24466220

  18. RNA viral metagenome of whiteflies leads to the discovery and characterization of a whitefly-transmitted carlavirus in North America.

    PubMed

    Rosario, Karyna; Capobianco, Heather; Ng, Terry Fei Fan; Breitbart, Mya; Polston, Jane E

    2014-01-01

    Whiteflies from the Bemisia tabaci species complex have the ability to transmit a large number of plant viruses and are some of the most detrimental pests in agriculture. Although whiteflies are known to transmit both DNA and RNA viruses, most of the diversity has been recorded for the former, specifically for the Begomovirus genus. This study investigated the total diversity of DNA and RNA viruses found in whiteflies collected from a single site in Florida to evaluate if there are additional, previously undetected viral types within the B. tabaci vector. Metagenomic analysis of viral DNA extracted from the whiteflies only resulted in the detection of begomoviruses. In contrast, whiteflies contained sequences similar to RNA viruses from divergent groups, with a diversity that extends beyond currently described viruses. The metagenomic analysis of whiteflies also led to the first report of a whitefly-transmitted RNA virus similar to Cowpea mild mottle virus (CpMMV Florida) (genus Carlavirus) in North America. Further investigation resulted in the detection of CpMMV Florida in native and cultivated plants growing near the original field site of whitefly collection and determination of its experimental host range. Analysis of complete CpMMV Florida genomes recovered from whiteflies and plants suggests that the current classification criteria for carlaviruses need to be reevaluated. Overall, metagenomic analysis supports that DNA plant viruses carried by B. tabaci are dominated by begomoviruses, whereas significantly less is known about RNA viruses present in this damaging insect vector.

  19. Metagenomic search strategies for interactions among plants and multiple microbes

    PubMed Central

    Melcher, Ulrich; Verma, Ruchi; Schneider, William L.

    2014-01-01

    Plants harbor multiple microbes. Metagenomics can facilitate understanding of the significance, for the plant, of the microbes, and of the interactions among them. However, current approaches to metagenomic analysis of plants are computationally time consuming. Efforts to speed the discovery process include improvement of computational speed, condensing the sequencing reads into smaller datasets before BLAST searches, simplifying the target database of BLAST searches, and flipping the roles of metagenomic and reference datasets. The latter is exemplified by the e-probe diagnostic nucleic acid analysis approach originally devised for improving analysis during plant quarantine. PMID:24966863

  20. Bioprospecting Potential of the Soil Metagenome: Novel Enzymes and Bioactivities

    PubMed Central

    Lee, Myung Hwan

    2013-01-01

    The microbial diversity in soil ecosystems is higher than in any other microbial ecosystem. The majority of soil microorganisms has not been characterized, because the dominant members have not been readily culturable on standard cultivation media; therefore, the soil ecosystem is a great reservoir for the discovery of novel microbial enzymes and bioactivities. The soil metagenome, the collective microbial genome, could be cloned and sequenced directly from soils to search for novel microbial resources. This review summarizes the microbial diversity in soils and the efforts to search for microbial resources from the soil metagenome, with more emphasis on the potential of bioprospecting metagenomics and recent discoveries. PMID:24124406

  1. Enhancing Metagenomics Investigations of Microbial Interactions with Biofilm Technology

    PubMed Central

    McLean, Robert J. C.; Kakirde, Kavita S.

    2013-01-01

    Investigations of microbial ecology and diversity have been greatly enhanced by the application of culture-independent techniques. One such approach, metagenomics, involves sample collections from soil, water, and other environments. Extracted nucleic acids from bulk environmental samples are sequenced and analyzed, which allows microbial interactions to be inferred on the basis of bioinformatics calculations. In most environments, microbial interactions occur predominately in surface-adherent, biofilm communities. In this review, we address metagenomics sampling and biofilm biology, and propose an experimental strategy whereby the resolving power of metagenomics can be enhanced by incorporating a biofilm-enrichment step during sample acquisition. PMID:24284397

  2. Enhancing metagenomics investigations of microbial interactions with biofilm technology.

    PubMed

    McLean, Robert J C; Kakirde, Kavita S

    2013-11-11

    Investigations of microbial ecology and diversity have been greatly enhanced by the application of culture-independent techniques. One such approach, metagenomics, involves sample collections from soil, water, and other environments. Extracted nucleic acids from bulk environmental samples are sequenced and analyzed, which allows microbial interactions to be inferred on the basis of bioinformatics calculations. In most environments, microbial interactions occur predominately in surface-adherent, biofilm communities. In this review, we address metagenomics sampling and biofilm biology, and propose an experimental strategy whereby the resolving power of metagenomics can be enhanced by incorporating a biofilm-enrichment step during sample acquisition.

  3. Competitive Metagenomic DNA Hybridization Identifies Host-Specific Microbial Genetic Markers in Cow Fecal Samples†

    PubMed Central

    Shanks, Orin C.; Santo Domingo, Jorge W.; Lamendella, Regina; Kelty, Catherine A.; Graham, James E.

    2006-01-01

    Several PCR methods have recently been developed to identify fecal contamination in surface waters. In all cases, researchers have relied on one gene or one microorganism for selection of host-specific markers. Here we describe the application of a genome fragment enrichment (GFE) method to identify host-specific genetic markers from fecal microbial community DNA. As a proof of concept, bovine fecal DNA was challenged against a porcine fecal DNA background to select for bovine-specific DNA sequences. Bioinformatic analyses of 380 bovine enriched metagenomic sequences indicated a preponderance of Bacteroidales-like regions predicted to encode membrane-associated and secreted proteins. Oligonucleotide primers capable of annealing to select Bacteroidales-like bovine GFE sequences exhibited extremely high specificity (>99%) in PCR assays with total fecal DNAs from 279 different animal sources. These primers also demonstrated a broad distribution of corresponding genetic markers (81% positive) among 148 different bovine sources. These data demonstrate that direct metagenomic DNA analysis by the competitive solution hybridization approach described is an efficient method for identifying potentially useful fecal genetic markers and for characterizing differences between environmental microbial communities. PMID:16751515

  4. Metagenomic discovery of novel enzymes and biosurfactants in a slaughterhouse biofilm microbial community

    PubMed Central

    Thies, Stephan; Rausch, Sonja Christina; Kovacic, Filip; Schmidt-Thaler, Alexandra; Wilhelm, Susanne; Rosenau, Frank; Daniel, Rolf; Streit, Wolfgang; Pietruszka, Jörg; Jaeger, Karl-Erich

    2016-01-01

    DNA derived from environmental samples is a rich source of novel bioactive molecules. The choice of the habitat to be sampled predefines the properties of the biomolecules to be discovered due to the physiological adaptation of the microbial community to the prevailing environmental conditions. We have constructed a metagenomic library in Escherichia coli DH10b with environmental DNA (eDNA) isolated from the microbial community of a slaughterhouse drain biofilm consisting mainly of species from the family Flavobacteriaceae. By functional screening of this library we have identified several lipases, proteases and two clones (SA343 and SA354) with biosurfactant and hemolytic activities. Sequence analysis of the respective eDNA fragments and subsequent structure homology modelling identified genes encoding putative N-acyl amino acid synthases with a unique two-domain organisation. The produced biosurfactants were identified by NMR spectroscopy as N-acyltyrosines with N-myristoyltyrosine as the predominant species. Critical micelle concentration and reduction of surface tension were similar to those of chemically synthesised N-myristoyltyrosine. Furthermore, we showed that the newly isolated N-acyltyrosines exhibit antibiotic activity against various bacteria. This is the first report describing the successful application of functional high-throughput screening assays for the identification of biosurfactant producing clones within a metagenomic library. PMID:27271534

  5. Metagenomic discovery of novel enzymes and biosurfactants in a slaughterhouse biofilm microbial community.

    PubMed

    Thies, Stephan; Rausch, Sonja Christina; Kovacic, Filip; Schmidt-Thaler, Alexandra; Wilhelm, Susanne; Rosenau, Frank; Daniel, Rolf; Streit, Wolfgang; Pietruszka, Jörg; Jaeger, Karl-Erich

    2016-01-01

    DNA derived from environmental samples is a rich source of novel bioactive molecules. The choice of the habitat to be sampled predefines the properties of the biomolecules to be discovered due to the physiological adaptation of the microbial community to the prevailing environmental conditions. We have constructed a metagenomic library in Escherichia coli DH10b with environmental DNA (eDNA) isolated from the microbial community of a slaughterhouse drain biofilm consisting mainly of species from the family Flavobacteriaceae. By functional screening of this library we have identified several lipases, proteases and two clones (SA343 and SA354) with biosurfactant and hemolytic activities. Sequence analysis of the respective eDNA fragments and subsequent structure homology modelling identified genes encoding putative N-acyl amino acid synthases with a unique two-domain organisation. The produced biosurfactants were identified by NMR spectroscopy as N-acyltyrosines with N-myristoyltyrosine as the predominant species. Critical micelle concentration and reduction of surface tension were similar to those of chemically synthesised N-myristoyltyrosine. Furthermore, we showed that the newly isolated N-acyltyrosines exhibit antibiotic activity against various bacteria. This is the first report describing the successful application of functional high-throughput screening assays for the identification of biosurfactant producing clones within a metagenomic library. PMID:27271534

  6. Functional characteristics of an endophyte community colonizing rice roots as revealed by metagenomic analysis.

    PubMed

    Sessitsch, A; Hardoim, P; Döring, J; Weilharter, A; Krause, A; Woyke, T; Mitter, B; Hauberg-Lotte, L; Friedrich, F; Rahalkar, M; Hurek, T; Sarkar, A; Bodrossy, L; van Overbeek, L; Brar, D; van Elsas, J D; Reinhold-Hurek, B

    2012-01-01

    Roots are the primary site of interaction between plants and microorganisms. To meet food demands in changing climates, improved yields and stress resistance are increasingly important, stimulating efforts to identify factors that affect plant productivity. The role of bacterial endophytes that reside inside plants remains largely unexplored, because analysis of their specific functions is impeded by difficulties in cultivating most prokaryotes. Here, we present the first metagenomic approach to analyze an endophytic bacterial community resident inside roots of rice, one of the most important staple foods. Metagenome sequences were obtained from endophyte cells extracted from roots of field-grown plants. Putative functions were deduced from protein domains or similarity analyses of protein-encoding gene fragments, and allowed insights into the capacities of endophyte cells. This allowed us to predict traits and metabolic processes important for the endophytic lifestyle, suggesting that the endorhizosphere is an exclusive microhabitat requiring numerous adaptations. Prominent features included flagella, plant-polymer-degrading enzymes, protein secretion systems, iron acquisition and storage, quorum sensing, and detoxification of reactive oxygen species. Surprisingly, endophytes might be involved in the entire nitrogen cycle, as protein domains involved in N(2)-fixation, denitrification, and nitrification were detected and selected genes expressed. Our data suggest a high potential of the endophyte community for plant-growth promotion, improvement of plant stress resistance, biocontrol against pathogens, and bioremediation, regardless of their culturability.

  7. Characterization of a Soil Metagenome-Derived Gene Encoding Wax Ester Synthase.

    PubMed

    Kim, Nam Hee; Park, Ji-Hye; Chung, Eunsook; So, Hyun-Ah; Lee, Myung Hwan; Kim, Jin-Cheol; Hwang, Eul Chul; Lee, Seon-Woo

    2016-02-01

    A soil metagenome contains the genomes of all microbes included in a soil sample, including those that cannot be cultured. In this study, soil metagenome libraries were searched for microbial genes exhibiting lipolytic activity and those involved in potential lipid metabolism that could yield valuable products in microorganisms. One of the subclones derived from the original fosmid clone, pELP120, was selected for further analysis. A subclone spanning a 3.3 kb DNA fragment was found to encode for lipase/esterase and contained an additional partial open reading frame encoding a wax ester synthase (WES) motif. Consequently, both pELP120 and the full length of the gene potentially encoding WES were sequenced. To determine if the wes gene encoded a functioning WES protein that produced wax esters, gas chromatography-mass spectroscopy was conducted using ethyl acetate extract from an Escherichia coli strain that expressed the wes gene and was grown with hexadecanol. The ethyl acetate extract from this E. coli strain did indeed produce wax ester compounds of various carbon-chain lengths. DNA sequence analysis of the full-length gene revealed that the gene cluster may be derived from a member of Proteobacteria, whereas the clone does not contain any clear phylogenetic markers. These results suggest that the wes gene discovered in this study encodes a functional protein in E. coli and produces wax esters through a heterologous expression system.

  8. Metagenome and Metatranscriptome Analyses Using Protein Family Profiles

    PubMed Central

    Zhong, Cuncong; Yooseph, Shibu

    2016-01-01

    Analyses of metagenome data (MG) and metatranscriptome data (MT) are often challenged by a paucity of complete reference genome sequences and the uneven/low sequencing depth of the constituent organisms in the microbial community, which respectively limit the power of reference-based alignment and de novo sequence assembly. These limitations make accurate protein family classification and abundance estimation challenging, which in turn hamper downstream analyses such as abundance profiling of metabolic pathways, identification of differentially encoded/expressed genes, and de novo reconstruction of complete gene and protein sequences from the protein family of interest. The profile hidden Markov model (HMM) framework enables the construction of very useful probabilistic models for protein families that allow for accurate modeling of position specific matches, insertions, and deletions. We present a novel homology detection algorithm that integrates banded Viterbi algorithm for profile HMM parsing with an iterative simultaneous alignment and assembly computational framework. The algorithm searches a given profile HMM of a protein family against a database of fragmentary MG/MT sequencing data and simultaneously assembles complete or near-complete gene and protein sequences of the protein family. The resulting program, HMM-GRASPx, demonstrates superior performance in aligning and assembling homologs when benchmarked on both simulated marine MG and real human saliva MG datasets. On real supragingival plaque and stool MG datasets that were generated from healthy individuals, HMM-GRASPx accurately estimates the abundances of the antimicrobial resistance (AMR) gene families and enables accurate characterization of the resistome profiles of these microbial communities. For real human oral microbiome MT datasets, using the HMM-GRASPx estimated transcript abundances significantly improves detection of differentially expressed (DE) genes. Finally, HMM-GRASPx was used to

  9. Metagenome and Metatranscriptome Analyses Using Protein Family Profiles.

    PubMed

    Zhong, Cuncong; Edlund, Anna; Yang, Youngik; McLean, Jeffrey S; Yooseph, Shibu

    2016-07-01

    Analyses of metagenome data (MG) and metatranscriptome data (MT) are often challenged by a paucity of complete reference genome sequences and the uneven/low sequencing depth of the constituent organisms in the microbial community, which respectively limit the power of reference-based alignment and de novo sequence assembly. These limitations make accurate protein family classification and abundance estimation challenging, which in turn hamper downstream analyses such as abundance profiling of metabolic pathways, identification of differentially encoded/expressed genes, and de novo reconstruction of complete gene and protein sequences from the protein family of interest. The profile hidden Markov model (HMM) framework enables the construction of very useful probabilistic models for protein families that allow for accurate modeling of position specific matches, insertions, and deletions. We present a novel homology detection algorithm that integrates banded Viterbi algorithm for profile HMM parsing with an iterative simultaneous alignment and assembly computational framework. The algorithm searches a given profile HMM of a protein family against a database of fragmentary MG/MT sequencing data and simultaneously assembles complete or near-complete gene and protein sequences of the protein family. The resulting program, HMM-GRASPx, demonstrates superior performance in aligning and assembling homologs when benchmarked on both simulated marine MG and real human saliva MG datasets. On real supragingival plaque and stool MG datasets that were generated from healthy individuals, HMM-GRASPx accurately estimates the abundances of the antimicrobial resistance (AMR) gene families and enables accurate characterization of the resistome profiles of these microbial communities. For real human oral microbiome MT datasets, using the HMM-GRASPx estimated transcript abundances significantly improves detection of differentially expressed (DE) genes. Finally, HMM-GRASPx was used to

  10. Dirichlet multinomial mixtures: generative models for microbial metagenomics.

    PubMed

    Holmes, Ian; Harris, Keith; Quince, Christopher

    2012-01-01

    We introduce Dirichlet multinomial mixtures (DMM) for the probabilistic modelling of microbial metagenomics data. This data can be represented as a frequency matrix giving the number of times each taxa is observed in each sample. The samples have different size, and the matrix is sparse, as communities are diverse and skewed to rare taxa. Most methods used previously to classify or cluster samples have ignored these features. We describe each community by a vector of taxa probabilities. These vectors are generated from one of a finite number of Dirichlet mixture components each with different hyperparameters. Observed samples are generated through multinomial sampling. The mixture components cluster communities into distinct 'metacommunities', and, hence, determine envirotypes or enterotypes, groups of communities with a similar composition. The model can also deduce the impact of a treatment and be used for classification. We wrote software for the fitting of DMM models using the 'evidence framework' (http://code.google.com/p/microbedmm/). This includes the Laplace approximation of the model evidence. We applied the DMM model to human gut microbe genera frequencies from Obese and Lean twins. From the model evidence four clusters fit this data best. Two clusters were dominated by Bacteroides and were homogenous; two had a more variable community composition. We could not find a significant impact of body mass on community structure. However, Obese twins were more likely to derive from the high variance clusters. We propose that obesity is not associated with a distinct microbiota but increases the chance that an individual derives from a disturbed enterotype. This is an example of the 'Anna Karenina principle (AKP)' applied to microbial communities: disturbed states having many more configurations than undisturbed. We verify this by showing that in a study of inflammatory bowel disease (IBD) phenotypes, ileal Crohn's disease (ICD) is associated with a more variable

  11. Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics

    PubMed Central

    Holmes, Ian; Harris, Keith; Quince, Christopher

    2012-01-01

    We introduce Dirichlet multinomial mixtures (DMM) for the probabilistic modelling of microbial metagenomics data. This data can be represented as a frequency matrix giving the number of times each taxa is observed in each sample. The samples have different size, and the matrix is sparse, as communities are diverse and skewed to rare taxa. Most methods used previously to classify or cluster samples have ignored these features. We describe each community by a vector of taxa probabilities. These vectors are generated from one of a finite number of Dirichlet mixture components each with different hyperparameters. Observed samples are generated through multinomial sampling. The mixture components cluster communities into distinct ‘metacommunities’, and, hence, determine envirotypes or enterotypes, groups of communities with a similar composition. The model can also deduce the impact of a treatment and be used for classification. We wrote software for the fitting of DMM models using the ‘evidence framework’ (http://code.google.com/p/microbedmm/). This includes the Laplace approximation of the model evidence. We applied the DMM model to human gut microbe genera frequencies from Obese and Lean twins. From the model evidence four clusters fit this data best. Two clusters were dominated by Bacteroides and were homogenous; two had a more variable community composition. We could not find a significant impact of body mass on community structure. However, Obese twins were more likely to derive from the high variance clusters. We propose that obesity is not associated with a distinct microbiota but increases the chance that an individual derives from a disturbed enterotype. This is an example of the ‘Anna Karenina principle (AKP)’ applied to microbial communities: disturbed states having many more configurations than undisturbed. We verify this by showing that in a study of inflammatory bowel disease (IBD) phenotypes, ileal Crohn's disease (ICD) is associated with a

  12. Characterization of the gut microbiota of Kawasaki disease patients by metagenomic analysis

    PubMed Central

    Kinumaki, Akiko; Sekizuka, Tsuyoshi; Hamada, Hiromichi; Kato, Kengo; Yamashita, Akifumi; Kuroda, Makoto

    2015-01-01

    Kawasaki disease (KD) is an acute febrile illness of early childhood. Previous reports have suggested that genetic disease susceptibility factors, together with a triggering infectious agent, could be involved in KD pathogenesis; however, the precise etiology of this disease remains unknown. Additionally, previous culture-based studies have suggested a possible role of intestinal microbiota in KD pathogenesis. In this study, we performed metagenomic analysis to comprehensively assess the longitudinal variation in the intestinal microbiota of 28 KD patients. Several notable bacterial genera were commonly extracted during the acute phase, whereas a relative increase in the number of Ruminococcus bacteria was observed during the non-acute phase of KD. The metagenomic analysis results based on bacterial species classification suggested that the number of sequencing reads with similarity to five Streptococcus spp. (S. pneumonia, pseudopneumoniae, oralis, gordonii, and sanguinis), in addition to patient-derived Streptococcus isolates, markedly increased during the acute phase in most patients. Streptococci include a variety of pathogenic bacteria and probiotic bacteria that promote human health; therefore, this further species discrimination could comprehensively illuminate the KD-associated microbiota. The findings of this study suggest that KD-related Streptococci might be involved in the pathogenesis of this disease. PMID:26322033

  13. Metagenomic analysis of the gut microbiota of the Timber Rattlesnake, Crotalus horridus.

    PubMed

    McLaughlin, Richard William; Cochran, Philip A; Dowd, Scot E

    2015-07-01

    Snakes are capable of surviving long periods without food. In this study we characterized the microbiota of a Timber Rattlesnake (Crotalus horridus), devoid of digesta, living in the wild. Pyrosequencing-based metagenomics were used to analyze phylogenetic and metabolic profiles with the aid of the MG-RAST server. Pyrosequencing of samples taken from the stomach, small intestine and colon yielded 691696, 957756 and 700419 high quality sequence reads. Taxonomic analysis of metagenomic reads indicated Eukarya was the most predominant domain, followed by bacteria and then viruses, for all three tissues. The most predominant phylum in the domain Bacteria was Proteobacteria for the tissues examined. Functional classifications by the subsystem database showed cluster-based subsystems were most predominant (10-15 %). Almost equally predominant (10-13 %) was carbohydrate metabolism. To identify bacteria in the colon at a finer taxonomic resolution, a 16S rRNA gene clone library was created. Proteobacteria was again found to be the most predominant phylum. The present study provides a baseline for understanding the microbial ecology of snakes living in the wild. PMID:25663091

  14. Metagenomic analysis of the gut microbiota of the Timber Rattlesnake, Crotalus horridus.

    PubMed

    McLaughlin, Richard William; Cochran, Philip A; Dowd, Scot E

    2015-07-01

    Snakes are capable of surviving long periods without food. In this study we characterized the microbiota of a Timber Rattlesnake (Crotalus horridus), devoid of digesta, living in the wild. Pyrosequencing-based metagenomics were used to analyze phylogenetic and metabolic profiles with the aid of the MG-RAST server. Pyrosequencing of samples taken from the stomach, small intestine and colon yielded 691696, 957756 and 700419 high quality sequence reads. Taxonomic analysis of metagenomic reads indicated Eukarya was the most predominant domain, followed by bacteria and then viruses, for all three tissues. The most predominant phylum in the domain Bacteria was Proteobacteria for the tissues examined. Functional classifications by the subsystem database showed cluster-based subsystems were most predominant (10-15 %). Almost equally predominant (10-13 %) was carbohydrate metabolism. To identify bacteria in the colon at a finer taxonomic resolution, a 16S rRNA gene clone library was created. Proteobacteria was again found to be the most predominant phylum. The present study provides a baseline for understanding the microbial ecology of snakes living in the wild.

  15. Classifying the uncultivated microbial majority: A place for metagenomic data in the Candidatus proposal.

    PubMed

    Konstantinidis, Konstantinos T; Rosselló-Móra, Ramon

    2015-06-01

    Microbial taxonomists have generally been reluctant to accept the valid publication of names of uncultured taxa given that only pure cultures allow for a thorough description of the genealogy, genetics and phenotype of the putative taxa to be classified. The classification of conspicuous uncultured organisms has been considered into the Candidatus provisional status, but this is only possible with organisms for which it is possible to retrieve basic data on phylogeny, morphology, ecology and some metabolic traits that unequivocally identify them. The current developments on modern sequencing techniques, and especially metagenomics, allow the recognition of discrete populations of DNA sequences in environmental samples, which can be considered to belong to individual closely related populations that may be identified as members of yet-to-be described species. The recognition of such populations of (meta)genomes allow the retrieval of valuable taxonomic information, i.e. genealogy, genome, phenotypic coherence with other populations, and ecological relevant traits. Such traits may be included in the Candidatus proposals of environmentally occurring, yet uncultured species not exhibiting exceptional morphologies, phenotypes or ecological relevancies. PMID:25681255

  16. Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Stepanauskas, Ramunas [Bigelow Laboratory

    2016-07-12

    DOE JGI's Tanja Woyke, chair of the Single Cells and Metagenomes session, delivers an introduction, followed by Bigelow Laboratory's Ramunas Stepanauskas on "Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  17. Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Stepanauskas, Ramunas

    2011-10-13

    DOE JGI's Tanja Woyke, chair of the Single Cells and Metagenomes session, delivers an introduction, followed by Bigelow Laboratory's Ramunas Stepanauskas on "Single Cell and Metagenomic Assemblies: Biology Drives Technical Choices and Goals" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  18. Selectable fragmentation warhead

    SciTech Connect

    Bryan, C.S.; Paisley, D.L.; Montoya, N.I.; Stahl, D.B.

    1993-07-20

    A selectable fragmentation warhead is described comprising: a case having proximal and distal ends; a fragmenting plate mounted in said distal end of said casing; first explosive means cast adjacent to said fragmenting plate for creating a predetermined number of fragments from said fragmenting plate; three or more first laser-driven slapper detonators located adjacent to said first explosive means for detonating said first explosive means in a predetermined pattern; smoother-disk means located adjacent to said first means for accelerating said fragments; second explosive means cast adjacent to said smoother-disk means for further accelerating said fragments; at least one laser-driven slapper detonators located in said second explosive means; a laser located in said proximal end of said casing; optical fibers connecting said laser to said first and second laser-driven slapper detonators; and optical switch means located in series with said optical fibers connected to said plurality of first laser-driven slapper detonators for blocking or passing light from said laser to said plurality of first laser-driven slapper detonators.

  19. Opaque rock fragments

    SciTech Connect

    Abhijit, B.; Molinaroli, E.; Olsen, J.

    1987-05-01

    The authors describe a new, rare, but petrogenetically significant variety of rock fragments from Holocene detrital sediments. Approximately 50% of the opaque heavy mineral concentrates from Holocene siliciclastic sands are polymineralic-Fe-Ti oxide particles, i.e., they are opaque rock fragments. About 40% to 70% of these rock fragments show intergrowth of hm + il, mt + il, and mt + hm +/- il. Modal analysis of 23,282 opaque particles in 117 polished thin sections of granitic and metamorphic parent rocks and their daughter sands from semi-arid and humid climates show the following relative abundances. The data show that opaque rock fragments are more common in sands from igneous source rocks and that hm + il fragments are more durable. They assume that equilibrium conditions existed in parent rocks during the growth of these paired minerals, and that the Ti/Fe ratio did not change during oxidation of mt to hm. Geothermometric determinations using electron probe microanalysis of opaque rock fragments in sand samples from Lake Erie and the Adriatic Sea suggest that these rock fragments may have equilibrated at approximately 900/sup 0/ and 525/sup 0/C, respectively.

  20. Auroral fragmentation into patches

    NASA Astrophysics Data System (ADS)

    Shiokawa, Kazuo; Hashimoto, Ayumi; Hori, Tomoaki; Sakaguchi, Kaori; Ogawa, Yasunobu; Donovan, Eric; Spanswick, Emma; Connors, Martin; Otsuka, Yuichi; Oyama, Shin-Ichiro; Nozawa, Satonori; McWilliams, Kathryn

    2014-10-01

    Auroral patches in diffuse auroras are very common features in the postmidnight local time. However, the processes that produce auroral patches are not yet well understood. In this paper we present two examples of auroral fragmentation which is the process by which uniform aurora is broken into several fragments to form auroral patches. These examples were observed at Athabasca, Canada (geomagnetic latitude: 61.7°N), and Tromsø, Norway (67.1°N). Captured in sequences of images, the auroral fragmentation occurs as finger-like structures developing latitudinally with horizontal-scale sizes of 40-100 km at ionospheric altitudes. The structures tend to develop in a north-south direction with speeds of 150-420 m/s without any shearing motion, suggesting that pressure-driven instability in the balance between the earthward magnetic-tension force and the tailward pressure gradient force in the magnetosphere is the main driving force of the auroral fragmentation. Therefore, these observations indicate that auroral fragmentation associated with pressure-driven instability is a process that creates auroral patches. The observed slow eastward drift of aurora during the auroral fragmentation suggests that fragmentation occurs in low-energy ambient plasma.

  1. Metagenomic profiling of a microbial assemblage associated with the California mussel: a node in networks of carbon and nitrogen cycling.

    PubMed

    Pfister, Catherine A; Meyer, Folker; Antonopoulos, Dionysios A

    2010-05-06

    Mussels are conspicuous and often abundant members of rocky shores and may constitute an important site for the nitrogen cycle due to their feeding and excretion activities. We used shotgun metagenomics of the microbial community associated with the surface of mussels (Mytilus californianus) on Tatoosh Island in Washington state to test whether there is a nitrogen-based microbial assemblage associated with mussels. Analyses of both tidepool mussels and those on emergent benches revealed a diverse community of Bacteria and Archaea with approximately 31 million bp from 6 mussels in each habitat. Using MG-RAST, between 22.5-25.6% were identifiable using the SEED non-redundant database for proteins. Of those fragments that were identifiable through MG-RAST, the composition was dominated by Cyanobacteria and Alpha- and Gamma-proteobacteria. Microbial composition was highly similar between the tidepool and emergent bench mussels, suggesting similar functions across these different microhabitats. One percent of the proteins identified in each sample were related to nitrogen cycling. When normalized to protein discovery rate, the high diversity and abundance of enzymes related to the nitrogen cycle in mussel-associated microbes is as great or greater than that described for other marine metagenomes. In some instances, the nitrogen-utilizing profile of this assemblage was more concordant with soil metagenomes in the Midwestern U.S. than for open ocean system. Carbon fixation and Calvin cycle enzymes further represented 0.65 and 1.26% of all proteins and their abundance was comparable to a number of open ocean marine metagenomes. In sum, the diversity and abundance of nitrogen and carbon cycle related enzymes in the microbes occupying the shells of Mytilus californianus suggest these mussels provide a node for microbial populations and thus biogeochemical processes.

  2. Metagenomics of Glassy-winged Sharpshooter, Homalodisca vitripennis (Hemiptera: Cicadellidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Three new insect-infecting viruses, three endosymbiotic bacteria, a fungus, and a bacterial phage were discovered using a metagenomics approach to identify unknown organisms that live in association with the sharpshooter, Homalodisca vitripennis (Hemiptera: Cicadellidae). The genetic composition of ...

  3. Application of DNA microarray for screening metagenome library clones.

    PubMed

    Park, Soo-Je; Chae, Jong-Chan; Rhee, Sung-Keun

    2010-01-01

    Sequence-based screening tools of a metagenome library can expedite metagenome researches considering tremendous metagenome diversities. Several critical disadvantages of activity-based screening of metagenome libraries could be overcome by sequence-based screening approaches. DNA microarray technology widely used for monitoring environmental genes can be employed for screening environmental fosmid and BAC clones harboring target genes due to its high throughput nature. DNAs of fosmid clones are extracted and spotted on a glass slide and fluorescence-labeled probes are hybridized to the microarray. Specific hybridization signals can be obtained only for the fosmid clones that contain the target gene with high sensitivity (10 ng/μL of fosmid clone DNA) and quantitativeness. PMID:20830574

  4. Metagenomics - a guide from sampling to data analysis

    PubMed Central

    2012-01-01

    Metagenomics applies a suite of genomic technologies and bioinformatics tools to directly access the genetic content of entire communities of organisms. The field of metagenomics has been responsible for substantial advances in microbial ecology, evolution, and diversity over the past 5 to 10 years, and many research laboratories are actively engaged in it now. With the growing numbers of activities also comes a plethora of methodological knowledge and expertise that should guide future developments in the field. This review summarizes the current opinions in metagenomics, and provides practical guidance and advice on sample processing, sequencing technology, assembly, binning, annotation, experimental design, statistical analysis, data storage, and data sharing. As more metagenomic datasets are generated, the availability of standardized procedures and shared data storage and analysis becomes increasingly important to ensure that output of individual projects can be assessed and compared. PMID:22587947

  5. mmnet: An R Package for Metagenomics Systems Biology Analysis.

    PubMed

    Cao, Yang; Zheng, Xiaofei; Li, Fei; Bo, Xiaochen

    2015-01-01

    The human microbiome plays important roles in human health and disease. Previous microbiome studies focused mainly on single pure species function and overlooked the interactions in the complex communities on system-level. A metagenomic approach introduced recently integrates metagenomic data with community-level metabolic network modeling, but no comprehensive tool was available for such kind of approaches. To facilitate these kinds of studies, we developed an R package, mmnet, to implement community-level metabolic network reconstruction. The package also implements a set of functions for automatic analysis pipeline construction including functional annotation of metagenomic reads, abundance estimation of enzymatic genes, community-level metabolic network reconstruction, and integrated network analysis. The result can be represented in an intuitive way and sent to Cytoscape for further exploration. The package has substantial potentials in metagenomic studies that focus on identifying system-level variations of human microbiome associated with disease.

  6. Exploring Metagenomics in the Laboratory of an Introductory Biology Course†

    PubMed Central

    Gibbens, Brian B.; Scott, Cheryl L.; Hoff, Courtney D.; Schottel, Janet L.

    2015-01-01

    Four laboratory modules were designed for introductory biology students to explore the field of metagenomics. Students collected microbes from environmental samples, extracted the DNA, and amplified 16S rRNA gene sequences using polymerase chain reaction (PCR). Students designed functional metagenomics screens to determine and compare antibiotic resistance profiles among the samples. Bioinformatics tools were used to generate and interpret phylogenetic trees and identify homologous genes. A pretest and posttest were used to assess learning gains, and the results indicated that these modules increased student performance by an average of 22%. Here we describe ways to engage students in metagenomics-related research and provide readers with ideas for how they can start developing metagenomics exercises for their own classrooms. PMID:25949755

  7. Identifying Differentially Abundant Metabolic Pathways in Metagenomic Datasets

    NASA Astrophysics Data System (ADS)

    Liu, Bo; Pop, Mihai

    Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of such studies is to identify specific functional adaptations of microbial communities to their habitats. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic data-sets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge. We show that MetaPath outperforms other common approaches when evaluated on simulated datasets. We also demonstrate the power of our methods in analyzing two, publicly available, metagenomic datasets: a comparison of the gut microbiome of obese and lean twins; and a comparison of the gut microbiome of infant and adult subjects. We demonstrate that the subpathways identified by our method provide valuable insights into the biological activities of the microbiome.

  8. Functional metagenomics for the investigation of antibiotic resistance.

    PubMed

    Mullany, Peter

    2014-04-01

    Antibiotic resistance is a major threat to human health and well-being. To effectively combat this problem we need to understand the range of different resistance genes that allow bacteria to resist antibiotics. To do this the whole microbiota needs to be investigated. As most bacteria cannot be cultivated in the laboratory, the reservoir of antibiotic resistance genes in the non-cultivatable majority remains relatively unexplored. Currently the only way to study antibiotic resistance in these organisms is to use metagenomic approaches. Furthermore, the only method that does not require any prior knowledge about the resistance genes is functional metagenomics, which involves expressing genes from metagenomic clones in surrogate hosts. In this review the methods and limitations of functional metagenomics to isolate new antibiotic resistance genes and the mobile genetic elements that mediate their spread are explored.

  9. Comparative Metagenomics of Freshwater Microbial Communities

    SciTech Connect

    Hemme, Chris; Deng, Ye; Tu, Qichao; Fields, Matthew; Gentry, Terry; Wu, Liyou; Tringe, Susannah; Watson, David; He, Zhili; Hazen, Terry; Tiedje, James; Rubin, Eddy; Zhou, Jizhong

    2010-05-17

    Previous analyses of a microbial metagenome from uranium and nitric-acid contaminated groundwater (FW106) showed significant environmental effects resulting from the rapid introduction of multiple contaminants. Effects include a massive loss of species and strain biodiversity, accumulation of toxin resistant genes in the metagenome and lateral transfer of toxin resistance genes between community members. To better understand these results in an ecological context, a second metagenome from a pristine groundwater system located along the same geological strike was sequenced and analyzed (FW301). It is hypothesized that FW301 approximates the ancestral FW106 community based on phylogenetic profiles and common geological parameters; however, even if is not the case, the datasets still permit comparisons between healthy and stressed groundwater ecosystems. Complex carbohydrate metabolism has been almost entirely lost in the stressed ecosystem. In contrast, the pristine system encodes a wide diversity of complex carbohydrate metabolism systems, suggesting that carbon turnover is very rapid and less leaky in the healthy groundwater system. FW301 encodes many (~;;160+) carbon monoxide dehydrogenase genes while FW106 encodes none. This result suggests that the community is frequently exposed to oxygen from aerated rainwater percolating into the subsurface, with a resulting high rate of carbon metabolism and CO production. When oxygen levels fall, the CO then serves as a major carbon source for the community. FW301 appears to be capable of CO2 fixation via the reductive carboxylase (reverse TCA) cycle and possibly acetogenesis, activities; these activities are lacking in the heterotrophic FW106 system which relies exclusively on respiration of nitrate and/or oxygen for energy production. FW301 encodes a complete set of B12 biosynthesis pathway at high abundance suggesting the use of sodium gradients for energy production in the healthy groundwater community. Overall

  10. Fragment capture device

    DOEpatents

    Payne, Lloyd R.; Cole, David L.

    2010-03-30

    A fragment capture device for use in explosive containment. The device comprises an assembly of at least two rows of bars positioned to eliminate line-of-sight trajectories between the generation point of fragments and a surrounding containment vessel or asset. The device comprises an array of at least two rows of bars, wherein each row is staggered with respect to the adjacent row, and wherein a lateral dimension of each bar and a relative position of each bar in combination provides blockage of a straight-line passage of a solid fragment through the adjacent rows of bars, wherein a generation point of the solid fragment is located within a cavity at least partially enclosed by the array of bars.

  11. Fragmentation in Biaxial Tension

    SciTech Connect

    Campbell, G H; Archbold, G C; Hurricane, O A; Miller, P L

    2006-06-13

    We have carried out an experiment that places a ductile stainless steel in a state of biaxial tension at a high rate of strain. The loading of the ductile metal spherical cap is performed by the detonation of a high explosive layer with a conforming geometry to expand the metal radially outwards. Simulations of the loading and expansion of the metal predict strain rates that compare well with experimental observations. A high percentage of the HE loaded material was recovered through a soft capture process and characterization of the recovered fragments provided high quality data, including uniform strain prior to failure and fragment size. These data were used with a modified fragmentation model to determine a fragmentation energy.

  12. Metagenomic Sequencing of an In Vitro-Simulated Microbial Community

    SciTech Connect

    Morgan, Jenna L.; Darling, Aaron E.; Eisen, Jonathan A.

    2009-12-01

    Background: Microbial life dominates the earth, but many species are difficult or even impossible to study under laboratory conditions. Sequencing DNA directly from the environment, a technique commonly referred to as metagenomics, is an important tool for cataloging microbial life. This culture-independent approach involves collecting samples that include microbes in them, extracting DNA from the samples, and sequencing the DNA. A sample may contain many different microorganisms, macroorganisms, and even free-floating environmental DNA. A fundamental challenge in metagenomics has been estimating the abundance of organisms in a sample based on the frequency with which the organism's DNA was observed in reads generated via DNA sequencing. Methodology/Principal Findings: We created mixtures of ten microbial species for which genome sequences are known. Each mixture contained an equal number of cells of each species. We then extracted DNA from the mixtures, sequenced the DNA, and measured the frequency with which genomic regions from each organism was observed in the sequenced DNA. We found that the observed frequency of reads mapping to each organism did not reflect the equal numbers of cells that were known to be included in each mixture. The relative organism abundances varied significantly depending on the DNA extraction and sequencing protocol utilized. Conclusions/Significance: We describe a new data resource for measuring the accuracy of metagenomic binning methods, created by in vitro-simulation of a metagenomic community. Our in vitro simulation can be used to complement previous in silico benchmark studies. In constructing a synthetic community and sequencing its metagenome, we encountered several sources of observation bias that likely affect most metagenomic experiments to date and present challenges for comparative metagenomic studies. DNA preparation methods have a particularly profound effect in our study, implying that samples prepared with different

  13. Exploration of Metagenome Assemblies with an Interactive Visualization Tool

    SciTech Connect

    Cantor, Michael; Nordberg, Henrik; Smirnova, Tatyana; Andersen, Evan; Tringe, Susannah; Hess, Matthias; Dubchak, Inna

    2014-07-09

    Metagenomics, one of the fastest growing areas of modern genomic science, is the genetic profiling of the entire community of microbial organisms present in an environmental sample. Elviz is a web-based tool for the interactive exploration of metagenome assemblies. Elviz can be used with publicly available data sets from the Joint Genome Institute or with custom user-loaded assemblies. Elviz is available at genome.jgi.doe.gov/viz

  14. Metagenomics for studying unculturable microorganisms: cutting the Gordian knot

    PubMed Central

    Schloss, Patrick D; Handelsman, Jo

    2005-01-01

    More than 99% of prokaryotes in the environment cannot be cultured in the laboratory, a phenomenon that limits our understanding of microbial physiology, genetics, and community ecology. One way around this problem is metagenomics, the culture-independent cloning and analysis of microbial DNA extracted directly from an environmental sample. Recent advances in shotgun sequencing and computational methods for genome assembly have advanced the field of metagenomics to provide glimpses into the life of uncultured microorganisms. PMID:16086859

  15. Metagenomics-based drug discovery and marine microbial diversity.

    PubMed

    Li, Xiang; Qin, Ling

    2005-11-01

    As the global threat of drug-resistant pathogens continues to rise, new strategies and resources are required to accelerate and advance the drug discovery process. We believe that rapid progress in metagenomics has opened up a new era in the study of marine microbial diversity that enables direct access to the genomes of numerous uncultivable microorganisms. This review outlines recent developments and future trends in metagenomics-based drug discovery in marine microbial communities and their associated chemical prosperity.

  16. Assessment of diversity indices for the characterization of the soil prokaryotic community by metagenomic analysis

    NASA Astrophysics Data System (ADS)

    Chernov, T. I.; Tkhakakhova, A. K.; Kutovaya, O. V.

    2015-04-01

    The diversity indices used in ecology for assessing the metagenomes of soil prokaryotic communities at different phylogenetic levels were compared. The following indices were considered: the number of detected taxa and the Shannon, Menhinick, Margalef, Simpson, Chao1, and ACE indices. The diversity analysis of the prokaryotic communities in the upper horizons of a typical chernozem (Haplic Chernozem (Pachic)), a dark chestnut soil (Haplic Kastanozem (Chromic)), and an extremely arid desert soil (Endosalic Calcisol (Yermic)) was based on the analysis of 16S rRNA genes. The Menhinick, Margalef, Chao1, and ACE indices gave similar results for the classification of the communities according to their diversity levels; the Simpson index gave good results only for the high-level taxa (phyla); the best results were obtained with the Shannon index. In general, all the indices used showed a decrease in the diversity of the soil prokaryotes in the following sequence: chernozem > dark chestnut soil > extremely arid desert soil.

  17. MetaProx: the database of metagenomic proximons

    PubMed Central

    Vey, Gregory; Charles, Trevor C.

    2014-01-01

    MetaProx is the database of metagenomic proximons: a searchable repository of proximon objects conceived with two specific goals. The first objective is to accelerate research involving metagenomic functional interactions by providing a database of metagenomic operon candidates. Proximons represent a special subset of directons (series of contiguous co-directional genes) where each member gene is in close proximity to its neighbours with respect to intergenic distance. As a result, proximons represent significant operon candidates where some subset of proximons is the set of true metagenomic operons. Proximons are well suited for the inference of metagenomic functional networks because predicted functional linkages do not rely on homology-dependent information that is frequently unavailable in metagenomic scenarios. The second objective is to explore representations for semistructured biological data that can offer an alternative to the traditional relational database approach. In particular, we use a serialized object implementation and advocate a Data as Data policy where the same serialized objects can be used at all levels (database, search tool and saved user file) without conversion or the use of human-readable markups. MetaProx currently includes 4 210 818 proximons consisting of 8 926 993 total member genes. Database URL: http://metaprox.uwaterloo.ca PMID:25288655

  18. Beyond the bounds of orthology: functional inference from metagenomic context.

    PubMed

    Vey, Gregory; Moreno-Hagelsieb, Gabriel

    2010-07-01

    The effectiveness of the computational inference of function by genomic context is bounded by the diversity of known microbial genomes. Although metagenomes offer access to previously inaccessible organisms, their fragmentary nature prevents the conventional establishment of orthologous relationships required for reliably predicting functional interactions. We introduce a protocol for the prediction of functional interactions using data sources without information about orthologous relationships. To illustrate this process, we use the Sargasso Sea metagenome to construct a functional interaction network for the Escherichia coli K12 genome. We identify two reliability metrics, target intergenic distance and source interaction count, and apply them to selectively filter the predictions retained to construct the network of functional interactions. The resulting network contains 2297 nodes with 10 072 edges with a positive predictive value of 0.80. The metagenome yielded 8423 functional interactions beyond those found using only the genomic orthologs as a data source. This amounted to a 134% increase in the total number of functional interactions that are predicted by combining the metagenome and the genomic orthologs versus the genomic orthologs alone. In the absence of detectable orthologous relationships it remains feasible to derive a reliable set of predicted functional interactions. This offers a strategy for harnessing other metagenomes and homologs in general. Because metagenomes allow access to previously unreachable microorganisms, this will result in expanding the universe of known functional interactions thus furthering our understanding of functional organization. PMID:20419183

  19. Beyond the bounds of orthology: functional inference from metagenomic context.

    PubMed

    Vey, Gregory; Moreno-Hagelsieb, Gabriel

    2010-07-01

    The effectiveness of the computational inference of function by genomic context is bounded by the diversity of known microbial genomes. Although metagenomes offer access to previously inaccessible organisms, their fragmentary nature prevents the conventional establishment of orthologous relationships required for reliably predicting functional interactions. We introduce a protocol for the prediction of functional interactions using data sources without information about orthologous relationships. To illustrate this process, we use the Sargasso Sea metagenome to construct a functional interaction network for the Escherichia coli K12 genome. We identify two reliability metrics, target intergenic distance and source interaction count, and apply them to selectively filter the predictions retained to construct the network of functional interactions. The resulting network contains 2297 nodes with 10 072 edges with a positive predictive value of 0.80. The metagenome yielded 8423 functional interactions beyond those found using only the genomic orthologs as a data source. This amounted to a 134% increase in the total number of functional interactions that are predicted by combining the metagenome and the genomic orthologs versus the genomic orthologs alone. In the absence of detectable orthologous relationships it remains feasible to derive a reliable set of predicted functional interactions. This offers a strategy for harnessing other metagenomes and homologs in general. Because metagenomes allow access to previously unreachable microorganisms, this will result in expanding the universe of known functional interactions thus furthering our understanding of functional organization.

  20. Uncovering oral Neisseria tropism and persistence using metagenomic sequencing.

    PubMed

    Donati, Claudio; Zolfo, Moreno; Albanese, Davide; Tin Truong, Duy; Asnicar, Francesco; Iebba, Valerio; Cavalieri, Duccio; Jousson, Olivier; De Filippo, Carlotta; Huttenhower, Curtis; Segata, Nicola

    2016-01-01

    Microbial epidemiology and population genomics have previously been carried out near-exclusively for organisms grown in vitro. Metagenomics helps to overcome this limitation, but it is still challenging to achieve strain-level characterization of microorganisms from culture-independent data with sufficient resolution for epidemiological modelling. Here, we have developed multiple complementary approaches that can be combined to profile and track individual microbial strains. To specifically profile highly recombinant neisseriae from oral metagenomes, we integrated four metagenomic analysis techniques: single nucleotide polymorphisms in the clade's core genome, DNA uptake sequence signatures, metagenomic multilocus sequence typing and strain-specific marker genes. We applied these tools to 520 oral metagenomes from the Human Microbiome Project, finding evidence of site tropism and temporal intra-subject strain retention. Although the opportunistic pathogen Neisseria meningitidis is enriched for colonization in the throat, N. flavescens and N. subflava populate the tongue dorsum, and N. sicca, N. mucosa and N. elongata the gingival plaque. The buccal mucosa appeared as an intermediate ecological niche between the plaque and the tongue. The resulting approaches to metagenomic strain profiling are generalizable and can be extended to other organisms and microbiomes across environments. PMID:27572971

  1. Developing a metagenomic view of xenobiotic metabolism

    PubMed Central

    Haiser, Henry J.; Turnbaugh, Peter J.

    2012-01-01

    The microbes residing in and on the human body influence human physiology in many ways, particularly through their impact on the metabolism of xenobiotic compounds, including therapeutic drugs, antibiotics, and diet-derived bioactive compounds. Despite the importance of these interactions and the many possibilities for intervention, microbial xenobiotic metabolism remains a largely underexplored component of pharmacology. Here, we discuss the emerging evidence for both direct and indirect effects of the human gut microbiota on xenobiotic metabolism, and the initial links that have been made between specific compounds, diverse members of this complex community, and the microbial genes responsible. Furthermore, we highlight the many parallels to the now well-established field of environmental bioremediation, and the vast potential to leverage emerging metagenomic tools to shed new light on these important microbial biotransformations. PMID:22902524

  2. EBI metagenomics in 2016--an expanding and evolving resource for the analysis and archiving of metagenomic data.

    PubMed

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; ten Hoopen, Petra; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D

    2016-01-01

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. PMID:26582919

  3. Metagenomics, metaMicrobesOnline and Kbase Data Integration (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Dehal, Paramvir [LBNL

    2016-07-12

    Berkeley Lab's Paramvir Dehal on "Managing and Storing large Datasets in MicrobesOnline, metaMicrobesOnline and the DOE Knowledgebase" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  4. Metagenomics, metaMicrobesOnline and Kbase Data Integration (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Dehal, Paramvir

    2011-10-12

    Berkeley Lab's Paramvir Dehal on "Managing and Storing large Datasets in MicrobesOnline, metaMicrobesOnline and the DOE Knowledgebase" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  5. EBI metagenomics in 2016 - an expanding and evolving resource for the analysis and archiving of metagenomic data

    PubMed Central

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; Hoopen, Petra ten; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D.

    2016-01-01

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans. PMID:26582919

  6. Introduction to Metagenomics at DOE JGI (Opening Remarks for the Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Kyrpides, Nikos [DOE JGI

    2016-07-12

    After a quick introduction by DOE JGI Director Eddy Rubin, DOE JGI's Nikos Kyrpides delivers the opening remarks at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011

  7. EBI metagenomics in 2016--an expanding and evolving resource for the analysis and archiving of metagenomic data.

    PubMed

    Mitchell, Alex; Bucchini, Francois; Cochrane, Guy; Denise, Hubert; ten Hoopen, Petra; Fraser, Matthew; Pesseat, Sebastien; Potter, Simon; Scheremetjew, Maxim; Sterk, Peter; Finn, Robert D

    2016-01-01

    EBI metagenomics (https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and archiving of metagenomic and metatranscriptomic data. Over the last 2 years, the resource has undergone rapid growth, with an increase of over five-fold in the number of processed samples and consequently represents one of the largest resources of analysed shotgun metagenomes. Here, we report the status of the resource in 2016 and give an overview of new developments. In particular, we describe updates to data content, a complete overhaul of the analysis pipeline, streamlining of data presentation via the website and the development of a new web based tool to compare functional analyses of sequence runs within a study. We also highlight two of the higher profile projects that have been analysed using the resource in the last year: the oceanographic projects Ocean Sampling Day and Tara Oceans.

  8. Introduction to Metagenomics at DOE JGI (Opening Remarks for the Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Kyrpides, Nikos

    2011-10-12

    After a quick introduction by DOE JGI Director Eddy Rubin, DOE JGI's Nikos Kyrpides delivers the opening remarks at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011

  9. Phylogenetic Analysis of a Spontaneous Cocoa Bean Fermentation Metagenome Reveals New Insights into Its Bacterial and Fungal Community Diversity

    PubMed Central

    Illeghems, Koen; De Vuyst, Luc; Papalexandratou, Zoi; Weckx, Stefan

    2012-01-01

    This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly γ-Proteobacteria) and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni). Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques. PMID:22666442

  10. A New Secondary Structure Assignment Algorithm Using Cα Backbone Fragments

    PubMed Central

    Cao, Chen; Wang, Guishen; Liu, An; Xu, Shutan; Wang, Lincong; Zou, Shuxue

    2016-01-01

    The assignment of secondary structure elements in proteins is a key step in the analysis of their structures and functions. We have developed an algorithm, SACF (secondary structure assignment based on Cα fragments), for secondary structure element (SSE) assignment based on the alignment of Cα backbone fragments with central poses derived by clustering known SSE fragments. The assignment algorithm consists of three steps: First, the outlier fragments on known SSEs are detected. Next, the remaining fragments are clustered to obtain the central fragments for each cluster. Finally, the central fragments are used as a template to make assignments. Following a large-scale comparison of 11 secondary structure assignment methods, SACF, KAKSI and PROSS are found to have similar agreement with DSSP, while PCASSO agrees with DSSP best. SACF and PCASSO show preference to reducing residues in N and C cap regions, whereas KAKSI, P-SEA and SEGNO tend to add residues to the terminals when DSSP assignment is taken as standard. Moreover, our algorithm is able to assign subtle helices (310-helix, π-helix and left-handed helix) and make uniform assignments, as well as to detect rare SSEs in β-sheets or long helices as outlier fragments from other programs. The structural uniformity should be useful for protein structure classification and prediction, while outlier fragments underlie the structure–function relationship. PMID:26978354

  11. Subject Classification.

    ERIC Educational Resources Information Center

    Thompson, Gayle; And Others

    Three newspaper librarians described how they manage the files of newspaper clippings which are a necessary part of their collections. The development of a new subject classification system for the clippings files was outlined. The new subject headings were based on standard subject heading lists and on local need. It was decided to use a computer…

  12. Classifying Classification

    ERIC Educational Resources Information Center

    Novakowski, Janice

    2009-01-01

    This article describes the experience of a group of first-grade teachers as they tackled the science process of classification, a targeted learning objective for the first grade. While the two-year process was not easy and required teachers to teach in a new, more investigation-oriented way, the benefits were great. The project helped teachers and…

  13. Heavy fragment radioactivities

    SciTech Connect

    Price, P.B.

    1987-12-10

    This recently discovered mode of radioactive decay, like alpha decay and spontaneous fission, is believed to involve tunneling through the deformation-energy barrier between a very heavy nucleus and two separated fragments the sum of whose masses is less than the mass of the parent nucleus. In all known cases the heavier of the two fragments is close to doubly magic /sup 208/Pb, and the lighter fragment has even Z. Four isotopes of Ra are known to emit /sup 14/C nuclei; several isotopes of U as well as /sup 230/Th and /sup 231/Pa emit Ne nuclei; and /sup 234/U exhibits four hadronic decay modes: alpha decay, spontaneous fission, Ne decay and Mg decay.

  14. Okazaki fragment metabolism.

    PubMed

    Balakrishnan, Lata; Bambara, Robert A

    2013-02-01

    Cellular DNA replication requires efficient copying of the double-stranded chromosomal DNA. The leading strand is elongated continuously in the direction of fork opening, whereas the lagging strand is made discontinuously in the opposite direction. The lagging strand needs to be processed to form a functional DNA segment. Genetic analyses and reconstitution experiments identified proteins and multiple pathways responsible for maturation of the lagging strand. In both prokaryotes and eukaryotes the lagging-strand fragments are initiated by RNA primers, which are removed by a joining mechanism involving strand displacement of the primer into a flap, flap removal, and then ligation. Although the prokaryotic fragments are ~1200 nucleotides long, the eukaryotic fragments are much shorter, with lengths determined by nucleosome periodicity. The prokaryotic joining mechanism is simple and efficient. The eukaryotic maturation mechanism involves many enzymes, possibly three pathways, and regulation that can shift from high efficiency to high fidelity.

  15. Allogenous tooth fragment reattachment

    PubMed Central

    Maitin, Nitin; Maitin, Shipra; Rastogi, Khushboo; Bhushan, Rajarshi

    2013-01-01

    Coronal fractures of the anterior teeth are a common form of dental trauma and its sequelae may impair the establishment and accomplishment of an adequate treatment plan. Among the various treatment options, reattachment of a crown fragment obtained from a previously extracted tooth is a conservative treatment that should be considered for crown fractures of anterior teeth. This article reports reattachment of an allogenous tooth fragment in a fractured maxillary lateral incisor in a 38-year-old patient. It is suggested that allogenous reattachment in a fractured anterior tooth serves to be a better alternative and should be further researched. Aesthetic and functional rehabilitation of a fractured complicated anterior crown using allogenous tooth fragment is a better alternative to other more conventional treatment options. PMID:23845684

  16. Going deeper: metagenome of a hadopelagic microbial community.

    PubMed

    Eloe, Emiley A; Fadrosh, Douglas W; Novotny, Mark; Zeigler Allen, Lisa; Kim, Maria; Lombardo, Mary-Jane; Yee-Greenbaum, Joyclyn; Yooseph, Shibu; Allen, Eric E; Lasken, Roger; Williamson, Shannon J; Bartlett, Douglas H

    2011-01-01

    The paucity of sequence data from pelagic deep-ocean microbial assemblages has severely restricted molecular exploration of the largest biome on Earth. In this study, an analysis is presented of a large-scale 454-pyrosequencing metagenomic dataset from a hadopelagic environment from 6,000 m depth within the Puerto Rico Trench (PRT). A total of 145 Mbp of assembled sequence data was generated and compared to two pelagic deep ocean metagenomes and two representative surface seawater datasets from the Sargasso Sea. In a number of instances, all three deep metagenomes displayed similar trends, but were most magnified in the PRT, including enrichment in functions for two-component signal transduction mechanisms and transcriptional regulation. Overrepresented transporters in the PRT metagenome included outer membrane porins, diverse cation transporters, and di- and tri-carboxylate transporters that matched well with the prevailing catabolic processes such as butanoate, glyoxylate and dicarboxylate metabolism. A surprisingly high abundance of sulfatases for the degradation of sulfated polysaccharides were also present in the PRT. The most dramatic adaptational feature of the PRT microbes appears to be heavy metal resistance, as reflected in the large numbers of transporters present for their removal. As a complement to the metagenome approach, single-cell genomic techniques were utilized to generate partial whole-genome sequence data from four uncultivated cells from members of the dominant phyla within the PRT, Alphaproteobacteria, Gammaproteobacteria, Bacteroidetes and Planctomycetes. The single-cell sequence data provided genomic context for many of the highly abundant functional attributes identified from the PRT metagenome, as well as recruiting heavily the PRT metagenomic sequence data compared to 172 available reference marine genomes. Through these multifaceted sequence approaches, new insights have been provided into the unique functional attributes present in

  17. Going deeper: metagenome of a hadopelagic microbial community.

    PubMed

    Eloe, Emiley A; Fadrosh, Douglas W; Novotny, Mark; Zeigler Allen, Lisa; Kim, Maria; Lombardo, Mary-Jane; Yee-Greenbaum, Joyclyn; Yooseph, Shibu; Allen, Eric E; Lasken, Roger; Williamson, Shannon J; Bartlett, Douglas H

    2011-01-01

    The paucity of sequence data from pelagic deep-ocean microbial assemblages has severely restricted molecular exploration of the largest biome on Earth. In this study, an analysis is presented of a large-scale 454-pyrosequencing metagenomic dataset from a hadopelagic environment from 6,000 m depth within the Puerto Rico Trench (PRT). A total of 145 Mbp of assembled sequence data was generated and compared to two pelagic deep ocean metagenomes and two representative surface seawater datasets from the Sargasso Sea. In a number of instances, all three deep metagenomes displayed similar trends, but were most magnified in the PRT, including enrichment in functions for two-component signal transduction mechanisms and transcriptional regulation. Overrepresented transporters in the PRT metagenome included outer membrane porins, diverse cation transporters, and di- and tri-carboxylate transporters that matched well with the prevailing catabolic processes such as butanoate, glyoxylate and dicarboxylate metabolism. A surprisingly high abundance of sulfatases for the degradation of sulfated polysaccharides were also present in the PRT. The most dramatic adaptational feature of the PRT microbes appears to be heavy metal resistance, as reflected in the large numbers of transporters present for their removal. As a complement to the metagenome approach, single-cell genomic techniques were utilized to generate partial whole-genome sequence data from four uncultivated cells from members of the dominant phyla within the PRT, Alphaproteobacteria, Gammaproteobacteria, Bacteroidetes and Planctomycetes. The single-cell sequence data provided genomic context for many of the highly abundant functional attributes identified from the PRT metagenome, as well as recruiting heavily the PRT metagenomic sequence data compared to 172 available reference marine genomes. Through these multifaceted sequence approaches, new insights have been provided into the unique functional attributes present in

  18. Going Deeper: Metagenome of a Hadopelagic Microbial Community

    PubMed Central

    Eloe, Emiley A.; Fadrosh, Douglas W.; Novotny, Mark; Zeigler Allen, Lisa; Kim, Maria; Lombardo, Mary-Jane; Yee-Greenbaum, Joyclyn; Yooseph, Shibu; Allen, Eric E.; Lasken, Roger; Williamson, Shannon J.; Bartlett, Douglas H.

    2011-01-01

    The paucity of sequence data from pelagic deep-ocean microbial assemblages has severely restricted molecular exploration of the largest biome on Earth. In this study, an analysis is presented of a large-scale 454-pyrosequencing metagenomic dataset from a hadopelagic environment from 6,000 m depth within the Puerto Rico Trench (PRT). A total of 145 Mbp of assembled sequence data was generated and compared to two pelagic deep ocean metagenomes and two representative surface seawater datasets from the Sargasso Sea. In a number of instances, all three deep metagenomes displayed similar trends, but were most magnified in the PRT, including enrichment in functions for two-component signal transduction mechanisms and transcriptional regulation. Overrepresented transporters in the PRT metagenome included outer membrane porins, diverse cation transporters, and di- and tri-carboxylate transporters that matched well with the prevailing catabolic processes such as butanoate, glyoxylate and dicarboxylate metabolism. A surprisingly high abundance of sulfatases for the degradation of sulfated polysaccharides were also present in the PRT. The most dramatic adaptational feature of the PRT microbes appears to be heavy metal resistance, as reflected in the large numbers of transporters present for their removal. As a complement to the metagenome approach, single-cell genomic techniques were utilized to generate partial whole-genome sequence data from four uncultivated cells from members of the dominant phyla within the PRT, Alphaproteobacteria, Gammaproteobacteria, Bacteroidetes and Planctomycetes. The single-cell sequence data provided genomic context for many of the highly abundant functional attributes identified from the PRT metagenome, as well as recruiting heavily the PRT metagenomic sequence data compared to 172 available reference marine genomes. Through these multifaceted sequence approaches, new insights have been provided into the unique functional attributes present in

  19. Environmental Metagenomics: The Data Assembly and Data Analysis Perspectives

    NASA Astrophysics Data System (ADS)

    Kumar, Vinay; Maitra, S. S.; Shukla, Rohit Nandan

    2015-01-01

    Novel gene finding is one of the emerging fields in the environmental research. In the past decades the research was focused mainly on the discovery of microorganisms which were capable of degrading a particular compound. A lot of methods are available in literature about the cultivation and screening of these novel microorganisms. All of these methods are efficient for screening of microbes which can be cultivated in the laboratory. Microorganisms which live in extreme conditions like hot springs, frozen glaciers, acid mine drainage, etc. cannot be cultivated in the laboratory, this is because of incomplete knowledge about their growth requirements like temperature, nutrients and their mutual dependence on each other. The microbes that can be cultivated correspond only to less than 1 % of the total microbes which are present in the earth. Rest of the 99 % of uncultivated majority remains inaccessible. Metagenomics transcends the culture requirements of microbes. In metagenomics DNA is directly extracted from the environmental samples such as soil, seawater, acid mine drainage etc., followed by construction and screening of metagenomic library. With the ongoing research, a huge amount of metagenomic data is accumulating. Understanding this data is an essential step to extract novel genes of industrial importance. Various bioinformatics tools have been designed to analyze and annotate the data produced from the metagenome. The Bio-informatic requirements of metagenomics data analysis are different in theory and practice. This paper reviews the tools that are available for metagenomic data analysis and the capability such tools—what they can do and their web availability.

  20. FIELD TESTS OF GEOGRAPHICALLY-DEPENDENT VS. THRESHOLD-BASED WATERSHED CLASSIFICATION SCHEMED IN THE GREAT LAKES BASIN

    EPA Science Inventory

    We compared classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmentation with a geographically-based classification scheme for two case studies involving 1)Lake Superior tributaries and 2) watersheds of riverine coastal wetlands ...

  1. FIELD TESTS OF GEOGRAPHICALLY-DEPENDENT VS. THRESHOLD-BASED WATERSHED CLASSIFICATION SCHEMES IN THE GREAT LAKES BASIN

    EPA Science Inventory

    We compared classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmentation with a geographically-based classification scheme for two case studies involving 1) Lake Superior tributaries and 2) watersheds of riverine coastal wetlands...

  2. The single-species metagenome: subtyping Staphylococcus aureus core genome sequences from shotgun metagenomic data

    PubMed Central

    Li, Ben; Petit III, Robert A.; Qin, Zhaohui S.; Darrow, Lyndsey

    2016-01-01

    In this study we developed a genome-based method for detecting Staphylococcus aureus subtypes from metagenome shotgun sequence data. We used a binomial mixture model and the coverage counts at >100,000 known S. aureus SNP (single nucleotide polymorphism) sites derived from prior comparative genomic analysis to estimate the proportion of 40 subtypes in metagenome samples. We were able to obtain >87% sensitivity and >94% specificity at 0.025X coverage for S. aureus. We found that 321 and 149 metagenome samples from the Human Microbiome Project and metaSUB analysis of the New York City subway, respectively, contained S. aureus at genome coverage >0.025. In both projects, CC8 and CC30 were the most common S. aureus clonal complexes encountered. We found evidence that the subtype composition at different body sites of the same individual were more similar than random sampling and more limited evidence that certain body sites were enriched for particular subtypes. One surprising finding was the apparent high frequency of CC398, a lineage often associated with livestock, in samples from the tongue dorsum. Epidemiologic analysis of the HMP subject population suggested that high BMI (body mass index) and health insurance are possibly associated with S. aureus carriage but there was limited power to identify factors linked to carriage of even the most common subtype. In the NYC subway data, we found a small signal of geographic distance affecting subtype clustering but other unknown factors influence taxonomic distribution of the species around the city. PMID:27781166

  3. IMPACT fragmentation model developments

    NASA Astrophysics Data System (ADS)

    Sorge, Marlon E.; Mains, Deanna L.

    2016-09-01

    The IMPACT fragmentation model has been used by The Aerospace Corporation for more than 25 years to analyze orbital altitude explosions and hypervelocity collisions. The model is semi-empirical, combining mass, energy and momentum conservation laws with empirically derived relationships for fragment characteristics such as number, mass, area-to-mass ratio, and spreading velocity as well as event energy distribution. Model results are used for several types of analysis including assessment of short-term risks to satellites from orbital altitude fragmentations, prediction of the long-term evolution of the orbital debris environment and forensic assessments of breakup events. A new version of IMPACT, version 6, has been completed and incorporates a number of advancements enabled by a multi-year long effort to characterize more than 11,000 debris fragments from more than three dozen historical on-orbit breakup events. These events involved a wide range of causes, energies, and fragmenting objects. Special focus was placed on the explosion model, as the majority of events examined were explosions. Revisions were made to the mass distribution used for explosion events, increasing the number of smaller fragments generated. The algorithm for modeling upper stage large fragment generation was updated. A momentum conserving asymmetric spreading velocity distribution algorithm was implemented to better represent sub-catastrophic events. An approach was developed for modeling sub-catastrophic explosions, those where the majority of the parent object remains intact, based on estimated event energy. Finally, significant modifications were made to the area-to-mass ratio distribution to incorporate the tendencies of different materials to fragment into different shapes. This ability enabled better matches between the observed area-to-mass ratios and those generated by the model. It also opened up additional possibilities for post-event analysis of breakups. The paper will discuss

  4. [Comparative Metagenomics of BIOLAK and A2O Activated Sludge Based on Next-generation Sequencing Technology].

    PubMed

    Tian, Mei; Liu, Han-hu; Shen, Xin

    2016-02-15

    This is the first report of comparative metagenomic analyses of BIOLAK sludge and anaerobic/anoxic/oxic (A2O) sludge. In the BIOLAK and A2O sludge metagenomes, 47 and 51 phyla were identified respectively, more than the numbers of phyla identified in Australia EBPR (enhanced biological phosphorus removal), USA EBPR and Bibby sludge. All phyla found in the BIOLAK sludge were detected in the A2O sludge, but four phyla were exclusively found in the A20 sludge. The proportion of the phylum Ignavibacteriae in the A2O sludge was 2.0440%, which was 3.2 times as much as that in the BIOLAK sludge (0.6376%). Meanwhile, the proportion of the bacterial phylum Gemmatimonadetes in the BIOLAK sludge was 2.4673%, which was >17 times as much as that in the A2O sludge (0.1404%). The proportion of the bacterial phylum Chlamydiae in the BIOLAK metagenome (0.2192%) was >6 times higher than that in the A2O (0.0360%). Furthermore, 167 genera found in the A20 sludge were not detected in the BIOLAK sludge. And 50 genera found in the BIOLAK sludge were not detected in the A20 sludge. From the analyses of both the phylum and genus levels, there were huge differences between the two biological communities of A2O and BIOLAK sludge. However, the proportions of each group of functional genes associated with metabolism of nitrogen, phosphor, sulfur and aromatic compounds in BIOLAK were very similar to those in A2O sludge. Moreover, the rankings of all six KEGG (Kyoto Encyclopedia for Genes and Genomes) categories were identical in the two sludges. In addition, the analyses of functional classification and pathway related nitrogen metabolism showed that the abundant enzymes had identical ranking in the BIOLAK and A2O metagenomes. Therefore, comparative metagenomics of BIOLAK and A2O activated sludge indicated similar function assignments from the two different biological communities. PMID:27363155

  5. Target fragmentation in radiobiology

    NASA Technical Reports Server (NTRS)

    Wilson, John W.; Cucinotta, Francis A.; Shinn, Judy L.; Townsend, Lawrence W.

    1993-01-01

    Nuclear reactions in biological systems produce low-energy fragments of the target nuclei seen as local high events of linear energy transfer (LET). A nuclear-reaction formalism is used to evaluate the nuclear-induced fields within biosystems and their effects within several biological models. On the basis of direct ionization interaction, one anticipates high-energy protons to have a quality factor and relative biological effectiveness (RBE) of unity. Target fragmentation contributions raise the effective quality factor of 10 GeV protons to 3.3 in reasonable agreement with RBE values for induced micronuclei in bean sprouts. Application of the Katz model indicates that the relative increase in RBE with decreasing exposure observed in cell survival experiments with 160 MeV protons is related solely to target fragmentation events. Target fragment contributions to lens opacity given an RBE of 1.4 for 2 GeV protons in agreement with the work of Lett and Cox. Predictions are made for the effective RBE for Harderian gland tumors induced by high-energy protons. An exposure model for lifetime cancer risk is derived from NCRP 98 risk tables, and protraction effects are examined for proton and helium ion exposures. The implications of dose rate enhancement effects on space radiation protection are considered.

  6. Fragment Separator ACCULINNA-2

    SciTech Connect

    Krupko, S. A.; Fomichev, A. S.; Chudoba, V.; Daniel, A. V.; Golovkov, M. S.; Gorshkov, V. A.; Oganessian, Yu. Ts.; Sidorchuk, S. I.; Slepnev, R. S.; Stepantsov, S. V.; Ter-Akopian, G. M.; Wolski, R.; Grigorenko, L. V.; Tarasov, O. B.; Ershov, S. N.; Lukyanov, V. K.; Danilin, B. V.; Korsheninnikov, A. A.; Goldberg, V. Z.; Mukha, I. G.

    2010-04-30

    Project of a new in-flight fragment separator is proposed as a part of the third generation DRIBs facilities in Dubna. As compared to the existing separator ACCULINNA, beam intensity should be increased by a factor 10-15, the beam quality improved and the RIB assortment should broaden considerably at ACCULINNA-2. Research program and structure are outlined for the new instrument.

  7. Comment on diquark fragmentation

    SciTech Connect

    Fredriksson, S.; Larsson, T.

    1983-07-01

    We discuss diquark fragmentation and suggest that a spectator uu system in deep-inelastic lepton-nucleon scattering has a larger breakup probability than a ud system. The reason for this is argued to be that half of the leftover ud systems are in bound (ud)/sub 0/ diquark configurations, while no such bound uu diquarks exist.

  8. Cross-roads in the classification of papillomaviruses.

    PubMed

    de Villiers, Ethel-Michele

    2013-10-01

    Acceptance of an official classification for the family Papillomaviridae based purely on DNA sequence relatedness, was achieved as late as 2003. The rate of isolation and characterization of new papillomavirus types has greatly depended on and subjected to the development of new laboratory techniques. Introduction of every new technique led to a temporarily burst in the number of new isolates. In the following, the bumpy road towards achieving a classification system combined with the controversies of implementing and accepting new techniques will be summarized. An update of the classification of the 170 human papillomavirus (HPV) types presently known is presented. Arguments towards the implementation of metagenomic sequencing for this rapidly growing family will be presented.

  9. An introduction to the analysis of shotgun metagenomic data

    PubMed Central

    Sharpton, Thomas J.

    2014-01-01

    Environmental DNA sequencing has revealed the expansive biodiversity of microorganisms and clarified the relationship between host-associated microbial communities and host phenotype. Shotgun metagenomic DNA sequencing is a relatively new and powerful environmental sequencing approach that provides insight into community biodiversity and function. But, the analysis of metagenomic sequences is complicated due to the complex structure of the data. Fortunately, new tools and data resources have been developed to circumvent these complexities and allow researchers to determine which microbes are present in the community and what they might be doing. This review describes the analytical strategies and specific tools that can be applied to metagenomic data and the considerations and caveats associated with their use. Specifically, it documents how metagenomes can be analyzed to quantify community structure and diversity, assemble novel genomes, identify new taxa and genes, and determine which metabolic pathways are encoded in the community. It also discusses several methods that can be used compare metagenomes to identify taxa and functions that differentiate communities. PMID:24982662

  10. Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data

    PubMed Central

    Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer

    2012-01-01

    Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available. PMID:22384016

  11. Longitudinal Metagenomic Analysis of Hospital Air Identifies Clinically Relevant Microbes

    PubMed Central

    King, Paula; Pham, Long K.; Waltz, Shannon; Sphar, Dan; Yamamoto, Robert T.; Conrad, Douglas; Taplitz, Randy; Torriani, Francesca

    2016-01-01

    We describe the sampling of sixty-three uncultured hospital air samples collected over a six-month period and analysis using shotgun metagenomic sequencing. Our primary goals were to determine the longitudinal metagenomic variability of this environment, identify and characterize genomes of potential pathogens and determine whether they are atypical to the hospital airborne metagenome. Air samples were collected from eight locations which included patient wards, the main lobby and outside. The resulting DNA libraries produced 972 million sequences representing 51 gigabases. Hierarchical clustering of samples by the most abundant 50 microbial orders generated three major nodes which primarily clustered by type of location. Because the indoor locations were longitudinally consistent, episodic relative increases in microbial genomic signatures related to the opportunistic pathogens Aspergillus, Penicillium and Stenotrophomonas were identified as outliers at specific locations. Further analysis of microbial reads specific for Stenotrophomonas maltophilia indicated homology to a sequenced multi-drug resistant clinical strain and we observed broad sequence coverage of resistance genes. We demonstrate that a shotgun metagenomic sequencing approach can be used to characterize the resistance determinants of pathogen genomes that are uncharacteristic for an otherwise consistent hospital air microbial metagenomic profile. PMID:27482891

  12. Reconstruction of Bacterial and Viral Genomes from Multiple Metagenomes

    PubMed Central

    Gupta, Ankit; Kumar, Sanjiv; Prasoodanan, Vishnu P. K.; Harish, K.; Sharma, Ashok K.; Sharma, Vineet K.

    2016-01-01

    Several metagenomic projects have been accomplished or are in progress. However, in most cases, it is not feasible to generate complete genomic assemblies of species from the metagenomic sequencing of a complex environment. Only a few studies have reported the reconstruction of bacterial genomes from complex metagenomes. In this work, Binning-Assembly approach has been proposed and demonstrated for the reconstruction of bacterial and viral genomes from 72 human gut metagenomic datasets. A total 1156 bacterial genomes belonging to 219 bacterial families and, 279 viral genomes belonging to 84 viral families could be identified. More than 80% complete draft genome sequences could be reconstructed for a total of 126 bacterial and 11 viral genomes. Selected draft assembled genomes could be validated with 99.8% accuracy using their ORFs. The study provides useful information on the assembly expected for a species given its number of reads and abundance. This approach along with spiking was also demonstrated to be useful in improving the draft assembly of a bacterial genome. The Binning-Assembly approach can be successfully used to reconstruct bacterial and viral genomes from multiple metagenomic datasets obtained from similar environments. PMID:27148174

  13. Metagenomic analysis of permafrost microbial community response to thaw

    SciTech Connect

    Mackelprang, R.; Waldrop, M.P.; DeAngelis, K.M.; David, M.M.; Chavarria, K.L.; Blazewicz, S.J.; Rubin, E.M.; Jansson, J.K.

    2011-07-01

    We employed deep metagenomic sequencing to determine the impact of thaw on microbial phylogenetic and functional genes and related this data to measurements of methane emissions. Metagenomics, the direct sequencing of DNA from the environment, allows for the examination of whole biochemical pathways and associated processes, as opposed to individual pieces of the metabolic puzzle. Our metagenome analyses revealed that during transition from a frozen to a thawed state there were rapid shifts in many microbial, phylogenetic and functional gene abundances and pathways. After one week of incubation at 5°C, permafrost metagenomes converged to be more similar to each other than while they were frozen. We found that multiple genes involved in cycling of C and nitrogen shifted rapidly during thaw. We also constructed the first draft genome from a complex soil metagenome, which corresponded to a novel methanogen. Methane previously accumulated in permafrost was released during thaw and subsequently consumed by methanotrophic bacteria. Together these data point towards the importance of rapid cycling of methane and nitrogen in thawing permafrost.

  14. Metagenomic Insights into Transferable Antibiotic Resistance in Oral Bacteria.

    PubMed

    Sukumar, S; Roberts, A P; Martin, F E; Adler, C J

    2016-08-01

    Antibiotic resistance is considered one of the greatest threats to global public health. Resistance is often conferred by the presence of antibiotic resistance genes (ARGs), which are readily found in the oral microbiome. In-depth genetic analyses of the oral microbiome through metagenomic techniques reveal a broad distribution of ARGs (including novel ARGs) in individuals not recently exposed to antibiotics, including humans in isolated indigenous populations. This has resulted in a paradigm shift from focusing on the carriage of antibiotic resistance in pathogenic bacteria to a broader concept of an oral resistome, which includes all resistance genes in the microbiome. Metagenomics is beginning to demonstrate the role of the oral resistome and horizontal gene transfer within and between commensals in the absence of selective pressure, such as an antibiotic. At the chairside, metagenomic data reinforce our need to adhere to current antibiotic guidelines to minimize the spread of resistance, as such data reveal the extent of ARGs without exposure to antimicrobials and the ecologic changes created in the oral microbiome by even a single dose of antibiotics. The aim of this review is to discuss the role of metagenomics in the investigation of the oral resistome, including the transmission of antibiotic resistance in the oral microbiome. Future perspectives, including clinical implications of the findings from metagenomic investigations of oral ARGs, are also considered.

  15. Application of metagenomics in the human gut microbiome.

    PubMed

    Wang, Wei-Lin; Xu, Shao-Yan; Ren, Zhi-Gang; Tao, Liang; Jiang, Jian-Wen; Zheng, Shu-Sen

    2015-01-21

    There are more than 1000 microbial species living in the complex human intestine. The gut microbial community plays an important role in protecting the host against pathogenic microbes, modulating immunity, regulating metabolic processes, and is even regarded as an endocrine organ. However, traditional culture methods are very limited for identifying microbes. With the application of molecular biologic technology in the field of the intestinal microbiome, especially metagenomic sequencing of the next-generation sequencing technology, progress has been made in the study of the human intestinal microbiome. Metagenomics can be used to study intestinal microbiome diversity and dysbiosis, as well as its relationship to health and disease. Moreover, functional metagenomics can identify novel functional genes, microbial pathways, antibiotic resistance genes, functional dysbiosis of the intestinal microbiome, and determine interactions and co-evolution between microbiota and host, though there are still some limitations. Metatranscriptomics, metaproteomics and metabolomics represent enormous complements to the understanding of the human gut microbiome. This review aims to demonstrate that metagenomics can be a powerful tool in studying the human gut microbiome with encouraging prospects. The limitations of metagenomics to be overcome are also discussed. Metatranscriptomics, metaproteomics and metabolomics in relation to the study of the human gut microbiome are also briefly discussed.

  16. Application of metagenomics in the human gut microbiome

    PubMed Central

    Wang, Wei-Lin; Xu, Shao-Yan; Ren, Zhi-Gang; Tao, Liang; Jiang, Jian-Wen; Zheng, Shu-Sen

    2015-01-01

    There are more than 1000 microbial species living in the complex human intestine. The gut microbial community plays an important role in protecting the host against pathogenic microbes, modulating immunity, regulating metabolic processes, and is even regarded as an endocrine organ. However, traditional culture methods are very limited for identifying microbes. With the application of molecular biologic technology in the field of the intestinal microbiome, especially metagenomic sequencing of the next-generation sequencing technology, progress has been made in the study of the human intestinal microbiome. Metagenomics can be used to study intestinal microbiome diversity and dysbiosis, as well as its relationship to health and disease. Moreover, functional metagenomics can identify novel functional genes, microbial pathways, antibiotic resistance genes, functional dysbiosis of the intestinal microbiome, and determine interactions and co-evolution between microbiota and host, though there are still some limitations. Metatranscriptomics, metaproteomics and metabolomics represent enormous complements to the understanding of the human gut microbiome. This review aims to demonstrate that metagenomics can be a powerful tool in studying the human gut microbiome with encouraging prospects. The limitations of metagenomics to be overcome are also discussed. Metatranscriptomics, metaproteomics and metabolomics in relation to the study of the human gut microbiome are also briefly discussed. PMID:25624713

  17. Comparative metagenome analysis of an Alaskan glacier.

    PubMed

    Choudhari, Sulbha; Lohia, Ruchi; Grigoriev, Andrey

    2014-04-01

    The temperature in the Arctic region has been increasing in the recent past accompanied by melting of its glaciers. We took a snapshot of the current microbial inhabitation of an Alaskan glacier (which can be considered as one of the simplest possible ecosystems) by using metagenomic sequencing of 16S rRNA recovered from ice/snow samples. Somewhat contrary to our expectations and earlier estimates, a rich and diverse microbial population of more than 2,500 species was revealed including several species of Archaea that has been identified for the first time in the glaciers of the Northern hemisphere. The most prominent bacterial groups found were Proteobacteria, Bacteroidetes, and Firmicutes. Firmicutes were not reported in large numbers in a previously studied Alpine glacier but were dominant in an Antarctic subglacial lake. Representatives of Cyanobacteria, Actinobacteria and Planctomycetes were among the most numerous, likely reflecting the dependence of the ecosystem on the energy obtained through photosynthesis and close links with the microbial community of the soil. Principal component analysis (PCA) of nucleotide word frequency revealed distinct sequence clusters for different taxonomic groups in the Alaskan glacier community and separate clusters for the glacial communities from other regions of the world. Comparative analysis of the community composition and bacterial diversity present in the Byron glacier in Alaska with other environments showed larger overlap with an Arctic soil than with a high Arctic lake, indicating patterns of community exchange and suggesting that these bacteria may play an important role in soil development during glacial retreat.

  18. Metagenomic scaffolds enable combinatorial lignin transformation

    PubMed Central

    Strachan, Cameron R.; Singh, Rahul; VanInsberghe, David; Ievdokymenko, Kateryna; Budwill, Karen; Mohn, William W.; Eltis, Lindsay D.; Hallam, Steven J.

    2014-01-01

    Engineering the microbial transformation of lignocellulosic biomass is essential to developing modern biorefining processes that alleviate reliance on petroleum-derived energy and chemicals. Many current bioprocess streams depend on the genetic tractability of Escherichia coli with a primary emphasis on engineering cellulose/hemicellulose catabolism, small molecule production, and resistance to product inhibition. Conversely, bioprocess streams for lignin transformation remain embryonic, with relatively few environmental strains or enzymes implicated. Here we develop a biosensor responsive to monoaromatic lignin transformation products compatible with functional screening in E. coli. We use this biosensor to retrieve metagenomic scaffolds sourced from coal bed bacterial communities conferring an array of lignin transformation phenotypes that synergize in combination. Transposon mutagenesis and comparative sequence analysis of active clones identified genes encoding six functional classes mediating lignin transformation phenotypes that appear to be rearrayed in nature via horizontal gene transfer. Lignin transformation activity was then demonstrated for one of the predicted gene products encoding a multicopper oxidase to validate the screen. These results illuminate cellular and community-wide networks acting on aromatic polymers and expand the toolkit for engineering recombinant lignin transformation based on ecological design principles. PMID:24982175

  19. Metagenomic analysis of stressed coral holobionts.

    PubMed

    Vega Thurber, Rebecca; Willner-Hall, Dana; Rodriguez-Mueller, Beltran; Desnues, Christelle; Edwards, Robert A; Angly, Florent; Dinsdale, Elizabeth; Kelly, Linda; Rohwer, Forest

    2009-08-01

    The coral holobiont is the community of metazoans, protists and microbes associated with scleractinian corals. Disruptions in these associations have been correlated with coral disease, but little is known about the series of events involved in the shift from mutualism to pathogenesis. To evaluate structural and functional changes in coral microbial communities, Porites compressa was exposed to four stressors: increased temperature, elevated nutrients, dissolved organic carbon loading and reduced pH. Microbial metagenomic samples were collected and pyrosequenced. Functional gene analysis demonstrated that stressors increased the abundance of microbial genes involved in virulence, stress resistance, sulfur and nitrogen metabolism, motility and chemotaxis, fatty acid and lipid utilization, and secondary metabolism. Relative changes in taxonomy also demonstrated that coral-associated microbiota (Archaea, Bacteria, protists) shifted from a healthy-associated coral community (e.g. Cyanobacteria, Proteobacteria and the zooxanthellae Symbiodinium) to a community (e.g. Bacteriodetes, Fusobacteria and Fungi) of microbes often found on diseased corals. Additionally, low-abundance Vibrio spp. were found to significantly alter microbiome metabolism, suggesting that the contribution of a just a few members of a community can profoundly shift the health status of the coral holobiont.

  20. The oral metagenome in health and disease

    PubMed Central

    Belda-Ferre, Pedro; Alcaraz, Luis David; Cabrera-Rubio, Raúl; Romero, Héctor; Simón-Soro, Aurea; Pignatelli, Miguel; Mira, Alex

    2012-01-01

    The oral cavity of humans is inhabited by hundreds of bacterial species and some of them have a key role in the development of oral diseases, mainly dental caries and periodontitis. We describe for the first time the metagenome of the human oral cavity under health and diseased conditions, with a focus on supragingival dental plaque and cavities. Direct pyrosequencing of eight samples with different oral-health status produced 1 Gbp of sequence without the biases imposed by PCR or cloning. These data show that cavities are not dominated by Streptococcus mutans (the species originally identified as the ethiological agent of dental caries) but are in fact a complex community formed by tens of bacterial species, in agreement with the view that caries is a polymicrobial disease. The analysis of the reads indicated that the oral cavity is functionally a different environment from the gut, with many functional categories enriched in one of the two environments and depleted in the other. Individuals who had never suffered from dental caries showed an over-representation of several functional categories, like genes for antimicrobial peptides and quorum sensing. In addition, they did not have mutans streptococci but displayed high recruitment of other species. Several isolates belonging to these dominant bacteria in healthy individuals were cultured and shown to inhibit the growth of cariogenic bacteria, suggesting the use of these commensal bacterial strains as probiotics to promote oral health and prevent dental caries. PMID:21716308

  1. Metagenomic analysis of phosphorus removing sludgecommunities

    SciTech Connect

    Garcia Martin, Hector; Ivanova, Natalia; Kunin, Victor; Warnecke,Falk; Barry, Kerrie; McHardy, Alice C.; Yeates, Christine; He, Shaomei; Salamov, Asaf; Szeto, Ernest; Dalin, Eileen; Putnam, Nik; Shapiro, HarrisJ.; Pangilinan, Jasmyn L.; Rigoutsos, Isidore; Kyrpides, Nikos C.; Blackall, Linda Louise; McMahon, Katherine D.; Hugenholtz, Philip

    2006-02-01

    Enhanced Biological Phosphorus Removal (EBPR) is not wellunderstood at the metabolic level despite being one of the best-studiedmicrobially-mediated industrial processes due to its ecological andeconomic relevance. Here we present a metagenomic analysis of twolab-scale EBPR sludges dominated by the uncultured bacterium, "CandidatusAccumulibacter phosphatis." This analysis resolves several controversiesin EBPR metabolic models and provides hypotheses explaining the dominanceof A. phosphatis in this habitat, its lifestyle outside EBPR and probablecultivation requirements. Comparison of the same species from differentEBPR sludges highlights recent evolutionary dynamics in the A. phosphatisgenome that could be linked to mechanisms for environmental adaptation.In spite of an apparent lack of phylogenetic overlap in the flankingcommunities of the two sludges studied, common functional themes werefound, at least one of them complementary to the inferred metabolism ofthe dominant organism. The present study provides a much-needed blueprintfor a systems-level understanding of EBPR and illustrates thatmetagenomics enables detailed, often novel, insights into evenwell-studied biological systems.

  2. Recovery of a Medieval Brucella melitensis Genome Using Shotgun Metagenomics

    PubMed Central

    Kay, Gemma L.; Sergeant, Martin J.; Giuffra, Valentina; Bandiera, Pasquale; Milanese, Marco; Bramanti, Barbara

    2014-01-01

    ABSTRACT Shotgun metagenomics provides a powerful assumption-free approach to the recovery of pathogen genomes from contemporary and historical material. We sequenced the metagenome of a calcified nodule from the skeleton of a 14th-century middle-aged male excavated from the medieval Sardinian settlement of Geridu. We obtained 6.5-fold coverage of a Brucella melitensis genome. Sequence reads from this genome showed signatures typical of ancient or aged DNA. Despite the relatively low coverage, we were able to use information from single-nucleotide polymorphisms to place the medieval pathogen genome within a clade of B. melitensis strains that included the well-studied Ether strain and two other recent Italian isolates. We confirmed this placement using information from deletions and IS711 insertions. We conclude that metagenomics stands ready to document past and present infections, shedding light on the emergence, evolution, and spread of microbial pathogens. PMID:25028426

  3. Application of metagenomics in understanding oral health and disease

    PubMed Central

    Xu, Ping; Gunsolley, John

    2014-01-01

    Oral diseases including periodontal disease and caries are some of the most prevalent infectious diseases in humans. Different microbial species cohabitate and form a polymicrobial biofilm called dental plaque in the oral cavity. Metagenomics using next generation sequencing technologies has produced bacterial profiles and genomic profiles to study the relationships between microbial diversity, genetic variation, and oral diseases. Several oral metagenomic studies have examined the oral microbiome of periodontal disease and caries. Gene annotations in these studies support the association of specific genes or metabolic pathways with oral health and with specific diseases. The roles of pathogenic species and functions of specific genes in oral disease development have been recognized by metagenomic analysis. A model is proposed in which three levels of interactions occur in the oral microbiome that determines oral health or disease. PMID:24642489

  4. Natural Product Discovery through Improved Functional Metagenomics in Streptomyces.

    PubMed

    Iqbal, Hala A; Low-Beinart, Lila; Obiajulu, Joseph U; Brady, Sean F

    2016-08-01

    Because the majority of environmental bacteria are not easily culturable, access to many bacterially encoded secondary metabolites will be dependent on the development of improved functional metagenomic screening methods. In this study, we examined a collection of diverse Streptomyces species for the best innate ability to heterologously express biosynthetic gene clusters. We then optimized methods for constructing high quality metagenomic cosmid libraries in the best Streptomyces host. An initial screen of a 1.5 million-membered metagenomic library constructed in Streptomyces albus, the species that exhibited the highest propensity for heterologous expression of gene clusters, led to the identification of the novel natural product metatricycloene (1). Metatricycloene is a tricyclic polyene encoded by a reductive, iterative polyketide-like gene cluster. Related gene clusters found in sequenced genomes appear to encode a largely unexplored collection of structurally diverse, polyene-based metabolites. PMID:27447056

  5. Recovering complete and draft population genomes from metagenome datasets

    DOE PAGESBeta

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A.

    2016-03-08

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem ofmore » chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.« less

  6. Introduction to the analysis of environmental sequences: metagenomics with MEGAN.

    PubMed

    Huson, Daniel H; Mitra, Suparna

    2012-01-01

    Metagenomics is the study of microbial organisms using sequencing applied directly to environmental samples. Similarly, in metatranscriptomics and metaproteomics, the RNA and protein sequences of such samples are studied. The analysis of these kinds of data often starts by asking the questions of "who is out there?", "what are they doing?", and "how do they compare?". In this chapter, we describe how these computational questions can be addressed using MEGAN, the MEtaGenome ANalyzer program. We first show how to analyze the taxonomic and functional content of a single dataset and then show how such analyses can be performed in a comparative fashion. We demonstrate how to compare different datasets using ecological indices and other distance measures. The discussion is conducted using a number of published marine datasets comprising metagenomic, metatranscriptomic, metaproteomic, and 16S rRNA data.

  7. Recovering complete and draft population genomes from metagenome datasets.

    PubMed

    Sangwan, Naseer; Xia, Fangfang; Gilbert, Jack A

    2016-03-08

    Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.

  8. Application of metagenomics in understanding oral health and disease.

    PubMed

    Xu, Ping; Gunsolley, John

    2014-04-01

    Oral diseases including periodontal disease and caries are some of the most prevalent infectious diseases in humans. Different microbial species cohabitate and form a polymicrobial biofilm called dental plaque in the oral cavity. Metagenomics using next generation sequencing technologies has produced bacterial profiles and genomic profiles to study the relationships between microbial diversity, genetic variation, and oral diseases. Several oral metagenomic studies have examined the oral microbiome of periodontal disease and caries. Gene annotations in these studies support the association of specific genes or metabolic pathways with oral health and with specific diseases. The roles of pathogenic species and functions of specific genes in oral disease development have been recognized by metagenomic analysis. A model is proposed in which three levels of interactions occur in the oral microbiome that determines oral health or disease.

  9. Is metagenomics resolving identification of functions in microbial communities?

    PubMed

    Chistoserdova, Ludmila

    2014-01-01

    We are coming up on the tenth anniversary of the broad use of the method involving whole metagenome shotgun sequencing, referred to as metagenomics. The application of this approach has definitely revolutionized microbiology and the related fields, including the realization of the importance of the human microbiome. As such, metagenomics has already provided a novel outlook on the complexity and dynamics of microbial communities that are an important part of the biosphere of the planet. Accumulation of massive amounts of sequence data also caused a surge in the development of bioinformatics tools specially designed to provide pipelines for data analysis and visualization. However, a critical outlook into the field is required to appreciate what could be and what has currently been gained from the massive sequence databases that are being generated with ever-increasing speed.

  10. Polyketide synthases in the microbiome of the marine sponge Plakortis halichondrioides: a metagenomic update.

    PubMed

    Della Sala, Gerardo; Hochmuth, Thomas; Teta, Roberta; Costantino, Valeria; Mangoni, Alfonso

    2014-11-01

    Sponge-associated microorganisms are able to assemble the complex machinery for the production of secondary metabolites such as polyketides, the most important class of marine natural products from a drug discovery perspective. A comprehensive overview of polyketide biosynthetic genes of the sponge Plakortis halichondrioides and its symbionts was obtained in the present study by massively parallel 454 pyrosequencing of complex and heterogeneous PCR (Polymerase Chain Reaction) products amplified from the metagenomic DNA of a specimen of P. halichondrioides collected in the Caribbean Sea. This was accompanied by a survey of the bacterial diversity within the sponge. In line with previous studies, sequences belonging to supA and swfA, two widespread sponge-specific groups of polyketide synthase (PKS) genes were dominant. While they have been previously reported as belonging to Poribacteria (a novel bacterial phylum found exclusively in sponges), re-examination of current genomic sequencing data showed supA and swfA not to be present in the poribacterial genome. Several non-supA, non-swfA type-I PKS fragments were also identified. A significant portion of these fragments resembled type-I PKSs from protists, suggesting that bacteria may not be the only source of polyketides from P. halichondrioides, and that protistan PKSs should receive further investigation as a source of novel polyketides.

  11. Microbial diversity of hypersaline environments: a metagenomic approach.

    PubMed

    Ventosa, Antonio; de la Haba, Rafael R; Sánchez-Porro, Cristina; Papke, R Thane

    2015-06-01

    Recent studies based on metagenomics and other molecular techniques have permitted a detailed knowledge of the microbial diversity and metabolic activities of microorganisms in hypersaline environments. The current accepted model of community structure in hypersaline environments is that the square archaeon Haloquadratum waslbyi, the bacteroidete Salinibacter ruber and nanohaloarchaea are predominant members at higher salt concentrations, while more diverse archaeal and bacterial taxa are observed in habitats with intermediate salinities. Additionally, metagenomic studies may provide insight into the isolation and characterization of the principal microbes in these habitats, such as the recently described gammaproteobacterium Spiribacter salinus. PMID:26056770

  12. Whither or wither geomicrobiology in the era of 'community metagenomics'

    USGS Publications Warehouse

    Oremland, R.S.; Capone, D.G.; Stolz, J.F.; Fuhrman, J.

    2005-01-01

    Molecular techniques are valuable tools that can improve our understanding of the structure of microbial communities. They provide the ability to probe for life in all niches of the biosphere, perhaps even supplanting the need to cultivate microorganisms or to conduct ecophysiological investigations. However, an overemphasis and strict dependence on such large information-driven endeavours as environmental metagenomics could overwhelm the field, to the detriment of microbial ecology. We now call for more balanced, hypothesis-driven research efforts that couple metagenomics with classic approaches.

  13. Cryobiology of coral fragments.

    PubMed

    Hagedorn, Mary; Farrell, Ann; Carter, Virginia L

    2013-02-01

    Around the world, coral reefs are dying due to human influences, and saving habitat alone may not stop this destruction. This investigation focused on the biological processes that will provide the first steps in understanding the cryobiology of whole coral fragments. Coral fragments are a partnership of coral tissue and endosymbiotic algae, Symbiodinium sp., commonly called zooxanthellae. These data reflected their separate sensitivities to chilling and a cryoprotectant (dimethyl sulfoxide) for the coral Pocillopora damicornis, as measured by tissue loss and Pulse Amplitude Modulated fluorometry 3weeks post-treatment. Five cryoprotectant treatments maintained the viability of the coral tissue and zooxanthellae at control values (1M dimethyl sulfoxide at 1.0, 1.5 and 2.0h exposures, and 1.5M dimethyl sulfoxide at 1.0 and 1.5h exposures, P>0.05, ANOVA), whereas 2M concentrations did not (P<0.05, ANOVA). A seasonal response to chilling was observed in the coral tissue, but not in the zooxanthellae. During the winter when the fragments were chilled, the coral tissue remained relatively intact (∼25% loss) post-treatment, but the zooxanthellae numbers in the tissue declined after 5min of chilling (P<0.05, ANOVA). However, in the late spring, coral tissue (∼75% loss) and zooxanthellae numbers declined in response to chilling alone (P<0.05, ANOVA). When a cryoprotectant (1M dimethyl sulfoxide) was used in concert with chilling it protected the coral against tissue loss after 45min of cryoprotectant exposure (P>0.05, ANOVA), but it did not protect against the loss of zooxanthellae (P<0.05, ANOVA). The zooxanthellae are the most sensitive element in the coral fragment complex and future cryopreservation protocols must be guided by their greater sensitivity.

  14. Metagenomic analysis of microbial community of an Amazonian geothermal spring in Peru.

    PubMed

    Paul, Sujay; Cortez, Yolanda; Vera, Nadia; Villena, Gretty K; Gutiérrez-Correa, Marcel

    2016-09-01

    Aguas Calientes (AC) is an isolated geothermal spring located deep into the Amazon rainforest (7°21'12″ S, 75°00'54″ W) of Peru. This geothermal spring is slightly acidic (pH 5.0-7.0) in nature, with temperatures varying from 45 to 90 °C and continually fed by plant litter, resulting in a relatively high degree of total organic content (TOC). Pooled water sample was analyzed at 16S rRNA V3-V4 hypervariable region by amplicon metagenome sequencing on Illumina HiSeq platform. A total of 2,976,534 paired ends reads were generated which were assigned into 5434 numbers of OTUs. All the resulting 16S rRNA fragments were then classified into 58 bacterial phyla and 2 archaeal phyla. Proteobacteria (88.06%) was found to be the highest represented phyla followed by Thermi (6.43%), Firmicutes (3.41%) and Aquificae (1.10%), respectively. Crenarchaeota and Euryarchaeota were the only 2 archaeal phyla detected in this study with low abundance. Metagenomic sequences were deposited to SRA database which is available at NCBI with accession number SRX1809286. Functional categorization of the assigned OTUs was performed using PICRUSt tool. In COG analysis "Amino acid transport and metabolism" (8.5%) was found to be the highest represented category whereas among predicted KEGG pathways "Metabolism" (50.6%) was the most abundant. This is the first report of a high resolution microbial phylogenetic profile of an Amazonian hot spring.

  15. The Systemic Imprint of Growth and Its Uses in Ecological (Meta)Genomics

    PubMed Central

    Vieira-Silva, Sara; Rocha, Eduardo P. C.

    2010-01-01

    Microbial minimal generation times range from a few minutes to several weeks. They are evolutionarily determined by variables such as environment stability, nutrient availability, and community diversity. Selection for fast growth adaptively imprints genomes, resulting in gene amplification, adapted chromosomal organization, and biased codon usage. We found that these growth-related traits in 214 species of bacteria and archaea are highly correlated, suggesting they all result from growth optimization. While modeling their association with maximal growth rates in view of synthetic biology applications, we observed that codon usage biases are better correlates of growth rates than any other trait, including rRNA copy number. Systematic deviations to our model reveal two distinct evolutionary processes. First, genome organization shows more evolutionary inertia than growth rates. This results in over-representation of growth-related traits in fast degrading genomes. Second, selection for these traits depends on optimal growth temperature: for similar generation times purifying selection is stronger in psychrophiles, intermediate in mesophiles, and lower in thermophiles. Using this information, we created a predictor of maximal growth rate adapted to small genome fragments. We applied it to three metagenomic environmental samples to show that a transiently rich environment, as the human gut, selects for fast-growers, that a toxic environment, as the acid mine biofilm, selects for low growth rates, whereas a diverse environment, like the soil, shows all ranges of growth rates. We also demonstrate that microbial colonizers of babies gut grow faster than stabilized human adults gut communities. In conclusion, we show that one can predict maximal growth rates from sequence data alone, and we propose that such information can be used to facilitate the manipulation of generation times. Our predictor allows inferring growth rates in the vast majority of uncultivable

  16. Metagenomic analysis of microbial community of an Amazonian geothermal spring in Peru.

    PubMed

    Paul, Sujay; Cortez, Yolanda; Vera, Nadia; Villena, Gretty K; Gutiérrez-Correa, Marcel

    2016-09-01

    Aguas Calientes (AC) is an isolated geothermal spring located deep into the Amazon rainforest (7°21'12″ S, 75°00'54″ W) of Peru. This geothermal spring is slightly acidic (pH 5.0-7.0) in nature, with temperatures varying from 45 to 90 °C and continually fed by plant litter, resulting in a relatively high degree of total organic content (TOC). Pooled water sample was analyzed at 16S rRNA V3-V4 hypervariable region by amplicon metagenome sequencing on Illumina HiSeq platform. A total of 2,976,534 paired ends reads were generated which were assigned into 5434 numbers of OTUs. All the resulting 16S rRNA fragments were then classified into 58 bacterial phyla and 2 archaeal phyla. Proteobacteria (88.06%) was found to be the highest represented phyla followed by Thermi (6.43%), Firmicutes (3.41%) and Aquificae (1.10%), respectively. Crenarchaeota and Euryarchaeota were the only 2 archaeal phyla detected in this study with low abundance. Metagenomic sequences were deposited to SRA database which is available at NCBI with accession number SRX1809286. Functional categorization of the assigned OTUs was performed using PICRUSt tool. In COG analysis "Amino acid transport and metabolism" (8.5%) was found to be the highest represented category whereas among predicted KEGG pathways "Metabolism" (50.6%) was the most abundant. This is the first report of a high resolution microbial phylogenetic profile of an Amazonian hot spring. PMID:27408814

  17. Characterization of a novel thermostable patatin-like protein from a Guaymas basin metagenomic library.

    PubMed

    Fu, Ling; He, Ying; Xu, Fangdi; Ma, Qun; Wang, Fengping; Xu, Jun

    2015-07-01

    Deep-sea hydrothermal vents are a natural habitat for thermophiles, in which contain plenty of enzymes that can function at high temperatures. In this work, we constructed a fosmid library in Escherichia coli using metagenomic DNA isolated from a chimney sample collected in the hydrothermal vents in Guaymas Basin. The library was screened for lipolytic activity and positive clones were subjected to subcloning. A novel patatin-like protein (PLP) that exhibited less than 45 % identity in amino acid sequence to known enzymes was obtained. Common features of the patatin-like proteins, such as four conserved blocks, were detected. Interestingly, there was an Ala at site 42 in PLP instead of the first Gly-residue in the consensus sequence Gly-X-Ser-X-Gly found in other PLP homologs. The active sites of PLP were Ser44 and Asp160. Spectrophotometric assays with different p-nitrophenyl esters demonstrated a preference for p-nitrophenyl butyrate (C4) and p-nitrophenyl decanoate (C10). Moreover, PLP demonstrated optimal activity at 70 °C and at pH 9.0 (Tris-HCl). The activation energy from the linear Arrhenius plot was found to be 38.3 ± 0.9 kJ/mol. The K m and V max of PLP for C4 were 304 ± 38 μM and 14 ± 0.38 μmol min(-1) mg(-1), respectively. Gene-mining of the metagenome dataset that was generated by pyrosequencing the same chimney sample resulted in identification of 20 PLP homolog gene fragments, which could represent promising examples of this category of thermostable proteins. PMID:26016814

  18. Fragmentation of cancer cells

    NASA Astrophysics Data System (ADS)

    Vanapalli, Siva; Kamyabi, Nabiollah

    Tumor cells have to travel through blood capillaries to be able to metastasize and colonize in distant organs. Among the numerous cells that are shed by the primary tumor, very few survive in circulation. In vivo studies have shown that tumor cells can undergo breakup at microcapillary junctions affecting their survival. It is currently unclear what hydrodynamic and biomechanical factors contribute to fragmentation and moreover how different are the breakup dynamics of highly and weakly metastatic cells. In this study, we use microfluidics to investigate flow-induced breakup of prostate and breast cancer cells. We observe several different modes of breakup of cancer cells, which have striking similarities with breakup of viscous drops. We quantify the breakup time and find that highly metastatic cancer cells take longer to breakup than lowly metastatic cells suggesting that tumor cells may dynamically modify their deformability to avoid fragmentation. We also identify the role that cytoskeleton and membrane plays in the breakup process. Our study highlights the important role that tumor cell fragmentation plays in cancer metastasis. Cancer Prevention and Research Institute of Texas.

  19. Fracture, failure, and fragmentation

    SciTech Connect

    Dienes, J.K.

    1984-01-01

    Though continuum descriptions of material behavior are useful for many kinds of problems, particularly those involving plastic flow, a more general approach is required when the failure is likely to involve growth and coalescence of a large number of fractures, as in fragmentation. Failures of this kind appear frequently in rapid dynamic processes such as those resulting from impacts and explosions, particularly in the formation of spall fragments. In the first part of this paper an approach to formulating constitutive relations that accounts for the opening, shear and growth of an ensemble of cracks is discussed. The approach also accounts for plastic flow accompanying fragmentation. The resulting constitutive relations have been incorporated into a Lagrangean computer program. In the second part of this paper a theoretical approach to coalescence is described. The simplest formulation makes use of a linear Liouville equation, with crack growth limited by the mean free path of cracks, assumed constant. This approach allows for an anisotropic distribution of cracks. An alternative approach is also described in which the decrease of the mean free path with increasing crack size is accounted for, but the crack distribution is assumed isotropic. A reduction of the governing Liouville equation to an ordinary differential equation of third order is possible, and the result can be used to determine how mean-free-path decreases with increasing crack size.

  20. Utility of Metagenomic Next-Generation Sequencing for Characterization of HIV and Human Pegivirus Diversity

    PubMed Central

    Naccache, Samia N.; Kabre, Beniwende; Federman, Scot; Mbanya, Dora; Kaptué, Lazare; Chiu, Charles Y.; Brennan, Catherine A.; Hackett, John

    2015-01-01

    Given the dynamic changes in HIV-1 complexity and diversity, next-generation sequencing (NGS) has the potential to revolutionize strategies for effective HIV global surveillance. In this study, we explore the utility of metagenomic NGS to characterize divergent strains of HIV-1 and to simultaneously screen for other co-infecting viruses. Thirty-five HIV-1-infected Cameroonian blood donor specimens with viral loads of >4.4 log10 copies/ml were selected to include a diverse representation of group M strains. Random-primed NGS libraries, prepared from plasma specimens, resulted in greater than 90% genome coverage for 88% of specimens. Correct subtype designations based on NGS were concordant with sub-region PCR data in 31 of 35 (89%) cases. Complete genomes were assembled for 25 strains, including circulating recombinant forms with relatively limited data available (7 CRF11_cpx, 2 CRF13_cpx, 1 CRF18_cpx, and 1 CRF37_cpx), as well as 9 unique recombinant forms. HPgV (formerly designated GBV-C) co-infection was detected in 9 of 35 (25%) specimens, of which eight specimens yielded complete genomes. The recovered HPgV genomes formed a diverse cluster with genotype 1 sequences previously reported from Ghana, Uganda, and Japan. The extensive genome coverage obtained by NGS improved accuracy and confidence in phylogenetic classification of the HIV-1 strains present in the study population relative to conventional sub-region PCR. In addition, these data demonstrate the potential for metagenomic analysis to be used for routine characterization of HIV-1 and identification of other viral co-infections. PMID:26599538

  1. Diversity of Virophages in Metagenomic Data Sets

    PubMed Central

    Zhou, Jinglie; Zhang, Weijia; Yan, Shuling; Xiao, Jinzhou; Zhang, Yuanyuan; Li, Bailin; Pan, Yingjie

    2013-01-01

    Virophages, e.g., Sputnik, Mavirus, and Organic Lake virophage (OLV), are unusual parasites of giant double-stranded DNA (dsDNA) viruses, yet little is known about their diversity. Here, we describe the global distribution, abundance, and genetic diversity of virophages based on analyzing and mapping comprehensive metagenomic databases. The results reveal a distinct abundance and worldwide distribution of virophages, involving almost all geographical zones and a variety of unique environments. These environments ranged from deep ocean to inland, iced to hydrothermal lakes, and human gut- to animal-associated habitats. Four complete virophage genomic sequences (Yellowstone Lake virophages [YSLVs]) were obtained, as was one nearly complete sequence (Ace Lake Mavirus [ALM]). The genomes obtained were 27,849 bp long with 26 predicted open reading frames (ORFs) (YSLV1), 23,184 bp with 21 ORFs (YSLV2), 27,050 bp with 23 ORFs (YSLV3), 28,306 bp with 34 ORFs (YSLV4), and 17,767 bp with 22 ORFs (ALM). The homologous counterparts of five genes, including putative FtsK-HerA family DNA packaging ATPase and genes encoding DNA helicase/primase, cysteine protease, major capsid protein (MCP), and minor capsid protein (mCP), were present in all virophages studied thus far. They also shared a conserved gene cluster comprising the two core genes of MCP and mCP. Comparative genomic and phylogenetic analyses showed that YSLVs, having a closer relationship to each other than to the other virophages, were more closely related to OLV than to Sputnik but distantly related to Mavirus and ALM. These findings indicate that virophages appear to be widespread and genetically diverse, with at least 3 major lineages. PMID:23408616

  2. Diversity of virophages in metagenomic data sets.

    PubMed

    Zhou, Jinglie; Zhang, Weijia; Yan, Shuling; Xiao, Jinzhou; Zhang, Yuanyuan; Li, Bailin; Pan, Yingjie; Wang, Yongjie

    2013-04-01

    Virophages, e.g., Sputnik, Mavirus, and Organic Lake virophage (OLV), are unusual parasites of giant double-stranded DNA (dsDNA) viruses, yet little is known about their diversity. Here, we describe the global distribution, abundance, and genetic diversity of virophages based on analyzing and mapping comprehensive metagenomic databases. The results reveal a distinct abundance and worldwide distribution of virophages, involving almost all geographical zones and a variety of unique environments. These environments ranged from deep ocean to inland, iced to hydrothermal lakes, and human gut- to animal-associated habitats. Four complete virophage genomic sequences (Yellowstone Lake virophages [YSLVs]) were obtained, as was one nearly complete sequence (Ace Lake Mavirus [ALM]). The genomes obtained were 27,849 bp long with 26 predicted open reading frames (ORFs) (YSLV1), 23,184 bp with 21 ORFs (YSLV2), 27,050 bp with 23 ORFs (YSLV3), 28,306 bp with 34 ORFs (YSLV4), and 17,767 bp with 22 ORFs (ALM). The homologous counterparts of five genes, including putative FtsK-HerA family DNA packaging ATPase and genes encoding DNA helicase/primase, cysteine protease, major capsid protein (MCP), and minor capsid protein (mCP), were present in all virophages studied thus far. They also shared a conserved gene cluster comprising the two core genes of MCP and mCP. Comparative genomic and phylogenetic analyses showed that YSLVs, having a closer relationship to each other than to the other virophages, were more closely related to OLV than to Sputnik but distantly related to Mavirus and ALM. These findings indicate that virophages appear to be widespread and genetically diverse, with at least 3 major lineages. PMID:23408616

  3. Virtual fragment preparation for computational fragment-based drug design.

    PubMed

    Ludington, Jennifer L

    2015-01-01

    Fragment-based drug design (FBDD) has become an important component of the drug discovery process. The use of fragments can accelerate both the search for a hit molecule and the development of that hit into a lead molecule for clinical testing. In addition to experimental methodologies for FBDD such as NMR and X-ray Crystallography screens, computational techniques are playing an increasingly important role. The success of the computational simulations is due in large part to how the database of virtual fragments is prepared. In order to prepare the fragments appropriately it is necessary to understand how FBDD differs from other approaches and the issues inherent in building up molecules from smaller fragment pieces. The ultimate goal of these calculations is to link two or more simulated fragments into a molecule that has an experimental binding affinity consistent with the additive predicted binding affinities of the virtual fragments. Computationally predicting binding affinities is a complex process, with many opportunities for introducing error. Therefore, care should be taken with the fragment preparation procedure to avoid introducing additional inaccuracies.This chapter is focused on the preparation process used to create a virtual fragment database. Several key issues of fragment preparation which affect the accuracy of binding affinity predictions are discussed. The first issue is the selection of the two-dimensional atomic structure of the virtual fragment. Although the particular usage of the fragment can affect this choice (i.e., whether the fragment will be used for calibration, binding site characterization, hit identification, or lead optimization), general factors such as synthetic accessibility, size, and flexibility are major considerations in selecting the 2D structure. Other aspects of preparing the virtual fragments for simulation are the generation of three-dimensional conformations and the assignment of the associated atomic point charges

  4. Evaluating techniques for metagenome annotation using simulated sequence data.

    PubMed

    Randle-Boggis, Richard J; Helgason, Thorunn; Sapp, Melanie; Ashton, Peter D

    2016-07-01

    The advent of next-generation sequencing has allowed huge amounts of DNA sequence data to be produced, advancing the capabilities of microbial ecosystem studies. The current challenge is to identify from which microorganisms and genes the DNA originated. Several tools and databases are available for annotating DNA sequences. The tools, databases and parameters used can have a significant impact on the results: naïve choice of these factors can result in a false representation of community composition and function. We use a simulated metagenome to show how different parameters affect annotation accuracy by evaluating the sequence annotation performances of MEGAN, MG-RAST, One Codex and Megablast. This simulated metagenome allowed the recovery of known organism and function abundances to be quantitatively evaluated, which is not possible for environmental metagenomes. The performance of each program and database varied, e.g. One Codex correctly annotated many sequences at the genus level, whereas MG-RAST RefSeq produced many false positive annotations. This effect decreased as the taxonomic level investigated increased. Selecting more stringent parameters decreases the annotation sensitivity, but increases precision. Ultimately, there is a trade-off between taxonomic resolution and annotation accuracy. These results should be considered when annotating metagenomes and interpreting results from previous studies. PMID:27162180

  5. Metagenomes provide valuable comparative information on soil microeukaryotes.

    PubMed

    Jacquiod, Samuel; Stenbæk, Jonas; Santos, Susana S; Winding, Anne; Sørensen, Søren J; Priemé, Anders

    2016-06-01

    Despite the critical ecological roles of microeukaryotes in terrestrial ecosystems, most descriptive studies of soil microbes published so far focused only on specific groups. Meanwhile, the fast development of metagenome sequencing leads to considerable data accumulation in public repositories, providing microbiologists with substantial amounts of accessible information. We took advantage of public metagenomes in order to investigate microeukaryote communities in a well characterized grassland soil. The data gathered allowed the evaluation of several factors impacting the community structure, including the DNA extraction method, the database choice and also the annotation procedure. While most studies on soil microeukaryotes are based on sequencing of PCR-amplified taxonomic markers (18S rRNA genes, ITS regions), this work represents, to our knowledge, the first report based solely on metagenomic microeukaryote DNA. Choosing the correct annotation procedure and reference database has proven to be crucial, as it considerably limits the risk of wrong assignments. In addition, a significant and pronounced effect of the DNA extraction method on the taxonomical structure of soil microeukaryotes has been identified. Our analyses suggest that publicly available metagenome data can provide valuable information on soil microeukaryotes for comparative purposes when handled appropriately, complementing the current view provided by ribosomal amplicon sequencing methods. PMID:27020245

  6. Two Metagenomes from Late Pleistocene Northeast Siberian Permafrost

    PubMed Central

    Krivushin, Kirill; Kondrashov, Fyodor; Shmakova, Lyubov; Tutukina, Mariya; Petrovskaya, Lada

    2015-01-01

    The present study reports metagenomic shotgun sequencing of microbial communities of two ancient permafrost horizons of the Russian Arctic. Results demonstrate a significant difference in microbial community structure of the analyzed samples in general and microorganisms of the methane cycle in particular. PMID:25555741

  7. Two metagenomes from late pleistocene northeast siberian permafrost.

    PubMed

    Krivushin, Kirill; Kondrashov, Fyodor; Shmakova, Lyubov; Tutukina, Mariya; Petrovskaya, Lada; Rivkina, Elizaveta

    2015-01-01

    The present study reports metagenomic shotgun sequencing of microbial communities of two ancient permafrost horizons of the Russian Arctic. Results demonstrate a significant difference in microbial community structure of the analyzed samples in general and microorganisms of the methane cycle in particular. PMID:25555741

  8. Metagenome sequencing of prokaryotic microbiota collected from Byron Glacier, Alaska.

    PubMed

    Choudhari, Sulbha; Smith, Sean; Owens, Sarah; Gilbert, Jack A; Shain, Daniel H; Dial, Roman J; Grigoriev, Andrey

    2013-03-21

    Cold environments, such as glaciers, are large reservoirs of microbial life. The present study employed 16S rRNA gene amplicon metagenomic sequencing to survey the prokaryotic microbiota on Alaskan glacial ice, revealing a rich and diverse microbial community of some 2,500 species of bacteria and archaea.

  9. Substrate Type Determines Metagenomic Profiles from Diverse Chemical Habitats

    PubMed Central

    Jeffries, Thomas C.; Seymour, Justin R.; Gilbert, Jack A.; Dinsdale, Elizabeth A.; Newton, Kelly; Leterme, Sophie S. C.; Roudnew, Ben; Smith, Renee J.; Seuront, Laurent; Mitchell, James G.

    2011-01-01

    Environmental parameters drive phenotypic and genotypic frequency variations in microbial communities and thus control the extent and structure of microbial diversity. We tested the extent to which microbial community composition changes are controlled by shifting physiochemical properties within a hypersaline lagoon. We sequenced four sediment metagenomes from the Coorong, South Australia from samples which varied in salinity by 99 Practical Salinity Units (PSU), an order of magnitude in ammonia concentration and two orders of magnitude in microbial abundance. Despite the marked divergence in environmental parameters observed between samples, hierarchical clustering of taxonomic and metabolic profiles of these metagenomes showed striking similarity between the samples (>89%). Comparison of these profiles to those derived from a wide variety of publically available datasets demonstrated that the Coorong sediment metagenomes were similar to other sediment, soil, biofilm and microbial mat samples regardless of salinity (>85% similarity). Overall, clustering of solid substrate and water metagenomes into discrete similarity groups based on functional potential indicated that the dichotomy between water and solid matrices is a fundamental determinant of community microbial metabolism that is not masked by salinity, nutrient concentration or microbial abundance. PMID:21966446

  10. Metagenomic Analyses of Drinking Water Receiving Different Disinfection Treatments

    EPA Science Inventory

    A metagenome-based approach was utilized for assessing the taxonomic affiliation and function potential of microbial populations in free chlorine (CHL) and monochloramine (CHM) treated drinking water (DW). A total of 1,024, 242 (averaging 544 bp) and 849, 349 (averaging 554 bp) ...

  11. Metagenome Sequencing of the Greater Kudu (Tragelaphus strepsiceros) Rumen Microbiome.

    PubMed

    Dube, Anita N; Moyo, Freeman; Dhlamini, Zephaniah

    2015-01-01

    Ruminant herbivores utilize a symbiotic relationship with microorganisms in their rumen to exploit fibrous foods for nutrition. We report the metagenome sequences of the greater kudu (Tragelaphus strepsiceros) rumen digesta, revealing a diverse community of microbes and some novel hydrolytic enzymes.

  12. MetaGenomic Assembly by Merging (MeGAMerge)

    SciTech Connect

    Scholz Chien-Chi Lo, Matthew B.

    2015-08-03

    "MetaGenomic Assembly by Merging" (MeGAMerge)Is a novel method of merging of multiple genomic assembly or long read data sources for assembly by use of internal trimming/filtering of data, followed by use of two 3rd party tools to merge data by overlap based assembly.

  13. Evaluating techniques for metagenome annotation using simulated sequence data

    PubMed Central

    Randle-Boggis, Richard J.; Helgason, Thorunn; Sapp, Melanie; Ashton, Peter D.

    2016-01-01

    The advent of next-generation sequencing has allowed huge amounts of DNA sequence data to be produced, advancing the capabilities of microbial ecosystem studies. The current challenge is to identify from which microorganisms and genes the DNA originated. Several tools and databases are available for annotating DNA sequences. The tools, databases and parameters used can have a significant impact on the results: naïve choice of these factors can result in a false representation of community composition and function. We use a simulated metagenome to show how different parameters affect annotation accuracy by evaluating the sequence annotation performances of MEGAN, MG-RAST, One Codex and Megablast. This simulated metagenome allowed the recovery of known organism and function abundances to be quantitatively evaluated, which is not possible for environmental metagenomes. The performance of each program and database varied, e.g. One Codex correctly annotated many sequences at the genus level, whereas MG-RAST RefSeq produced many false positive annotations. This effect decreased as the taxonomic level investigated increased. Selecting more stringent parameters decreases the annotation sensitivity, but increases precision. Ultimately, there is a trade-off between taxonomic resolution and annotation accuracy. These results should be considered when annotating metagenomes and interpreting results from previous studies. PMID:27162180

  14. Marine Metagenome as A Resource for Novel Enzymes

    PubMed Central

    Alma’abadi, Amani D.; Gojobori, Takashi; Mineta, Katsuhiko

    2015-01-01

    More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel biocatalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metagenomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we discuss challenges in the metagenomics as an important part of bioinformatics analysis in big data. PMID:26563467

  15. Assessment of quality control approaches for metagenomic data analysis

    NASA Astrophysics Data System (ADS)

    Zhou, Qian; Su, Xiaoquan; Ning, Kang

    2014-11-01

    Currently there is an explosive increase of the next-generation sequencing (NGS) projects and related datasets, which have to be processed by Quality Control (QC) procedures before they could be utilized for omics analysis. QC procedure usually includes identification and filtration of sequencing artifacts such as low-quality reads and contaminating reads, which would significantly affect and sometimes mislead downstream analysis. Quality control of NGS data for microbial communities is especially challenging. In this work, we have evaluated and compared the performance and effects of various QC pipelines on different types of metagenomic NGS data and from different angles, based on which general principles of using QC pipelines were proposed. Results based on both simulated and real metagenomic datasets have shown that: firstly, QC-Chain is superior in its ability for contamination identification for metagenomic NGS datasets with different complexities with high sensitivity and specificity. Secondly, the high performance computing engine enabled QC-Chain to achieve a significant reduction in processing time compared to other pipelines based on serial computing. Thirdly, QC-Chain could outperform other tools in benefiting downstream metagenomic data analysis.

  16. Metagenome Sequencing of the Greater Kudu (Tragelaphus strepsiceros) Rumen Microbiome

    PubMed Central

    Dube, Anita N.; Moyo, Freeman

    2015-01-01

    Ruminant herbivores utilize a symbiotic relationship with microorganisms in their rumen to exploit fibrous foods for nutrition. We report the metagenome sequences of the greater kudu (Tragelaphus strepsiceros) rumen digesta, revealing a diverse community of microbes and some novel hydrolytic enzymes. PMID:26272573

  17. Metagenomic gene annotation by a homology-independent approach

    SciTech Connect

    Froula, Jeff; Zhang, Tao; Salmeen, Annette; Hess, Matthias; Kerfeld, Cheryl A.; Wang, Zhong; Du, Changbin

    2011-06-02

    Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMER but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.

  18. Metagenome Sequencing of the Greater Kudu (Tragelaphus strepsiceros) Rumen Microbiome.

    PubMed

    Dube, Anita N; Moyo, Freeman; Dhlamini, Zephaniah

    2015-01-01

    Ruminant herbivores utilize a symbiotic relationship with microorganisms in their rumen to exploit fibrous foods for nutrition. We report the metagenome sequences of the greater kudu (Tragelaphus strepsiceros) rumen digesta, revealing a diverse community of microbes and some novel hydrolytic enzymes. PMID:26272573

  19. Scaling metagenome sequence assembly with probabilistic de Bruijn graphs

    PubMed Central

    Pell, Jason; Hintze, Arend; Canino-Koning, Rosangela; Howe, Adina; Tiedje, James M.; Brown, C. Titus

    2012-01-01

    Deep sequencing has enabled the investigation of a wide range of environmental microbial ecosystems, but the high memory requirements for de novo assembly of short-read shotgun sequencing data from these complex populations are an increasingly large practical barrier. Here we introduce a memory-efficient graph representation with which we can analyze the k-mer connectivity of metagenomic samples. The graph representation is based on a probabilistic data structure, a Bloom filter, that allows us to efficiently store assembly graphs in as little as 4 bits per k-mer, albeit inexactly. We show that this data structure accurately represents DNA assembly graphs in low memory. We apply this data structure to the problem of partitioning assembly graphs into components as a prelude to assembly, and show that this reduces the overall memory requirements for de novo assembly of metagenomes. On one soil metagenome assembly, this approach achieves a nearly 40-fold decrease in the maximum memory requirements for assembly. This probabilistic graph representation is a significant theoretical advance in storing assembly graphs and also yields immediate leverage on metagenomic assembly. PMID:22847406

  20. Marine Metagenome as A Resource for Novel Enzymes.

    PubMed

    Alma'abadi, Amani D; Gojobori, Takashi; Mineta, Katsuhiko

    2015-10-01

    More than 99% of identified prokaryotes, including many from the marine environment, cannot be cultured in the laboratory. This lack of capability restricts our knowledge of microbial genetics and community ecology. Metagenomics, the culture-independent cloning of environmental DNAs that are isolated directly from an environmental sample, has already provided a wealth of information about the uncultured microbial world. It has also facilitated the discovery of novel biocatalysts by allowing researchers to probe directly into a huge diversity of enzymes within natural microbial communities. Recent advances in these studies have led to a great interest in recruiting microbial enzymes for the development of environmentally-friendly industry. Although the metagenomics approach has many limitations, it is expected to provide not only scientific insights but also economic benefits, especially in industry. This review highlights the importance of metagenomics in mining microbial lipases, as an example, by using high-throughput techniques. In addition, we discuss challenges in the metagenomics as an important part of bioinformatics analysis in big data. PMID:26563467

  1. Metagenomics and other Methods for Measuring Antibiotic Resistance in Agroecosystems

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: There is broad concern regarding antibiotic resistance on farms and in fields, however there is no standard method for defining or measuring antibiotic resistance in environmental samples. Methods: We used metagenomic, culture-based, and molecular methods to characterize the amount, t...

  2. The effects of variable sample biomass on comparative metagenomics.

    PubMed

    Chafee, Meghan; Maignien, Loïs; Simmons, Sheri L

    2015-07-01

    Longitudinal studies that integrate samples with variable biomass are essential to understand microbial community dynamics across space or time. Shotgun metagenomics is widely used to investigate these communities at the functional level, but little is known about the effects of combining low and high biomass samples on downstream analysis. We investigated the interacting effects of DNA input and library amplification by polymerase chain reaction on comparative metagenomic analysis using dilutions of a single complex template from an Arabidopsis thaliana-associated microbial community. We modified the Illumina Nextera kit to generate high-quality large-insert (680 bp) paired-end libraries using a range of 50 pg to 50 ng of input DNA. Using assembly-based metagenomic analysis, we demonstrate that DNA input level has a significant impact on community structure due to overrepresentation of low-GC genomic regions following library amplification. In our system, these differences were largely superseded by variations between biological replicates, but our results advocate verifying the influence of library amplification on a case-by-case basis. Overall, this study provides recommendations for quality filtering and de-replication prior to analysis, as well as a practical framework to address the issue of low biomass or biomass heterogeneity in longitudinal metagenomic surveys.

  3. Identification of a novel coronavirus from guinea fowl using metagenomics.

    PubMed

    Ducatez, Mariette F; Guérin, Jean-Luc

    2015-01-01

    While classical virology techniques such as virus culture, electron microscopy, or classical PCR had been unsuccessful in identifying the causative agent responsible for the fulminating disease of guinea fowl, we identified a novel avian gammacoronavirus associated with the disease using metagenomics. Next-generation sequencing is an unbiased approach that allows the sequencing of virtually all the genetic material present in a given sample.

  4. Aquatic metagenomes implicate Thaumarchaeota in global cobalamin production

    PubMed Central

    Doxey, Andrew C; Kurtz, Daniel A; Lynch, Michael DJ; Sauder, Laura A; Neufeld, Josh D

    2015-01-01

    Cobalamin (vitamin B12) is a complex metabolite and essential cofactor required by many branches of life, including most eukaryotic phytoplankton. Algae and other cobalamin auxotrophs rely on environmental cobalamin supplied from a relatively small set of cobalamin-producing prokaryotic taxa. Although several Bacteria have been implicated in cobalamin biosynthesis and associated with algal symbiosis, the involvement of Archaea in cobalamin production is poorly understood, especially with respect to the Thaumarchaeota. Based on the detection of cobalamin synthesis genes in available thaumarchaeotal genomes, we hypothesized that Thaumarchaeota, which are ubiquitous and abundant in aquatic environments, have an important role in cobalamin biosynthesis within global aquatic ecosystems. To test this hypothesis, we examined cobalamin synthesis genes across sequenced thaumarchaeotal genomes and 430 metagenomes from a diverse range of marine, freshwater and hypersaline environments. Our analysis demonstrates that all available thaumarchaeotal genomes possess cobalamin synthesis genes, predominantly from the anaerobic pathway, suggesting widespread genetic capacity for cobalamin synthesis. Furthermore, although bacterial cobalamin genes dominated most surface marine metagenomes, thaumarchaeotal cobalamin genes dominated metagenomes from polar marine environments, increased with depth in marine water columns, and displayed seasonality, with increased winter abundance observed in time-series datasets (e.g., L4 surface water in the English Channel). Our results also suggest niche partitioning between thaumarchaeotal and cyanobacterial ribosomal and cobalamin synthesis genes across all metagenomic datasets analyzed. These results provide strong evidence for specific biogeographical distributions of thaumarchaeotal cobalamin genes, expanding our understanding of the global biogeochemical roles played by Thaumarchaeota in aquatic environments. PMID:25126756

  5. New Scalings in Nuclear Fragmentation

    SciTech Connect

    Bonnet, E.; Bougault, R.; Galichet, E.; Gagnon-Moisan, F.; Guinet, D.; Lautesse, P.; Marini, P.; Parlog, M.

    2010-10-01

    Fragment partitions of fragmenting hot nuclei produced in central and semiperipheral collisions have been compared in the excitation energy region 4-10 MeV per nucleon where radial collective expansion takes place. It is shown that, for a given total excitation energy per nucleon, the amount of radial collective energy fixes the mean fragment multiplicity. It is also shown that, at a given total excitation energy per nucleon, the different properties of fragment partitions are completely determined by the reduced fragment multiplicity (i.e., normalized to the source size). Freeze-out volumes seem to play a role in the scalings observed.

  6. Fragmentation function measurements at Belle

    SciTech Connect

    Seidl, Ralf; Vossen, Anselm; Leitgab, Martin; Grosse-Perdekamp, Matthias; Giordano, Francesca; Ogawa, Akio

    2011-12-14

    The precision measurement of fragmentation functions is an important requirement to study the spin structure of the nucleon. Unpolarized fragmentation functions at reasonably low scale and high fractional energy are necessary to complement the measurements mostly performed at LEP in order to obtain high enough precision for measurements at semi-inclusive DIS experiments and at RHIC. Those can be obtained from the abundant data collected with the Belle detector at the e{sup +}e{sup -} collider KEKB. In addition one can cleanly measure the transversely polarized fragmentation functions such as the Collins fragmentation function and the interference fragmentation functions. Both have been obtained with great precision at Belle.

  7. Metagenomic Sequencing of the Chronic Obstructive Pulmonary Disease Upper Bronchial Tract Microbiome Reveals Functional Changes Associated with Disease Severity.

    PubMed

    Cameron, Simon J S; Lewis, Keir E; Huws, Sharon A; Lin, Wanchang; Hegarty, Matthew J; Lewis, Paul D; Mur, Luis A J; Pachebat, Justin A

    2016-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a major source of mortality and morbidity worldwide. The microbiome associated with this disease may be an important component of the disease, though studies to date have been based on sequencing of the 16S rRNA gene, and have revealed unequivocal results. Here, we employed metagenomic sequencing of the upper bronchial tract (UBT) microbiome to allow for greater elucidation of its taxonomic composition, and revealing functional changes associated with the disease. The bacterial metagenomes within sputum samples from eight COPD patients and ten 'healthy' smokers (Controls) were sequenced, and suggested significant changes in the abundance of bacterial species, particularly within the Streptococcus genus. The functional capacity of the COPD UBT microbiome indicated an increased capacity for bacterial growth, which could be an important feature in bacterial-associated acute exacerbations. Regression analyses correlated COPD severity (FEV1% of predicted) with differences in the abundance of Streptococcus pneumoniae and functional classifications related to a reduced capacity for bacterial sialic acid metabolism. This study suggests that the COPD UBT microbiome could be used in patient risk stratification and in identifying novel monitoring and treatment methods, but study of a longitudinal cohort will be required to unequivocally relate these features of the microbiome with COPD severity. PMID:26872143

  8. Metagenomic Sequencing of the Chronic Obstructive Pulmonary Disease Upper Bronchial Tract Microbiome Reveals Functional Changes Associated with Disease Severity

    PubMed Central

    Cameron, Simon J. S.; Lewis, Keir E.; Huws, Sharon A.; Lin, Wanchang; Hegarty, Matthew J.; Lewis, Paul D.; Mur, Luis A. J.; Pachebat, Justin A.

    2016-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a major source of mortality and morbidity worldwide. The microbiome associated with this disease may be an important component of the disease, though studies to date have been based on sequencing of the 16S rRNA gene, and have revealed unequivocal results. Here, we employed metagenomic sequencing of the upper bronchial tract (UBT) microbiome to allow for greater elucidation of its taxonomic composition, and revealing functional changes associated with the disease. The bacterial metagenomes within sputum samples from eight COPD patients and ten ‘healthy’ smokers (Controls) were sequenced, and suggested significant changes in the abundance of bacterial species, particularly within the Streptococcus genus. The functional capacity of the COPD UBT microbiome indicated an increased capacity for bacterial growth, which could be an important feature in bacterial-associated acute exacerbations. Regression analyses correlated COPD severity (FEV1% of predicted) with differences in the abundance of Streptococcus pneumoniae and functional classifications related to a reduced capacity for bacterial sialic acid metabolism. This study suggests that the COPD UBT microbiome could be used in patient risk stratification and in identifying novel monitoring and treatment methods, but study of a longitudinal cohort will be required to unequivocally relate these features of the microbiome with COPD severity. PMID:26872143

  9. Dietary history contributes to enterotype-like clustering and functional metagenomic content in the intestinal microbiome of wild mice.

    PubMed

    Wang, Jun; Linnenbrink, Miriam; Künzel, Sven; Fernandes, Ricardo; Nadeau, Marie-Josée; Rosenstiel, Philip; Baines, John F

    2014-07-01

    Understanding the origins of gut microbial community structure is critical for the identification and interpretation of potential fitness-related traits for the host. The presence of community clusters characterized by differences in the abundance of signature taxa, referred to as enterotypes, is a debated concept first reported in humans and later extended to other mammalian hosts. In this study, we provide a thorough assessment of their existence in wild house mice using a panel of evaluation criteria. We identify support for two clusters that are compositionally similar to clusters identified in humans, chimpanzees, and laboratory mice, characterized by differences in Bacteroides, Robinsoniella, and unclassified genera belonging to the family Lachnospiraceae. To further evaluate these clusters, we (i) monitored community changes associated with moving mice from the natural to a laboratory environment, (ii) performed functional metagenomic sequencing, and (iii) subjected wild-caught samples to stable isotope analysis to reconstruct dietary patterns. This process reveals differences in the proportions of genes involved in carbohydrate versus protein metabolism in the functional metagenome, as well as differences in plant- versus meat-derived food sources between clusters. In conjunction with wild-caught mice quickly changing their enterotype classification upon transfer to a standard laboratory chow diet, these results provide strong evidence that dietary history contributes to the presence of enterotype-like clustering in wild mice.

  10. Metagenomic Sequencing of the Chronic Obstructive Pulmonary Disease Upper Bronchial Tract Microbiome Reveals Functional Changes Associated with Disease Severity.

    PubMed

    Cameron, Simon J S; Lewis, Keir E; Huws, Sharon A; Lin, Wanchang; Hegarty, Matthew J; Lewis, Paul D; Mur, Luis A J; Pachebat, Justin A

    2016-01-01

    Chronic Obstructive Pulmonary Disease (COPD) is a major source of mortality and morbidity worldwide. The microbiome associated with this disease may be an important component of the disease, though studies to date have been based on sequencing of the 16S rRNA gene, and have revealed unequivocal results. Here, we employed metagenomic sequencing of the upper bronchial tract (UBT) microbiome to allow for greater elucidation of its taxonomic composition, and revealing functional changes associated with the disease. The bacterial metagenomes within sputum samples from eight COPD patients and ten 'healthy' smokers (Controls) were sequenced, and suggested significant changes in the abundance of bacterial species, particularly within the Streptococcus genus. The functional capacity of the COPD UBT microbiome indicated an increased capacity for bacterial growth, which could be an important feature in bacterial-associated acute exacerbations. Regression analyses correlated COPD severity (FEV1% of predicted) with differences in the abundance of Streptococcus pneumoniae and functional classifications related to a reduced capacity for bacterial sialic acid metabolism. This study suggests that the COPD UBT microbiome could be used in patient risk stratification and in identifying novel monitoring and treatment methods, but study of a longitudinal cohort will be required to unequivocally relate these features of the microbiome with COPD severity.

  11. Remote Sensing Information Classification

    NASA Technical Reports Server (NTRS)

    Rickman, Douglas L.

    2008-01-01

    This viewgraph presentation reviews the classification of Remote Sensing data in relation to epidemiology. Classification is a way to reduce the dimensionality and precision to something a human can understand. Classification changes SCALAR data into NOMINAL data.

  12. Classification and knowledge

    NASA Technical Reports Server (NTRS)

    Kurtz, Michael J.

    1989-01-01

    Automated procedures to classify objects are discussed. The classification problem is reviewed, and the relation of epistemology and classification is considered. The classification of stellar spectra and of resolved images of galaxies is addressed.

  13. Fragment oriented molecular shapes.

    PubMed

    Hain, Ethan; Camacho, Carlos J; Koes, David Ryan

    2016-05-01

    Molecular shape is an important concept in drug design and virtual screening. Shape similarity typically uses either alignment methods, which dynamically optimize molecular poses with respect to the query molecular shape, or feature vector methods, which are computationally less demanding but less accurate. The computational cost of alignment can be reduced by pre-aligning shapes, as is done with the Volumetric-Aligned Molecular Shapes (VAMS) method. Here, we introduce and evaluate fragment oriented molecular shapes (FOMS), where shapes are aligned based on molecular fragments. FOMS enables the use of shape constraints, a novel method for precisely specifying molecular shape queries that provides the ability to perform partial shape matching and supports search algorithms that function on an interactive time scale. When evaluated using the challenging Maximum Unbiased Validation dataset, shape constraints were able to extract significantly enriched subsets of compounds for the majority of targets, and FOMS matched or exceeded the performance of both VAMS and an optimizing alignment method of shape similarity search. PMID:27085751

  14. Diagnosis of Bacterial Bloodstream Infections: A 16S Metagenomics Approach

    PubMed Central

    Van Puyvelde, Sandra; De Block, Tessa; Maltha, Jessica; Palpouguini, Lompo; Tahita, Marc; Tinto, Halidou; Jacobs, Jan; Deborggraeve, Stijn

    2016-01-01

    Background Bacterial bloodstream infection (bBSI) is one of the leading causes of death in critically ill patients and accurate diagnosis is therefore crucial. We here report a 16S metagenomics approach for diagnosing and understanding bBSI. Methodology/Principal Findings The proof-of-concept was delivered in 75 children (median age 15 months) with severe febrile illness in Burkina Faso. Standard blood culture and malaria testing were conducted at the time of hospital admission. 16S metagenomics testing was done retrospectively and in duplicate on the blood of all patients. Total DNA was extracted from the blood and the V3–V4 regions of the bacterial 16S rRNA genes were amplified by PCR and deep sequenced on an Illumina MiSeq sequencer. Paired reads were curated, taxonomically labeled, and filtered. Blood culture diagnosed bBSI in 12 patients, but this number increased to 22 patients when combining blood culture and 16S metagenomics results. In addition to superior sensitivity compared to standard blood culture, 16S metagenomics revealed important novel insights into the nature of bBSI. Patients with acute malaria or recovering from malaria had a 7-fold higher risk of presenting polymicrobial bloodstream infections compared to patients with no recent malaria diagnosis (p-value = 0.046). Malaria is known to affect epithelial gut function and may thus facilitate bacterial translocation from the intestinal lumen to the blood. Importantly, patients with such polymicrobial blood infections showed a 9-fold higher risk factor for not surviving their febrile illness (p-value = 0.030). Conclusions/Significance Our data demonstrate that 16S metagenomics is a powerful approach for the diagnosis and understanding of bBSI. This proof-of-concept study also showed that appropriate control samples are crucial to detect background signals due to environmental contamination. PMID:26927306

  15. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

    PubMed

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P D T; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-06-01

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by "genome skimming," which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous "clusters" of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics. PMID:25979752

  16. gbtools: Interactive Visualization of Metagenome Bins in R.

    PubMed

    Seah, Brandon K B; Gruber-Vodicka, Harald R

    2015-01-01

    Improvements in DNA sequencing technology have increased the amount and quality of sequences that can be obtained from metagenomic samples, making it practical to extract individual microbial genomes from metagenomic assemblies ("binning"). However, while many tools and methods exist for unsupervised binning with various statistical algorithms, there are few options for visualizing the results, even though visualization is vital to exploratory data analysis. We have developed gbtools, a software package that allows users to visualize metagenomic assemblies by plotting coverage (sequencing depth) and GC values of contigs, and also to annotate the plots with taxonomic information. Different sets of annotations, including taxonomic assignments from conserved marker genes or SSU rRNA genes, can be imported simultaneously; users can choose which annotations to plot. Bins can be manually defined from plots, or be imported from third-party binning tools and overlaid onto plots, such that results from different methods can be compared side-by-side. gbtools reports summary statistics of bins including marker gene completeness, and allows the user to add or subtract bins with each other. We illustrate some of the functions available in gbtools with two examples: the metagenome of Olavius algarvensis, a marine oligochaete worm that has up to five bacterial symbionts, and the metagenome of a synthetic mock community comprising 64 bacterial and archaeal strains. We show how instances of poor automated binning, sequencer GC% bias, and variation between samples can be quickly diagnosed by visualization, and demonstrate how the results from different binning tools can be combined and refined to yield manually curated bins with higher completeness. gbtools is open-source and written in R. The software package, documentation, and example data are available freely online at https://github.com/kbseah/genome-bin-tools. PMID:26732662

  17. gbtools: Interactive Visualization of Metagenome Bins in R

    PubMed Central

    Seah, Brandon K. B.; Gruber-Vodicka, Harald R.

    2015-01-01

    Improvements in DNA sequencing technology have increased the amount and quality of sequences that can be obtained from metagenomic samples, making it practical to extract individual microbial genomes from metagenomic assemblies (“binning”). However, while many tools and methods exist for unsupervised binning with various statistical algorithms, there are few options for visualizing the results, even though visualization is vital to exploratory data analysis. We have developed gbtools, a software package that allows users to visualize metagenomic assemblies by plotting coverage (sequencing depth) and GC values of contigs, and also to annotate the plots with taxonomic information. Different sets of annotations, including taxonomic assignments from conserved marker genes or SSU rRNA genes, can be imported simultaneously; users can choose which annotations to plot. Bins can be manually defined from plots, or be imported from third-party binning tools and overlaid onto plots, such that results from different methods can be compared side-by-side. gbtools reports summary statistics of bins including marker gene completeness, and allows the user to add or subtract bins with each other. We illustrate some of the functions available in gbtools with two examples: the metagenome of Olavius algarvensis, a marine oligochaete worm that has up to five bacterial symbionts, and the metagenome of a synthetic mock community comprising 64 bacterial and archaeal strains. We show how instances of poor automated binning, sequencer GC% bias, and variation between samples can be quickly diagnosed by visualization, and demonstrate how the results from different binning tools can be combined and refined to yield manually curated bins with higher completeness. gbtools is open-source and written in R. The software package, documentation, and example data are available freely online at https://github.com/kbseah/genome-bin-tools. PMID:26732662

  18. Metagenome Skimming of Insect Specimen Pools: Potential for Comparative Genomics.

    PubMed

    Linard, Benjamin; Crampton-Platt, Alex; Gillett, Conrad P D T; Timmermans, Martijn J T N; Vogler, Alfried P

    2015-05-14

    Metagenomic analyses are challenging in metazoans, but high-copy number and repeat regions can be assembled from low-coverage sequencing by "genome skimming," which is applied here as a new way of characterizing metagenomes obtained in an ecological or taxonomic context. Illumina shotgun sequencing on two pools of Coleoptera (beetles) of approximately 200 species each were assembled into tens of thousands of scaffolds. Repeated low-coverage sequencing recovered similar scaffold sets consistently, although approximately 70% of scaffolds could not be identified against existing genome databases. Identifiable scaffolds included mitochondrial DNA, conserved sequences with hits to expressed sequence tag and protein databases, and known repeat elements of high and low complexity, including numerous copies of rRNA and histone genes. Assemblies of histones captured a diversity of gene order and primary sequence in Coleoptera. Scaffolds with similarity to multiple sites in available coleopteran genome sequences for Dendroctonus and Tribolium revealed high specificity of scaffolds to either of these genomes, in particular for high-copy number repeats. Numerous "clusters" of scaffolds mapped to the same genomic site revealed intra- and/or intergenomic variation within a metagenome pool. In addition to effect of taxonomic composition of the metagenomes, the number of mapped scaffolds also revealed structural differences between the two reference genomes, although the significance of this striking finding remains unclear. Finally, apparently exogenous sequences were recovered, including potential food plants, fungal pathogens, and bacterial symbionts. The "metagenome skimming" approach is useful for capturing the genomic diversity of poorly studied, species-rich lineages and opens new prospects in environmental genomics.

  19. Do waterbody classifications predict water quality?

    PubMed

    Barclay, Janet R; Tripp, Hannah; Bellucci, Christopher J; Warner, Glenn; Helton, Ashley M

    2016-12-01

    Many states classify waterbodies according to groups of designated uses, which suggests that classifications may be correlated with water quality. The primary assessments of water quality in the United States (the Biennial Integrated Water Quality Reports) do not consider classification, so the relationship between classification and water quality is untested. Additionally, water quality has been shown to be influenced by watershed land use; however, land use is not typically part of waterbody classification systems. To determine the relationships between waterbody classification, water quality, watershed land cover, and forest fragmentation, we analyzed existing water quality data for the State of Connecticut from the United States Geological Survey and the Connecticut Department of Energy and Environmental Protection and land cover data from the National Land Cover Dataset. Connecticut uses a unique classification system that includes separation of drinking water sources (Class AA) and waterbodies receiving waste water discharges (Class B). Using a comparison of multiple means, we found that Class B waters had higher levels of nitrogen, solids, chloride, sodium, dissolved copper, total iron, and dissolved manganese than Class AA waters. Watersheds upstream of Class B segments had less forest cover, more development and more impervious cover than watersheds upstream of Class AA segments. Class A sites had some similarities in water quality and land cover with Class AA sites and some with Class B sites. The subset of Class B waterbodies with "Class AA-like" water quality also had "Class AA-like" land cover. Based on this and a multiple linear regression analysis, we found that water quality is more closely related to watershed land cover and forest fragmentation than to waterbody classification. Our results suggest that watershed land cover likely is a better proxy for water quality than waterbody classification.

  20. Do waterbody classifications predict water quality?

    PubMed

    Barclay, Janet R; Tripp, Hannah; Bellucci, Christopher J; Warner, Glenn; Helton, Ashley M

    2016-12-01

    Many states classify waterbodies according to groups of designated uses, which suggests that classifications may be correlated with water quality. The primary assessments of water quality in the United States (the Biennial Integrated Water Quality Reports) do not consider classification, so the relationship between classification and water quality is untested. Additionally, water quality has been shown to be influenced by watershed land use; however, land use is not typically part of waterbody classification systems. To determine the relationships between waterbody classification, water quality, watershed land cover, and forest fragmentation, we analyzed existing water quality data for the State of Connecticut from the United States Geological Survey and the Connecticut Department of Energy and Environmental Protection and land cover data from the National Land Cover Dataset. Connecticut uses a unique classification system that includes separation of drinking water sources (Class AA) and waterbodies receiving waste water discharges (Class B). Using a comparison of multiple means, we found that Class B waters had higher levels of nitrogen, solids, chloride, sodium, dissolved copper, total iron, and dissolved manganese than Class AA waters. Watersheds upstream of Class B segments had less forest cover, more development and more impervious cover than watersheds upstream of Class AA segments. Class A sites had some similarities in water quality and land cover with Class AA sites and some with Class B sites. The subset of Class B waterbodies with "Class AA-like" water quality also had "Class AA-like" land cover. Based on this and a multiple linear regression analysis, we found that water quality is more closely related to watershed land cover and forest fragmentation than to waterbody classification. Our results suggest that watershed land cover likely is a better proxy for water quality than waterbody classification. PMID:27621038

  1. THE WESTERN LAKE SUPERIOR COMPARATIVE WATERSHED FRAMEWORK: A FIELD TEST OF GEOGRAPHICALLY-DEPENDENT VS. THRESHOLD-BASED GEOGRAPHICALLY-INDEPENDENT CLASSIFICATION

    EPA Science Inventory

    Stratified random selection of watersheds allowed us to compare geographically-independent classification schemes based on watershed storage (wetland + lake area/watershed area) and forest fragmentation with a geographically-based classification scheme within the Northern Lakes a...

  2. DOE JGI Quality Metrics; Approaches to Scaling and Improving Metagenome Assembly (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Copeland, Alex [DOE JGI; Brown, C Titus [Michigan State University

    2016-07-12

    DOE JGI's Alex Copeland on "DOE JGI Quality Metrics" and Michigan State University's C. Titus Brown on "Approaches to Scaling and Improving Metagenome Assembly" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  3. DOE JGI Quality Metrics; Approaches to Scaling and Improving Metagenome Assembly (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Copeland, Alex; Brown, C Titus

    2011-10-13

    DOE JGI's Alex Copeland on "DOE JGI Quality Metrics" and Michigan State University's C. Titus Brown on "Approaches to Scaling and Improving Metagenome Assembly" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  4. Toward Cloning of the Magnetotactic Metagenome: Identification of Magnetosome Island Gene Clusters in Uncultivated Magnetotactic Bacteria from Different Aquatic Sediments▿

    PubMed Central

    Jogler, Christian; Lin, Wei; Meyerdierks, Anke; Kube, Michael; Katzmann, Emanuel; Flies, Christine; Pan, Yongxin; Amann, Rudolf; Reinhardt, Richard; Schüler, Dirk

    2009-01-01

    In this report, we describe the selective cloning of large DNA fragments from magnetotactic metagenomes from various aquatic habitats. This was achieved by a two-step magnetic enrichment which allowed the mass collection of environmental magnetotactic bacteria (MTB) virtually free of nonmagnetic contaminants. Four fosmid libraries were constructed and screened by end sequencing and hybridization analysis using heterologous magnetosome gene probes. A total of 14 fosmids were fully sequenced. We identified and characterized two fosmids, most likely originating from two different alphaproteobacterial strains of MTB that contain several putative operons with homology to the magnetosome island (MAI) of cultivated MTB. This is the first evidence that uncultivated MTB exhibit similar yet differing organizations of the MAI, which may account for the diversity in biomineralization and magnetotaxis observed in MTB from various environments. PMID:19395570

  5. Safety analysis of a Russian phage cocktail: from metagenomic analysis to oral application in healthy human subjects.

    PubMed

    McCallin, Shawna; Alam Sarker, Shafiqul; Barretto, Caroline; Sultana, Shamima; Berger, Bernard; Huq, Sayeda; Krause, Lutz; Bibiloni, Rodrigo; Schmitt, Bertrand; Reuteler, Gloria; Brüssow, Harald

    2013-09-01

    Phage therapy has a long tradition in Eastern Europe, where preparations are comprised of complex phage cocktails whose compositions have not been described. We investigated the composition of a phage cocktail from the Russian pharmaceutical company Microgen targeting Escherichia coli/Proteus infections. Electron microscopy identified six phage types, with numerically T7-like phages dominating over T4-like phages. A metagenomic approach using taxonomical classification, reference mapping and de novo assembly identified 18 distinct phage types, including 7 genera of Podoviridae, 2 established and 2 proposed genera of Myoviridae, and 2 genera of Siphoviridae. De novo assembly yielded 7 contigs greater than 30 kb, including a 147-kb Myovirus genome and a 42-kb genome of a potentially new phage. Bioinformatic analysis did not reveal undesired genes and a small human volunteer trial did not associate adverse effects with oral phage exposure. PMID:23755967

  6. Virtual metagenome reconstruction from 16S rRNA gene sequences.

    PubMed

    Okuda, Shujiro; Tsuchiya, Yuki; Kiriyama, Chiho; Itoh, Masumi; Morisaki, Hisao

    2012-01-01

    Microbial ecologists have investigated roles of species richness and diversity in a wide variety of ecosystems. Recently, metagenomics have been developed to measure functions in ecosystems, but this approach is cost-intensive. Here we describe a novel method for the rapid and efficient reconstruction of a virtual metagenome in environmental microbial communities without using large-scale genomic sequencing. We demonstrate this approach using 16S rRNA gene sequences obtained from denaturing gradient gel electrophoresis analysis, mapped to fully sequenced genomes, to reconstruct virtual metagenome-like organizations. Furthermore, we validate a virtual metagenome using a published metagenome for cocoa bean fermentation samples, and show that metagenomes reconstructed from biofilm formation samples allow for the study of the gene pool dynamics that are necessary for biofilm growth.

  7. Metagenomic sequence of saline desert microbiota from wild ass sanctuary, Little Rann of Kutch, Gujarat, India.

    PubMed

    Patel, Rajesh; Mevada, Vishal; Prajapati, Dhaval; Dudhagara, Pravin; Koringa, Prakash; Joshi, C G

    2015-03-01

    We report Metagenome from the saline desert soil sample of Little Rann of Kutch, Gujarat State, India. Metagenome consisted of 633,760 sequences with size 141,307,202 bp and 56% G + C content. Metagenome sequence data are available at EBI under EBI Metagenomics database with accession no. ERP005612. Community metagenomics revealed total 1802 species belonged to 43 different phyla with dominating Marinobacter (48.7%) and Halobacterium (4.6%) genus in bacterial and archaeal domain respectively. Remarkably, 18.2% sequences in a poorly characterized group and 4% gene for various stress responses along with versatile presence of commercial enzyme were evident in a functional metagenome analysis.

  8. Metagenomic sequence of saline desert microbiota from wild ass sanctuary, Little Rann of Kutch, Gujarat, India.

    PubMed

    Patel, Rajesh; Mevada, Vishal; Prajapati, Dhaval; Dudhagara, Pravin; Koringa, Prakash; Joshi, C G

    2015-03-01

    We report Metagenome from the saline desert soil sample of Little Rann of Kutch, Gujarat State, India. Metagenome consisted of 633,760 sequences with size 141,307,202 bp and 56% G + C content. Metagenome sequence data are available at EBI under EBI Metagenomics database with accession no. ERP005612. Community metagenomics revealed total 1802 species belonged to 43 different phyla with dominating Marinobacter (48.7%) and Halobacterium (4.6%) genus in bacterial and archaeal domain respectively. Remarkably, 18.2% sequences in a poorly characterized group and 4% gene for various stress responses along with versatile presence of commercial enzyme were evident in a functional metagenome analysis. PMID:26484162

  9. Chapter 4 embedded metal fragments.

    PubMed

    Kalinich, John F; Vane, Elizabeth A; Centeno, Jose A; Gaitens, Joanna M; Squibb, Katherine S; McDiarmid, Melissa A; Kasper, Christine E

    2014-01-01

    The continued evolution of military munitions and armor on the battlefield, as well as the insurgent use of improvised explosive devices, has led to embedded fragment wounds containing metal and metal mixtures whose long-term toxicologic and carcinogenic properties are not as yet known. Advances in medical care have greatly increased the survival from these types of injuries. Standard surgical guidelines suggest leaving embedded fragments in place, thus individuals may carry these retained metal fragments for the rest of their lives. Nursing professionals will be at the forefront in caring for these wounded individuals, both immediately after the trauma and during the healing and rehabilitation process. Therefore, an understanding of the potential health effects of embedded metal fragment wounds is essential. This review will explore the history of embedded fragment wounds, current research in the field, and Department of Defense and Department of Veterans Affairs guidelines for the identification and long-term monitoring of individuals with embedded fragments.

  10. Metagenomes from two microbial consortia associated with Santa Barbara seep oil.

    PubMed

    Hawley, Erik R; Malfatti, Stephanie A; Pagani, Ioanna; Huntemann, Marcel; Chen, Amy; Foster, Brian; Copeland, Alexander; del Rio, Tijana Glavina; Pati, Amrita; Jansson, Janet R; Gilbert, Jack A; Tringe, Susannah Green; Lorenson, Thomas D; Hess, Matthias

    2014-12-01

    The metagenomes from two microbial consortia associated with natural oils seeping into the Pacific Ocean offshore the coast of Santa Barbara (California, USA) were determined to complement already existing metagenomes generated from microbial communities associated with hydrocarbons that pollute the marine ecosystem. This genomics resource article is the first of two publications reporting a total of four new metagenomes from oils that seep into the Santa Barbara Channel. PMID:24958360

  11. Binary stars - Formation by fragmentation

    NASA Technical Reports Server (NTRS)

    Boss, Alan P.

    1988-01-01

    Theories of binary star formation by capture, separate nuclei, fission and fragmentation are compared, assessing the success of theoretical attempts to explain the observed properties of main-sequence binary stars. The theory of formation by fragmentation is examined, discussing the prospects for checking the theory against observations of binary premain-sequence stars. It is concluded that formation by fragmentation is successful at explaining many of the key properties of main-sequence binary stars.

  12. MetLab: An In Silico Experimental Design, Simulation and Analysis Tool for Viral Metagenomics Studies

    PubMed Central

    Gourlé, Hadrien; Bongcam-Rudloff, Erik; Hayer, Juliette

    2016-01-01

    Metagenomics, the sequence characterization of all genomes within a sample, is widely used as a virus discovery tool as well as a tool to study viral diversity of animals. Metagenomics can be considered to have three main steps; sample collection and preparation, sequencing and finally bioinformatics. Bioinformatic analysis of metagenomic datasets is in itself a complex process, involving few standardized methodologies, thereby hampering comparison of metagenomics studies between research groups. In this publication the new bioinformatics framework MetLab is presented, aimed at providing scientists with an integrated tool for experimental design and analysis of viral metagenomes. MetLab provides support in designing the metagenomics experiment by estimating the sequencing depth needed for the complete coverage of a species. This is achieved by applying a methodology to calculate the probability of coverage using an adaptation of Stevens’ theorem. It also provides scientists with several pipelines aimed at simplifying the analysis of viral metagenomes, including; quality control, assembly and taxonomic binning. We also implement a tool for simulating metagenomics datasets from several sequencing platforms. The overall aim is to provide virologists with an easy to use tool for designing, simulating and analyzing viral metagenomes. The results presented here include a benchmark towards other existing software, with emphasis on detection of viruses as well as speed of applications. This is packaged, as comprehensive software, readily available for Linux and OSX users at https://github.com/norling/metlab. PMID:27479078

  13. MetLab: An In Silico Experimental Design, Simulation and Analysis Tool for Viral Metagenomics Studies.

    PubMed

    Norling, Martin; Karlsson-Lindsjö, Oskar E; Gourlé, Hadrien; Bongcam-Rudloff, Erik; Hayer, Juliette

    2016-01-01

    Metagenomics, the sequence characterization of all genomes within a sample, is widely used as a virus discovery tool as well as a tool to study viral diversity of animals. Metagenomics can be considered to have three main steps; sample collection and preparation, sequencing and finally bioinformatics. Bioinformatic analysis of metagenomic datasets is in itself a complex process, involving few standardized methodologies, thereby hampering comparison of metagenomics studies between research groups. In this publication the new bioinformatics framework MetLab is presented, aimed at providing scientists with an integrated tool for experimental design and analysis of viral metagenomes. MetLab provides support in designing the metagenomics experiment by estimating the sequencing depth needed for the complete coverage of a species. This is achieved by applying a methodology to calculate the probability of coverage using an adaptation of Stevens' theorem. It also provides scientists with several pipelines aimed at simplifying the analysis of viral metagenomes, including; quality control, assembly and taxonomic binning. We also implement a tool for simulating metagenomics datasets from several sequencing platforms. The overall aim is to provide virologists with an easy to use tool for designing, simulating and analyzing viral metagenomes. The results presented here include a benchmark towards other existing software, with emphasis on detection of viruses as well as speed of applications. This is packaged, as comprehensive software, readily available for Linux and OSX users at https://github.com/norling/metlab. PMID:27479078

  14. MetLab: An In Silico Experimental Design, Simulation and Analysis Tool for Viral Metagenomics Studies.

    PubMed

    Norling, Martin; Karlsson-Lindsjö, Oskar E; Gourlé, Hadrien; Bongcam-Rudloff, Erik; Hayer, Juliette

    2016-01-01

    Metagenomics, the sequence characterization of all genomes within a sample, is widely used as a virus discovery tool as well as a tool to study viral diversity of animals. Metagenomics can be considered to have three main steps; sample collection and preparation, sequencing and finally bioinformatics. Bioinformatic analysis of metagenomic datasets is in itself a complex process, involving few standardized methodologies, thereby hampering comparison of metagenomics studies between research groups. In this publication the new bioinformatics framework MetLab is presented, aimed at providing scientists with an integrated tool for experimental design and analysis of viral metagenomes. MetLab provides support in designing the metagenomics experiment by estimating the sequencing depth needed for the complete coverage of a species. This is achieved by applying a methodology to calculate the probability of coverage using an adaptation of Stevens' theorem. It also provides scientists with several pipelines aimed at simplifying the analysis of viral metagenomes, including; quality control, assembly and taxonomic binning. We also implement a tool for simulating metagenomics datasets from several sequencing platforms. The overall aim is to provide virologists with an easy to use tool for designing, simulating and analyzing viral metagenomes. The results presented here include a benchmark towards other existing software, with emphasis on detection of viruses as well as speed of applications. This is packaged, as comprehensive software, readily available for Linux and OSX users at https://github.com/norling/metlab.

  15. The MG-RAST Metagenomics Database and Portal in 2015

    SciTech Connect

    Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang; Glass, Elizabeth; Harrison, Travis; Keegan, Kevin; Paczian, Tobias; Trimble, William L.; Bagchi, Saurabh; Grama, Ananth; Chaterji, Somali; Meyer, Folker

    2015-12-09

    MG-RAST (http://metagenomics.anl.gov) is an opensubmission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. Currently, the system hosts over 200 000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. Lastly, to show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools.

  16. The MG-RAST Metagenomics Database and Portal in 2015

    DOE PAGESBeta

    Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang; Glass, Elizabeth; Harrison, Travis; Keegan, Kevin; Paczian, Tobias; Trimble, William L.; Bagchi, Saurabh; Grama, Ananth; et al

    2015-12-09

    MG-RAST (http://metagenomics.anl.gov) is an opensubmission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. Currently, the system hosts over 200 000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. Lastly, to show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignmentmore » tools.« less

  17. Unlocking the potential of metagenomics through replicated experimental design

    PubMed Central

    Knight, Rob; Jansson, Janet; Field, Dawn; Fierer, Noah; Desai, Narayan; Fuhrman, Jed A.; Hugenholtz, Phil; van der Lelie, Daniel; Meyer, Folker; Stevens, Rick; Bailey, Mark J.; Gordon, Jeffrey I.; Kowalchuk, George A.; Gilbert, Jack A.

    2015-01-01

    Metagenomics holds enormous promise for discovering novel enzymes and organisms that are biomarkers or causes of processes relevant to disease, industry and the environment. In the last two years we have seen a paradigm shift in metagenomics to the application of broad cross-sectional and longitudinal studies enabled by advances in DNA sequencing and high-performance computing. These technologies now make it possible to broadly assess microbial diversity and function, allowing systematic investigation of the largely unexplored frontier of microbial life. To achieve this aim, the global scientific community must collaborate and agree upon common objectives and data standards to enable comparative research across the Earth’s microbiome. Improvements in comparability of data will facilitate the study of biotechnologically relevant processes such as bioprospecting for new glycoside hydrolases or identifying novel energy sources. PMID:22678395

  18. A Statistical Framework for the Functional Analysis of Metagenomes

    SciTech Connect

    Sharon, Itai; Pati, Amrita; Markowitz, Victor; Pinter, Ron Y.

    2008-10-01

    Metagenomic studies consider the genetic makeup of microbial communities as a whole, rather than their individual member organisms. The functional and metabolic potential of microbial communities can be analyzed by comparing the relative abundance of gene families in their collective genomic sequences (metagenome) under different conditions. Such comparisons require accurate estimation of gene family frequencies. They present a statistical framework for assessing these frequencies based on the Lander-Waterman theory developed originally for Whole Genome Shotgun (WGS) sequencing projects. They also provide a novel method for assessing the reliability of the estimations which can be used for removing seemingly unreliable measurements. They tested their method on a wide range of datasets, including simulated genomes and real WGS data from sequencing projects of whole genomes. Results suggest that their framework corrects inherent biases in accepted methods and provides a good approximation to the true statistics of gene families in WGS projects.

  19. A Microbial Metagenome (Leucobacter sp.) in Caenorhabditis Whole Genome Sequences.

    PubMed

    Percudani, Riccardo

    2013-01-01

    DNA of apparently recent bacterial origin is found in the genomic sequences of Caenorhabditis angaria and Caenorhabditis remanei. Here we present evidence that the DNA belongs to a single species of the genus Leucobacter (high-GC Gram+ Actinobacteria). Metagenomic tools enabled the assembly of the contaminating sequences in a draft genome of 3.2 Mb harboring 2,826 genes. This information provides insight into a microbial organism intimately associated with Caenorhabditis as well as a solid basis for the reassignment of 3,373 metazoan entries of the public database to a novel bacterial species (Leucobacter sp. AEAR). The application of metagenomic techniques can thus prevent annotation errors and reveal unexpected genetic information in data obtained by conventional genomics. PMID:23585714

  20. Protocol for Metagenomic Virus Detection in Clinical Specimens1

    PubMed Central

    Brinkmann, Annika; Dabrowski, Piotr W.; Radonić, Aleksandar; Nitsche, Andreas; Kurth, Andreas

    2015-01-01

    Sixty percent of emerging viruses have a zoonotic origin, making transmission from animals a major threat to public health. Prompt identification and analysis of these pathogens are indispensable to taking action toward prevention and protection of the affected population. We quantifiably compared classical and modern approaches of virus purification and enrichment in theory and experiments. Eventually, we established an unbiased protocol for detection of known and novel emerging viruses from organ tissues (tissue-based universal virus detection for viral metagenomics [TUViD-VM]). The final TUViD-VM protocol was extensively validated by using real-time PCR and next-generation sequencing. We could increase the amount of detectable virus nucleic acids and improved the detection of viruses <75,000-fold compared with other tested approaches. This TUViD-VM protocol can be used in metagenomic and virome studies to increase the likelihood of detecting viruses from any biological source. PMID:25532973

  1. Metagenomic insights into the dynamics of microbial communities in food.

    PubMed

    Kergourlay, Gilles; Taminiau, Bernard; Daube, Georges; Champomier Vergès, Marie-Christine

    2015-11-20

    Metagenomics has proven to be a powerful tool in exploring a large diversity of natural environments such as air, soil, water, and plants, as well as various human microbiota (e.g. digestive tract, lungs, skin). DNA sequencing techniques are becoming increasingly popular and less and less expensive. Given that high-throughput DNA sequencing approaches have only recently started to be used to decipher food microbial ecosystems, there is a significant growth potential for such technologies in the field of food microbiology. The aim of this review is to present a survey of recent food investigations via metagenomics and to illustrate how this approach can be a valuable tool in the better characterization of foods and their transformation, storage and safety. Traditional food in particular has been thoroughly explored by global approaches in order to provide information on multi-species and multi-organism communities. PMID:26414193

  2. Functional metagenomic screen reveals new and diverse microbial rhodopsins.

    PubMed

    Pushkarev, Alina; Béjà, Oded

    2016-09-01

    Ion-translocating retinylidene rhodopsins are widely distributed among marine and freshwater microbes. The translocation is light-driven, contributing to the production of biochemical energy in diverse microbes. Until today, most microbial rhodopsins had been detected using bioinformatics based on homology to other rhodopsins. In the past decade, there has been increased interest in microbial rhodopsins in the field of optogenetics since microbial rhodopsins were found to be most useful in vertebrate neuronal systems. Here we report on a functional metagenomic assay for detecting microbial rhodopsins. Using an array of narrow pH electrodes and light-emitting diode illumination, we were able to screen a metagenomic fosmid library to detect diverse marine proteorhodopsins and an actinorhodopsin based solely on proton-pumping activity. Our assay therefore provides a rather simple phenotypic means to enrich our understanding of microbial rhodopsins without any prior knowledge of the genomic content of the environmental entities screened. PMID:26894445

  3. Metagenome of a Versatile Chemolithoautotroph from Expanding Oceanic Dead Zones

    SciTech Connect

    Walsh, David A.; Zaikova, Elena; Howes, Charles L.; Song, Young; Wright, Jody; Tringe, Susannah G.; Tortell, Philippe D.; Hallam, Steven J.

    2009-07-15

    Oxygen minimum zones (OMZs), also known as oceanic"dead zones", are widespread oceanographic features currently expanding due to global warming and coastal eutrophication. Although inhospitable to metazoan life, OMZs support a thriving but cryptic microbiota whose combined metabolic activity is intimately connected to nutrient and trace gas cycling within the global ocean. Here we report time-resolved metagenomic analyses of a ubiquitous and abundant but uncultivated OMZ microbe (SUP05) closely related to chemoautotrophic gill symbionts of deep-sea clams and mussels. The SUP05 metagenome harbors a versatile repertoire of genes mediating autotrophic carbon assimilation, sulfur-oxidation and nitrate respiration responsive to a wide range of water column redox states. Thus, SUP05 plays integral roles in shaping nutrient and energy flow within oxygen-deficient oceanic waters via carbon sequestration, sulfide detoxification and biological nitrogen loss with important implications for marine productivity and atmospheric greenhouse control.

  4. Metagenome of a versatile chemolithoautotroph from expanding oceanic dead zones.

    PubMed

    Walsh, David A; Zaikova, Elena; Howes, Charles G; Song, Young C; Wright, Jody J; Tringe, Susannah G; Tortell, Philippe D; Hallam, Steven J

    2009-10-23

    Oxygen minimum zones, also known as oceanic "dead zones," are widespread oceanographic features currently expanding because of global warming. Although inhospitable to metazoan life, they support a cryptic microbiota whose metabolic activities affect nutrient and trace gas cycling within the global ocean. Here, we report metagenomic analyses of a ubiquitous and abundant but uncultivated oxygen minimum zone microbe (SUP05) related to chemoautotrophic gill symbionts of deep-sea clams and mussels. The SUP05 metagenome harbors a versatile repertoire of genes mediating autotrophic carbon assimilation, sulfur oxidation, and nitrate respiration responsive to a wide range of water-column redox states. Our analysis provides a genomic foundation for understanding the ecological and biogeochemical role of pelagic SUP05 in oxygen-deficient oceanic waters and its potential sensitivity to environmental changes.

  5. Gene and translation initiation site prediction in metagenomic sequences

    SciTech Connect

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John; Uberbacher, Edward C

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.

  6. Metagenomic approaches to understanding phylogenetic diversity in quorum sensing

    PubMed Central

    Kimura, Nobutada

    2014-01-01

    Quorum sensing, a form of cell–cell communication among bacteria, allows bacteria to synchronize their behaviors at the population level in order to control behaviors such as luminescence, biofilm formation, signal turnover, pigment production, antibiotics production, swarming, and virulence. A better understanding of quorum-sensing systems will provide us with greater insight into the complex interaction mechanisms used widely in the Bacteria and even the Archaea domain in the environment. Metagenomics, the use of culture-independent sequencing to study the genomic material of microorganisms, has the potential to provide direct information about the quorum-sensing systems in uncultured bacteria. This article provides an overview of the current knowledge of quorum sensing focused on phylogenetic diversity, and presents examples of studies that have used metagenomic techniques. Future technologies potentially related to quorum-sensing systems are also discussed. PMID:24429899

  7. The MG-RAST metagenomics database and portal in 2015

    PubMed Central

    Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang; Glass, Elizabeth; Harrison, Travis; Keegan, Kevin P.; Paczian, Tobias; Trimble, William L.; Bagchi, Saurabh; Grama, Ananth; Chaterji, Somali; Meyer, Folker

    2016-01-01

    MG-RAST (http://metagenomics.anl.gov) is an open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200 000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. To show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools. PMID:26656948

  8. Metagenomic analysis of the airborne environment in urban spaces.

    PubMed

    Be, Nicholas A; Thissen, James B; Fofanov, Viacheslav Y; Allen, Jonathan E; Rojas, Mark; Golovko, George; Fofanov, Yuriy; Koshinsky, Heather; Jaing, Crystal J

    2015-02-01

    The organisms in aerosol microenvironments, especially densely populated urban areas, are relevant to maintenance of public health and detection of potential epidemic or biothreat agents. To examine aerosolized microorganisms in this environment, we performed sequencing on the material from an urban aerosol surveillance program. Whole metagenome sequencing was applied to DNA extracted from air filters obtained during periods from each of the four seasons. The composition of bacteria, plants, fungi, invertebrates, and viruses demonstrated distinct temporal shifts. Bacillus thuringiensis serovar kurstaki was detected in samples known to be exposed to aerosolized spores, illustrating the potential utility of this approach for identification of intentionally introduced microbial agents. Together, these data demonstrate the temporally dependent metagenomic complexity of urban aerosols and the potential of genomic analytical techniques for biosurveillance and monitoring of threats to public health. PMID:25351142

  9. Metagenome of a versatile chemolithoautotroph from expanding oceanic dead zones.

    PubMed

    Walsh, David A; Zaikova, Elena; Howes, Charles G; Song, Young C; Wright, Jody J; Tringe, Susannah G; Tortell, Philippe D; Hallam, Steven J

    2009-10-23

    Oxygen minimum zones, also known as oceanic "dead zones," are widespread oceanographic features currently expanding because of global warming. Although inhospitable to metazoan life, they support a cryptic microbiota whose metabolic activities affect nutrient and trace gas cycling within the global ocean. Here, we report metagenomic analyses of a ubiquitous and abundant but uncultivated oxygen minimum zone microbe (SUP05) related to chemoautotrophic gill symbionts of deep-sea clams and mussels. The SUP05 metagenome harbors a versatile repertoire of genes mediating autotrophic carbon assimilation, sulfur oxidation, and nitrate respiration responsive to a wide range of water-column redox states. Our analysis provides a genomic foundation for understanding the ecological and biogeochemical role of pelagic SUP05 in oxygen-deficient oceanic waters and its potential sensitivity to environmental changes. PMID:19900896

  10. MOCAT2: a metagenomic assembly, annotation and profiling framework

    PubMed Central

    Kultima, Jens Roat; Coelho, Luis Pedro; Forslund, Kristoffer; Huerta-Cepas, Jaime; Li, Simone S.; Driessen, Marja; Voigt, Anita Yvonne; Zeller, Georg; Sunagawa, Shinichi; Bork, Peer

    2016-01-01

    Summary: MOCAT2 is a software pipeline for metagenomic sequence assembly and gene prediction with novel features for taxonomic and functional abundance profiling. The automated generation and efficient annotation of non-redundant reference catalogs by propagating pre-computed assignments from 18 databases covering various functional categories allows for fast and comprehensive functional characterization of metagenomes. Availability and Implementation: MOCAT2 is implemented in Perl 5 and Python 2.7, designed for 64-bit UNIX systems and offers support for high-performance computer usage via LSF, PBS or SGE queuing systems; source code is freely available under the GPL3 license at http://mocat.embl.de. Contact: bork@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153620

  11. The MG-RAST metagenomics database and portal in 2015.

    PubMed

    Wilke, Andreas; Bischof, Jared; Gerlach, Wolfgang; Glass, Elizabeth; Harrison, Travis; Keegan, Kevin P; Paczian, Tobias; Trimble, William L; Bagchi, Saurabh; Grama, Ananth; Chaterji, Somali; Meyer, Folker

    2016-01-01

    MG-RAST (http://metagenomics.anl.gov) is an open-submission data portal for processing, analyzing, sharing and disseminating metagenomic datasets. The system currently hosts over 200,000 datasets and is continuously updated. The volume of submissions has increased 4-fold over the past 24 months, now averaging 4 terabasepairs per month. In addition to several new features, we report changes to the analysis workflow and the technologies used to scale the pipeline up to the required throughput levels. To show possible uses for the data from MG-RAST, we present several examples integrating data and analyses from MG-RAST into popular third-party analysis tools or sequence alignment tools.

  12. [Bacterial genomics and metagenomics: clinical applications and medical relevance].

    PubMed

    Diene, S M; Bertelli, C; Pillonel, T; Schrenzel, J; Greub, G

    2014-11-12

    New sequencing technologies provide in a short time and at low cost high amount of genomic sequences useful for applications such as: a) development of diagnostic PCRs and/or serological tests; b) detection of virulence factors (virulome) or genes/SNPs associated with resistance to antibiotics (resistome) and c) investigation of transmission and dissemination of bacterial pathogens. Thus, bacterial genomics of medical importance is useful to clinical microbiologists, to infectious diseases specialists as well as to epidemiologists. Determining the microbial composition of a sample by metagenomics is another application of new sequencing technologies, useful to understand the impact of bacteria on various non-infectious diseases such as obesity, asthma, or diabetes. Genomics and metagenomics will likely become a specialized diagnostic analysis.

  13. Metagenomic insights into the dynamics of microbial communities in food.

    PubMed

    Kergourlay, Gilles; Taminiau, Bernard; Daube, Georges; Champomier Vergès, Marie-Christine

    2015-11-20

    Metagenomics has proven to be a powerful tool in exploring a large diversity of natural environments such as air, soil, water, and plants, as well as various human microbiota (e.g. digestive tract, lungs, skin). DNA sequencing techniques are becoming increasingly popular and less and less expensive. Given that high-throughput DNA sequencing approaches have only recently started to be used to decipher food microbial ecosystems, there is a significant growth potential for such technologies in the field of food microbiology. The aim of this review is to present a survey of recent food investigations via metagenomics and to illustrate how this approach can be a valuable tool in the better characterization of foods and their transformation, storage and safety. Traditional food in particular has been thoroughly explored by global approaches in order to provide information on multi-species and multi-organism communities.

  14. Metagenomic analysis of the airborne environment in urban spaces.

    PubMed

    Be, Nicholas A; Thissen, James B; Fofanov, Viacheslav Y; Allen, Jonathan E; Rojas, Mark; Golovko, George; Fofanov, Yuriy; Koshinsky, Heather; Jaing, Crystal J

    2015-02-01

    The organisms in aerosol microenvironments, especially densely populated urban areas, are relevant to maintenance of public health and detection of potential epidemic or biothreat agents. To examine aerosolized microorganisms in this environment, we performed sequencing on the material from an urban aerosol surveillance program. Whole metagenome sequencing was applied to DNA extracted from air filters obtained during periods from each of the four seasons. The composition of bacteria, plants, fungi, invertebrates, and viruses demonstrated distinct temporal shifts. Bacillus thuringiensis serovar kurstaki was detected in samples known to be exposed to aerosolized spores, illustrating the potential utility of this approach for identification of intentionally introduced microbial agents. Together, these data demonstrate the temporally dependent metagenomic complexity of urban aerosols and the potential of genomic analytical techniques for biosurveillance and monitoring of threats to public health.

  15. Biogeography and individuality shape function in the human skin metagenome.

    PubMed

    Oh, Julia; Byrd, Allyson L; Deming, Clay; Conlan, Sean; Kong, Heidi H; Segre, Julia A

    2014-10-01

    The varied topography of human skin offers a unique opportunity to study how the body's microenvironments influence the functional and taxonomic composition of microbial communities. Phylogenetic marker gene-based studies have identified many bacteria and fungi that colonize distinct skin niches. Here metagenomic analyses of diverse body sites in healthy humans demonstrate that local biogeography and strong individuality define the skin microbiome. We developed a relational analysis of bacterial, fungal and viral communities, which showed not only site specificity but also individual signatures. We further identified strain-level variation of dominant species as heterogeneous and multiphyletic. Reference-free analyses captured the uncharacterized metagenome through the development of a multi-kingdom gene catalogue, which was used to uncover genetic signatures of species lacking reference genomes. This work is foundational for human disease studies investigating inter-kingdom interactions, metabolic changes and strain tracking, and defines the dual influence of biogeography and individuality on microbial composition and function. PMID:25279917

  16. Methylotrophs in natural habitats: current insights through metagenomics.

    PubMed

    Chistoserdova, Ludmila

    2015-07-01

    The focus of this review is on the recent data from the omics approaches, measuring the presence of methylotrophs in natural environments. Both Bacteria and Archaea are considered. The data are discussed in the context of the current knowledge on the biochemistry of methylotrophy and the physiology of cultivated methylotrophs. One major issue discussed is the recent metagenomic data pointing toward the activity of "aerobic" methanotrophs, such as Methylobacter, in microoxic or hypoxic conditions. A related issue of the metabolic distinction between aerobic and "anaerobic" methylotrophy is addressed in the light of the genomic and metagenomic data for respective organisms. The role of communities, as opposed to single-organism activities in environmental cycling of single-carbon compounds, such as methane, is also discussed. In addition, the emerging issue of the role of non-traditional methylotrophs in global metabolism of single-carbon compounds and the role of methylotrophy pathways in non-methylotrophs is briefly mentioned.

  17. Identification of an antibacterial protein by functional screening of a human oral metagenomic library.

    PubMed

    Arivaradarajan, Preeti; Warburton, Philip J; Paramasamy, Gunasekaran; Nair, Sean P; Allan, Elaine; Mullany, Peter

    2015-09-01

    Screening of a bacterial artificial chromosome (BAC) library containing metagenomic DNA from human plaque and saliva allowed the isolation of four clones producing antimicrobial activity. Three of these were pigmented and encoded homologues of glutamyl-tRNA reductase (GluTR), an enzyme involved in the C5 pathway leading to tetrapyrole synthesis, and one clone had antibacterial activity with no pigmentation. The latter contained a BAC with an insert of 15.6 kb. Initial attempts to localize the gene(s) responsible for antimicrobial activity by subcloning into pUC-based vectors failed. A new plasmid for toxic gene expression (pTGEX) was designed enabling localization of the antibacterial activity to a 4.7-kb HindIII fragment. Transposon mutagenesis localized the gene to an open reading frame of 483 bp designated antibacterial protein1 (abp1). Abp1 was 94% identical to a hypothetical protein of Neisseria subflava (accession number WP_004519448.1). An Escherichia coli clone expressing Abp1 exhibited antibacterial activity against Bacillus subtilis BS78H, Staphylococcus epidermidis NCTC 11964 and B4268, and S. aureus NCTC 12493,ATCC 35696 and NCTC 11561. However, no antibacterial activity was observed against Pseudomonas aeruginosa ATCC 9027, N. subflava ATCC A1078, E. coli K12 JM109 and BL21(DE3) Fusobacterium nucleatum ATCC 25586 and NCTC 11326, Prevotella intermedia ATCC 25611, Veillonella parvula ATCC 10790 or Lactobacillus casei NCTC 6375. PMID:26347298

  18. Diversity of putative archaeal RNA viruses in metagenomic datasets of a yellowstone acidic hot spring.

    PubMed

    Wang, Hongming; Yu, Yongxin; Liu, Taigang; Pan, Yingjie; Yan, Shuling; Wang, Yongjie

    2015-01-01

    Two genomic fragments (5,662 and 1,269 nt in size, GenBank accession no. JQ756122 and JQ756123, respectively) of novel, positive-strand RNA viruses that infect archaea were first discovered in an acidic hot spring in Yellowstone National Park (Bolduc et al., 2012). To investigate the diversity of these newly identified putative archaeal RNA viruses, global metagenomic datasets were searched for sequences that were significantly similar to those of the viruses. A total of 3,757 associated reads were retrieved solely from the Yellowstone datasets and were used to assemble the genomes of the putative archaeal RNA viruses. Nine contigs with lengths ranging from 417 to 5,866 nt were obtained, 4 of which were longer than 2,200 nt; one contig was 204 nt longer than JQ756122, representing the longest genomic sequence of the putative archaeal RNA viruses. These contigs revealed more than 50% sequence similarity to JQ756122 or JQ756123 and may be partial or nearly complete genomes of novel genogroups or genotypes of the putative archaeal RNA viruses. Sequence and phylogenetic analyses indicated that the archaeal RNA viruses are genetically diverse, with at least 3 related viral lineages in the Yellowstone acidic hot spring environment.

  19. Bioprospecting metagenomics of decaying wood: mining for new glycoside hydrolases

    SciTech Connect

    Li L. L.; van der Lelie D.; Taghavi, S.; McCorkle, S. M.; Zhang, Y.-B.; Blewitt, M. G.; Brunecky, R.; Adney, W. S.; Himmel, M. E.; Brumm, P.; Drinkwater, C.; Mead, D. A.; Tringe, S. G.

    2011-08-01

    To efficiently deconstruct recalcitrant plant biomass to fermentable sugars in industrial processes, biocatalysts of higher performance and lower cost are required. The genetic diversity found in the metagenomes of natural microbial biomass decay communities may harbor such enzymes. Our goal was to discover and characterize new glycoside hydrolases (GHases) from microbial biomass decay communities, especially those from unknown or never previously cultivated microorganisms. From the metagenome sequences of an anaerobic microbial community actively decaying poplar biomass, we identified approximately 4,000 GHase homologs. Based on homology to GHase families/activities of interest and the quality of the sequences, candidates were selected for full-length cloning and subsequent expression. As an alternative strategy, a metagenome expression library was constructed and screened for GHase activities. These combined efforts resulted in the cloning of four novel GHases that could be successfully expressed in Escherichia coli. Further characterization showed that two enzymes showed significant activity on p-nitrophenyl-{alpha}-L-arabinofuranoside, one enzyme had significant activity against p-nitrophenyl-{beta}-D-glucopyranoside, and one enzyme showed significant activity against p-nitrophenyl-{beta}-D-xylopyranoside. Enzymes were also tested in the presence of ionic liquids. Metagenomics provides a good resource for mining novel biomass degrading enzymes and for screening of cellulolytic enzyme activities. The four GHases that were cloned may have potential application for deconstruction of biomass pretreated with ionic liquids, as they remain active in the presence of up to 20% ionic liquid (except for 1-ethyl-3-methylimidazolium diethyl phosphate). Alternatively, ionic liquids might be used to immobilize or stabilize these enzymes for minimal solvent processing of biomass.

  20. Identification of a novel coronavirus from guinea fowl using metagenomics.

    PubMed

    Ducatez, Mariette F; Guérin, Jean-Luc

    2015-01-01

    While classical virology techniques such as virus culture, electron microscopy, or classical PCR had been unsuccessful in identifying the causative agent responsible for the fulminating disease of guinea fowl, we identified a novel avian gammacoronavirus associated with the disease using metagenomics. Next-generation sequencing is an unbiased approach that allows the sequencing of virtually all the genetic material present in a given sample. PMID:25720467

  1. Bioprospecting metagenomics of decaying wood: mining for new glycoside hydrolases

    PubMed Central

    2011-01-01

    Background To efficiently deconstruct recalcitrant plant biomass to fermentable sugars in industrial processes, biocatalysts of higher performance and lower cost are required. The genetic diversity found in the metagenomes of natural microbial biomass decay communities may harbor such enzymes. Our goal was to discover and characterize new glycoside hydrolases (GHases) from microbial biomass decay communities, especially those from unknown or never previously cultivated microorganisms. Results From the metagenome sequences of an anaerobic microbial community actively decaying poplar biomass, we identified approximately 4,000 GHase homologs. Based on homology to GHase families/activities of interest and the quality of the sequences, candidates were selected for full-length cloning and subsequent expression. As an alternative strategy, a metagenome expression library was constructed and screened for GHase activities. These combined efforts resulted in the cloning of four novel GHases that could be successfully expressed in Escherichia coli. Further characterization showed that two enzymes showed significant activity on p-nitrophenyl-α-L-arabinofuranoside, one enzyme had significant activity against p-nitrophenyl-β-D-glucopyranoside, and one enzyme showed significant activity against p-nitrophenyl-β-D-xylopyranoside. Enzymes were also tested in the presence of ionic liquids. Conclusions Metagenomics provides a good resource for mining novel biomass degrading enzymes and for screening of cellulolytic enzyme activities. The four GHases that were cloned may have potential application for deconstruction of biomass pretreated with ionic liquids, as they remain active in the presence of up to 20% ionic liquid (except for 1-ethyl-3-methylimidazolium diethyl phosphate). Alternatively, ionic liquids might be used to immobilize or stabilize these enzymes for minimal solvent processing of biomass. PMID:21816041

  2. DIME: a novel framework for de novo metagenomic sequence assembly.

    PubMed

    Guo, Xuan; Yu, Ning; Ding, Xiaojun; Wang, Jianxin; Pan, Yi

    2015-02-01

    The recently developed next generation sequencing platforms not only decrease the cost for metagenomics data analysis, but also greatly enlarge the size of metagenomic sequence datasets. A common bottleneck of available assemblers is that the trade-off between the noise of the resulting contigs and the gain in sequence length for better annotation has not been attended enough for large-scale sequencing projects, especially for the datasets with low coverage and a large number of nonoverlapping contigs. To address this limitation and promote both accuracy and efficiency, we develop a novel metagenomic sequence assembly framework, DIME, by taking the DIvide, conquer, and MErge strategies. In addition, we give two MapReduce implementations of DIME, DIME-cap3 and DIME-genovo, on Apache Hadoop platform. For a systematic comparison of the performance of the assembly tasks, we tested DIME and five other popular short read assembly programs, Cap3, Genovo, MetaVelvet, SOAPdenovo, and SPAdes on four synthetic and three real metagenomic sequence datasets with various reads from fifty thousand to a couple million in size. The experimental results demonstrate that our method not only partitions the sequence reads with an extremely high accuracy, but also reconstructs more bases, generates higher quality assembled consensus, and yields higher assembly scores, including corrected N50 and BLAST-score-per-base, than other tools with a nearly theoretical speed-up. Results indicate that DIME offers great improvement in assembly across a range of sequence abundances and thus is robust to decreasing coverage. PMID:25684202

  3. The Challenge and Potential of Metagenomics in the Clinic

    PubMed Central

    Mulcahy-O’Grady, Heidi; Workentine, Matthew L.

    2016-01-01

    The bacteria, fungi, and viruses that live on and in us have a tremendous impact on our day-to-day health and are often linked to many diseases, including autoimmune disorders and infections. Diagnosing and treating these disorders relies on accurate identification and characterization of the microbial community. Current sequencing technologies allow the sequencing of the entire nucleic acid complement of a sample providing an accurate snapshot of the community members present in addition to the full genetic potential of that microbial community. There are a number of clinical applications that stand to benefit from these data sets, such as the rapid identification of pathogens present in a sample. Other applications include the identification of antibiotic-resistance genes, diagnosis and treatment of gastrointestinal disorders, and many other diseases associated with bacterial, viral, and fungal microbiomes. Metagenomics also allows the physician to probe more complex phenotypes such as microbial dysbiosis with intestinal disorders and disruptions of the skin microbiome that may be associated with skin disorders. Many of these disorders are not associated with a single pathogen but emerge as a result of complex ecological interactions within microbiota. Currently, we understand very little about these complex phenotypes, yet clearly they are important and in some cases, as with fecal microbiota transplants in Clostridium difficile infections, treating the microbiome of the patient is effective. Here, we give an overview of metagenomics and discuss a number of areas where metagenomics is applicable in the clinic, and progress being made in these areas. This includes (1) the identification of unknown pathogens, and those pathogens particularly hard to culture, (2) utilizing functional information and gene content to understand complex infections such as Clostridium difficile, and (3) predicting antimicrobial resistance of the community using genetic determinants of

  4. DIME: A Novel Framework for De Novo Metagenomic Sequence Assembly

    PubMed Central

    Guo, Xuan; Yu, Ning; Ding, Xiaojun; Wang, Jianxin

    2015-01-01

    Abstract The recently developed next generation sequencing platforms not only decrease the cost for metagenomics data analysis, but also greatly enlarge the size of metagenomic sequence datasets. A common bottleneck of available assemblers is that the trade-off between the noise of the resulting contigs and the gain in sequence length for better annotation has not been attended enough for large-scale sequencing projects, especially for the datasets with low coverage and a large number of nonoverlapping contigs. To address this limitation and promote both accuracy and efficiency, we develop a novel metagenomic sequence assembly framework, DIME, by taking the DIvide, conquer, and MErge strategies. In addition, we give two MapReduce implementations of DIME, DIME-cap3 and DIME-genovo, on Apache Hadoop platform. For a systematic comparison of the performance of the assembly tasks, we tested DIME and five other popular short read assembly programs, Cap3, Genovo, MetaVelvet, SOAPdenovo, and SPAdes on four synthetic and three real metagenomic sequence datasets with various reads from fifty thousand to a couple million in size. The experimental results demonstrate that our method not only partitions the sequence reads with an extremely high accuracy, but also reconstructs more bases, generates higher quality assembled consensus, and yields higher assembly scores, including corrected N50 and BLAST-score-per-base, than other tools with a nearly theoretical speed-up. Results indicate that DIME offers great improvement in assembly across a range of sequence abundances and thus is robust to decreasing coverage. PMID:25684202

  5. Recovery of a medieval Brucella melitensis genome using shotgun metagenomics.

    PubMed

    Kay, Gemma L; Sergeant, Martin J; Giuffra, Valentina; Bandiera, Pasquale; Milanese, Marco; Bramanti, Barbara; Bianucci, Raffaella; Pallen, Mark J

    2014-07-15

    Shotgun metagenomics provides a powerful assumption-free approach to the recovery of pathogen genomes from contemporary and historical material. We sequenced the metagenome of a calcified nodule from the skeleton of a 14th-century middle-aged male excavated from the medieval Sardinian settlement of Geridu. We obtained 6.5-fold coverage of a Brucella melitensis genome. Sequence reads from this genome showed signatures typical of ancient or aged DNA. Despite the relatively low coverage, we were able to use information from single-nucleotide polymorphisms to place the medieval pathogen genome within a clade of B. melitensis strains that included the well-studied Ether strain and two other recent Italian isolates. We confirmed this placement using information from deletions and IS711 insertions. We conclude that metagenomics stands ready to document past and present infections, shedding light on the emergence, evolution, and spread of microbial pathogens. Importance: Infectious diseases have shaped human populations and societies throughout history. The recovery of pathogen DNA sequences from human remains provides an opportunity to identify and characterize the causes of individual and epidemic infections. By sequencing DNA extracted from medieval human remains through shotgun metagenomics, without target-specific capture or amplification, we have obtained a draft genome sequence of an ~700-year-old Brucella melitensis strain. Using a variety of bioinformatic approaches, we have shown that this historical strain is most closely related to recent strains isolated from Italy, confirming the continuity of this zoonotic infection, and even a specific lineage, in the Mediterranean region over the centuries.

  6. Characterization of the Gut Microbiome Using 16S or Shotgun Metagenomics

    PubMed Central

    Jovel, Juan; Patterson, Jordan; Wang, Weiwei; Hotte, Naomi; O'Keefe, Sandra; Mitchel, Troy; Perry, Troy; Kao, Dina; Mason, Andrew L.; Madsen, Karen L.; Wong, Gane K.-S.

    2016-01-01

    The advent of next generation sequencing (NGS) has enabled investigations of the gut microbiome with unprecedented resolution and throughput. This has stimulated the development of sophisticated bioinformatics tools to analyze the massive amounts of data generated. Researchers therefore need a clear understanding of the key concepts required for the design, execution and interpretation of NGS experiments on microbiomes. We conducted a literature review and used our own data to determine which approaches work best. The two main approaches for analyzing the microbiome, 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics, are illustrated with analyses of libraries designed to highlight their strengths and weaknesses. Several methods for taxonomic classification of bacterial sequences are discussed. We present simulations to assess the number of sequences that are required to perform reliable appraisals of bacterial community structure. To the extent that fluctuations in the diversity of gut bacterial populations correlate with health and disease, we emphasize various techniques for the analysis of bacterial communities within samples (α-diversity) and between samples (β-diversity). Finally, we demonstrate techniques to infer the metabolic capabilities of a bacteria community from these 16S and shotgun data. PMID:27148170

  7. VIP: an integrated pipeline for metagenomics of virus identification and discovery.

    PubMed

    Li, Yang; Wang, Hao; Nie, Kai; Zhang, Chen; Zhang, Yi; Wang, Ji; Niu, Peihua; Ma, Xuejun

    2016-01-01

    Identification and discovery of viruses using next-generation sequencing technology is a fast-developing area with potential wide application in clinical diagnostics, public health monitoring and novel virus discovery. However, tremendous sequence data from NGS study has posed great challenge both in accuracy and velocity for application of NGS study. Here we describe VIP ("Virus Identification Pipeline"), a one-touch computational pipeline for virus identification and discovery from metagenomic NGS data. VIP performs the following steps to achieve its goal: (i) map and filter out background-related reads, (ii) extensive classification of reads on the basis of nucleotide and remote amino acid homology, (iii) multiple k-mer based de novo assembly and phylogenetic analysis to provide evolutionary insight. We validated the feasibility and veracity of this pipeline with sequencing results of various types of clinical samples and public datasets. VIP has also contributed to timely virus diagnosis (~10 min) in acutely ill patients, demonstrating its potential in the performance of unbiased NGS-based clinical studies with demand of short turnaround time. VIP is released under GPLv3 and is available for free download at: https://github.com/keylabivdc/VIP.

  8. VIP: an integrated pipeline for metagenomics of virus identification and discovery

    PubMed Central

    Li, Yang; Wang, Hao; Nie, Kai; Zhang, Chen; Zhang, Yi; Wang, Ji; Niu, Peihua; Ma, Xuejun

    2016-01-01

    Identification and discovery of viruses using next-generation sequencing technology is a fast-developing area with potential wide application in clinical diagnostics, public health monitoring and novel virus discovery. However, tremendous sequence data from NGS study has posed great challenge both in accuracy and velocity for application of NGS study. Here we describe VIP (“Virus Identification Pipeline”), a one-touch computational pipeline for virus identification and discovery from metagenomic NGS data. VIP performs the following steps to achieve its goal: (i) map and filter out background-related reads, (ii) extensive classification of reads on the basis of nucleotide and remote amino acid homology, (iii) multiple k-mer based de novo assembly and phylogenetic analysis to provide evolutionary insight. We validated the feasibility and veracity of this pipeline with sequencing results of various types of clinical samples and public datasets. VIP has also contributed to timely virus diagnosis (~10 min) in acutely ill patients, demonstrating its potential in the performance of unbiased NGS-based clinical studies with demand of short turnaround time. VIP is released under GPLv3 and is available for free download at: https://github.com/keylabivdc/VIP. PMID:27026381

  9. Metagenome Sequencing of the Hadza Hunter-Gatherer Gut Microbiota.

    PubMed

    Rampelli, Simone; Schnorr, Stephanie L; Consolandi, Clarissa; Turroni, Silvia; Severgnini, Marco; Peano, Clelia; Brigidi, Patrizia; Crittenden, Alyssa N; Henry, Amanda G; Candela, Marco

    2015-06-29

    Through human microbiome sequencing, we can better understand how host evolutionary and ontogenetic history is reflected in the microbial function. However, there has been no information on the gut metagenome configuration in hunter-gatherer populations, posing a gap in our knowledge of gut microbiota (GM)-host mutualism arising from a lifestyle that describes over 90% of human evolutionary history. Here, we present the first metagenomic analysis of GM from Hadza hunter-gatherers of Tanzania, showing a unique enrichment in metabolic pathways that aligns with the dietary and environmental factors characteristic of their foraging lifestyle. We found that the Hadza GM is adapted for broad-spectrum carbohydrate metabolism, reflecting the complex polysaccharides in their diet. Furthermore, the Hadza GM is equipped for branched-chain amino acid degradation and aromatic amino acid biosynthesis. Resistome functionality demonstrates the existence of antibiotic resistance genes in a population with little antibiotic exposure, indicating the ubiquitous presence of environmentally derived resistances. Our results demonstrate how the functional specificity of the GM correlates with certain environment and lifestyle factors and how complexity from the exogenous environment can be balanced by endogenous homeostasis. The Hadza gut metagenome structure allows us to appreciate the co-adaptive functional role of the GM in complementing the human physiology, providing a better understanding of the versatility of human life and subsistence. PMID:25981789

  10. Expanding the catalog of cas genes with metagenomes.

    PubMed

    Zhang, Quan; Doak, Thomas G; Ye, Yuzhen

    2014-02-01

    The CRISPR (clusters of regularly interspaced short palindromic repeats)-Cas adaptive immune system is an important defense system in bacteria, providing targeted defense against invasions of foreign nucleic acids. CRISPR-Cas systems consist of CRISPR loci and cas (CRISPR-associated) genes: sequence segments of invaders are incorporated into host genomes at CRISPR loci to generate specificity, while adjacent cas genes encode proteins that mediate the defense process. We pursued an integrated approach to identifying putative cas genes from genomes and metagenomes, combining similarity searches with genomic neighborhood analysis. Application of our approach to bacterial genomes and human microbiome datasets allowed us to significantly expand the collection of cas genes: the sequence space of the Cas9 family, the key player in the recently engineered RNA-guided platforms for genome editing in eukaryotes, is expanded by at least two-fold with metagenomic datasets. We found genes in cas loci encoding other functions, for example, toxins and antitoxins, confirming the recently discovered potential of coupling between adaptive immunity and the dormancy/suicide systems. We further identified 24 novel Cas families; one novel family contains 20 proteins, all identified from the human microbiome datasets, illustrating the importance of metagenomics projects in expanding the diversity of cas genes.

  11. Exploration of community traits as ecological markers in microbial metagenomes.

    PubMed

    Barberán, Albert; Fernández-Guerra, Antoni; Bohannan, Brendan J M; Casamayor, Emilio O

    2012-04-01

    The rate of information collection generated by metagenomics is uncoupled with its meaningful ecological interpretation. New analytical approaches based on functional trait-based ecology may help to bridge this gap and extend the trait approach to the community level in vast and complex environmental genetic data sets. Here, we explored a set of community traits that range from nucleotidic to genomic properties in 53 metagenomic aquatic samples from the Global Ocean Sampling (GOS) expedition. We found significant differences between the community profile derived from the commonly used 16S rRNA gene and from the functional trait set. The traits proved to be valuable ecological markers by discriminating between marine ecosystems (coastal vs. open ocean) and between oceans (Atlantic vs. Indian vs. Pacific). Intertrait relationships were also assessed, and we propose some that could be further used as habitat descriptors or indicators of artefacts during sample processing. Overall, the approach presented here may help to interpret metagenomics data to gain a full understanding of microbial community patterns in a rigorous ecological framework.

  12. Culture-independent discovery of natural products from soil metagenomes.

    PubMed

    Katz, Micah; Hover, Bradley M; Brady, Sean F

    2016-03-01

    Bacterial natural products have proven to be invaluable starting points in the development of many currently used therapeutic agents. Unfortunately, traditional culture-based methods for natural product discovery have been deemphasized by pharmaceutical companies due in large part to high rediscovery rates. Culture-independent, or "metagenomic," methods, which rely on the heterologous expression of DNA extracted directly from environmental samples (eDNA), have the potential to provide access to metabolites encoded by a large fraction of the earth's microbial biosynthetic diversity. As soil is both ubiquitous and rich in bacterial diversity, it is an appealing starting point for culture-independent natural product discovery efforts. This review provides an overview of the history of soil metagenome-driven natural product discovery studies and elaborates on the recent development of new tools for sequence-based, high-throughput profiling of environmental samples used in discovering novel natural product biosynthetic gene clusters. We conclude with several examples of these new tools being employed to facilitate the recovery of novel secondary metabolite encoding gene clusters from soil metagenomes and the subsequent heterologous expression of these clusters to produce bioactive small molecules.

  13. Metagenomic abundance estimation and diagnostic testing on species level

    PubMed Central

    Lindner, Martin S.; Renard, Bernhard Y.

    2013-01-01

    One goal of sequencing-based metagenomic community analysis is the quantitative taxonomic assessment of microbial community compositions. In particular, relative quantification of taxons is of high relevance for metagenomic diagnostics or microbial community comparison. However, the majority of existing approaches quantify at low resolution (e.g. at phylum level), rely on the existence of special genes (e.g. 16S), or have severe problems discerning species with highly similar genome sequences. Yet, problems as metagenomic diagnostics require accurate quantification on species level. We developed Genome Abundance Similarity Correction (GASiC), a method to estimate true genome abundances via read alignment by considering reference genome similarities in a non-negative LASSO approach. We demonstrate GASiC’s superior performance over existing methods on simulated benchmark data as well as on real data. In addition, we present applications to datasets of both bacterial DNA and viral RNA source. We further discuss our approach as an alternative to PCR-based DNA quantification. PMID:22941661

  14. ExoMeg1: a new exonuclease from metagenomic library

    PubMed Central

    Silva-Portela, Rita C. B.; Carvalho, Fabíola M.; Pereira, Carolina P. M.; de Souza-Pinto, Nadja C.; Modesti, Mauro; Fuchs, Robert P.; Agnez-Lima, Lucymara F.

    2016-01-01

    DNA repair mechanisms are responsible for maintaining the integrity of DNA and are essential to life. However, our knowledge of DNA repair mechanisms is based on model organisms such as Escherichia coli, and little is known about free living and uncultured microorganisms. In this study, a functional screening was applied in a metagenomic library with the goal of discovering new genes involved in the maintenance of genomic integrity. One clone was identified and the sequence analysis showed an open reading frame homolog to a hypothetical protein annotated as a member of the Exo_Endo_Phos superfamily. This novel enzyme shows 3′-5′ exonuclease activity on single and double strand DNA substrates and it is divalent metal-dependent, EDTA-sensitive and salt resistant. The clone carrying the hypothetical ORF was able to complement strains deficient in recombination or base excision repair, suggesting that the new enzyme may be acting on the repair of single strand breaks with 3′ blockers, which are substrates for these repair pathways. Because this is the first report of an enzyme obtained from a metagenomic approach showing exonuclease activity, it was named ExoMeg1. The metagenomic approach has proved to be a useful tool for identifying new genes of uncultured microorganisms. PMID:26815639

  15. Functional metagenomic selection of RubisCOs from uncultivated bacteria

    USGS Publications Warehouse

    Varaljay, Vanessa A; Satagopan, Sriram; North, Justin A.; Witteveen, Briana; Dourado, Manuella N.; Anantharaman, Karthik; Arbing, Mark A.; McCann, Shelley; Oremland, Ronald S.; Banfield, Jillian F.; Wrighton, Kelly C.; Tabita, F. Robert

    2016-01-01

    Ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) is a critical yet severely inefficient enzyme that catalyses the fixation of virtually all of the carbon found on Earth. Here, we report a functional metagenomic selection that recovers physiologically active RubisCO molecules directly from uncultivated and largely unknown members of natural microbial communities. Selection is based on CO2-dependent growth in a host strain capable of expressing environmental deoxyribonucleic acid (DNA), precluding the need for pure cultures or screening of recombinant clones for enzymatic activity. Seventeen functional RubisCO-encoded sequences were selected using DNA extracted from soil and river autotrophic enrichments, a photosynthetic biofilm and a subsurface groundwater aquifer. Notably, three related form II RubisCOs were recovered which share high sequence similarity with metagenomic scaffolds from uncultivated members of theGallionellaceae family. One of the Gallionellaceae RubisCOs was purified and shown to possessCO2/O2 specificity typical of form II enzymes. X-ray crystallography determined that this enzyme is a hexamer, only the second form II multimer ever solved and the first RubisCO structure obtained from an uncultivated bacterium. Functional metagenomic selection leverages natural biological diversity and billions of years of evolution inherent in environmental communities, providing a new window into the discovery of CO2-fixing enzymes not previously characterized.

  16. New viruses in veterinary medicine, detected by metagenomic approaches.

    PubMed

    Belák, Sándor; Karlsson, Oskar E; Blomström, Anne-Lie; Berg, Mikael; Granberg, Fredrik

    2013-07-26

    In our world, which is faced today with exceptional environmental changes and dramatically intensifying globalisation, we are encountering challenges due to many new factors, including the emergence or re-emergence of novel, so far "unknown" infectious diseases. Although a broad arsenal of diagnostic methods is at our disposal, the majority of the conventional diagnostic tests is highly virus-specific or is targeted entirely towards a limited group of infectious agents. This specificity complicates or even hinders the detection of new or unexpected pathogens, such as new, emerging or re-emerging viruses or novel viral variants. The recently developed approaches of viral metagenomics provide an effective novel way to screen samples and detect viruses without previous knowledge of the infectious agent, thereby enabling a better diagnosis and disease control, in line with the "One World, One Health" principles (www.oneworldonehealth.org). Using metagenomic approaches, we have recently identified a broad variety of new viruses, such as novel bocaviruses, Torque Teno viruses, astroviruses, rotaviruses and kobuviruses in porcine disease syndromes, new virus variants in honeybee populations, as well as a range of other infectious agents in further host species. These findings indicate that the metagenomic detection of viral pathogens is becoming now a powerful, cultivation-independent, and useful novel diagnostic tool in veterinary diagnostic virology. PMID:23428379

  17. New viruses in veterinary medicine, detected by metagenomic approaches.

    PubMed

    Belák, Sándor; Karlsson, Oskar E; Blomström, Anne-Lie; Berg, Mikael; Granberg, Fredrik

    2013-07-26

    In our world, which is faced today with exceptional environmental changes and dramatically intensifying globalisation, we are encountering challenges due to many new factors, including the emergence or re-emergence of novel, so far "unknown" infectious diseases. Although a broad arsenal of diagnostic methods is at our disposal, the majority of the conventional diagnostic tests is highly virus-specific or is targeted entirely towards a limited group of infectious agents. This specificity complicates or even hinders the detection of new or unexpected pathogens, such as new, emerging or re-emerging viruses or novel viral variants. The recently developed approaches of viral metagenomics provide an effective novel way to screen samples and detect viruses without previous knowledge of the infectious agent, thereby enabling a better diagnosis and disease control, in line with the "One World, One Health" principles (www.oneworldonehealth.org). Using metagenomic approaches, we have recently identified a broad variety of new viruses, such as novel bocaviruses, Torque Teno viruses, astroviruses, rotaviruses and kobuviruses in porcine disease syndromes, new virus variants in honeybee populations, as well as a range of other infectious agents in further host species. These findings indicate that the metagenomic detection of viral pathogens is becoming now a powerful, cultivation-independent, and useful novel diagnostic tool in veterinary diagnostic virology.

  18. Forest harvesting reduces the soil metagenomic potential for biomass decomposition.

    PubMed

    Cardenas, Erick; Kranabetter, J M; Hope, Graeme; Maas, Kendra R; Hallam, Steven; Mohn, William W

    2015-11-01

    Soil is the key resource that must be managed to ensure sustainable forest productivity. Soil microbial communities mediate numerous essential ecosystem functions, and recent studies show that forest harvesting alters soil community composition. From a long-term soil productivity study site in a temperate coniferous forest in British Columbia, 21 forest soil shotgun metagenomes were generated, totaling 187 Gb. A method to analyze unassembled metagenome reads from the complex community was optimized and validated. The subsequent metagenome analysis revealed that, 12 years after forest harvesting, there were 16% and 8% reductions in relative abundances of biomass decomposition genes in the organic and mineral soil layers, respectively. Organic and mineral soil layers differed markedly in genetic potential for biomass degradation, with the organic layer having greater potential and being more strongly affected by harvesting. Gene families were disproportionately affected, and we identified 41 gene families consistently affected by harvesting, including families involved in lignin, cellulose, hemicellulose and pectin degradation. The results strongly suggest that harvesting profoundly altered below-ground cycling of carbon and other nutrients at this site, with potentially important consequences for forest regeneration. Thus, it is important to determine whether these changes foreshadow long-term changes in forest productivity or resilience and whether these changes are broadly characteristic of harvested forests. PMID:25909978

  19. Analysis of Peptidoglycan Fragment Release.

    PubMed

    Schaub, Ryan E; Lenz, Jonathan D; Dillard, Joseph P

    2016-01-01

    Most bacteria break down a significant portion of their cell wall peptidoglycan during each round of growth and cell division. This process generates peptidoglycan fragments of various sizes that can either be imported back into the cytoplasm for recycling or released from the cell. Released fragments have been shown to act as microbe-associated molecular patterns for the initiation of immune responses, as triggers for the initiation of mutualistic host-microbe relationships, and as signals for cell-cell communication in bacteria. Characterizing these released peptidoglycan fragments can, therefore, be considered an important step in understanding how microbes communicate with other organisms in their environments. In this chapter, we describe methods for labeling cell wall peptidoglycan, calculating the rate at which peptidoglycan is turned over, and collecting released peptidoglycan to determine the abundance and species of released fragments. Methods are described for both the separation of peptidoglycan fragments by size-exclusion chromatography and further detailed analysis by HPLC.

  20. Fragment Screening and HIV Therapeutics

    PubMed Central

    Bauman, Joseph D.; Patel, Disha; Arnold, Eddy

    2013-01-01

    Fragment screening has proven to be a powerful alternative to traditional methods for drug discovery. Biophysical methods, such as X-ray crystallography, NMR spectroscopy, and surface plasmon resonance, are used to screen a diverse library of small molecule compounds. Although compounds identified via this approach have relatively weak affinity, they provide a good platform for lead development and are highly efficient binders with respect to their size. Fragment screening has been utilized for a wide-range of targets, including HIV-1 proteins. Here, we review the fragment screening studies targeting HIV-1 proteins using X-ray crystallography or surface plasmon resonance. These studies have successfully detected binding of novel fragments to either previously established or new sites on HIV-1 protease and reverse transcriptase. In addition, fragment screening against HIV-1 reverse transcriptase has been used as a tool to better understand the complex nature of ligand binding to a flexible target. PMID:21972022

  1. Fragment-based drug design.

    PubMed

    Feyfant, Eric; Cross, Jason B; Paris, Kevin; Tsao, Désirée H H

    2011-01-01

    Fragment-based drug design (FBDD), which is comprised of both fragment screening and the use of fragment hits to design leads, began more than 15 years ago and has been steadily gaining in popularity and utility. Its origin lies on the fact that the coverage of chemical space and the binding efficiency of hits are directly related to the size of the compounds screened. Nevertheless, FBDD still faces challenges, among them developing fragment screening libraries that ensure optimal coverage of chemical space, physical properties and chemical tractability. Fragment screening also requires sensitive assays, often biophysical in nature, to detect weak binders. In this chapter we will introduce the technologies used to address these challenges and outline the experimental advantages that make FBDD one of the most popular new hit-to-lead process. PMID:20981527

  2. Clinical and legal significance of fragmentation of bullets in relation to size of wounds: retrospective analysis

    PubMed Central

    Coupland, Robin

    1999-01-01

    Objective To examine the relation between fragmentation of bullets and size of wounds clinically and in the context of the Hague Declaration of 1899. Design Retrospective analysis of prospectively collected data on hospital admissions. Setting Hospitals of the International Committee of the Red Cross. Subjects 5215 people wounded by bullets in armed conflicts (5933 wounds). Main outcome measures Grade of wound computed from the Red Cross wound classification and presence of bullet fragments on radiography. Results Of the 347 wounds with fragmentation of bullets, 251 (72%) were large wounds (grade 2 or 3)—that is, those with a clinically detectable cavity. Of the 5586 wounds without fragmentation of bullets, 2915 (52.1%) were large wounds. Only 7.9% (251/3166) of large wounds were associated with fragmentation of bullets. Conclusions Fragmentation of bullets is associated with large wounds, but most large wounds do not contain bullet fragments. In addition, bullet fragments may occur in wounds that are not defined as large. Fragmentation of bullets is neither a necessary nor sufficient cause of large wounds, and surgeons should not diagnose extensive tissue damage because of the presence of fragments on radiography. Such findings also do not necessarily represent the use of bullets which contravene the law of war. Future legislation should take into account not only the construction of bullets but also their potential to transfer energy to the human body. Key messagesThe use of certain bullets has been prohibited in warWounds from bullets are caused by transfer of kinetic energy from the bullet to the tissuesThe relation between size of wound and fragmentation of bullets can be examined using the Red Cross wound classification system Fragments of bullets seen on radiographs of wounds sustained in wars do not necessarily represent large wounds or the use of illegal bulletsExisting legislation on the construction of bullets should be supplemented by legislation on

  3. IMG/M: A data management and analysis system for metagenomes

    SciTech Connect

    Markowitz, Victor M.; Ivanova, Natalia N.; Szeto, Ernest; Palaniappan, Krishna; Chu, Ken; Dalevi, Daniel; Chen, I-Min A.; Grechkin,Yuri; Dubchak,Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis,Konstantinos; Hug enholtz, Phil; Kyrpides, Nikos C.

    2007-08-01

    IMG/M is a data management and analysis system for microbial community genomes (metagenomes) hosted at the Joint Genome Institute (JGI). IMG/M consists of metagenome data integrated with isolate microbial genomes from the Integrated Microbial Genomes (IMG) system. IMG/M provides IMG's comparative data analysis tools extended to handle metagenome data, together with metagenome-specific analysis tools. IMG/M is available at http://img.jgi.doe.gov/m. Studies of the collective genomes (also known as metagenomes) of environmental microbial communities (also known as microbiomes) are expected to lead to advances in environmental cleanup, agriculture, industrial processes, alternative energy production, and human health (1). Metagenomes of specific microbiome samples are sequenced by organizations worldwide, such as the Department of Energy's (DOE) Joint Genome Institute (JGI), the Venter Institute and the Washington University in St. Louis using different sequencing strategies, technology platforms, and annotation procedures. According to the Genomes OnLine Database, about 28 metagenome studies have been published to date, with over 60 other projects ongoing and more in the process of being launched (2). The Department of Energy's (DOE) Joint Genome Institute (JGI) is one of the major contributors of metagenome sequence data, currently sequencing more than 50% of the reported metagenome projects worldwide. Due to the higher complexity, inherent incompleteness, and lower quality of metagenome sequence data, traditional assembly, gene prediction, and annotation methods do not perform on these datasets as well as they do on isolate microbial genome sequences (3, 4). In spite of these limitations, metagenome data are amenable to a variety of analyses, as illustrated by several recent studies (5-10). Metagenome data analysis is usually set up in the context of reference isolate genomes and considers the questions of composition and functional or metabolic potential of

  4. FAMeS: Fidelity of Analysis of Metagenomic Samples

    DOE Data Explorer

    Metagenomics is a rapidly emerging field of research for studying microbial communities. To evaluate methods currently used to process metagenomic sequences, simulated datasets of varying complexity were constructed by combining sequencing reads randomly selected from 113 isolate genomes. These datasets were designed to model real metagenomes in terms of complexity and phylogenetic composition. Assembly, gene prediction and binning, employing methods commonly used for the analysis of metagenomic datasets at the DOE JGI, were performed. This site provides access to the simulated datasets, and aims to facilitate standardized benchmarking of tools for metagenomic analysis. FAMeS now hosts data coming from a comprehensive study of methodologies used to create OTUs from 16S rRNA targeted studies of microbial communities. Studies of phylogenetic markers at the molecular level have revealed a vast biodiversity of microorganisms living in the sea, land, and even within the human body. Microbial diversity studies of uncharacterized environments typically seek to estimate the richness and diversity of endemic microflora using a 16S rRNA gene sequencing approach. When most of the species in an environment are unknown and cannot be classified through a database search, researchers cluster 16S sequences into operational taxonomic units (OTUs) or phylotypes, thereby providing an estimate of population structure. Using real 16S sequence data, we have performed a critical analysis of OTU clustering methodologies to assess the potential variability in OTU quality. FAMeS provides the sequence data, taxonomic information, multiple sequence alignments, and distance matrices used and described in the core paper, as well as compiled results of more than 700 unique OTU methods. [The above was copied from the FAMeS home page at http://fames.jgi-psf.org/] The core paper behind FAMeS is: Konstantinos Mavromatis, Natalia Ivanova, Kerrie Barry, Harris Shapiro, Eugene Goltsman, Alice C Mc

  5. Metagenomic Insights into the Uncultured Diversity and Physiology of Microbes in Four Hypersaline Soda Lake Brines.

    PubMed

    Vavourakis, Charlotte D; Ghai, Rohit; Rodriguez-Valera, Francisco; Sorokin, Dimitry Y; Tringe, Susannah G; Hugenholtz, Philip; Muyzer, Gerard

    2016-01-01

    Soda lakes are salt lakes with a naturally alkaline pH due to evaporative concentration of sodium carbonates in the absence of major divalent cations. Hypersaline soda brines harbor microbial communities with a high species- and strain-level archaeal diversity and a large proportion of still uncultured poly-extremophiles compared to neutral brines of similar salinities. We present the first "metagenomic snapshots" of microbial communities thriving in the brines of four shallow soda lakes from the Kulunda Steppe (Altai, Russia) covering a salinity range from 170 to 400 g/L. Both amplicon sequencing of 16S rRNA fragments and direct metagenomic sequencing showed that the top-level taxa abundance was linked to the ambient salinity: Bacteroidetes, Alpha-, and Gamma-proteobacteria were dominant below a salinity of 250 g/L, Euryarchaeota at higher salinities. Within these taxa, amplicon sequences related to Halorubrum, Natrinema, Gracilimonas, purple non-sulfur bacteria (Rhizobiales, Rhodobacter, and Rhodobaca) and chemolithotrophic sulfur oxidizers (Thioalkalivibrio) were highly abundant. Twenty-four draft population genomes from novel members and ecotypes within the Nanohaloarchaea, Halobacteria, and Bacteroidetes were reconstructed to explore their metabolic features, environmental abundance and strategies for osmotic adaptation. The Halobacteria- and Bacteroidetes-related draft genomes belong to putative aerobic heterotrophs, likely with the capacity to ferment sugars in the absence of oxygen. Members from both taxonomic groups are likely involved in primary organic carbon degradation, since some of the reconstructed genomes encode the ability to hydrolyze recalcitrant substrates, such as cellulose and chitin. Putative sodium-pumping rhodopsins were found in both a Flavobacteriaceae- and a Chitinophagaceae-related draft genome. The predicted proteomes of both the latter and a Rhodothermaceae-related draft genome were indicative of a "salt-in" strategy of osmotic

  6. Metagenomic Insights into the Uncultured Diversity and Physiology of Microbes in Four Hypersaline Soda Lake Brines

    PubMed Central

    Vavourakis, Charlotte D.; Ghai, Rohit; Rodriguez-Valera, Francisco; Sorokin, Dimitry Y.; Tringe, Susannah G.; Hugenholtz, Philip; Muyzer, Gerard

    2016-01-01

    Soda lakes are salt lakes with a naturally alkaline pH due to evaporative concentration of sodium carbonates in the absence of major divalent cations. Hypersaline soda brines harbor microbial communities with a high species- and strain-level archaeal diversity and a large proportion of still uncultured poly-extremophiles compared to neutral brines of similar salinities. We present the first “metagenomic snapshots” of microbial communities thriving in the brines of four shallow soda lakes from the Kulunda Steppe (Altai, Russia) covering a salinity range from 170 to 400 g/L. Both amplicon sequencing of 16S rRNA fragments and direct metagenomic sequencing showed that the top-level taxa abundance was linked to the ambient salinity: Bacteroidetes, Alpha-, and Gamma-proteobacteria were dominant below a salinity of 250 g/L, Euryarchaeota at higher salinities. Within these taxa, amplicon sequences related to Halorubrum, Natrinema, Gracilimonas, purple non-sulfur bacteria (Rhizobiales, Rhodobacter, and Rhodobaca) and chemolithotrophic sulfur oxidizers (Thioalkalivibrio) were highly abundant. Twenty-four draft population genomes from novel members and ecotypes within the Nanohaloarchaea, Halobacteria, and Bacteroidetes were reconstructed to explore their metabolic features, environmental abundance and strategies for osmotic adaptation. The Halobacteria- and Bacteroidetes-related draft genomes belong to putative aerobic heterotrophs, likely with the capacity to ferment sugars in the absence of oxygen. Members from both taxonomic groups are likely involved in primary organic carbon degradation, since some of the reconstructed genomes encode the ability to hydrolyze recalcitrant substrates, such as cellulose and chitin. Putative sodium-pumping rhodopsins were found in both a Flavobacteriaceae- and a Chitinophagaceae-related draft genome. The predicted proteomes of both the latter and a Rhodothermaceae-related draft genome were indicative of a “salt-in” strategy of

  7. Metagenomic Insights into the Uncultured Diversity and Physiology of Microbes in Four Hypersaline Soda Lake Brines.

    PubMed

    Vavourakis, Charlotte D; Ghai, Rohit; Rodriguez-Valera, Francisco; Sorokin, Dimitry Y; Tringe, Susannah G; Hugenholtz, Philip; Muyzer, Gerard

    2016-01-01

    Soda lakes are salt lakes with a naturally alkaline pH due to evaporative concentration of sodium carbonates in the absence of major divalent cations. Hypersaline soda brines harbor microbial communities with a high species- and strain-level archaeal diversity and a large proportion of still uncultured poly-extremophiles compared to neutral brines of similar salinities. We present the first "metagenomic snapshots" of microbial communities thriving in the brines of four shallow soda lakes from the Kulunda Steppe (Altai, Russia) covering a salinity range from 170 to 400 g/L. Both amplicon sequencing of 16S rRNA fragments and direct metagenomic sequencing showed that the top-level taxa abundance was linked to the ambient salinity: Bacteroidetes, Alpha-, and Gamma-proteobacteria were dominant below a salinity of 250 g/L, Euryarchaeota at higher salinities. Within these taxa, amplicon sequences related to Halorubrum, Natrinema, Gracilimonas, purple non-sulfur bacteria (Rhizobiales, Rhodobacter, and Rhodobaca) and chemolithotrophic sulfur oxidizers (Thioalkalivibrio) were highly abundant. Twenty-four draft population genomes from novel members and ecotypes within the Nanohaloarchaea, Halobacteria, and Bacteroidetes were reconstructed to explore their metabolic features, environmental abundance and strategies for osmotic adaptation. The Halobacteria- and Bacteroidetes-related draft genomes belong to putative aerobic heterotrophs, likely with the capacity to ferment sugars in the absence of oxygen. Members from both taxonomic groups are likely involved in primary organic carbon degradation, since some of the reconstructed genomes encode the ability to hydrolyze recalcitrant substrates, such as cellulose and chitin. Putative sodium-pumping rhodopsins were found in both a Flavobacteriaceae- and a Chitinophagaceae-related draft genome. The predicted proteomes of both the latter and a Rhodothermaceae-related draft genome were indicative of a "salt-in" strategy of osmotic

  8. Exploring antibiotic resistance genes and metal resistance genes in plasmid metagenomes from wastewater treatment plants.

    PubMed

    Li, An-Dong; Li, Li-Guan; Zhang, Tong

    2015-01-01

    Plasmids operate as independent genetic elements in microorganism communities. Through horizontal gene transfer (HGT), they can provide their host microorganisms with important functions such as antibiotic resistance and heavy metal resistance. In this study, six metagenomic libraries were constructed with plasmid DNA extracted from influent, activated sludge (AS) and digested sludge (DS) of two wastewater treatment plants (WWTPs). Compared with the metagenomes of the total DNA extracted from the same sectors of the wastewater treatment plant, the plasmid metagenomes had significantly higher annotation rates, indicating that the functional genes on plasmids are commonly shared by those studied microorganisms. Meanwhile, the plasmid metagenomes also encoded many more genes related to defense mechanisms, including ARGs. Searching against an antibiotic resistance genes (ARGs) database and a metal resistance genes (MRGs) database revealed a broad-spectrum of antibiotic (323 out of a total 618 subtypes) and MRGs (23 out of a total 23 types) on these plasmid metagenomes. The influent plasmid metagenomes contained many more resistance genes (both ARGs and MRGs) than the AS and the DS metagenomes. Sixteen novel plasmids with a complete circular structure that carried these resistance genes were assembled from the plasmid metagenomes. The results of this study demonstrated that the plasmids in WWTPs could be important reservoirs for resistance genes, and may play a significant role in the horizontal transfer of these genes. PMID:26441947

  9. Metagenomes obtained by 'deep sequencing' - what do they tell about the enhanced biological phosphorus removal communities?

    PubMed

    Albertsen, Mads; Saunders, Aaron M; Nielsen, Kåre L; Nielsen, Per H

    2013-01-01

    Metagenomics enables studies of the genomic potential of complex microbial communities by sequencing bulk genomic DNA directly from the environment. Knowledge of the genetic potential of a community can be used to formulate and test ecological hypotheses about stability and performance. In this study deep metagenomics and fluorescence in situ hybridization (FISH) were used to study a full-scale wastewater treatment plant with enhanced biological phosphorus removal (EBPR), and the results were compared to an existing EBPR metagenome. EBPR is a widely used process that relies on a complex community of microorganisms to function properly. Insight into community and species level stability and dynamics is valuable for knowledge-driven optimization of the EBPR process. The metagenomes of the EBPR communities were distinct compared to metagenomes of communities from a wide range of other environments, which could be attributed to selection pressures of the EBPR process. The metabolic potential of one of the key microorganisms in the EPBR process, Accumulibacter, was investigated in more detail in the two plants, revealing a potential importance of phage predation on the dynamics of Accumulibacter populations. The results demonstrate that metagenomics can be used as a powerful tool for system wide characterization of the EBPR community as well as for a deeper understanding of the function of specific community members. Furthermore, we discuss and illustrate some of the general pitfalls in metagenomics and stress the need of additional DNA extraction independent information in metagenome studies.

  10. Bead-beating artefacts in the Bacteroidetes to Firmicutes ratio of the human stool metagenome.

    PubMed

    Vebø, Heidi C; Karlsson, Magdalena Kauczynska; Avershina, Ekaterina; Finnby, Lene; Rudi, Knut

    2016-10-01

    We evaluated bead-beating cell-lysis in analysing the human stool metagenome, since this is a key step. We observed that two different bead-beating instruments from the same producer gave a three-fold difference in the Bacteroidetes to Firmicutes ratio. This illustrates that bead-beating can have a major impact on downstream metagenome analyses. PMID:27498349

  11. A metagenomic snapshot of taxonomic and functional diversity in an alpine glacier cryoconite ecosystem

    NASA Astrophysics Data System (ADS)

    Edwards, Arwyn; Pachebat, Justin A.; Swain, Martin; Hegarty, Matt; Hodson, Andrew J.; Irvine-Fynn, Tristram D. L.; Rassner, Sara M. E.; Sattler, Birgit

    2013-09-01

    Cryoconite is a microbe-mineral aggregate which darkens the ice surface of glaciers. Microbial process and marker gene PCR-dependent measurements reveal active and diverse cryoconite microbial communities on polar glaciers. Here, we provide the first report of a cryoconite metagenome and culture-independent study of alpine cryoconite microbial diversity. We assembled 1.2 Gbp of metagenomic DNA sequenced using an Illumina HiScanSQ from cryoconite holes across the ablation zone of Rotmoosferner in the Austrian Alps. The metagenome revealed a bacterially-dominated community, with Proteobacteria (62% of bacterial-assigned contigs) and Bacteroidetes (14%) considerably more abundant than Cyanobacteria (2.5%). Streptophyte DNA dominated the eukaryotic metagenome. Functional genes linked to N, Fe, S and P cycling illustrated an acquisitive trend and a nitrogen cycle based upon efficient ammonia recycling. A comparison of 32 metagenome datasets revealed a similarity in functional profiles between the cryoconite and metagenomes characterized from other cold microbe-mineral aggregates. Overall, the metagenomic snapshot reveals the cryoconite ecosystem of this alpine glacier as dependent on scavenging carbon and nutrients from allochthonous sources, in particular mosses transported by wind from ice-marginal habitats, consistent with net heterotrophy indicated by productivity measurements. A transition from singular snapshots of cryoconite metagenomes to comparative analyses is advocated.

  12. Improved metagenome screening efficiency by random insertion of T7 promoters.

    PubMed

    Kim, Yu Jung; Kim, Haseong; Kim, Seo Hyeon; Rha, Eugene; Choi, Su-Lim; Yeom, Soo-Jin; Kim, Hak-Sung; Lee, Seung-Goo

    2016-07-20

    Metagenomes constitute a major source for the identification of novel enzymes for industrial applications. However, current functional screening methods are hindered by the limited transcription efficiency of foreign metagenomic genes. To overcome this constraint, we introduced the 'Enforced Transcription' technique, which involves the random insertion of the bi-directional T7 promoter into a metagenomic fosmid library. Then the effect of enforced transcription was quantitatively assessed by screening for metagenomic lipolytic genes encoding enzymes whose catalytic activity forms halos on tributyrin agar plates. The metagenomic library containing the enforced transcription system yielded a significantly increased number of screening hits with lipolytic activity compared to the library without random T7 promoter insertions. Additional sequence analysis revealed that the hits from the enforced transcription library had greater genetic diversity than those from the original metagenome library. Enhancing heterologous expression using the T7 promoter should enable the identification of greater numbers of diverse novel biocatalysts from the metagenome than possible using conventional metagenome screening approaches. PMID:27239964

  13. The great screen anomaly--a new frontier in product discovery through functional metagenomics.

    PubMed

    Ekkers, David Matthias; Cretoiu, Mariana Silvia; Kielak, Anna Maria; Elsas, Jan Dirk van

    2012-02-01

    Functional metagenomics, the study of the collective genome of a microbial community by expressing it in a foreign host, is an emerging field in biotechnology. Over the past years, the possibility of novel product discovery through metagenomics has developed rapidly. Thus, metagenomics has been heralded as a promising mining strategy of resources for the biotechnological and pharmaceutical industry. However, in spite of innovative work in the field of functional genomics in recent years, yields from function-based metagenomics studies still fall short of producing significant amounts of new products that are valuable for biotechnological processes. Thus, a new set of strategies is required with respect to fostering gene expression in comparison to the traditional work. These new strategies should address a major issue, that is, how to successfully express a set of unknown genes of unknown origin in a foreign host in high throughput. This article is an opinionating review of functional metagenomic screening of natural microbial communities, with a focus on the optimization of new product discovery. It first summarizes current major bottlenecks in functional metagenomics and then provides an overview of the general metagenomic assessment strategies, with a focus on the challenges that are met in the screening for, and selection of, target genes in metagenomic libraries. To identify possible screening limitations, strategies to achieve optimal gene expression are reviewed, examining the molecular events all the way from the transcription level through to the secretion of the target gene product.

  14. Exploring antibiotic resistance genes and metal resistance genes in plasmid metagenomes from wastewater treatment plants

    PubMed Central

    Li, An-Dong; Li, Li-Guan; Zhang, Tong

    2015-01-01

    Plasmids operate as independent genetic elements in microorganism communities. Through horizontal gene transfer (HGT), they can provide their host microorganisms with important functions such as antibiotic resistance and heavy metal resistance. In this study, six metagenomic libraries were constructed with plasmid DNA extracted from influent, activated sludge (AS) and digested sludge (DS) of two wastewater treatment plants (WWTPs). Compared with the metagenomes of the total DNA extracted from the same sectors of the wastewater treatment plant, the plasmid metagenomes had significantly higher annotation rates, indicating that the functional genes on plasmids are commonly shared by those studied microorganisms. Meanwhile, the plasmid metagenomes also encoded many more genes related to defense mechanisms, including ARGs. Searching against an antibiotic resistance genes (ARGs) database and a metal resistance genes (MRGs) database revealed a broad-spectrum of antibiotic (323 out of a total 618 subtypes) and MRGs (23 out of a total 23 types) on these plasmid metagenomes. The influent plasmid metagenomes contained many more resistance genes (both ARGs and MRGs) than the AS and the DS metagenomes. Sixteen novel plasmids with a complete circular structure that carried these resistance genes were assembled from the plasmid metagenomes. The results of this study demonstrated that the plasmids in WWTPs could be important reservoirs for resistance genes, and may play a significant role in the horizontal transfer of these genes. PMID:26441947

  15. Fragmentation and densities of meteoroids

    NASA Technical Reports Server (NTRS)

    Babadzhanov, Pulat B.

    1992-01-01

    Photographic observations of meteors carried out in Dushanbe by the method of instantaneous exposure have shown clearly that meteoroids entering the Earth's atmosphere are subjected to different types of fragmentation. The quasi-continuous fragmentation of meteoroids is mostly widespread. Using the physical theory of meteors which takes into account the quasi-continuous fragmentation of meteoroids and on the basis of light curves of meteors the densities of meteoroids of different streams have been determined. The results enable us to conclude that the densities of meteoroids are over an order of magnitude higher than they have been assumed before. Moreover they are close to the densities of carbonaceous and ordinary chondrites.

  16. Velocity fluctuations of fission fragments

    NASA Astrophysics Data System (ADS)

    Llanes-Estrada, Felipe J.; Carmona, Belén Martínez; Martínez, Jose L. Muñoz

    2016-02-01

    We propose event by event velocity fluctuations of nuclear fission fragments as an additional interesting observable that gives access to the nuclear temperature in an independent way from spectral measurements and relates the diffusion and friction coefficients for the relative fragment coordinate in Kramers-like models (in which some aspects of fission can be understood as the diffusion of a collective variable through a potential barrier). We point out that neutron emission by the heavy fragments can be treated in effective theory if corrections to the velocity distribution are needed.

  17. Woods: A fast and accurate functional annotator and classifier of genomic and metagenomic sequences.

    PubMed

    Sharma, Ashok K; Gupta, Ankit; Kumar, Sanjiv; Dhakan, Darshan B; Sharma, Vineet K

    2015-07-01

    Functional annotation of the gigantic metagenomic data is one of the major time-consuming and computationally demanding tasks, which is currently a bottleneck for the efficient analysis. The commonly used homology-based methods to functionally annotate and classify proteins are extremely slow. Therefore, to achieve faster and accurate functional annotation, we have developed an orthology-based functional classifier 'Woods' by using a combination of machine learning and similarity-based approaches. Woods displayed a precision of 98.79% on independent genomic dataset, 96.66% on simulated metagenomic dataset and >97% on two real metagenomic datasets. In addition, it performed >87 times faster than BLAST on the two real metagenomic datasets. Woods can be used as a highly efficient and accurate classifier with high-throughput capability which facilitates its usability on large metagenomic datasets. PMID:25863333

  18. Metagenomic exploration of the bacterial community structure at Paradip Port, Odisha, India

    PubMed Central

    Pramanik, Arnab; Basak, Pijush; Banerjee, Satabdi; Sengupta, Sanghamitra; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2015-01-01

    This is a pioneering report on the metagenomic exploration of the bacterial diversity from a busy sea port in Paradip, Odisha, India. In our study, high-throughput sequencing of community 16S rRNA gene amplicon was performed using 454 GS Junior platform. Metagenome contain 34,121 sequences with 16,677,333 bp and 56.3% G + C content. Metagenome sequences data are now available at NCBI under the Sequence Read Archive (SRA) database with accession no. SRX897055. Community metagenome sequence revealed the presence of 11,705 species belonging to 40 different phyla. Bacteroidetes (23%), Firmicutes (19%), Proteobacteria (17%), Spirochaetes (10%), Nitrospirae (8%), Actinobacteria (7%) and Acidobacteria (3%) are the predominant bacterial phyla in this port soil. Analysis of metagenomic sequences unfolded the interesting distribution of several phyla which pointed to the significant anthropogenic intervention influencing the bacterial community character of this port. PMID:26981374

  19. Soil-specific limitations for access and analysis of soil microbial communities by metagenomics.

    PubMed

    Lombard, Nathalie; Prestat, Emmanuel; van Elsas, Jan Dirk; Simonet, Pascal

    2011-10-01

    Metagenomics approaches represent an important way to acquire information on the microbial communities present in complex environments like soil. However, to what extent do these approaches provide us with a true picture of soil microbial diversity? Soil is a challenging environment to work with. Its physicochemical properties affect microbial distributions inside the soil matrix, metagenome extraction and its subsequent analyses. To better understand the bias inherent to soil metagenome 'processing', we focus on soil physicochemical properties and their effects on the perceived bacterial distribution. In the light of this information, each step of soil metagenome processing is then discussed, with an emphasis on strategies for optimal soil sampling. Then, the interaction of cells and DNA with the soil matrix and the consequences for microbial DNA extraction are examined. Soil DNA extraction methods are compared and the veracity of the microbial profiles obtained is discussed. Finally, soil metagenomic sequence analysis and exploitation methods are reviewed.

  20. Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.

    PubMed

    Nielsen, H Bjørn; Almeida, Mathieu; Juncker, Agnieszka Sierakowska; Rasmussen, Simon; Li, Junhua; Sunagawa, Shinichi; Plichta, Damian R; Gautier, Laurent; Pedersen, Anders G; Le Chatelier, Emmanuelle; Pelletier, Eric; Bonde, Ida; Nielsen, Trine; Manichanh, Chaysavanh; Arumugam, Manimozhiyan; Batto, Jean-Michel; Quintanilha Dos Santos, Marcelo B; Blom, Nikolaj; Borruel, Natalia; Burgdorf, Kristoffer S; Boumezbeur, Fouad; Casellas, Francesc; Doré, Joël; Dworzynski, Piotr; Guarner, Francisco; Hansen, Torben; Hildebrand, Falk; Kaas, Rolf S; Kennedy, Sean; Kristiansen, Karsten; Kultima, Jens Roat; Léonard, Pierre; Levenez, Florence; Lund, Ole; Moumen, Bouziane; Le Paslier, Denis; Pons, Nicolas; Pedersen, Oluf; Prifti, Edi; Qin, Junjie; Raes, Jeroen; Sørensen, Søren; Tap, Julien; Tims, Sebastian; Ussery, David W; Yamada, Takuji; Renault, Pierre; Sicheritz-Ponten, Thomas; Bork, Peer; Wang, Jun; Brunak, Søren; Ehrlich, S Dusko

    2014-08-01

    Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.