Sample records for ultra high-throughput sequencing

  1. Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding.

    PubMed

    Lan, Freeman; Demaree, Benjamin; Ahmed, Noorsher; Abate, Adam R

    2017-07-01

    The application of single-cell genome sequencing to large cell populations has been hindered by technical challenges in isolating single cells during genome preparation. Here we present single-cell genomic sequencing (SiC-seq), which uses droplet microfluidics to isolate, fragment, and barcode the genomes of single cells, followed by Illumina sequencing of pooled DNA. We demonstrate ultra-high-throughput sequencing of >50,000 cells per run in a synthetic community of Gram-negative and Gram-positive bacteria and fungi. The sequenced genomes can be sorted in silico based on characteristic sequences. We use this approach to analyze the distributions of antibiotic-resistance genes, virulence factors, and phage sequences in microbial communities from an environmental sample. The ability to routinely sequence large populations of single cells will enable the de-convolution of genetic heterogeneity in diverse cell populations.

  2. Multiplexed fragaria chloroplast genome sequencing

    Treesearch

    W. Njuguna; A. Liston; R. Cronn; N.V. Bassil

    2010-01-01

    A method to sequence multiple chloroplast genomes using ultra high throughput sequencing technologies was recently described. Complete chloroplast genome sequences can resolve phylogenetic relationships at low taxonomic levels and identify informative point mutations and indels. The objective of this research was to sequence multiple Fragaria...

  3. Ultra-barcoding in cacao (Theobroma spp.; malvaceae) using whole chloroplast genomes and nuclear ribosomal DNA

    USDA-ARS?s Scientific Manuscript database

    High-throughput next-generation sequencing was used to scan the genome and generate reliable sequence of high copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribosomal DNA sequences for nine genotypes of Theobroma cacao and an indivi...

  4. Improved Selection of Internal Transcribed Spacer-Specific Primers Enables Quantitative, Ultra-High-Throughput Profiling of Fungal Communities

    PubMed Central

    Bokulich, Nicholas A.

    2013-01-01

    Ultra-high-throughput sequencing (HTS) of fungal communities has been restricted by short read lengths and primer amplification bias, slowing the adoption of newer sequencing technologies to fungal community profiling. To address these issues, we evaluated the performance of several common internal transcribed spacer (ITS) primers and designed a novel primer set and work flow for simultaneous quantification and species-level interrogation of fungal consortia. Primer comparison and validation were predicted in silico and by sequencing a “mock community” of mixed yeast species to explore the challenges of amplicon length and amplification bias for reconstructing defined yeast community structures. The amplicon size and distribution of this primer set are smaller than for all preexisting ITS primer sets, maximizing sequencing coverage of hypervariable ITS domains by very-short-amplicon, high-throughput sequencing platforms. This feature also enables the optional integration of quantitative PCR (qPCR) directly into the HTS preparatory work flow by substituting qPCR with these primers for standard PCR, yielding quantification of individual community members. The complete work flow described here, utilizing any of the qualified primer sets evaluated, can rapidly profile mixed fungal communities and capably reconstructed well-characterized beer and wine fermentation fungal communities. PMID:23377949

  5. Next Generation Sequencing at the University of Chicago Genomics Core

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faber, Pieter

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  6. High-throughput sequence alignment using Graphics Processing Units

    PubMed Central

    Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L; Varshney, Amitabh

    2007-01-01

    Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU. PMID:18070356

  7. Ultra high-throughput nucleic acid sequencing as a tool for virus discovery in the turkey gut.

    USDA-ARS?s Scientific Manuscript database

    Recently, the use of the next generation of nucleic acid sequencing technology (i.e., 454 pyrosequencing, as developed by Roche/454 Life Sciences) has allowed an in-depth look at the uncultivated microorganisms present in complex environmental samples, including samples with agricultural importance....

  8. A technological update of molecular diagnostics for infectious diseases

    PubMed Central

    Liu, Yu-Tsueng

    2008-01-01

    Identification of a causative pathogen is essential for the choice of treatment for most infectious diseases. Many FDA approved molecular assays; usually more sensitive and specific compared to traditional tests, have been developed in the last decade. A new trend of high throughput and multiplexing assays are emerging thanks to technological developments for the human genome sequencing project. The applications of microarray and ultra high throughput sequencing technologies for diagnostic microbiology are reviewed. The race for the $1000 genome technology by 2014 will have a profound impact in diagnosis and treatment of infectious diseases in the near future. PMID:18782035

  9. Enabling systematic interrogation of protein-protein interactions in live cells with a versatile ultra-high-throughput biosensor platform | Office of Cancer Genomics

    Cancer.gov

    The vast datasets generated by next generation gene sequencing and expression profiling have transformed biological and translational research. However, technologies to produce large-scale functional genomics datasets, such as high-throughput detection of protein-protein interactions (PPIs), are still in early development. While a number of powerful technologies have been employed to detect PPIs, a singular PPI biosensor platform featured with both high sensitivity and robustness in a mammalian cell environment remains to be established.

  10. Ultra-deep mutant spectrum profiling: improving sequencing accuracy using overlapping read pairs.

    PubMed

    Chen-Harris, Haiyin; Borucki, Monica K; Torres, Clinton; Slezak, Tom R; Allen, Jonathan E

    2013-02-12

    High throughput sequencing is beginning to make a transformative impact in the area of viral evolution. Deep sequencing has the potential to reveal the mutant spectrum within a viral sample at high resolution, thus enabling the close examination of viral mutational dynamics both within- and between-hosts. The challenge however, is to accurately model the errors in the sequencing data and differentiate real viral mutations, particularly those that exist at low frequencies, from sequencing errors. We demonstrate that overlapping read pairs (ORP) -- generated by combining short fragment sequencing libraries and longer sequencing reads -- significantly reduce sequencing error rates and improve rare variant detection accuracy. Using this sequencing protocol and an error model optimized for variant detection, we are able to capture a large number of genetic mutations present within a viral population at ultra-low frequency levels (<0.05%). Our rare variant detection strategies have important implications beyond viral evolution and can be applied to any basic and clinical research area that requires the identification of rare mutations.

  11. 76 FR 28990 - Ultra High Throughput Sequencing for Clinical Diagnostic Applications-Approaches To Assess...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-19

    ...: 900). If you have never attended a Connect Pro meeting before, test your connection at: https://collaboration.fda.gov/common/help/en/support/meeting_test.htm . To get a quick overview of the Connect Pro... technologies are currently extensively used in research and are entering clinical diagnostic use; they are...

  12. SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing.

    PubMed

    Sato, Yukuto; Kojima, Kaname; Nariai, Naoki; Yamaguchi-Kabata, Yumi; Kawai, Yosuke; Takahashi, Mamoru; Mimori, Takahiro; Nagasaki, Masao

    2014-08-08

    Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics. We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved. The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.

  13. Multiplex amplification of large sets of human exons.

    PubMed

    Porreca, Gregory J; Zhang, Kun; Li, Jin Billy; Xie, Bin; Austin, Derek; Vassallo, Sara L; LeProust, Emily M; Peck, Bill J; Emig, Christopher J; Dahl, Fredrik; Gao, Yuan; Church, George M; Shendure, Jay

    2007-11-01

    A new generation of technologies is poised to reduce DNA sequencing costs by several orders of magnitude. But our ability to fully leverage the power of these technologies is crippled by the absence of suitable 'front-end' methods for isolating complex subsets of a mammalian genome at a scale that matches the throughput at which these platforms will routinely operate. We show that targeting oligonucleotides released from programmable microarrays can be used to capture and amplify approximately 10,000 human exons in a single multiplex reaction. Additionally, we show integration of this protocol with ultra-high-throughput sequencing for targeted variation discovery. Although the multiplex capture reaction is highly specific, we found that nonuniform capture is a key issue that will need to be resolved by additional optimization. We anticipate that highly multiplexed methods for targeted amplification will enable the comprehensive resequencing of human exons at a fraction of the cost of whole-genome resequencing.

  14. The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.

    PubMed

    Droege, Marcus; Hill, Brendon

    2008-08-31

    The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer.

  15. Clinical validation of an ultra high-throughput spiral microfluidics for the detection and enrichment of viable circulating tumor cells.

    PubMed

    Khoo, Bee Luan; Warkiani, Majid Ebrahimi; Tan, Daniel Shao-Weng; Bhagat, Ali Asgar S; Irwin, Darryl; Lau, Dawn Pingxi; Lim, Alvin S T; Lim, Kiat Hon; Krisna, Sai Sakktee; Lim, Wan-Teck; Yap, Yoon Sim; Lee, Soo Chin; Soo, Ross A; Han, Jongyoon; Lim, Chwee Teck

    2014-01-01

    Circulating tumor cells (CTCs) are cancer cells that can be isolated via liquid biopsy from blood and can be phenotypically and genetically characterized to provide critical information for guiding cancer treatment. Current analysis of CTCs is hindered by the throughput, selectivity and specificity of devices or assays used in CTC detection and isolation. Here, we enriched and characterized putative CTCs from blood samples of patients with both advanced stage metastatic breast and lung cancers using a novel multiplexed spiral microfluidic chip. This system detected putative CTCs under high sensitivity (100%, n = 56) (Breast cancer samples: 12-1275 CTCs/ml; Lung cancer samples: 10-1535 CTCs/ml) rapidly from clinically relevant blood volumes (7.5 ml under 5 min). Blood samples were completely separated into plasma, CTCs and PBMCs components and each fraction were characterized with immunophenotyping (Pan-cytokeratin/CD45, CD44/CD24, EpCAM), fluorescence in-situ hybridization (FISH) (EML4-ALK) or targeted somatic mutation analysis. We used an ultra-sensitive mass spectrometry based system to highlight the presence of an EGFR-activating mutation in both isolated CTCs and plasma cell-free DNA (cf-DNA), and demonstrate concordance with the original tumor-biopsy samples. We have clinically validated our multiplexed microfluidic chip for the ultra high-throughput, low-cost and label-free enrichment of CTCs. Retrieved cells were unlabeled and viable, enabling potential propagation and real-time downstream analysis using next generation sequencing (NGS) or proteomic analysis.

  16. kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences

    PubMed Central

    2017-01-01

    Abstract Motifs of only 1–4 letters can play important roles when present at key locations within macromolecules. Because existing motif-discovery tools typically miss these position-specific short motifs, we developed kpLogo, a probability-based logo tool for integrated detection and visualization of position-specific ultra-short motifs from a set of aligned sequences. kpLogo also overcomes the limitations of conventional motif-visualization tools in handling positional interdependencies and utilizing ranked or weighted sequences increasingly available from high-throughput assays. kpLogo can be found at http://kplogo.wi.mit.edu/. PMID:28460012

  17. The ChIP-exo Method: Identifying Protein-DNA Interactions with Near Base Pair Precision.

    PubMed

    Perreault, Andrea A; Venters, Bryan J

    2016-12-23

    Chromatin immunoprecipitation (ChIP) is an indispensable tool in the fields of epigenetics and gene regulation that isolates specific protein-DNA interactions. ChIP coupled to high throughput sequencing (ChIP-seq) is commonly used to determine the genomic location of proteins that interact with chromatin. However, ChIP-seq is hampered by relatively low mapping resolution of several hundred base pairs and high background signal. The ChIP-exo method is a refined version of ChIP-seq that substantially improves upon both resolution and noise. The key distinction of the ChIP-exo methodology is the incorporation of lambda exonuclease digestion in the library preparation workflow to effectively footprint the left and right 5' DNA borders of the protein-DNA crosslink site. The ChIP-exo libraries are then subjected to high throughput sequencing. The resulting data can be leveraged to provide unique and ultra-high resolution insights into the functional organization of the genome. Here, we describe the ChIP-exo method that we have optimized and streamlined for mammalian systems and next-generation sequencing-by-synthesis platform.

  18. Quantitative Analysis of Focused A-To-I RNA Editing Sites by Ultra-High-Throughput Sequencing in Psychiatric Disorders

    PubMed Central

    Zhu, Hu; Urban, Daniel J.; Blashka, Jared; McPheeters, Matthew T.; Kroeze, Wesley K.; Mieczkowski, Piotr; Overholser, James C.; Jurjus, George J.; Dieter, Lesa; Mahajan, Gouri J.; Rajkowska, Grazyna; Wang, Zefeng; Sullivan, Patrick F.; Stockmeier, Craig A.; Roth, Bryan L.

    2012-01-01

    A-to-I RNA editing is a post-transcriptional modification of single nucleotides in RNA by adenosine deamination, which thereby diversifies the gene products encoded in the genome. Thousands of potential RNA editing sites have been identified by recent studies (e.g. see Li et al, Science 2009); however, only a handful of these sites have been independently confirmed. Here, we systematically and quantitatively examined 109 putative coding region A-to-I RNA editing sites in three sets of normal human brain samples by ultra-high-throughput sequencing (uHTS). Forty of 109 putative sites, including 25 previously confirmed sites, were validated as truly edited in our brain samples, suggesting an overestimation of A-to-I RNA editing in these putative sites by Li et al (2009). To evaluate RNA editing in human disease, we analyzed 29 of the confirmed sites in subjects with major depressive disorder and schizophrenia using uHTS. In striking contrast to many prior studies, we did not find significant alterations in the frequency of RNA editing at any of the editing sites in samples from these patients, including within the 5HT2C serotonin receptor (HTR2C). Our results indicate that uHTS is a fast, quantitative and high-throughput method to assess RNA editing in human physiology and disease and that many prior studies of RNA editing may overestimate both the extent and disease-related variability of RNA editing at the sites we examined in the human brain. PMID:22912834

  19. Ultra-High-Throughput Screening of an In Vitro-Synthesized Horseradish Peroxidase Displayed on Microbeads Using Cell Sorter

    PubMed Central

    Zhu, Bo; Mizoguchi, Takuro; Kojima, Takaaki; Nakano, Hideo

    2015-01-01

    The C1a isoenzyme of horseradish peroxidase (HRP) is an industrially important heme-containing enzyme that utilizes hydrogen peroxide to oxidize a wide variety of inorganic and organic compounds for practical applications, including synthesis of fine chemicals, medical diagnostics, and bioremediation. To develop a ultra-high-throughput screening system for HRP, we successfully produced active HRP in an Escherichia coli cell-free protein synthesis system, by adding disulfide bond isomerase DsbC and optimizing the concentrations of hemin and calcium ions and the temperature. The biosynthesized HRP was fused with a single-chain Cro (scCro) DNA-binding tag at its N-terminal and C-terminal sites. The addition of the scCro-tag at both ends increased the solubility of the protein. Next, HRP and its fusion proteins were successfully synthesized in a water droplet emulsion by using hexadecane as the oil phase and SunSoft No. 818SK as the surfactant. HRP fusion proteins were displayed on microbeads attached with double-stranded DNA (containing the scCro binding sequence) via scCro-DNA interactions. The activities of the immobilized HRP fusion proteins were detected with a tyramide-based fluorogenic assay using flow cytometry. Moreover, a model microbead library containing wild type hrp (WT) and inactive mutant (MUT) genes was screened using fluorescence-activated cell-sorting, thus efficiently enriching the WT gene from the 1:100 (WT:MUT) library. The technique described here could serve as a novel platform for the ultra-high-throughput discovery of more useful HRP mutants and other heme-containing peroxidases. PMID:25993095

  20. Ultra-high-throughput Production of III-V/Si Wafer for Electronic and Photonic Applications

    PubMed Central

    Geum, Dae-Myeong; Park, Min-Su; Lim, Ju Young; Yang, Hyun-Duk; Song, Jin Dong; Kim, Chang Zoo; Yoon, Euijoon; Kim, SangHyeon; Choi, Won Jun

    2016-01-01

    Si-based integrated circuits have been intensively developed over the past several decades through ultimate device scaling. However, the Si technology has reached the physical limitations of the scaling. These limitations have fuelled the search for alternative active materials (for transistors) and the introduction of optical interconnects (called “Si photonics”). A series of attempts to circumvent the Si technology limits are based on the use of III-V compound semiconductor due to their superior benefits, such as high electron mobility and direct bandgap. To use their physical properties on a Si platform, the formation of high-quality III-V films on the Si (III-V/Si) is the basic technology ; however, implementing this technology using a high-throughput process is not easy. Here, we report new concepts for an ultra-high-throughput heterogeneous integration of high-quality III-V films on the Si using the wafer bonding and epitaxial lift off (ELO) technique. We describe the ultra-fast ELO and also the re-use of the III-V donor wafer after III-V/Si formation. These approaches provide an ultra-high-throughput fabrication of III-V/Si substrates with a high-quality film, which leads to a dramatic cost reduction. As proof-of-concept devices, this paper demonstrates GaAs-based high electron mobility transistors (HEMTs), solar cells, and hetero-junction phototransistors on Si substrates. PMID:26864968

  1. Digital transcriptome profiling using selective hexamer priming for cDNA synthesis.

    PubMed

    Armour, Christopher D; Castle, John C; Chen, Ronghua; Babak, Tomas; Loerch, Patrick; Jackson, Stuart; Shah, Jyoti K; Dey, John; Rohl, Carol A; Johnson, Jason M; Raymond, Christopher K

    2009-09-01

    We developed a procedure for the preparation of whole transcriptome cDNA libraries depleted of ribosomal RNA from only 1 microg of total RNA. The method relies on a collection of short, computationally selected oligonucleotides, called 'not-so-random' (NSR) primers, to obtain full-length, strand-specific representation of nonribosomal RNA transcripts. In this study we validated the technique by profiling human whole brain and universal human reference RNA using ultra-high-throughput sequencing.

  2. An ultra-high-density bin map facilitates high-throughput QTL mapping of horticultural traits in pepper (Capsicum annuum).

    PubMed

    Han, Koeun; Jeong, Hee-Jin; Yang, Hee-Bum; Kang, Sung-Min; Kwon, Jin-Kyung; Kim, Seungill; Choi, Doil; Kang, Byoung-Cheorl

    2016-04-01

    Most agricultural traits are controlled by quantitative trait loci (QTLs); however, there are few studies on QTL mapping of horticultural traits in pepper (Capsicum spp.) due to the lack of high-density molecular maps and the sequence information. In this study, an ultra-high-density map and 120 recombinant inbred lines (RILs) derived from a cross between C. annuum'Perennial' and C. annuum'Dempsey' were used for QTL mapping of horticultural traits. Parental lines and RILs were resequenced at 18× and 1× coverage, respectively. Using a sliding window approach, an ultra-high-density bin map containing 2,578 bins was constructed. The total map length of the map was 1,372 cM, and the average interval between bins was 0.53 cM. A total of 86 significant QTLs controlling 17 horticultural traits were detected. Among these, 32 QTLs controlling 13 traits were major QTLs. Our research shows that the construction of bin maps using low-coverage sequence is a powerful method for QTL mapping, and that the short intervals between bins are helpful for fine-mapping of QTLs. Furthermore, bin maps can be used to improve the quality of reference genomes by elucidating the genetic order of unordered regions and anchoring unassigned scaffolds to linkage groups. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  3. A simple dual online ultra-high pressure liquid chromatography system (sDO-UHPLC) for high throughput proteome analysis.

    PubMed

    Lee, Hangyeore; Mun, Dong-Gi; Bae, Jingi; Kim, Hokeun; Oh, Se Yeon; Park, Young Soo; Lee, Jae-Hyuk; Lee, Sang-Won

    2015-08-21

    We report a new and simple design of a fully automated dual-online ultra-high pressure liquid chromatography system. The system employs only two nano-volume switching valves (a two-position four port valve and a two-position ten port valve) that direct solvent flows from two binary nano-pumps for parallel operation of two analytical columns and two solid phase extraction (SPE) columns. Despite the simple design, the sDO-UHPLC offers many advantageous features that include high duty cycle, back flushing sample injection for fast and narrow zone sample injection, online desalting, high separation resolution and high intra/inter-column reproducibility. This system was applied to analyze proteome samples not only in high throughput deep proteome profiling experiments but also in high throughput MRM experiments.

  4. Ultra-barcoding in cacao (Theobroma spp.; Malvaceae) using whole chloroplast genomes and nuclear ribosomal DNA.

    PubMed

    Kane, Nolan; Sveinsson, Saemundur; Dempewolf, Hannes; Yang, Ji Yong; Zhang, Dapeng; Engels, Johannes M M; Cronk, Quentin

    2012-02-01

    To reliably identify lineages below the species level such as subspecies or varieties, we propose an extension to DNA-barcoding using next-generation sequencing to produce whole organellar genomes and substantial nuclear ribosomal sequence. Because this method uses much longer versions of the traditional DNA-barcoding loci in the plastid and ribosomal DNA, we call our approach ultra-barcoding (UBC). We used high-throughput next-generation sequencing to scan the genome and generate reliable sequence of high copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribosomal DNA sequences for nine genotypes of Theobroma cacao and an individual of the related species T. grandiflorum, as well as an additional publicly available whole plastid genome of T. cacao. All individuals of T. cacao examined were uniquely distinguished, and evidence of reticulation and gene flow was observed. Sequence variation was observed in some of the canonical barcoding regions between species, but other regions of the chloroplast were more variable both within species and between species, as were ribosomal spacers. Furthermore, no single region provides the level of data available using the complete plastid genome and rDNA. Our data demonstrate that UBC is a viable, increasingly cost-effective approach for reliably distinguishing varieties and even individual genotypes of T. cacao. This approach shows great promise for applications where very closely related or interbreeding taxa must be distinguished.

  5. An ultra-HTS process for the identification of small molecule modulators of orphan G-protein-coupled receptors.

    PubMed

    Cacace, Angela; Banks, Martyn; Spicer, Timothy; Civoli, Francesca; Watson, John

    2003-09-01

    G-protein-coupled receptors (GPCRs) are the most successful target proteins for drug discovery research to date. More than 150 orphan GPCRs of potential therapeutic interest have been identified for which no activating ligands or biological functions are known. One of the greatest challenges in the pharmaceutical industry is to link these orphan GPCRs with human diseases. Highly automated parallel approaches that integrate ultra-high throughput and focused screening can be used to identify small molecule modulators of orphan GPCRs. These small molecules can then be employed as pharmacological tools to explore the function of orphan receptors in models of human disease. In this review, we describe methods that utilize powerful ultra-high-throughput screening technologies to identify surrogate ligands of orphan GPCRs.

  6. [Current applications of high-throughput DNA sequencing technology in antibody drug research].

    PubMed

    Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

    2012-03-01

    Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.

  7. Compound Transfer by Acoustic Droplet Ejection Promotes Quality and Efficiency in Ultra-High-Throughput Screening Campaigns.

    PubMed

    Dawes, Timothy D; Turincio, Rebecca; Jones, Steven W; Rodriguez, Richard A; Gadiagellan, Dhireshan; Thana, Peter; Clark, Kevin R; Gustafson, Amy E; Orren, Linda; Liimatta, Marya; Gross, Daniel P; Maurer, Till; Beresini, Maureen H

    2016-02-01

    Acoustic droplet ejection (ADE) as a means of transferring library compounds has had a dramatic impact on the way in which high-throughput screening campaigns are conducted in many laboratories. Two Labcyte Echo ADE liquid handlers form the core of the compound transfer operation in our 1536-well based ultra-high-throughput screening (uHTS) system. Use of these instruments has promoted flexibility in compound formatting in addition to minimizing waste and eliminating compound carryover. We describe the use of ADE for the generation of assay-ready plates for primary screening as well as for follow-up dose-response evaluations. Custom software has enabled us to harness the information generated by the ADE instrumentation. Compound transfer via ADE also contributes to the screening process outside of the uHTS system. A second fully automated ADE-based system has been used to augment the capacity of the uHTS system as well as to permit efficient use of previously picked compound aliquots for secondary assay evaluations. Essential to the utility of ADE in the high-throughput screening process is the high quality of the resulting data. Examples of data generated at various stages of high-throughput screening campaigns are provided. Advantages and disadvantages of the use of ADE in high-throughput screening are discussed. © 2015 Society for Laboratory Automation and Screening.

  8. Link and Network Layers Design for Ultra-High-Speed Terahertz-Band Communications Networks

    DTIC Science & Technology

    2017-01-01

    throughput, and identify the optimal parameter values for their design (Sec. 6.2.3). Moreover, we validate and test the scheme with experimental data obtained...LINK AND NETWORK LAYERS DESIGN FOR ULTRA-HIGH- SPEED TERAHERTZ-BAND COMMUNICATIONS NETWORKS STATE UNIVERSITY OF NEW YORK (SUNY) AT BUFFALO JANUARY...TYPE FINAL TECHNICAL REPORT 3. DATES COVERED (From - To) FEB 2015 – SEP 2016 4. TITLE AND SUBTITLE LINK AND NETWORK LAYERS DESIGN FOR ULTRA-HIGH

  9. Nucleic Acids for Ultra-Sensitive Protein Detection

    PubMed Central

    Janssen, Kris P. F.; Knez, Karel; Spasic, Dragana; Lammertyn, Jeroen

    2013-01-01

    Major advancements in molecular biology and clinical diagnostics cannot be brought about strictly through the use of genomics based methods. Improved methods for protein detection and proteomic screening are an absolute necessity to complement to wealth of information offered by novel, high-throughput sequencing technologies. Only then will it be possible to advance insights into clinical processes and to characterize the importance of specific protein biomarkers for disease detection or the realization of “personalized medicine”. Currently however, large-scale proteomic information is still not as easily obtained as its genomic counterpart, mainly because traditional antibody-based technologies struggle to meet the stringent sensitivity and throughput requirements that are required whereas mass-spectrometry based methods might be burdened by significant costs involved. However, recent years have seen the development of new biodetection strategies linking nucleic acids with existing antibody technology or replacing antibodies with oligonucleotide recognition elements altogether. These advancements have unlocked many new strategies to lower detection limits and dramatically increase throughput of protein detection assays. In this review, an overview of these new strategies will be given. PMID:23337338

  10. A novel ultra high-throughput 16S rRNA gene amplicon sequencing library preparation method for the Illumina HiSeq platform.

    PubMed

    de Muinck, Eric J; Trosvik, Pål; Gilfillan, Gregor D; Hov, Johannes R; Sundaram, Arvind Y M

    2017-07-06

    Advances in sequencing technologies and bioinformatics have made the analysis of microbial communities almost routine. Nonetheless, the need remains to improve on the techniques used for gathering such data, including increasing throughput while lowering cost and benchmarking the techniques so that potential sources of bias can be better characterized. We present a triple-index amplicon sequencing strategy to sequence large numbers of samples at significantly lower c ost and in a shorter timeframe compared to existing methods. The design employs a two-stage PCR protocol, incorpo rating three barcodes to each sample, with the possibility to add a fourth-index. It also includes heterogeneity spacers to overcome low complexity issues faced when sequencing amplicons on Illumina platforms. The library preparation method was extensively benchmarked through analysis of a mock community in order to assess biases introduced by sample indexing, number of PCR cycles, and template concentration. We further evaluated the method through re-sequencing of a standardized environmental sample. Finally, we evaluated our protocol on a set of fecal samples from a small cohort of healthy adults, demonstrating good performance in a realistic experimental setting. Between-sample variation was mainly related to batch effects, such as DNA extraction, while sample indexing was also a significant source of bias. PCR cycle number strongly influenced chimera formation and affected relative abundance estimates of species with high GC content. Libraries were sequenced using the Illumina HiSeq and MiSeq platforms to demonstrate that this protocol is highly scalable to sequence thousands of samples at a very low cost. Here, we provide the most comprehensive study of performance and bias inherent to a 16S rRNA gene amplicon sequencing method to date. Triple-indexing greatly reduces the number of long custom DNA oligos required for library preparation, while the inclusion of variable length heterogeneity spacers minimizes the need for PhiX spike-in. This design results in a significant cost reduction of highly multiplexed amplicon sequencing. The biases we characterize highlight the need for highly standardized protocols. Reassuringly, we find that the biological signal is a far stronger structuring factor than the various sources of bias.

  11. BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing

    PubMed Central

    Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph

    2011-01-01

    Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797

  12. Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

    PubMed Central

    Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2016-01-01

    Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039

  13. Ultra-short pulse laser micro patterning with highest throughput by utilization of a novel multi-beam processing head

    NASA Astrophysics Data System (ADS)

    Homburg, Oliver; Jarczynski, Manfred; Mitra, Thomas; Brüning, Stephan

    2017-02-01

    In the last decade much improvement has been achieved for ultra-short pulse lasers with high repetition rates. This laser technology has vastly matured so that it entered a manifold of industrial applications recently compared to mainly scientific use in the past. Compared to ns-pulse ablation ultra-short pulses in the ps- or even fs regime lead to still colder ablation and further reduced heat-affected zones. This is crucial for micro patterning when structure sizes are getting smaller and requirements are getting stronger at the same time. An additional advantage of ultra-fast processing is its applicability to a large variety of materials, e.g. metals and several high bandgap materials like glass and ceramics. One challenge for ultra-fast micro machining is throughput. The operational capacity of these processes can be maximized by increasing the scan rate or the number of beams - parallel processing. This contribution focuses on process parallelism of ultra-short pulsed lasers with high repetition rate and individually addressable acousto-optical beam modulation. The core of the multi-beam generation is a smooth diffractive beam splitter component with high uniform spots and negligible loss, and a prismatic array compressor to match beam size and pitch. The optical design and the practical realization of an 8 beam processing head in combination with a high average power single mode ultra-short pulsed laser source are presented as well as the currently on-going and promising laboratory research and micro machining results. Finally, an outlook of scaling the processing head to several tens of beams is given.

  14. Unraveling Core Functional Microbiota in Traditional Solid-State Fermentation by High-Throughput Amplicons and Metatranscriptomics Sequencing.

    PubMed

    Song, Zhewei; Du, Hai; Zhang, Yan; Xu, Yan

    2017-01-01

    Fermentation microbiota is specific microorganisms that generate different types of metabolites in many productions. In traditional solid-state fermentation, the structural composition and functional capacity of the core microbiota determine the quality and quantity of products. As a typical example of food fermentation, Chinese Maotai-flavor liquor production involves a complex of various microorganisms and a wide variety of metabolites. However, the microbial succession and functional shift of the core microbiota in this traditional food fermentation remain unclear. Here, high-throughput amplicons (16S rRNA gene amplicon sequencing and internal transcribed space amplicon sequencing) and metatranscriptomics sequencing technologies were combined to reveal the structure and function of the core microbiota in Chinese soy sauce aroma type liquor production. In addition, ultra-performance liquid chromatography and headspace-solid phase microextraction-gas chromatography-mass spectrometry were employed to provide qualitative and quantitative analysis of the major flavor metabolites. A total of 10 fungal and 11 bacterial genera were identified as the core microbiota. In addition, metatranscriptomic analysis revealed pyruvate metabolism in yeasts (genera Pichia, Schizosaccharomyces, Saccharomyces , and Zygosaccharomyces ) and lactic acid bacteria (genus Lactobacillus ) classified into two stages in the production of flavor components. Stage I involved high-level alcohol (ethanol) production, with the genus Schizosaccharomyces serving as the core functional microorganism. Stage II involved high-level acid (lactic acid and acetic acid) production, with the genus Lactobacillus serving as the core functional microorganism. The functional shift from the genus Schizosaccharomyces to the genus Lactobacillus drives flavor component conversion from alcohol (ethanol) to acid (lactic acid and acetic acid) in Chinese Maotai-flavor liquor production. Our findings provide insight into the effects of the core functional microbiota in soy sauce aroma type liquor production and the characteristics of the fermentation microbiota under different environmental conditions.

  15. Rapid Catalyst Screening by a Continuous-Flow Microreactor Interfaced with Ultra High Pressure Liquid Chromatography

    PubMed Central

    Fang, Hui; Xiao, Qing; Wu, Fanghui; Floreancig, Paul E.; Weber, Stephen G.

    2010-01-01

    A high-throughput screening system for homogeneous catalyst discovery has been developed by integrating a continuous-flow capillary-based microreactor with ultra-high pressure liquid chromatography (UHPLC) for fast online analysis. Reactions are conducted in distinct and stable zones in a flow stream that allows for time and temperature regulation. UHPLC detection at high temperature allows high throughput online determination of substrate, product, and byproduct concentrations. We evaluated the efficacies of a series of soluble acid catalysts for an intramolecular Friedel-Crafts addition into an acyliminium ion intermediate within one day and with minimal material investment. The effects of catalyst loading, reaction time, and reaction temperature were also screened. This system exhibited high reproducibility for high-throughput catalyst screening and allowed several acid catalysts for the reaction to be identified. Major side products from the reactions were determined through off-line mass spectrometric detection. Er(OTf)3, the catalyst that showed optimal efficiency in the screening, was shown to be effective at promoting the cyclization reaction on a preparative scale. PMID:20666502

  16. mrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applications.

    PubMed

    Hach, Faraz; Sarrafi, Iman; Hormozdiari, Farhad; Alkan, Can; Eichler, Evan E; Sahinalp, S Cenk

    2014-07-01

    High throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the 'best' mapping location of each read provide a fast way to process HTS data, they are not suitable for many types of downstream analysis such as structural variation detection, where it is important to report multiple mapping loci for each read. For this purpose we introduce mrsFAST-Ultra, a fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of HTS reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. In fact the size of the index generated by mrsFAST-Ultra is 10 times smaller than that of mrsFAST. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNP-aware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. Notice that all of the above features are implemented within the index structure and are not simple post-processing steps and thus are performed highly efficiently. Finally, mrsFAST-Ultra utilizes multiple available cores and processors and can be tuned for various memory settings. Our results show that mrsFAST-Ultra is roughly five times faster than its predecessor mrsFAST. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times or more) in the multi-mapping mode. Furthermore, mrsFAST-Ultra has an index size of 2GB for the entire human reference genome, which is roughly half of that of Bowtie2. mrsFAST-Ultra is open source and it can be accessed at http://mrsfast.sourceforge.net. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. High-Throughput Sequencing of Germline and Tumor From Men with Early-Onset Metastatic Prostate Cancer

    DTIC Science & Technology

    2016-12-01

    AWARD NUMBER: W81XWH-13-1-0371 TITLE: High-Throughput Sequencing of Germline and Tumor From Men with Early- Onset Metastatic Prostate Cancer...DATES COVERED 30 Sep 2013 - 29 Sep 2016 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER High-Throughput Sequencing of Germline and Tumor From Men with...presenting with metastatic prostate cancer at a young age (before age 60 years). Whole exome sequencing identified a panel of germline variants that have

  18. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

  19. Molecular characterization of a novel Nucleorhabdovirus from black currant identified by high-throughput sequencing

    USDA-ARS?s Scientific Manuscript database

    Contigs with sequence similarities to several nucleorhabdoviruses were identified by high-throughput sequencing analysis from a black currant (Ribes nigrum L.) cultivar. The complete genomic sequence of this new nucleorhabdovirus is 14,432 nucleotides. Its genomic organization is typical of nucleorh...

  20. High Throughput Plasmid Sequencing with Illumina and CLC Bio (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Athavale, Ajay

    2018-01-04

    Ajay Athavale (Monsanto) presents "High Throughput Plasmid Sequencing with Illumina and CLC Bio" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  1. The application of the high throughput sequencing technology in the transposable elements.

    PubMed

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  2. Molecular characterization of a novel Luteovirus from peach identified by high-throughput sequencing

    USDA-ARS?s Scientific Manuscript database

    Contigs with sequence homologies to Cherry-associated luteovirus were identified by high-throughput sequencing analysis of two peach accessions undergoing quarantine testing. The complete genomic sequences of the two isolates of this virus are 5,819 and 5,814 nucleotides. Their genome organization i...

  3. Evolutionary Patterns and Processes: Lessons from Ancient DNA.

    PubMed

    Leonardi, Michela; Librado, Pablo; Der Sarkissian, Clio; Schubert, Mikkel; Alfarhan, Ahmed H; Alquraishi, Saleh A; Al-Rasheid, Khaled A S; Gamba, Cristina; Willerslev, Eske; Orlando, Ludovic

    2017-01-01

    Ever since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics, and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution—time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data.

  4. Evolutionary Patterns and Processes: Lessons from Ancient DNA

    PubMed Central

    Leonardi, Michela; Librado, Pablo; Der Sarkissian, Clio; Schubert, Mikkel; Alfarhan, Ahmed H.; Alquraishi, Saleh A.; Al-Rasheid, Khaled A. S.; Gamba, Cristina; Willerslev, Eske

    2017-01-01

    Abstract Ever since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics, and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution—time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data. PMID:28173586

  5. Identification of Sequence Specificity of 5-Methylcytosine Oxidation by Tet1 Protein with High-Throughput Sequencing.

    PubMed

    Kizaki, Seiichiro; Chandran, Anandhakumar; Sugiyama, Hiroshi

    2016-03-02

    Tet (ten-eleven translocation) family proteins have the ability to oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC). However, the oxidation reaction of Tet is not understood completely. Evaluation of genomic-level epigenetic changes by Tet protein requires unbiased identification of the highly selective oxidation sites. In this study, we used high-throughput sequencing to investigate the sequence specificity of mC oxidation by Tet1. A 6.6×10(4) -member mC-containing random DNA-sequence library was constructed. The library was subjected to Tet-reactive pulldown followed by high-throughput sequencing. Analysis of the obtained sequence data identified the Tet1-reactive sequences. We identified mCpG as a highly reactive sequence of Tet1 protein. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Screening of HIV-1 Protease Using a Combination of an Ultra-High-Throughput Fluorescent-Based Assay and RapidFire Mass Spectrometry.

    PubMed

    Meng, Juncai; Lai, Ming-Tain; Munshi, Vandna; Grobler, Jay; McCauley, John; Zuck, Paul; Johnson, Eric N; Uebele, Victor N; Hermes, Jeffrey D; Adam, Gregory C

    2015-06-01

    HIV-1 protease (PR) represents one of the primary targets for developing antiviral agents for the treatment of HIV-infected patients. To identify novel PR inhibitors, a label-free, high-throughput mass spectrometry (HTMS) assay was developed using the RapidFire platform and applied as an orthogonal assay to confirm hits identified in a fluorescence resonance energy transfer (FRET)-based primary screen of > 1 million compounds. For substrate selection, a panel of peptide substrates derived from natural processing sites for PR was evaluated on the RapidFire platform. As a result, KVSLNFPIL, a new substrate measured to have a ~ 20- and 60-fold improvement in k cat/K m over the frequently used sequences SQNYPIVQ and SQNYPIV, respectively, was identified for the HTMS screen. About 17% of hits from the FRET-based primary screen were confirmed in the HTMS confirmatory assay including all 304 known PR inhibitors in the set, demonstrating that the HTMS assay is effective at triaging false-positives while capturing true hits. Hence, with a sampling rate of ~7 s per well, the RapidFire HTMS assay enables the high-throughput evaluation of peptide substrates and functions as an efficient tool for hits triage in the discovery of novel PR inhibitors. © 2015 Society for Laboratory Automation and Screening.

  7. Diversity of Pico- to Mesoplankton along the 2000 km Salinity Gradient of the Baltic Sea

    PubMed Central

    Hu, Yue O. O.; Karlson, Bengt; Charvet, Sophie; Andersson, Anders F.

    2016-01-01

    Microbial plankton form the productive base of both marine and freshwater ecosystems and are key drivers of global biogeochemical cycles of carbon and nutrients. Plankton diversity is immense with representations from all major phyla within the three domains of life. So far, plankton monitoring has mainly been based on microscopic identification, which has limited sensitivity and reproducibility, not least because of the numerical majority of plankton being unidentifiable under the light microscope. High-throughput sequencing of taxonomic marker genes offers a means to identify taxa inaccessible by traditional methods; thus, recent studies have unveiled an extensive previously unknown diversity of plankton. Here, we conducted ultra-deep Illumina sequencing (average 105 sequences/sample) of rRNA gene amplicons of surface water eukaryotic and bacterial plankton communities sampled in summer along a 2000 km transect following the salinity gradient of the Baltic Sea. Community composition was strongly correlated with salinity for both bacterial and eukaryotic plankton assemblages, highlighting the importance of salinity for structuring the biodiversity within this ecosystem. In contrast, no clear trends in alpha-diversity for bacterial or eukaryotic communities could be detected along the transect. The distribution of major planktonic taxa followed expected patterns as observed in monitoring programs, but groups novel to the Baltic Sea were also identified, such as relatives to the coccolithophore Emiliana huxleyi detected in the northern Baltic Sea. This study provides the first ultra-deep sequencing-based survey on eukaryotic and bacterial plankton biogeography in the Baltic Sea. PMID:27242706

  8. Characterization and complete genome sequence of a previously uncharacterized panicovirus from Bermuda grass detected by high throughput sequencing

    USDA-ARS?s Scientific Manuscript database

    Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high throughput sequencing (HTS). The nearly full genome sequence of a previously uncharacterized Panicovirus was identified from...

  9. Ultra-High Throughput Synthesis of Nanoparticles with Homogeneous Size Distribution Using a Coaxial Turbulent Jet Mixer

    PubMed Central

    2015-01-01

    High-throughput production of nanoparticles (NPs) with controlled quality is critical for their clinical translation into effective nanomedicines for diagnostics and therapeutics. Here we report a simple and versatile coaxial turbulent jet mixer that can synthesize a variety of NPs at high throughput up to 3 kg/d, while maintaining the advantages of homogeneity, reproducibility, and tunability that are normally accessible only in specialized microscale mixing devices. The device fabrication does not require specialized machining and is easy to operate. As one example, we show reproducible, high-throughput formulation of siRNA-polyelectrolyte polyplex NPs that exhibit effective gene knockdown but exhibit significant dependence on batch size when formulated using conventional methods. The coaxial turbulent jet mixer can accelerate the development of nanomedicines by providing a robust and versatile platform for preparation of NPs at throughputs suitable for in vivo studies, clinical trials, and industrial-scale production. PMID:24824296

  10. Evaluation of Methods for de novo Genome assembly from High-throughput Sequencing Reads Reveals Dependencies that Affect the Quality of the Results

    USDA-ARS?s Scientific Manuscript database

    Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole...

  11. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  12. A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens

    PubMed Central

    Besaratinia, Ahmad; Li, Haiqing; Yoon, Jae-In; Zheng, Albert; Gao, Hanlin; Tommasi, Stella

    2012-01-01

    Many carcinogens leave a unique mutational fingerprint in the human genome. These mutational fingerprints manifest as specific types of mutations often clustering at certain genomic loci in tumor genomes from carcinogen-exposed individuals. To develop a high-throughput method for detecting the mutational fingerprint of carcinogens, we have devised a cost-, time- and labor-effective strategy, in which the widely used transgenic Big Blue® mouse mutation detection assay is made compatible with the Roche/454 Genome Sequencer FLX Titanium next-generation sequencing technology. As proof of principle, we have used this novel method to establish the mutational fingerprints of three prominent carcinogens with varying mutagenic potencies, including sunlight ultraviolet radiation, 4-aminobiphenyl and secondhand smoke that are known to be strong, moderate and weak mutagens, respectively. For verification purposes, we have compared the mutational fingerprints of these carcinogens obtained by our newly developed method with those obtained by parallel analyses using the conventional low-throughput approach, that is, standard mutation detection assay followed by direct DNA sequencing using a capillary DNA sequencer. We demonstrate that this high-throughput next-generation sequencing-based method is highly specific and sensitive to detect the mutational fingerprints of the tested carcinogens. The method is reproducible, and its accuracy is comparable with that of the currently available low-throughput method. In conclusion, this novel method has the potential to move the field of carcinogenesis forward by allowing high-throughput analysis of mutations induced by endogenous and/or exogenous genotoxic agents. PMID:22735701

  13. A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens.

    PubMed

    Besaratinia, Ahmad; Li, Haiqing; Yoon, Jae-In; Zheng, Albert; Gao, Hanlin; Tommasi, Stella

    2012-08-01

    Many carcinogens leave a unique mutational fingerprint in the human genome. These mutational fingerprints manifest as specific types of mutations often clustering at certain genomic loci in tumor genomes from carcinogen-exposed individuals. To develop a high-throughput method for detecting the mutational fingerprint of carcinogens, we have devised a cost-, time- and labor-effective strategy, in which the widely used transgenic Big Blue mouse mutation detection assay is made compatible with the Roche/454 Genome Sequencer FLX Titanium next-generation sequencing technology. As proof of principle, we have used this novel method to establish the mutational fingerprints of three prominent carcinogens with varying mutagenic potencies, including sunlight ultraviolet radiation, 4-aminobiphenyl and secondhand smoke that are known to be strong, moderate and weak mutagens, respectively. For verification purposes, we have compared the mutational fingerprints of these carcinogens obtained by our newly developed method with those obtained by parallel analyses using the conventional low-throughput approach, that is, standard mutation detection assay followed by direct DNA sequencing using a capillary DNA sequencer. We demonstrate that this high-throughput next-generation sequencing-based method is highly specific and sensitive to detect the mutational fingerprints of the tested carcinogens. The method is reproducible, and its accuracy is comparable with that of the currently available low-throughput method. In conclusion, this novel method has the potential to move the field of carcinogenesis forward by allowing high-throughput analysis of mutations induced by endogenous and/or exogenous genotoxic agents.

  14. Advanced Virus Detection Technologies Interest Group (AVDTIG): Efforts on High Throughput Sequencing (HTS) for Virus Detection.

    PubMed

    Khan, Arifa S; Vacante, Dominick A; Cassart, Jean-Pol; Ng, Siemon H S; Lambert, Christophe; Charlebois, Robert L; King, Kathryn E

    Several nucleic-acid based technologies have recently emerged with capabilities for broad virus detection. One of these, high throughput sequencing, has the potential for novel virus detection because this method does not depend upon prior viral sequence knowledge. However, the use of high throughput sequencing for testing biologicals poses greater challenges as compared to other newly introduced tests due to its technical complexities and big data bioinformatics. Thus, the Advanced Virus Detection Technologies Users Group was formed as a joint effort by regulatory and industry scientists to facilitate discussions and provide a forum for sharing data and experiences using advanced new virus detection technologies, with a focus on high throughput sequencing technologies. The group was initiated as a task force that was coordinated by the Parenteral Drug Association and subsequently became the Advanced Virus Detection Technologies Interest Group to continue efforts for using new technologies for detection of adventitious viruses with broader participation, including international government agencies, academia, and technology service providers. © PDA, Inc. 2016.

  15. Spectral efficiency in crosstalk-impaired multi-core fiber links

    NASA Astrophysics Data System (ADS)

    Luís, Ruben S.; Puttnam, Benjamin J.; Rademacher, Georg; Klaus, Werner; Agrell, Erik; Awaji, Yoshinari; Wada, Naoya

    2018-02-01

    We review the latest advances on ultra-high throughput transmission using crosstalk-limited single-mode multicore fibers and compare these with the theoretical spectral efficiency of such systems. We relate the crosstalkimposed spectral efficiency limits with fiber parameters, such as core diameter, core pitch, and trench design. Furthermore, we investigate the potential of techniques such as direction interleaving and high-order MIMO to improve the throughput or reach of these systems when using various modulation formats.

  16. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    PubMed

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  17. High Throughput Sequence Analysis for Disease Resistance in Maize

    USDA-ARS?s Scientific Manuscript database

    Preliminary results of a computational analysis of high throughput sequencing data from Zea mays and the fungus Aspergillus are reported. The Illumina Genome Analyzer was used to sequence RNA samples from two strains of Z. mays (Va35 and Mp313) collected over a time course as well as several specie...

  18. Loeffler 4.0: Diagnostic Metagenomics.

    PubMed

    Höper, Dirk; Wylezich, Claudia; Beer, Martin

    2017-01-01

    A new world of possibilities for "virus discovery" was opened up with high-throughput sequencing becoming available in the last decade. While scientifically metagenomic analysis was established before the start of the era of high-throughput sequencing, the availability of the first second-generation sequencers was the kick-off for diagnosticians to use sequencing for the detection of novel pathogens. Today, diagnostic metagenomics is becoming the standard procedure for the detection and genetic characterization of new viruses or novel virus variants. Here, we provide an overview about technical considerations of high-throughput sequencing-based diagnostic metagenomics together with selected examples of "virus discovery" for animal diseases or zoonoses and metagenomics for food safety or basic veterinary research. © 2017 Elsevier Inc. All rights reserved.

  19. Unraveling Core Functional Microbiota in Traditional Solid-State Fermentation by High-Throughput Amplicons and Metatranscriptomics Sequencing

    PubMed Central

    Song, Zhewei; Du, Hai; Zhang, Yan; Xu, Yan

    2017-01-01

    Fermentation microbiota is specific microorganisms that generate different types of metabolites in many productions. In traditional solid-state fermentation, the structural composition and functional capacity of the core microbiota determine the quality and quantity of products. As a typical example of food fermentation, Chinese Maotai-flavor liquor production involves a complex of various microorganisms and a wide variety of metabolites. However, the microbial succession and functional shift of the core microbiota in this traditional food fermentation remain unclear. Here, high-throughput amplicons (16S rRNA gene amplicon sequencing and internal transcribed space amplicon sequencing) and metatranscriptomics sequencing technologies were combined to reveal the structure and function of the core microbiota in Chinese soy sauce aroma type liquor production. In addition, ultra-performance liquid chromatography and headspace-solid phase microextraction-gas chromatography-mass spectrometry were employed to provide qualitative and quantitative analysis of the major flavor metabolites. A total of 10 fungal and 11 bacterial genera were identified as the core microbiota. In addition, metatranscriptomic analysis revealed pyruvate metabolism in yeasts (genera Pichia, Schizosaccharomyces, Saccharomyces, and Zygosaccharomyces) and lactic acid bacteria (genus Lactobacillus) classified into two stages in the production of flavor components. Stage I involved high-level alcohol (ethanol) production, with the genus Schizosaccharomyces serving as the core functional microorganism. Stage II involved high-level acid (lactic acid and acetic acid) production, with the genus Lactobacillus serving as the core functional microorganism. The functional shift from the genus Schizosaccharomyces to the genus Lactobacillus drives flavor component conversion from alcohol (ethanol) to acid (lactic acid and acetic acid) in Chinese Maotai-flavor liquor production. Our findings provide insight into the effects of the core functional microbiota in soy sauce aroma type liquor production and the characteristics of the fermentation microbiota under different environmental conditions. PMID:28769888

  20. UltraNet Target Parameters. Chapter 1

    NASA Technical Reports Server (NTRS)

    Kislitzin, Katherine T.; Blaylock, Bruce T. (Technical Monitor)

    1992-01-01

    The UltraNet is a high speed network capable of rates up to one gigabit per second. It is a hub based network with four optical fiber links connecting each hub. Each link can carry up to 256 megabits of data, and the hub backplane is capable of one gigabit aggregate throughput. Host connections to the hub may be fiber, coax, or channel based. Bus based machines have adapter boards that connect to transceivers in the hub, while channel based machines use a personality module in the hub. One way that the UltraNet achieves its high transfer rates is by off-loading the protocol processing from the hosts to special purpose protocol engines in the UltraNet hubs. In addition, every hub has a PC connected to it by StarLAN for network management purposes. Although there is hub resident and PC resident UltraNet software, this document treats only the host resident UltraNet software.

  1. Comprehensive analysis of the T-cell receptor beta chain gene in rhesus monkey by high throughput sequencing

    PubMed Central

    Li, Zhoufang; Liu, Guangjie; Tong, Yin; Zhang, Meng; Xu, Ying; Qin, Li; Wang, Zhanhui; Chen, Xiaoping; He, Jiankui

    2015-01-01

    Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys. We identified 1.26 million TCRβ sequences corresponding to 643,570 unique TCRβ sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys. PMID:25961410

  2. Ultra-Sensitive Detection of Plasmodium falciparum by Amplification of Multi-Copy Subtelomeric Targets

    PubMed Central

    Hofmann, Natalie; Mwingira, Felista; Shekalaghe, Seif; Robinson, Leanne J.; Mueller, Ivo; Felger, Ingrid

    2015-01-01

    Background Planning and evaluating malaria control strategies relies on accurate definition of parasite prevalence in the population. A large proportion of asymptomatic parasite infections can only be identified by surveillance with molecular methods, yet these infections also contribute to onward transmission to mosquitoes. The sensitivity of molecular detection by PCR is limited by the abundance of the target sequence in a DNA sample; thus, detection becomes imperfect at low densities. We aimed to increase PCR diagnostic sensitivity by targeting multi-copy genomic sequences for reliable detection of low-density infections, and investigated the impact of these PCR assays on community prevalence data. Methods and Findings Two quantitative PCR (qPCR) assays were developed for ultra-sensitive detection of Plasmodium falciparum, targeting the high-copy telomere-associated repetitive element 2 (TARE-2, ∼250 copies/genome) and the var gene acidic terminal sequence (varATS, 59 copies/genome). Our assays reached a limit of detection of 0.03 to 0.15 parasites/μl blood and were 10× more sensitive than standard 18S rRNA qPCR. In a population cross-sectional study in Tanzania, 295/498 samples tested positive using ultra-sensitive assays. Light microscopy missed 169 infections (57%). 18S rRNA qPCR failed to identify 48 infections (16%), of which 40% carried gametocytes detected by pfs25 quantitative reverse-transcription PCR. To judge the suitability of the TARE-2 and varATS assays for high-throughput screens, their performance was tested on sample pools. Both ultra-sensitive assays correctly detected all pools containing one low-density P. falciparum–positive sample, which went undetected by 18S rRNA qPCR, among nine negatives. TARE-2 and varATS qPCRs improve estimates of prevalence rates, yet other infections might still remain undetected when absent in the limited blood volume sampled. Conclusions Measured malaria prevalence in communities is largely determined by the sensitivity of the diagnostic tool used. Even when applying standard molecular diagnostics, prevalence in our study population was underestimated by 8% compared to the new assays. Our findings highlight the need for highly sensitive tools such as TARE-2 and varATS qPCR in community surveillance and for monitoring interventions to better describe malaria epidemiology and inform malaria elimination efforts. PMID:25734259

  3. Sequencing the Connectome

    PubMed Central

    Zador, Anthony M.; Dubnau, Joshua; Oyibo, Hassana K.; Zhan, Huiqing; Cao, Gang; Peikon, Ian D.

    2012-01-01

    Connectivity determines the function of neural circuits. Historically, circuit mapping has usually been viewed as a problem of microscopy, but no current method can achieve high-throughput mapping of entire circuits with single neuron precision. Here we describe a novel approach to determining connectivity. We propose BOINC (“barcoding of individual neuronal connections”), a method for converting the problem of connectivity into a form that can be read out by high-throughput DNA sequencing. The appeal of using sequencing is that its scale—sequencing billions of nucleotides per day is now routine—is a natural match to the complexity of neural circuits. An inexpensive high-throughput technique for establishing circuit connectivity at single neuron resolution could transform neuroscience research. PMID:23109909

  4. A 64Cycles/MB, Luma-Chroma Parallelized H.264/AVC Deblocking Filter for 4K × 2K Applications

    NASA Astrophysics Data System (ADS)

    Shen, Weiwei; Fan, Yibo; Zeng, Xiaoyang

    In this paper, a high-throughput debloking filter is presented for H.264/AVC standard, catering video applications with 4K × 2K (4096 × 2304) ultra-definition resolution. In order to strengthen the parallelism without simply increasing the area, we propose a luma-chroma parallel method. Meanwhile, this work reduces the number of processing cycles, the amount of external memory traffic and the working frequency, by using triple four-stage pipeline filters and a luma-chroma interlaced sequence. Furthermore, it eliminates most unnecessary off-chip memory bandwidth with a highly reusable memory scheme, and adopts a “slide window” buffer scheme. As a result, our design can support 4K × 2K at 30fps applications at the working frequency of only 70.8MHz.

  5. A high-throughput and quantitative method to assess the mutagenic potential of translesion DNA synthesis

    PubMed Central

    Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai

    2013-01-01

    Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999

  6. AmpliVar: mutation detection in high-throughput sequence from amplicon-based libraries.

    PubMed

    Hsu, Arthur L; Kondrashova, Olga; Lunke, Sebastian; Love, Clare J; Meldrum, Cliff; Marquis-Nicholson, Renate; Corboy, Greg; Pham, Kym; Wakefield, Matthew; Waring, Paul M; Taylor, Graham R

    2015-04-01

    Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data. © 2015 WILEY PERIODICALS, INC.

  7. Ultrasensitive Detection of Multiplexed Somatic Mutations Using MALDI-TOF Mass Spectrometry.

    PubMed

    Mosko, Michael J; Nakorchevsky, Aleksey A; Flores, Eunice; Metzler, Heath; Ehrich, Mathias; van den Boom, Dirk J; Sherwood, James L; Nygren, Anders O H

    2016-01-01

    Multiplex detection of low-frequency mutations is becoming a necessary diagnostic tool for clinical laboratories interested in noninvasive prognosis and prediction. Challenges include the detection of minor alleles among abundant wild-type alleles, the heterogeneous nature of tumors, and the limited amount of available tissue. A method that can reliably detect minor variants <1% in a multiplexed reaction using a platform amenable to a variety of throughputs would meet these requirements. We developed a novel approach, UltraSEEK, for high-throughput, multiplexed, ultrasensitive mutation detection and used it for detection of mutant sequence mixtures as low as 0.1% minor allele frequency. The process consisted of multiplex PCR, followed by mutation-specific, single-base extension using chain terminators labeled with a moiety for solid phase capture. The captured and enriched products were then identified using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. For verification, we successfully analyzed ultralow fractions of mutations in a set of characterized cell lines, and included a direct comparison to droplet digital PCR. Finally, we verified the specificity in a set of 122 paired tumor and circulating cell-free DNA samples from melanoma patients. Our results show that the UltraSEEK chemistry is a particularly powerful approach for the detection of somatic variants, with the potential to be an invaluable resource to investigators in saving time and material without compromising analytical sensitivity and accuracy. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  8. Lead discovery for mammalian elongation of long chain fatty acids family 6 using a combination of high-throughput fluorescent-based assay and RapidFire mass spectrometry assay

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Takamiya, Mari; Discovery Technology Laboratories, Sohyaku, Innovative Research Division, Mitsubishi Tanabe Pharma Corporation, Kawagishi, Toda-shi, Saitama; Sakurai, Masaaki

    A high-throughput RapidFire mass spectrometry assay is described for elongation of very long-chain fatty acids family 6 (Elovl6). Elovl6 is a microsomal enzyme that regulates the elongation of C12-16 saturated and monounsaturated fatty acids. Elovl6 may be a new therapeutic target for fat metabolism disorders such as obesity, type 2 diabetes, and nonalcoholic steatohepatitis. To identify new Elovl6 inhibitors, we developed a high-throughput fluorescence screening assay in 1536-well format. However, a number of false positives caused by fluorescent interference have been identified. To pick up the real active compounds among the primary hits from the fluorescence assay, we developed amore » RapidFire mass spectrometry assay and a conventional radioisotope assay. These assays have the advantage of detecting the main products directly without using fluorescent-labeled substrates. As a result, 276 compounds (30%) of the primary hits (921 compounds) in a fluorescence ultra-high-throughput screening method were identified as common active compounds in these two assays. It is concluded that both methods are very effective to eliminate false positives. Compared with the radioisotope method using an expensive {sup 14}C-labeled substrate, the RapidFire mass spectrometry method using unlabeled substrates is a high-accuracy, high-throughput method. In addition, some of the hit compounds selected from the screening inhibited cellular fatty acid elongation in HEK293 cells expressing Elovl6 transiently. This result suggests that these compounds may be promising lead candidates for therapeutic drugs. Ultra-high-throughput fluorescence screening followed by a RapidFire mass spectrometry assay was a suitable strategy for lead discovery against Elovl6. - Highlights: • A novel assay for elongation of very-long-chain fatty acids 6 (Elovl6) is proposed. • RapidFire mass spectrometry (RF-MS) assay is useful to select real screening hits. • RF-MS assay is proved to be beneficial because of its high-throughput and accuracy. • A combination of fluorescent and RF-MS assays is effective for Elovl6 inhibitors.« less

  9. Fast log P determination by ultra-high-pressure liquid chromatography coupled with UV and mass spectrometry detections.

    PubMed

    Henchoz, Yveline; Guillarme, Davy; Martel, Sophie; Rudaz, Serge; Veuthey, Jean-Luc; Carrupt, Pierre-Alain

    2009-08-01

    Ultra-high-pressure liquid chromatography (UHPLC) systems able to work with columns packed with sub-2 microm particles offer very fast methods to determine the lipophilicity of new chemical entities. The careful development of the most suitable experimental conditions presented here will help medicinal chemists for high-throughput screening (HTS) log P(oct) measurements. The approach was optimized using a well-balanced set of 38 model compounds and a series of 28 basic compounds such as beta-blockers, local anesthetics, piperazines, clonidine, and derivatives. Different organic modifiers and hybrid stationary phases packed with 1.7-microm particles were evaluated in isocratic as well as gradient modes, and the advantages and limitations of tested conditions pointed out. The UHPLC approach offered a significant enhancement over the classical HPLC methods, by a factor 50 in the lipophilicity determination throughput. The hyphenation of UHPLC with MS detection allowed a further increase in the throughput. Data and results reported herein prove that the UHPLC-MS method can represent a progress in the HTS-measurement of lipophilicity due to its speed (at least a factor of 500 with respect to HPLC approaches) and to an extended field of application.

  10. High-Throughput Block Optical DNA Sequence Identification.

    PubMed

    Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

    2018-01-01

    Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. High-throughput sequencing methods to study neuronal RNA-protein interactions.

    PubMed

    Ule, Jernej

    2009-12-01

    UV-cross-linking and RNase protection, combined with high-throughput sequencing, have provided global maps of RNA sites bound by individual proteins or ribosomes. Using a stringent purification protocol, UV-CLIP (UV-cross-linking and immunoprecipitation) was able to identify intronic and exonic sites bound by splicing regulators in mouse brain tissue. Ribosome profiling has been used to quantify ribosome density on budding yeast mRNAs under different environmental conditions. Post-transcriptional regulation in neurons requires high spatial and temporal precision, as is evident from the role of localized translational control in synaptic plasticity. It remains to be seen if the high-throughput methods can be applied quantitatively to study the dynamics of RNP (ribonucleoprotein) remodelling in specific neuronal populations during the neurodegenerative process. It is certain, however, that applications of new biochemical techniques followed by high-throughput sequencing will continue to provide important insights into the mechanisms of neuronal post-transcriptional regulation.

  12. Optimization of High-Throughput Sequencing Kinetics for determining enzymatic rate constants of thousands of RNA substrates

    PubMed Central

    Niland, Courtney N.; Jankowsky, Eckhard; Harris, Michael E.

    2016-01-01

    Quantification of the specificity of RNA binding proteins and RNA processing enzymes is essential to understanding their fundamental roles in biological processes. High Throughput Sequencing Kinetics (HTS-Kin) uses high throughput sequencing and internal competition kinetics to simultaneously monitor the processing rate constants of thousands of substrates by RNA processing enzymes. This technique has provided unprecedented insight into the substrate specificity of the tRNA processing endonuclease ribonuclease P. Here, we investigate the accuracy and robustness of measurements associated with each step of the HTS-Kin procedure. We examine the effect of substrate concentration on the observed rate constant, determine the optimal kinetic parameters, and provide guidelines for reducing error in amplification of the substrate population. Importantly, we find that high-throughput sequencing, and experimental reproducibility contribute their own sources of error, and these are the main sources of imprecision in the quantified results when otherwise optimized guidelines are followed. PMID:27296633

  13. Next-generation sequencing coupled with a cell-free display technology for high-throughput production of reliable interactome data

    PubMed Central

    Fujimori, Shigeo; Hirai, Naoya; Ohashi, Hiroyuki; Masuoka, Kazuyo; Nishikimi, Akihiko; Fukui, Yoshinori; Washio, Takanori; Oshikubo, Tomohiro; Yamashita, Tatsuhiro; Miyamoto-Sato, Etsuko

    2012-01-01

    Next-generation sequencing (NGS) has been applied to various kinds of omics studies, resulting in many biological and medical discoveries. However, high-throughput protein-protein interactome datasets derived from detection by sequencing are scarce, because protein-protein interaction analysis requires many cell manipulations to examine the interactions. The low reliability of the high-throughput data is also a problem. Here, we describe a cell-free display technology combined with NGS that can improve both the coverage and reliability of interactome datasets. The completely cell-free method gives a high-throughput and a large detection space, testing the interactions without using clones. The quantitative information provided by NGS reduces the number of false positives. The method is suitable for the in vitro detection of proteins that interact not only with the bait protein, but also with DNA, RNA and chemical compounds. Thus, it could become a universal approach for exploring the large space of protein sequences and interactome networks. PMID:23056904

  14. High-throughput sequencing: a failure mode analysis.

    PubMed

    Yang, George S; Stott, Jeffery M; Smailus, Duane; Barber, Sarah A; Balasundaram, Miruna; Marra, Marco A; Holt, Robert A

    2005-01-04

    Basic manufacturing principles are becoming increasingly important in high-throughput sequencing facilities where there is a constant drive to increase quality, increase efficiency, and decrease operating costs. While high-throughput centres report failure rates typically on the order of 10%, the causes of sporadic sequencing failures are seldom analyzed in detail and have not, in the past, been formally reported. Here we report the results of a failure mode analysis of our production sequencing facility based on detailed evaluation of 9,216 ESTs generated from two cDNA libraries. Two categories of failures are described; process-related failures (failures due to equipment or sample handling) and template-related failures (failures that are revealed by close inspection of electropherograms and are likely due to properties of the template DNA sequence itself). Preventative action based on a detailed understanding of failure modes is likely to improve the performance of other production sequencing pipelines.

  15. New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era (2010 JGI/ANL HPC Workshop)

    ScienceCinema

    Notredame, Cedric

    2018-05-02

    Cedric Notredame from the Centre for Genomic Regulation gives a presentation on New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era at the JGI/Argonne HPC Workshop on January 26, 2010.

  16. Evaluation of Sequencing Approaches for High-Throughput Transcriptomics - (BOSC)

    EPA Science Inventory

    Whole-genome in vitro transcriptomics has shown the capability to identify mechanisms of action and estimates of potency for chemical-mediated effects in a toxicological framework, but with limited throughput and high cost. The generation of high-throughput global gene expression...

  17. The promise and challenge of high-throughput sequencing of the antibody repertoire

    PubMed Central

    Georgiou, George; Ippolito, Gregory C; Beausang, John; Busse, Christian E; Wardemann, Hedda; Quake, Stephen R

    2014-01-01

    Efforts to determine the antibody repertoire encoded by B cells in the blood or lymphoid organs using high-throughput DNA sequencing technologies have been advancing at an extremely rapid pace and are transforming our understanding of humoral immune responses. Information gained from high-throughput DNA sequencing of immunoglobulin genes (Ig-seq) can be applied to detect B-cell malignancies with high sensitivity, to discover antibodies specific for antigens of interest, to guide vaccine development and to understand autoimmunity. Rapid progress in the development of experimental protocols and informatics analysis tools is helping to reduce sequencing artifacts, to achieve more precise quantification of clonal diversity and to extract the most pertinent biological information. That said, broader application of Ig-seq, especially in clinical settings, will require the development of a standardized experimental design framework that will enable the sharing and meta-analysis of sequencing data generated by different laboratories. PMID:24441474

  18. Pseudouridines have context-dependent mutation and stop rates in high-throughput sequencing.

    PubMed

    Zhou, Katherine I; Clark, Wesley C; Pan, David W; Eckwahl, Matthew J; Dai, Qing; Pan, Tao

    2018-05-11

    The abundant RNA modification pseudouridine (Ψ) has been mapped transcriptome-wide by chemically modifying pseudouridines with carbodiimide and detecting the resulting reverse transcription stops in high-throughput sequencing. However, these methods have limited sensitivity and specificity, in part due to the use of reverse transcription stops. We sought to use mutations rather than just stops in sequencing data to identify pseudouridine sites. Here, we identify reverse transcription conditions that allow read-through of carbodiimide-modified pseudouridine (CMC-Ψ), and we show that pseudouridines in carbodiimide-treated human ribosomal RNA have context-dependent mutation and stop rates in high-throughput sequencing libraries prepared under these conditions. Furthermore, accounting for the context-dependence of mutation and stop rates can enhance the detection of pseudouridine sites. Similar approaches could contribute to the sequencing-based detection of many RNA modifications.

  19. Discovery of viruses and virus-like pathogens in pistachio using high-throughput sequencing

    USDA-ARS?s Scientific Manuscript database

    Pistachio (Pistacia vera L.) trees from the National Clonal Germplasm Repository (NCGR) and orchards in California were surveyed for viruses and virus-like agents by high-throughput sequencing (HTS). Analyses of 60 trees including clonal UCB-1 hybrid rootstock (P. atlantica × P. integerrima) identif...

  20. Investigation of Human Cancers for Retrovirus by Low-Stringency Target Enrichment and High-Throughput Sequencing.

    PubMed

    Vinner, Lasse; Mourier, Tobias; Friis-Nielsen, Jens; Gniadecki, Robert; Dybkaer, Karen; Rosenberg, Jacob; Langhoff, Jill Levin; Cruz, David Flores Santa; Fonager, Jannik; Izarzugaza, Jose M G; Gupta, Ramneek; Sicheritz-Ponten, Thomas; Brunak, Søren; Willerslev, Eske; Nielsen, Lars Peter; Hansen, Anders Johannes

    2015-08-19

    Although nearly one fifth of all human cancers have an infectious aetiology, the causes for the majority of cancers remain unexplained. Despite the enormous data output from high-throughput shotgun sequencing, viral DNA in a clinical sample typically constitutes a proportion of host DNA that is too small to be detected. Sequence variation among virus genomes complicates application of sequence-specific, and highly sensitive, PCR methods. Therefore, we aimed to develop and characterize a method that permits sensitive detection of sequences despite considerable variation. We demonstrate that our low-stringency in-solution hybridization method enables detection of <100 viral copies. Furthermore, distantly related proviral sequences may be enriched by orders of magnitude, enabling discovery of hitherto unknown viral sequences by high-throughput sequencing. The sensitivity was sufficient to detect retroviral sequences in clinical samples. We used this method to conduct an investigation for novel retrovirus in samples from three cancer types. In accordance with recent studies our investigation revealed no retroviral infections in human B-cell lymphoma cells, cutaneous T-cell lymphoma or colorectal cancer biopsies. Nonetheless, our generally applicable method makes sensitive detection possible and permits sequencing of distantly related sequences from complex material.

  1. Methods for comparative metagenomics

    PubMed Central

    Huson, Daniel H; Richter, Daniel C; Mitra, Suparna; Auch, Alexander F; Schuster, Stephan C

    2009-01-01

    Background Metagenomics is a rapidly growing field of research that aims at studying uncultured organisms to understand the true diversity of microbes, their functions, cooperation and evolution, in environments such as soil, water, ancient remains of animals, or the digestive system of animals and humans. The recent development of ultra-high throughput sequencing technologies, which do not require cloning or PCR amplification, and can produce huge numbers of DNA reads at an affordable cost, has boosted the number and scope of metagenomic sequencing projects. Increasingly, there is a need for new ways of comparing multiple metagenomics datasets, and for fast and user-friendly implementations of such approaches. Results This paper introduces a number of new methods for interactively exploring, analyzing and comparing multiple metagenomic datasets, which will be made freely available in a new, comparative version 2.0 of the stand-alone metagenome analysis tool MEGAN. Conclusion There is a great need for powerful and user-friendly tools for comparative analysis of metagenomic data and MEGAN 2.0 will help to fill this gap. PMID:19208111

  2. Gene expression profiling of human breast tissue samples using SAGE-Seq.

    PubMed

    Wu, Zhenhua Jeremy; Meyer, Clifford A; Choudhury, Sibgat; Shipitsin, Michail; Maruyama, Reo; Bessarabova, Marina; Nikolskaya, Tatiana; Sukumar, Saraswati; Schwartzman, Armin; Liu, Jun S; Polyak, Kornelia; Liu, X Shirley

    2010-12-01

    We present a powerful application of ultra high-throughput sequencing, SAGE-Seq, for the accurate quantification of normal and neoplastic mammary epithelial cell transcriptomes. We develop data analysis pipelines that allow the mapping of sense and antisense strands of mitochondrial and RefSeq genes, the normalization between libraries, and the identification of differentially expressed genes. We find that the diversity of cancer transcriptomes is significantly higher than that of normal cells. Our analysis indicates that transcript discovery plateaus at 10 million reads/sample, and suggests a minimum desired sequencing depth around five million reads. Comparison of SAGE-Seq and traditional SAGE on normal and cancerous breast tissues reveals higher sensitivity of SAGE-Seq to detect less-abundant genes, including those encoding for known breast cancer-related transcription factors and G protein-coupled receptors (GPCRs). SAGE-Seq is able to identify genes and pathways abnormally activated in breast cancer that traditional SAGE failed to call. SAGE-Seq is a powerful method for the identification of biomarkers and therapeutic targets in human disease.

  3. High-throughput and automated SAXS/USAXS experiment for industrial use at BL19B2 in SPring-8

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Osaka, Keiichi, E-mail: k-osaka@spring8.or.jp; Inoue, Daisuke; Sato, Masugu

    A highly automated system combining a sample transfer robot with focused SR beam has been established for small-angle and ultra small-angle X-ray scattering (SAXS/USAXS) measurement at BL19B2 for industrial use of SPring-8. High-throughput data collection system can be realized by means of X-ray beam of high photon flux density concentrated by a cylindrical mirror, and a two-dimensional pixel detector PILATUS-2M. For SAXS measurement, we can obtain high-quality data within 1 minute for one exposure using this system. The sample transfer robot has a capacity of 90 samples with a large variety of shapes. The fusion of high-throughput and robotic systemmore » has enhanced the usability of SAXS/USAXS capability for industrial application.« less

  4. Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling.

    PubMed

    Tome, Jacob M; Ozer, Abdullah; Pagano, John M; Gheba, Dan; Schroth, Gary P; Lis, John T

    2014-06-01

    RNA-protein interactions play critical roles in gene regulation, but methods to quantitatively analyze these interactions at a large scale are lacking. We have developed a high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay by adapting a high-throughput DNA sequencer to quantify the binding of fluorescently labeled protein to millions of RNAs anchored to sequenced cDNA templates. Using HiTS-RAP, we measured the affinity of mutagenized libraries of GFP-binding and NELF-E-binding aptamers to their respective targets and identified critical regions of interaction. Mutations additively affected the affinity of the NELF-E-binding aptamer, whose interaction depended mainly on a single-stranded RNA motif, but not that of the GFP aptamer, whose interaction depended primarily on secondary structure.

  5. High-Throughput Fabrication of Ultradense Annular Nanogap Arrays for Plasmon-Enhanced Spectroscopy.

    PubMed

    Cai, Hongbing; Meng, Qiushi; Zhao, Hui; Li, Mingling; Dai, Yanmeng; Lin, Yue; Ding, Huaiyi; Pan, Nan; Tian, Yangchao; Luo, Yi; Wang, Xiaoping

    2018-06-13

    The confinement of light into nanometer-sized metallic nanogaps can lead to an extremely high field enhancement, resulting in dramatically enhanced absorption, emission, and surface-enhanced Raman scattering (SERS) of molecules embedded in nanogaps. However, low-cost, high-throughput, and reliable fabrication of ultra-high-dense nanogap arrays with precise control of the gap size still remains a challenge. Here, by combining colloidal lithography and atomic layer deposition technique, a reproducible method for fabricating ultra-high-dense arrays of hexagonal close-packed annular nanogaps over large areas is demonstrated. The annular nanogap arrays with a minimum diameter smaller than 100 nm and sub-1 nm gap width have been produced, showing excellent SERS performance with a typical enhancement factor up to 3.1 × 10 6 and a detection limit of 10 -11 M. Moreover, it can also work as a high-quality field enhancement substrate for studying two-dimensional materials, such as MoSe 2 . Our method provides an attractive approach to produce controllable nanogaps for enhanced light-matter interaction at the nanoscale.

  6. Genome sequencing in microfabricated high-density picolitre reactors.

    PubMed

    Margulies, Marcel; Egholm, Michael; Altman, William E; Attiya, Said; Bader, Joel S; Bemben, Lisa A; Berka, Jan; Braverman, Michael S; Chen, Yi-Ju; Chen, Zhoutao; Dewell, Scott B; Du, Lei; Fierro, Joseph M; Gomes, Xavier V; Godwin, Brian C; He, Wen; Helgesen, Scott; Ho, Chun Heen; Ho, Chun He; Irzyk, Gerard P; Jando, Szilveszter C; Alenquer, Maria L I; Jarvie, Thomas P; Jirage, Kshama B; Kim, Jong-Bum; Knight, James R; Lanza, Janna R; Leamon, John H; Lefkowitz, Steven M; Lei, Ming; Li, Jing; Lohman, Kenton L; Lu, Hong; Makhijani, Vinod B; McDade, Keith E; McKenna, Michael P; Myers, Eugene W; Nickerson, Elizabeth; Nobile, John R; Plant, Ramona; Puc, Bernard P; Ronan, Michael T; Roth, George T; Sarkis, Gary J; Simons, Jan Fredrik; Simpson, John W; Srinivasan, Maithreyan; Tartaro, Karrie R; Tomasz, Alexander; Vogt, Kari A; Volkmer, Greg A; Wang, Shally H; Wang, Yong; Weiner, Michael P; Yu, Pengguang; Begley, Richard F; Rothberg, Jonathan M

    2005-09-15

    The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.

  7. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

    PubMed

    Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

    2018-03-09

    High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.

  8. Pediatric Glioblastoma Therapies Based on Patient-Derived Stem Cell Resources

    DTIC Science & Technology

    2014-11-01

    genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate gene...and genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate...PRISM 7900 Sequence Detection System ( Genomics Resource, FHCRC). Relative transcript abundance was analyzed using the 2−ΔΔCt method. TRIzol (Invitrogen

  9. Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leung, Elo; Huang, Amy; Cadag, Eithon

    In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less

  10. Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

    DOE PAGES

    Leung, Elo; Huang, Amy; Cadag, Eithon; ...

    2016-01-20

    In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less

  11. Characterization of the indigenous microflora in raw and pasteurized buffalo milk during storage at refrigeration temperature by high-throughput sequencing

    USDA-ARS?s Scientific Manuscript database

    The effect of refrigeration on bacterial communities within raw and pasteurized buffalo milk was studied using high-throughput sequencing. High quality samples of raw buffalo milk were obtained from five dairy farms in the Guangxi province of China. A sample of each milk was pasteurized, and both r...

  12. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    PubMed

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  13. Construction of an ultra-high density consensus genetic map, and enhancement of the physical map from genome sequencing in Lupinus angustifolius.

    PubMed

    Zhou, Gaofeng; Jian, Jianbo; Wang, Penghao; Li, Chengdao; Tao, Ye; Li, Xuan; Renshaw, Daniel; Clements, Jonathan; Sweetingham, Mark; Yang, Huaan

    2018-01-01

    An ultra-high density genetic map containing 34,574 sequence-defined markers was developed in Lupinus angustifolius. Markers closely linked to nine genes of agronomic traits were identified. A physical map was improved to cover 560.5 Mb genome sequence. Lupin (Lupinus angustifolius L.) is a recently domesticated legume grain crop. In this study, we applied the restriction-site associated DNA sequencing (RADseq) method to genotype an F 9 recombinant inbred line population derived from a wild type × domesticated cultivar (W × D) cross. A high density linkage map was developed based on the W × D population. By integrating sequence-defined DNA markers reported in previous mapping studies, we established an ultra-high density consensus genetic map, which contains 34,574 markers consisting of 3508 loci covering 2399 cM on 20 linkage groups. The largest gap in the entire consensus map was 4.73 cM. The high density W × D map and the consensus map were used to develop an improved physical map, which covered 560.5 Mb of genome sequence data. The ultra-high density consensus linkage map, the improved physical map and the markers linked to genes of breeding interest reported in this study provide a common tool for genome sequence assembly, structural genomics, comparative genomics, functional genomics, QTL mapping, and molecular plant breeding in lupin.

  14. Target enrichment and high-throughput sequencing of 80 ribosomal protein genes to identify mutations associated with Diamond-Blackfan anaemia.

    PubMed

    Gerrard, Gareth; Valgañón, Mikel; Foong, Hui En; Kasperaviciute, Dalia; Iskander, Deena; Game, Laurence; Müller, Michael; Aitman, Timothy J; Roberts, Irene; de la Fuente, Josu; Foroni, Letizia; Karadimitris, Anastasios

    2013-08-01

    Diamond-Blackfan anaemia (DBA) is caused by inactivating mutations in ribosomal protein (RP) genes, with mutations in 13 of the 80 RP genes accounting for 50-60% of cases. The remaining 40-50% cases may harbour mutations in one of the remaining RP genes, but the very low frequencies render conventional genetic screening as challenging. We, therefore, applied custom enrichment technology combined with high-throughput sequencing to screen all 80 RP genes. Using this approach, we identified and validated inactivating mutations in 15/17 (88%) DBA patients. Target enrichment combined with high-throughput sequencing is a robust and improved methodology for the genetic diagnosis of DBA. © 2013 John Wiley & Sons Ltd.

  15. Overcoming bias and systematic errors in next generation sequencing data.

    PubMed

    Taub, Margaret A; Corrada Bravo, Hector; Irizarry, Rafael A

    2010-12-10

    Considerable time and effort has been spent in developing analysis and quality assessment methods to allow the use of microarrays in a clinical setting. As is the case for microarrays and other high-throughput technologies, data from new high-throughput sequencing technologies are subject to technological and biological biases and systematic errors that can impact downstream analyses. Only when these issues can be readily identified and reliably adjusted for will clinical applications of these new technologies be feasible. Although much work remains to be done in this area, we describe consistently observed biases that should be taken into account when analyzing high-throughput sequencing data. In this article, we review current knowledge about these biases, discuss their impact on analysis results, and propose solutions.

  16. Brain MR imaging at ultra-low radiofrequency power.

    PubMed

    Sarkar, Subhendra N; Alsop, David C; Madhuranthakam, Ananth J; Busse, Reed F; Robson, Philip M; Rofsky, Neil M; Hackney, David B

    2011-05-01

    To explore the lower limits for radiofrequency (RF) power-induced specific absorption rate (SAR) achievable at 1.5 T for brain magnetic resonance (MR) imaging without loss of tissue signal or contrast present in high-SAR clinical imaging in order to create a potentially viable MR method at ultra-low RF power to image tissues containing implanted devices. An institutional review board-approved HIPAA-compliant prospective MR study design was used, with written informed consent from all subjects prior to MR sessions. Seven healthy subjects were imaged prospectively at 1.5 T with ultra-low-SAR optimized three-dimensional (3D) fast spin-echo (FSE) and fluid-attenuated inversion-recovery (FLAIR) T2-weighted sequences and an ultra-low-SAR 3D spoiled gradient-recalled acquisition in the steady state T1-weighted sequence. Corresponding high-SAR two-dimensional (2D) clinical sequences were also performed. In addition to qualitative comparisons, absolute signal-to-noise ratios (SNRs) and contrast-to-noise ratios (CNRs) for multicoil, parallel imaging acquisitions were generated by using a Monte Carlo method for quantitative comparison between ultra-low-SAR and high-SAR results. There were minor to moderate differences in the absolute tissue SNR and CNR values and in qualitative appearance of brain images obtained by using ultra-low-SAR and high-SAR techniques. High-SAR 2D T2-weighted imaging produced slightly higher SNR, while ultra-low-SAR 3D technique not only produced higher SNR for T1-weighted and FLAIR images but also higher CNRs for all three sequences for most of the brain tissues. The 3D techniques adopted here led to a decrease in the absorbed RF power by two orders of magnitude at 1.5 T, and still the image quality was preserved within clinically acceptable imaging times. RSNA, 2011

  17. Molecular characterization of a novel rhabdovirus infecting blackcurrant identified by high-throughput sequencing.

    PubMed

    Wu, L-P; Yang, T; Liu, H-W; Postman, J; Li, R

    2018-05-01

    A large contig with sequence similarities to several nucleorhabdoviruses was identified by high-throughput sequencing analysis from a black currant (Ribes nigrum L.) cultivar. The complete genome sequence of this new nucleorhabdovirus is 14,432 nucleotides long. Its genomic organization is very similar to those of unsegmented plant rhabdoviruses, containing six open reading frames in the order 3'-N-P-P3-M-G-L-5. The virus, which is provisionally named "black currant-associated rhabdovirus", is 41-52% identical in its genome nucleotide sequence to other nucleorhabdoviruses and may represent a new species in the genus Nucleorhabdovirus.

  18. Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

    PubMed Central

    Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

    2016-01-01

    Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240

  19. A high-throughput multiplex method adapted for GMO detection.

    PubMed

    Chaouachi, Maher; Chupeau, Gaëlle; Berard, Aurélie; McKhann, Heather; Romaniuk, Marcel; Giancola, Sandra; Laval, Valérie; Bertheau, Yves; Brunel, Dominique

    2008-12-24

    A high-throughput multiplex assay for the detection of genetically modified organisms (GMO) was developed on the basis of the existing SNPlex method designed for SNP genotyping. This SNPlex assay allows the simultaneous detection of up to 48 short DNA sequences (approximately 70 bp; "signature sequences") from taxa endogenous reference genes, from GMO constructions, screening targets, construct-specific, and event-specific targets, and finally from donor organisms. This assay avoids certain shortcomings of multiplex PCR-based methods already in widespread use for GMO detection. The assay demonstrated high specificity and sensitivity. The results suggest that this assay is reliable, flexible, and cost- and time-effective for high-throughput GMO detection.

  20. Interaction of DNA and Proteins with Single Nanopores

    NASA Astrophysics Data System (ADS)

    Kasianowicz, J. J.

    2006-03-01

    The bacterial toxins Staphylococcus aureus alpha-hemolysin and Bacillus anthracis protective antigen kill cells in part by forming ion channels in target membranes. We are using electrophysiology, molecular biology/protein biochemistry and computer modeling to study how biopolymers (e.g., single-stranded DNA and proteins) bind to and transport through these nanometer-scale pores. The results provide insight into the mechanism by which these toxins work and are the basis for several potential nanobiotechnology applications including ultra-rapid DNA sequencing, the sensitive and selective detection of a wide range of analytes and high throughput screening of therapeutic agents against several anthrax toxins. In collaboration with V.M. Stanford, M. Misakian, B. Nablo, S.E. Henrickson, NIST, EEEL, Gaithersburg, MD; T. Nguyen, R. Gussio, NCI, Ft. Detrick, MD; and K.M. Halverson, S. Bavari, R.G. Panchal, USAMRIID, Ft. Detrick, MD.

  1. SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

    PubMed

    Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

    2018-05-25

    Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.

  2. RIKEN Integrated Sequence Analysis (RISA) System—384-Format Sequencing Pipeline with 384 Multicapillary Sequencer

    PubMed Central

    Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide

    2000-01-01

    The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be revealed by seven RISA systems within one month. PMID:11076861

  3. Increasing ecological inference from high throughput sequencing of fungi in the environment through a tagging approach

    Treesearch

    D. Lee Taylor; Michael G. Booth; Jack W. McFarland; Ian C. Herriott; Niall J. Lennon; Chad Nusbaum; Thomas G. Marr

    2008-01-01

    High throughput sequencing methods are widely used in analyses of microbial diversity but are generally applied to small numbers of samples, which precludes charaterization of patterns of microbial diversity across space and time. We have designed a primer-tagging approach that allows pooling and subsequent sorting of numerous samples, which is directed to...

  4. Enzyme catalysis: Evolution made easy

    NASA Astrophysics Data System (ADS)

    Wee, Eugene J. H.; Trau, Matt

    2014-09-01

    Directed evolution is a powerful tool for the development of improved enzyme catalysts. Now, a method that enables an enzyme, its encoding DNA and a fluorescent reaction product to be encapsulated in a gel bead enables the application of directed evolution in an ultra-high-throughput format.

  5. Ultra-high-throughput screening method for the directed evolution of glucose oxidase.

    PubMed

    Ostafe, Raluca; Prodanovic, Radivoje; Nazor, Jovana; Fischer, Rainer

    2014-03-20

    Glucose oxidase (GOx) is used in many industrial processes that could benefit from improved versions of the enzyme. Some improvements like higher activity under physiological conditions and thermal stability could be useful for GOx applications in biosensors and biofuel cells. Directed evolution is one of the currently available methods to engineer improved GOx variants. Here, we describe an ultra-high-throughput screening system for sorting the best enzyme variants generated by directed evolution that incorporates several methodological refinements: flow cytometry, in vitro compartmentalization, yeast surface display, fluorescent labeling of the expressed enzyme, delivery of glucose substrate to the reaction mixture through the oil phase, and covalent labeling of the cells with fluorescein-tyramide. The method enables quantitative screening of gene libraries to identify clones with improved activity and it also allows cells to be selected based not only on the overall activity but also on the specific activity of the enzyme. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Ultra-sensitive Sequencing Identifies High Prevalence of Clonal Hematopoiesis-Associated Mutations throughout Adult Life.

    PubMed

    Acuna-Hidalgo, Rocio; Sengul, Hilal; Steehouwer, Marloes; van de Vorst, Maartje; Vermeulen, Sita H; Kiemeney, Lambertus A L M; Veltman, Joris A; Gilissen, Christian; Hoischen, Alexander

    2017-07-06

    Clonal hematopoiesis results from somatic mutations in hematopoietic stem cells, which give an advantage to mutant cells, driving their clonal expansion and potentially leading to leukemia. The acquisition of clonal hematopoiesis-driver mutations (CHDMs) occurs with normal aging and these mutations have been detected in more than 10% of individuals ≥65 years. We aimed to examine the prevalence and characteristics of CHDMs throughout adult life. We developed a targeted re-sequencing assay combining high-throughput with ultra-high sensitivity based on single-molecule molecular inversion probes (smMIPs). Using smMIPs, we screened more than 100 loci for CHDMs in more than 2,000 blood DNA samples from population controls between 20 and 69 years of age. Loci screened included 40 regions known to drive clonal hematopoiesis when mutated and 64 novel candidate loci. We identified 224 somatic mutations throughout our cohort, of which 216 were coding mutations in known driver genes (DNMT3A, JAK2, GNAS, TET2, and ASXL1), including 196 point mutations and 20 indels. Our assay's improved sensitivity allowed us to detect mutations with variant allele frequencies as low as 0.001. CHDMs were identified in more than 20% of individuals 60 to 69 years of age and in 3% of individuals 20 to 29 years of age, approximately double the previously reported prevalence despite screening a limited set of loci. Our findings support the occurrence of clonal hematopoiesis-associated mutations as a widespread mechanism linked with aging, suggesting that mosaicism as a result of clonal evolution of cells harboring somatic mutations is a universal mechanism occurring at all ages in healthy humans. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  7. Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing

    PubMed Central

    Gibson, Joel F.; Shokralla, Shadi; Curry, Colin; Baird, Donald J.; Monk, Wendy A.; King, Ian; Hajibabaei, Mehrdad

    2015-01-01

    Biodiversity metrics are critical for assessment and monitoring of ecosystems threatened by anthropogenic stressors. Existing sorting and identification methods are too expensive and labour-intensive to be scaled up to meet management needs. Alternately, a high-throughput DNA sequencing approach could be used to determine biodiversity metrics from bulk environmental samples collected as part of a large-scale biomonitoring program. Here we show that both morphological and DNA sequence-based analyses are suitable for recovery of individual taxonomic richness, estimation of proportional abundance, and calculation of biodiversity metrics using a set of 24 benthic samples collected in the Peace-Athabasca Delta region of Canada. The high-throughput sequencing approach was able to recover all metrics with a higher degree of taxonomic resolution than morphological analysis. The reduced cost and increased capacity of DNA sequence-based approaches will finally allow environmental monitoring programs to operate at the geographical and temporal scale required by industrial and regulatory end-users. PMID:26488407

  8. A comparative analysis of high-throughput platforms for validation of a circulating microRNA signature in diabetic retinopathy.

    PubMed

    Farr, Ryan J; Januszewski, Andrzej S; Joglekar, Mugdha V; Liang, Helena; McAulley, Annie K; Hewitt, Alex W; Thomas, Helen E; Loudovaris, Tom; Kay, Thomas W H; Jenkins, Alicia; Hardikar, Anandwardhan A

    2015-06-02

    MicroRNAs are now increasingly recognized as biomarkers of disease progression. Several quantitative real-time PCR (qPCR) platforms have been developed to determine the relative levels of microRNAs in biological fluids. We systematically compared the detection of cellular and circulating microRNA using a standard 96-well platform, a high-content microfluidics platform and two ultra-high content platforms. We used extensive analytical tools to compute inter- and intra-run variability and concordance measured using fidelity scoring, coefficient of variation and cluster analysis. We carried out unprejudiced next generation sequencing to identify a microRNA signature for Diabetic Retinopathy (DR) and systematically assessed the validation of this signature on clinical samples using each of the above four qPCR platforms. The results indicate that sensitivity to measure low copy number microRNAs is inversely related to qPCR reaction volume and that the choice of platform for microRNA biomarker validation should be made based on the abundance of miRNAs of interest.

  9. Characterization and complete genome sequence of a panicovirus from Bermuda grass by high-throughput sequencing.

    PubMed

    Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre

    2017-04-01

    Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.

  10. Beam engineering for zero conicity cutting and drilling with ultra fast laser (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Letan, Amelie; Mishchik, Konstantin; Audouard, Eric; Hoenninger, Clemens; Mottay, Eric P.

    2017-03-01

    With the development of high average power, high repetition rate, industrial ultrafast lasers, it is now possible to achieve a high throughput with femtosecond laser processing, providing that the operating parameters are finely tuned to the application. Femtosecond lasers play a key role in these processes, due to their ability to high quality micro processing. They are able to drill high thickness holes (up to 1 mm) with arbitrary shapes, such as zero-conicity or even inversed taper, but can also perform zero-taper cutting. A clear understanding of all the processing steps necessary to optimize the processing speed is a main challenge for industrial developments. Indeed, the laser parameters are not independent of the beam steering devices. Pulses energy and repetition rate have to be precisely adjusted to the beam angle with the sample, and to the temporal and spatial sequences of pulses superposition. The purpose of the present work is to identify the role of these parameters for high aspect ratio drilling and cutting not only with experimental trials, but also with numerical estimations, using a simple engineering model based on the two temperature description of ultra-fast ablation. Assuming a nonlinear logarithmic response of the materials to ultrafast pulses, each material can be described by only two adjustable parameters. Simple assumptions allow to predict the effect of beam velocity and non-normal incident beams to estimate profile shapes and processing time.

  11. Minimap2: pairwise alignment for nucleotide sequences.

    PubMed

    Li, Heng

    2018-05-10

    Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.

  12. Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq.

    PubMed

    Hu, Ming; Zhu, Yu; Taylor, Jeremy M G; Liu, Jun S; Qin, Zhaohui S

    2012-01-01

    RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base-specific variation and between-base dependence make simple approaches, such as those that use averaging to normalize RNA-Seq data and quantify gene expressions, ineffective. In this study, we propose a Poisson mixed-effects (POME) model to characterize base-level read coverage within each transcript. The underlying expression level is included as a key parameter in this model. Since the proposed model is capable of incorporating base-specific variation as well as between-base dependence that affect read coverage profile throughout the transcript, it can lead to improved quantification of the true underlying expression level. POME can be freely downloaded at http://www.stat.purdue.edu/~yuzhu/pome.html. yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary data are available at Bioinformatics online.

  13. Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

    PubMed Central

    Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

    2014-01-01

    Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from <40 nt, 40–150 nt and >150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259

  14. Combining high-throughput sequencing with fruit body surveys reveals contrasting life-history strategies in fungi

    PubMed Central

    Ovaskainen, Otso; Schigel, Dmitry; Ali-Kovero, Heini; Auvinen, Petri; Paulin, Lars; Nordén, Björn; Nordén, Jenni

    2013-01-01

    Before the recent revolution in molecular biology, field studies on fungal communities were mostly confined to fruit bodies, whereas mycelial interactions were studied in the laboratory. Here we combine high-throughput sequencing with a fruit body inventory to study simultaneously mycelial and fruit body occurrences in a community of fungi inhabiting dead wood of Norway spruce. We studied mycelial occurrence by extracting DNA from wood samples followed by 454-sequencing of the ITS1 and ITS2 regions and an automated procedure for species identification. In total, we detected 198 species as mycelia and 137 species as fruit bodies. The correlation between mycelial and fruit body occurrences was high for the majority of the species, suggesting that high-throughput sequencing can successfully characterize the dominating fungal communities, despite possible biases related to sampling, PCR, sequencing and molecular identification. We used the fruit body and molecular data to test hypothesized links between life history and population dynamic parameters. We show that the species that have on average a high mycelial abundance also have a high fruiting rate and produce large fruit bodies, leading to a positive feedback loop in their population dynamics. Earlier studies have shown that species with specialized resource requirements are rarely seen fruiting, for which reason they are often classified as red-listed. We show with the help of high-throughput sequencing that some of these species are more abundant as mycelium in wood than what could be expected from their occurrence as fruit bodies. PMID:23575372

  15. A 48Cycles/MB H.264/AVC Deblocking Filter Architecture for Ultra High Definition Applications

    NASA Astrophysics Data System (ADS)

    Zhou, Dajiang; Zhou, Jinjia; Zhu, Jiayi; Goto, Satoshi

    In this paper, a highly parallel deblocking filter architecture for H.264/AVC is proposed to process one macroblock in 48 clock cycles and give real-time support to QFHD@60fps sequences at less than 100MHz. 4 edge filters organized in 2 groups for simultaneously processing vertical and horizontal edges are applied in this architecture to enhance its throughput. While parallelism increases, pipeline hazards arise owing to the latency of edge filters and data dependency of deblocking algorithm. To solve this problem, a zig-zag processing schedule is proposed to eliminate the pipeline bubbles. Data path of the architecture is then derived according to the processing schedule and optimized through data flow merging, so as to minimize the cost of logic and internal buffer. Meanwhile, the architecture's data input rate is designed to be identical to its throughput, while the transmission order of input data can also match the zig-zag processing schedule. Therefore no intercommunication buffer is required between the deblocking filter and its previous component for speed matching or data reordering. As a result, only one 24×64 two-port SRAM as internal buffer is required in this design. When synthesized with SMIC 130nm process, the architecture costs a gate count of 30.2k, which is competitive considering its high performance.

  16. Sources of PCR-induced distortions in high-throughput sequencing data sets

    PubMed Central

    Kebschull, Justus M.; Zador, Anthony M.

    2015-01-01

    PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. PMID:26187991

  17. Optimisation Issues of High Throughput Medical Data and Video Streaming Traffic in 3G Wireless Environments.

    PubMed

    Istepanian, R S H; Philip, N

    2005-01-01

    In this paper we describe some of the optimisation issues relevant to the requirements of high throughput of medical data and video streaming traffic in 3G wireless environments. In particular we present a challenging 3G mobile health care application that requires a demanding 3G medical data throughput. We also describe the 3G QoS requirement of mObile Tele-Echography ultra-Light rObot system (OTELO that is designed to provide seamless 3G connectivity for real-time ultrasound medical video streams and diagnosis from a remote site (robotic and patient station) manipulated by an expert side (specialists) that is controlling the robotic scanning operation and presenting a real-time feedback diagnosis using 3G wireless communication links.

  18. Research progress of plant population genomics based on high-throughput sequencing.

    PubMed

    Wang, Yun-sheng

    2016-08-01

    Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.

  19. Novel method for high-throughput colony PCR screening in nanoliter-reactors

    PubMed Central

    Walser, Marcel; Pellaux, Rene; Meyer, Andreas; Bechtold, Matthias; Vanderschuren, Herve; Reinhardt, Richard; Magyar, Joseph; Panke, Sven; Held, Martin

    2009-01-01

    We introduce a technology for the rapid identification and sequencing of conserved DNA elements employing a novel suspension array based on nanoliter (nl)-reactors made from alginate. The reactors have a volume of 35 nl and serve as reaction compartments during monoseptic growth of microbial library clones, colony lysis, thermocycling and screening for sequence motifs via semi-quantitative fluorescence analyses. nl-Reactors were kept in suspension during all high-throughput steps which allowed performing the protocol in a highly space-effective fashion and at negligible expenses of consumables and reagents. As a first application, 11 high-quality microsatellites for polymorphism studies in cassava were isolated and sequenced out of a library of 20 000 clones in 2 days. The technology is widely scalable and we envision that throughputs for nl-reactor based screenings can be increased up to 100 000 and more samples per day thereby efficiently complementing protocols based on established deep-sequencing technologies. PMID:19282448

  20. Mobile element biology – new possibilities with high-throughput sequencing

    PubMed Central

    Xing, Jinchuan; Witherspoon, David J.; Jorde, Lynn B.

    2014-01-01

    Mobile elements compose more than half of the human genome, but until recently their large-scale detection was time-consuming and challenging. With the development of new high-throughput sequencing technologies, the complete spectrum of mobile element variation in humans can now be identified and analyzed. Thousands of new mobile element insertions have been discovered, yielding new insights into mobile element biology, evolution, and genomic variation. We review several high-throughput methods, with an emphasis on techniques that specifically target mobile element insertions in humans, and we highlight recent applications of these methods in evolutionary studies and in the analysis of somatic alterations in human cancers. PMID:23312846

  1. Laser cutting of ultra-thin glasses based on a nonlinear laser interaction effect

    NASA Astrophysics Data System (ADS)

    Chen, Jian; Wu, Zhouling

    2013-07-01

    Glass panel substrates have been widely used in consumer electronics such as in flat panel TVs, laptops, and cell phones. With the advancement in the industry, the glass substrates are becoming thinner and stronger for reduced weight and volume, which brings great challenges for traditional mechanical processes in terms of cut quality, yield, and throughput. Laser glass cutting provides a non-contact process with minimum impact and superior quality compared to the mechanical counterparts. In this paper, we presented recent progresses in advanced laser processing of ultra-thin glass substrates, especially laser-cutting of ultra-thin glasses by a high power laser through a nonlinear interaction effect. Our results indicate that this technique has great potential of application for mass production of ultra-thin glass substrates.

  2. Mapping of disease-associated variants in admixed populations

    PubMed Central

    2011-01-01

    Recent developments in high-throughput genotyping and whole-genome sequencing will enhance the identification of disease loci in admixed populations. We discuss how a more refined estimation of ancestry benefits both admixture mapping and association mapping, making disease loci identification in admixed populations more powerful. High-throughput genotyping and sequencing will enable refined estimation of ancestry, thus enhancing disease loci identification in admixed populations PMID:21635713

  3. Ultra-High-Throughput Screening of Natural Product Extracts to Identify Proapoptotic Inhibitors of Bcl-2 Family Proteins.

    PubMed

    Hassig, Christian A; Zeng, Fu-Yue; Kung, Paul; Kiankarimi, Mehrak; Kim, Sylvia; Diaz, Paul W; Zhai, Dayong; Welsh, Kate; Morshedian, Shana; Su, Ying; O'Keefe, Barry; Newman, David J; Rusman, Yudi; Kaur, Harneet; Salomon, Christine E; Brown, Susan G; Baire, Beeraiah; Michel, Andrew R; Hoye, Thomas R; Francis, Subhashree; Georg, Gunda I; Walters, Michael A; Divlianska, Daniela B; Roth, Gregory P; Wright, Amy E; Reed, John C

    2014-09-01

    Antiapoptotic Bcl-2 family proteins are validated cancer targets composed of six related proteins. From a drug discovery perspective, these are challenging targets that exert their cellular functions through protein-protein interactions (PPIs). Although several isoform-selective inhibitors have been developed using structure-based design or high-throughput screening (HTS) of synthetic chemical libraries, no large-scale screen of natural product collections has been reported. A competitive displacement fluorescence polarization (FP) screen of nearly 150,000 natural product extracts was conducted against all six antiapoptotic Bcl-2 family proteins using fluorochrome-conjugated peptide ligands that mimic functionally relevant PPIs. The screens were conducted in 1536-well format and displayed satisfactory overall HTS statistics, with Z'-factor values ranging from 0.72 to 0.83 and a hit confirmation rate between 16% and 64%. Confirmed active extracts were orthogonally tested in a luminescent assay for caspase-3/7 activation in tumor cells. Active extracts were resupplied, and effort toward the isolation of pure active components was initiated through iterative bioassay-guided fractionation. Several previously described altertoxins were isolated from a microbial source, and the pure compounds demonstrate activity in both Bcl-2 FP and caspase cellular assays. The studies demonstrate the feasibility of ultra-high-throughput screening using natural product sources and highlight some of the challenges associated with this approach. © 2014 Society for Laboratory Automation and Screening.

  4. Ultra High Throughput Screening of Natural Product Extracts to Identify Pro-apoptotic Inhibitors of Bcl-2 Family Proteins

    PubMed Central

    Hassig, Christian A.; Zeng, Fu-Yue; Kung, Paul; Kiankarimi, Mehrak; Kim, Sylvia; Diaz, Paul W.; Zhai, Dayong; Welsh, Kate; Morshedian, Shana; Su, Ying; O'Keefe, Barry; Newman, David J.; Rusman, Yudi; Kaur, Harneet; Salomon, Christine E.; Brown, Susan G.; Baire, Beeraiah; Michel, Andrew R.; Hoye, Thomas R.; Francis, Subhashree; Georg, Gunda I.; Walters, Michael A.; Divlianska, Daniela B.; Roth, Gregory P.; Wright, Amy E.; Reed, John C.

    2015-01-01

    Anti-apoptotic Bcl-2 family proteins are validated cancer targets comprised of six related proteins. From a drug discovery perspective, these are challenging targets that exert their cellular functions through protein-protein interactions (PPIs). While several isoform-selective inhibitors have been developed using structure-based design or high throughput screening (HTS) of synthetic chemical libraries, no large scale screen of natural product collections has been reported. A competitive displacement fluorescence polarization (FP) screen of nearly 150,000 natural product extracts was conducted against all six anti-apoptotic Bcl-2 family proteins using fluorochrome-conjugated peptide ligands that mimic functionally-relevant PPIs. The screens were conducted in 1,536-well format and displayed satisfactory overall HTS statistics, with Z’-factor values ranging from 0.72 to 0.83, and a hit confirmation rate between 16-64%. Confirmed active extracts were orthogonally tested in a luminescent assay for caspase-3/7 activation in tumor cells. Active extracts were resupplied and effort toward the isolation of pure active components was initiated through iterative bioassay-guided fractionation. Several previously described altertoxins were isolated from a microbial source and the pure compounds demonstrate activity in both Bcl-2 FP and caspase cellular assays. The studies demonstrate the feasibility of ultra high throughput screening using natural product sources and highlight some of the challenges associated with this approach. PMID:24870016

  5. Ultra-high-speed variable focus optics for novel applications in advanced imaging

    NASA Astrophysics Data System (ADS)

    Kang, S.; Dotsenko, E.; Amrhein, D.; Theriault, C.; Arnold, C. B.

    2018-02-01

    With the advancement of ultra-fast manufacturing technologies, high speed imaging with high 3D resolution has become increasingly important. Here we show the use of an ultra-high-speed variable focus optical element, the TAG Lens, to enable new ways to acquire 3D information from an object. The TAG Lens uses sound to adjust the index of refraction profile in a liquid and thereby can achieve focal scanning rates greater than 100 kHz. When combined with a high-speed pulsed LED and a high-speed camera, we can exploit this phenomenon to achieve high-resolution imaging through large depths. By combining the image acquisition with digital image processing, we can extract relevant parameters such as tilt and angle information from objects in the image. Due to the high speeds at which images can be collected and processed, we believe this technique can be used as an efficient method of industrial inspection and metrology for high throughput applications.

  6. HTP-OligoDesigner: An Online Primer Design Tool for High-Throughput Gene Cloning and Site-Directed Mutagenesis.

    PubMed

    Camilo, Cesar M; Lima, Gustavo M A; Maluf, Fernando V; Guido, Rafael V C; Polikarpov, Igor

    2016-01-01

    Following burgeoning genomic and transcriptomic sequencing data, biochemical and molecular biology groups worldwide are implementing high-throughput cloning and mutagenesis facilities in order to obtain a large number of soluble proteins for structural and functional characterization. Since manual primer design can be a time-consuming and error-generating step, particularly when working with hundreds of targets, the automation of primer design process becomes highly desirable. HTP-OligoDesigner was created to provide the scientific community with a simple and intuitive online primer design tool for both laboratory-scale and high-throughput projects of sequence-independent gene cloning and site-directed mutagenesis and a Tm calculator for quick queries.

  7. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs

    PubMed Central

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore

    2017-01-01

    Abstract Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. PMID:28108659

  8. Dual-mode ultraflow access networks: a hybrid solution for the access bottleneck

    NASA Astrophysics Data System (ADS)

    Kazovsky, Leonid G.; Shen, Thomas Shunrong; Dhaini, Ahmad R.; Yin, Shuang; De Leenheer, Marc; Detwiler, Benjamin A.

    2013-12-01

    Optical Flow Switching (OFS) is a promising solution for large Internet data transfers. In this paper, we introduce UltraFlow Access, a novel optical access network architecture that offers dual-mode service to its end-users: IP and OFS. With UltraFlow Access, we design and implement a new dual-mode control plane and a new dual-mode network stack to ensure efficient connection setup and reliable and optimal data transmission. We study the impact of the UltraFlow system's design on the network throughput. Our experimental results show that with an optimized system design, near optimal (around 10 Gb/s) OFS data throughput can be attained when the line rate is 10Gb/s.

  9. A Protocol for Functional Assessment of Whole-Protein Saturation Mutagenesis Libraries Utilizing High-Throughput Sequencing.

    PubMed

    Stiffler, Michael A; Subramanian, Subu K; Salinas, Victor H; Ranganathan, Rama

    2016-07-03

    Site-directed mutagenesis has long been used as a method to interrogate protein structure, function and evolution. Recent advances in massively-parallel sequencing technology have opened up the possibility of assessing the functional or fitness effects of large numbers of mutations simultaneously. Here, we present a protocol for experimentally determining the effects of all possible single amino acid mutations in a protein of interest utilizing high-throughput sequencing technology, using the 263 amino acid antibiotic resistance enzyme TEM-1 β-lactamase as an example. In this approach, a whole-protein saturation mutagenesis library is constructed by site-directed mutagenic PCR, randomizing each position individually to all possible amino acids. The library is then transformed into bacteria, and selected for the ability to confer resistance to β-lactam antibiotics. The fitness effect of each mutation is then determined by deep sequencing of the library before and after selection. Importantly, this protocol introduces methods which maximize sequencing read depth and permit the simultaneous selection of the entire mutation library, by mixing adjacent positions into groups of length accommodated by high-throughput sequencing read length and utilizing orthogonal primers to barcode each group. Representative results using this protocol are provided by assessing the fitness effects of all single amino acid mutations in TEM-1 at a clinically relevant dosage of ampicillin. The method should be easily extendable to other proteins for which a high-throughput selection assay is in place.

  10. eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing

    PubMed Central

    2014-01-01

    Background RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. Results We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification” includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module “mRNA identification” includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module “Target screening” provides expression profiling analyses and graphic visualization. The module “Self-testing” offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program’s functionality. Conclusions eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory. PMID:24593312

  11. eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing.

    PubMed

    Yuan, Tiezheng; Huang, Xiaoyi; Dittmar, Rachel L; Du, Meijun; Kohli, Manish; Boardman, Lisa; Thibodeau, Stephen N; Wang, Liang

    2014-03-05

    RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification" includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module "mRNA identification" includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module "Target screening" provides expression profiling analyses and graphic visualization. The module "Self-testing" offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program's functionality. eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory.

  12. A new fungal large subunit ribosomal RNA primer for high throughput sequencing surveys

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mueller, Rebecca C.; Gallegos-Graves, La Verne; Kuske, Cheryl R.

    The inclusion of phylogenetic metrics in community ecology has provided insights into important ecological processes, particularly when combined with high-throughput sequencing methods; however, these approaches have not been widely used in studies of fungal communities relative to other microbial groups. Two obstacles have been considered: (1) the internal transcribed spacer (ITS) region has limited utility for constructing phylogenies and (2) most PCR primers that target the large subunit (LSU) ribosomal unit generate amplicons that exceed current limits of high-throughput sequencing platforms. We designed and tested a PCR primer (LR22R) to target approximately 300–400 bp region of the D2 hypervariable regionmore » of the fungal LSU for use with the Illumina MiSeq platform. Both in silico and empirical analyses showed that the LR22R–LR3 pair captured a broad range of fungal taxonomic groups with a small fraction of non-fungal groups. Phylogenetic placement of publically available LSU D2 sequences showed broad agreement with taxonomic classification. Comparisons of the LSU D2 and the ITS2 ribosomal regions from environmental samples and known communities showed similar discriminatory abilities of the two primer sets. Altogether, these findings show that the LR22R–LR3 primer pair has utility for phylogenetic analyses of fungal communities using high-throughput sequencing methods.« less

  13. A new fungal large subunit ribosomal RNA primer for high throughput sequencing surveys

    DOE PAGES

    Mueller, Rebecca C.; Gallegos-Graves, La Verne; Kuske, Cheryl R.

    2015-12-09

    The inclusion of phylogenetic metrics in community ecology has provided insights into important ecological processes, particularly when combined with high-throughput sequencing methods; however, these approaches have not been widely used in studies of fungal communities relative to other microbial groups. Two obstacles have been considered: (1) the internal transcribed spacer (ITS) region has limited utility for constructing phylogenies and (2) most PCR primers that target the large subunit (LSU) ribosomal unit generate amplicons that exceed current limits of high-throughput sequencing platforms. We designed and tested a PCR primer (LR22R) to target approximately 300–400 bp region of the D2 hypervariable regionmore » of the fungal LSU for use with the Illumina MiSeq platform. Both in silico and empirical analyses showed that the LR22R–LR3 pair captured a broad range of fungal taxonomic groups with a small fraction of non-fungal groups. Phylogenetic placement of publically available LSU D2 sequences showed broad agreement with taxonomic classification. Comparisons of the LSU D2 and the ITS2 ribosomal regions from environmental samples and known communities showed similar discriminatory abilities of the two primer sets. Altogether, these findings show that the LR22R–LR3 primer pair has utility for phylogenetic analyses of fungal communities using high-throughput sequencing methods.« less

  14. High-Throughput Sequencing for Detection of Subpopulations of Bacteria Not Previously Associated with Artisanal Cheeses

    PubMed Central

    Quigley, Lisa; O'Sullivan, Orla; Beresford, Tom P.; Ross, R. Paul; Fitzgerald, Gerald F.

    2012-01-01

    Here, high-throughput sequencing was employed to reveal the highly diverse bacterial populations present in 62 Irish artisanal cheeses and, in some cases, associated cheese rinds. Using this approach, we revealed the presence of several genera not previously associated with cheese, including Faecalibacterium, Prevotella, and Helcococcus and, for the first time, detected the presence of Arthrobacter and Brachybacterium in goats' milk cheese. Our analysis confirmed many previously observed patterns, such as the dominance of typical cheese bacteria, the fact that the microbiota of raw and pasteurized milk cheeses differ, and that the level of cheese maturation has a significant influence on Lactobacillus populations. It was also noted that cheeses containing adjunct ingredients had lower proportions of Lactococcus species. It is thus apparent that high-throughput sequencing-based investigations can provide valuable insights into the microbial populations of artisanal foods. PMID:22685131

  15. High-throughput sequencing for detection of subpopulations of bacteria not previously associated with artisanal cheeses.

    PubMed

    Quigley, Lisa; O'Sullivan, Orla; Beresford, Tom P; Ross, R Paul; Fitzgerald, Gerald F; Cotter, Paul D

    2012-08-01

    Here, high-throughput sequencing was employed to reveal the highly diverse bacterial populations present in 62 Irish artisanal cheeses and, in some cases, associated cheese rinds. Using this approach, we revealed the presence of several genera not previously associated with cheese, including Faecalibacterium, Prevotella, and Helcococcus and, for the first time, detected the presence of Arthrobacter and Brachybacterium in goats' milk cheese. Our analysis confirmed many previously observed patterns, such as the dominance of typical cheese bacteria, the fact that the microbiota of raw and pasteurized milk cheeses differ, and that the level of cheese maturation has a significant influence on Lactobacillus populations. It was also noted that cheeses containing adjunct ingredients had lower proportions of Lactococcus species. It is thus apparent that high-throughput sequencing-based investigations can provide valuable insights into the microbial populations of artisanal foods.

  16. Short-read, high-throughput sequencing technology for STR genotyping

    PubMed Central

    Bornman, Daniel M.; Hester, Mark E.; Schuetter, Jared M.; Kasoji, Manjula D.; Minard-Smith, Angela; Barden, Curt A.; Nelson, Scott C.; Godbold, Gene D.; Baker, Christine H.; Yang, Boyu; Walther, Jacquelyn E.; Tornes, Ivan E.; Yan, Pearlly S.; Rodriguez, Benjamin; Bundschuh, Ralf; Dickens, Michael L.; Young, Brian A.; Faith, Seth A.

    2013-01-01

    DNA-based methods for human identification principally rely upon genotyping of short tandem repeat (STR) loci. Electrophoretic-based techniques for variable-length classification of STRs are universally utilized, but are limited in that they have relatively low throughput and do not yield nucleotide sequence information. High-throughput sequencing technology may provide a more powerful instrument for human identification, but is not currently validated for forensic casework. Here, we present a systematic method to perform high-throughput genotyping analysis of the Combined DNA Index System (CODIS) STR loci using short-read (150 bp) massively parallel sequencing technology. Open source reference alignment tools were optimized to evaluate PCR-amplified STR loci using a custom designed STR genome reference. Evaluation of this approach demonstrated that the 13 CODIS STR loci and amelogenin (AMEL) locus could be accurately called from individual and mixture samples. Sensitivity analysis showed that as few as 18,500 reads, aligned to an in silico referenced genome, were required to genotype an individual (>99% confidence) for the CODIS loci. The power of this technology was further demonstrated by identification of variant alleles containing single nucleotide polymorphisms (SNPs) and the development of quantitative measurements (reads) for resolving mixed samples. PMID:25621315

  17. A family-based probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data.

    PubMed

    Cartwright, Reed A; Hussin, Julie; Keebler, Jonathan E M; Stone, Eric A; Awadalla, Philip

    2012-01-06

    Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.

  18. A confidence interval analysis of sampling effort, sequencing depth, and taxonomic resolution of fungal community ecology in the era of high-throughput sequencing.

    PubMed

    Oono, Ryoko

    2017-01-01

    High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions 'how and why are communities different?' This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences.

  19. A confidence interval analysis of sampling effort, sequencing depth, and taxonomic resolution of fungal community ecology in the era of high-throughput sequencing

    PubMed Central

    2017-01-01

    High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions ‘how and why are communities different?’ This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences. PMID:29253889

  20. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers.

    PubMed

    Hou, Weiguo; Wang, Shang; Briggs, Brandon R; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

    2018-01-01

    Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  1. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    PubMed Central

    Hou, Weiguo; Wang, Shang; Briggs, Brandon R.; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

    2018-01-01

    Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  2. Development of a Rapid Fluorescence-Based High-Throughput Screening Assay to Identify Novel Kynurenine 3-Monooxygenase Inhibitor Scaffolds.

    PubMed

    Jacobs, K R; Guillemin, G J; Lovejoy, D B

    2018-02-01

    Kynurenine 3-monooxygenase (KMO) is a well-validated therapeutic target for the treatment of neurodegenerative diseases, including Alzheimer's disease (AD) and Huntington's disease (HD). This work reports a facile fluorescence-based KMO assay optimized for high-throughput screening (HTS) that achieves a throughput approximately 20-fold higher than the fastest KMO assay currently reported. The screen was run with excellent performance (average Z' value of 0.80) from 110,000 compounds across 341 plates and exceeded all statistical parameters used to describe a robust HTS assay. A subset of molecules was selected for validation by ultra-high-performance liquid chromatography, resulting in the confirmation of a novel hit with an IC 50 comparable to that of the well-described KMO inhibitor Ro-61-8048. A medicinal chemistry program is currently underway to further develop our novel KMO inhibitor scaffolds.

  3. Role of APOE Isoforms in the Pathogenesis of TBI induced Alzheimer’s Disease

    DTIC Science & Technology

    2016-10-01

    deletion, APOE targeted replacement, complex breeding, CCI model optimization, mRNA library generation, high throughput massive parallel sequencing...demonstrate that the lack of Abca1 increases amyloid plaques and decreased APOE protein levels in AD-model mice. In this proposal we will test the hypothesis...injury, inflammatory reaction, transcriptome, high throughput massive parallel sequencing, mRNA-seq., behavioral testing, memory impairment, recovery 3

  4. Ultra-high throughput detection of single cell β-galactosidase activity in droplets using micro-optical lens array

    NASA Astrophysics Data System (ADS)

    Lim, Jiseok; Vrignon, Jérémy; Gruner, Philipp; Karamitros, Christos S.; Konrad, Manfred; Baret, Jean-Christophe

    2013-11-01

    We demonstrate the use of a hybrid microfluidic-micro-optical system for the screening of enzymatic activity at the single cell level. Escherichia coli β-galactosidase activity is revealed by a fluorogenic assay in 100 pl droplets. Individual droplets containing cells are screened by measuring their fluorescence signal using a high-speed camera. The measurement is parallelized over 100 channels equipped with microlenses and analyzed by image processing. A reinjection rate of 1 ml of emulsion per minute was reached corresponding to more than 105 droplets per second, an analytical throughput larger than those obtained using flow cytometry.

  5. Improving Cardiac Action Potential Measurements: 2D and 3D Cell Culture.

    PubMed

    Daily, Neil J; Yin, Yue; Kemanli, Pinar; Ip, Brian; Wakatsuki, Tetsuro

    2015-11-01

    Progress in the development of assays for measuring cardiac action potential is crucial for the discovery of drugs for treating cardiac disease and assessing cardiotoxicity. Recently, high-throughput methods for assessing action potential using induced pluripotent stem cell (iPSC) derived cardiomyocytes in both two-dimensional monolayer cultures and three-dimensional tissues have been developed. We describe an improved method for assessing cardiac action potential using an ultra-fast cost-effective plate reader with commercially available dyes. Our methods improve dramatically the detection of the fluorescence signal from these dyes and make way for the development of more high-throughput methods for cardiac drug discovery and cardiotoxicity.

  6. History, applications, and challenges of immune repertoire research.

    PubMed

    Liu, Xiao; Wu, Jinghua

    2018-02-27

    The diversity of T and B cells in terms of their receptor sequences is huge in the vertebrate's immune system and provides broad protection against the vast diversity of pathogens. Immune repertoire is defined as the sum of T cell receptors and B cell receptors (also named immunoglobulin) that makes the organism's adaptive immune system. Before the emergence of high-throughput sequencing, the studies on immune repertoire were limited by the underdeveloped methodologies, since it was impossible to capture the whole picture by the low-throughput tools. The massive paralleled sequencing technology suits perfectly the researches on immune repertoire. In this article, we review the history of immune repertoire studies, in terms of technologies and research applications. Particularly, we discuss several aspects of challenges in this field and highlight the efforts to develop potential solutions, in the era of high-throughput sequencing of the immune repertoire.

  7. Mapping the miRNA interactome by crosslinking ligation and sequencing of hybrids (CLASH)

    PubMed Central

    Helwak, Aleksandra; Tollervey, David

    2014-01-01

    RNA-RNA interactions play critical roles in many cellular processes but studying them is difficult and laborious. Here, we describe an experimental procedure, termed crosslinking ligation and sequencing of hybrids (CLASH), which allows high-throughput identification of sites of RNA-RNA interaction. During CLASH, a tagged bait protein is UV crosslinked in vivo to stabilise RNA interactions and purified under denaturing conditions. RNAs associated with the bait protein are partially truncated, and the ends of RNA-duplexes are ligated together. Following linker addition, cDNA library preparation and high-throughput sequencing, the ligated duplexes give rise to chimeric cDNAs, which unambiguously identify RNA-RNA interaction sites independent of bioinformatic predictions. This protocol is optimized for studying miRNA targets bound by Argonaute proteins, but should be easily adapted for other RNA-binding proteins and classes of RNA. The protocol requires around 5 days to complete, excluding the time required for high-throughput sequencing and bioinformatic analyses. PMID:24577361

  8. High Throughput Computing Impact on Meta Genomics (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Gore, Brooklin

    2018-02-01

    This presentation includes a brief background on High Throughput Computing, correlating gene transcription factors, optical mapping, genotype to phenotype mapping via QTL analysis, and current work on next gen sequencing.

  9. Low-Cost, High-Throughput Sequencing of DNA Assemblies Using a Highly Multiplexed Nextera Process.

    PubMed

    Shapland, Elaine B; Holmes, Victor; Reeves, Christopher D; Sorokin, Elena; Durot, Maxime; Platt, Darren; Allen, Christopher; Dean, Jed; Serber, Zach; Newman, Jack; Chandran, Sunil

    2015-07-17

    In recent years, next-generation sequencing (NGS) technology has greatly reduced the cost of sequencing whole genomes, whereas the cost of sequence verification of plasmids via Sanger sequencing has remained high. Consequently, industrial-scale strain engineers either limit the number of designs or take short cuts in quality control. Here, we show that over 4000 plasmids can be completely sequenced in one Illumina MiSeq run for less than $3 each (15× coverage), which is a 20-fold reduction over using Sanger sequencing (2× coverage). We reduced the volume of the Nextera tagmentation reaction by 100-fold and developed an automated workflow to prepare thousands of samples for sequencing. We also developed software to track the samples and associated sequence data and to rapidly identify correctly assembled constructs having the fewest defects. As DNA synthesis and assembly become a centralized commodity, this NGS quality control (QC) process will be essential to groups operating high-throughput pipelines for DNA construction.

  10. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs.

    PubMed

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus; Morling, Niels

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards with buccal swabs and compared the results with those obtained with DNA extracted using the EZ1 DNA Investigator Kit. Concordant profiles were obtained for all samples. Our protocol includes simple punch, wash, and PCR steps, reducing cost and hands-on time in the laboratory. Furthermore, it facilitates automation of DNA sequencing.

  11. Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis

    PubMed Central

    2012-01-01

    Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739

  12. Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.

    PubMed

    Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong

    2012-01-25

    The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.

  13. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.

    PubMed

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G; Rigoutsos, Isidore; Kirino, Yohei

    2017-05-19

    Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. High-throughput analysis of the protein sequence-stability landscape using a quantitative "yeast surface two-hybrid" system and fragment reconstitution

    PubMed Central

    Dutta, Sanjib; Koide, Akiko; Koide, Shohei

    2008-01-01

    Stability evaluation of many mutants can lead to a better understanding of the sequence determinants of a structural motif and of factors governing protein stability and protein evolution. The traditional biophysical analysis of protein stability is low throughput, limiting our ability to widely explore the sequence space in a quantitative manner. In this study, we have developed a high-throughput library screening method for quantifying stability changes, which is based on protein fragment reconstitution and yeast surface display. Our method exploits the thermodynamic linkage between protein stability and fragment reconstitution and the ability of the yeast surface display technique to quantitatively evaluate protein-protein interactions. The method was applied to a fibronectin type III (FN3) domain. Characterization of fragment reconstitution was facilitated by the co-expression of two FN3 fragments, thus establishing a "yeast surface two-hybrid" method. Importantly, our method does not rely on competition between clones and thus eliminates a common limitation of high-throughput selection methods in which the most stable variants are predominantly recovered. Thus, it allows for the isolation of sequences that exhibits a desired level of stability. We identified over one hundred unique sequences for a β-bulge motif, which was significantly more informative than natural sequences of the FN3 family in revealing the sequence determinants for the β-bulge. Our method provides a powerful means to rapidly assess stability of many variants, to systematically assess contribution of different factors to protein stability and to enhance protein stability. PMID:18674545

  15. Use of the melting curve assay as a means for high-throughput quantification of Illumina sequencing libraries.

    PubMed

    Shinozuka, Hiroshi; Forster, John W

    2016-01-01

    Background. Multiplexed sequencing is commonly performed on massively parallel short-read sequencing platforms such as Illumina, and the efficiency of library normalisation can affect the quality of the output dataset. Although several library normalisation approaches have been established, none are ideal for highly multiplexed sequencing due to issues of cost and/or processing time. Methods. An inexpensive and high-throughput library quantification method has been developed, based on an adaptation of the melting curve assay. Sequencing libraries were subjected to the assay using the Bio-Rad Laboratories CFX Connect(TM) Real-Time PCR Detection System. The library quantity was calculated through summation of reduction of relative fluorescence units between 86 and 95 °C. Results.PCR-enriched sequencing libraries are suitable for this quantification without pre-purification of DNA. Short DNA molecules, which ideally should be eliminated from the library for subsequent processing, were differentiated from the target DNA in a mixture on the basis of differences in melting temperature. Quantification results for long sequences targeted using the melting curve assay were correlated with those from existing methods (R (2) > 0.77), and that observed from MiSeq sequencing (R (2) = 0.82). Discussion.The results of multiplexed sequencing suggested that the normalisation performance of the described method is equivalent to that of another recently reported high-throughput bead-based method, BeNUS. However, costs for the melting curve assay are considerably lower and processing times shorter than those of other existing methods, suggesting greater suitability for highly multiplexed sequencing applications.

  16. "First generation" automated DNA sequencing technology.

    PubMed

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  17. False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing

    PubMed Central

    2014-01-01

    Background Identification of historic pathogens is challenging since false positives and negatives are a serious risk. Environmental non-pathogenic contaminants are ubiquitous. Furthermore, public genetic databases contain limited information regarding these species. High-throughput sequencing may help reliably detect and identify historic pathogens. Results We shotgun-sequenced 8 16th-century Mixtec individuals from the site of Teposcolula Yucundaa (Oaxaca, Mexico) who are reported to have died from the huey cocoliztli (‘Great Pestilence’ in Nahautl), an unknown disease that decimated native Mexican populations during the Spanish colonial period, in order to identify the pathogen. Comparison of these sequences with those deriving from the surrounding soil and from 4 precontact individuals from the site found a wide variety of contaminant organisms that confounded analyses. Without the comparative sequence data from the precontact individuals and soil, false positives for Yersinia pestis and rickettsiosis could have been reported. Conclusions False positives and negatives remain problematic in ancient DNA analyses despite the application of high-throughput sequencing. Our results suggest that several studies claiming the discovery of ancient pathogens may need further verification. Additionally, true single molecule sequencing’s short read lengths, inability to sequence through DNA lesions, and limited ancient-DNA-specific technical development hinder its application to palaeopathology. PMID:24568097

  18. Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

    PubMed Central

    Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

    2015-01-01

    Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438

  19. Evaluation of sequencing approaches for high-throughput toxicogenomics (SOT)

    EPA Science Inventory

    Whole-genome in vitro transcriptomics has shown the capability to identify mechanisms of action and estimates of potency for chemical-mediated effects in a toxicological framework, but with limited throughput and high cost. We present the evaluation of three toxicogenomics platfo...

  20. Advances in High-Throughput Speed, Low-Latency Communication for Embedded Instrumentation (7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Jordan, Scott

    2018-01-24

    Scott Jordan on "Advances in high-throughput speed, low-latency communication for embedded instrumentation" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  1. Jeffamine Derivatized TentaGel Beads and PDMS Microbead Cassettes for Ultra-high Throughput in situ Releasable Solution-Phase Cell-based Screening of OBOC Combinatorial Small Molecule Libraries

    PubMed Central

    Townsend, Jared B.; Shaheen, Farzana; Liu, Ruiwu; Lam, Kit S.

    2011-01-01

    A method to efficiently immobilize and partition large quantities of microbeads in an array format in microfabricated polydimethylsiloxane (PDMS) cassette for high-throughput in situ releasable solution-phase cell-based screening of one-bead-one-compound (OBOC) combinatorial libraries is described. Commercially available Jeffamine triamine T-403 (∼440 Da) was derivatized such that two of its amino groups were protected by Fmoc and the remaining amino group capped with succinic anhydride to generate a carboxyl group. This resulting tri-functional hydrophilic polymer was then sequentially coupled two times to the outer layer of topologically segregated bilayer TentaGel (TG) beads with solid phase peptide synthesis chemistry, resulting in beads with increased loading capacity, hydrophilicity and porosity at the outer layer. We have found that such bead configuration can facilitate ultra high-throughput in situ releasable solution-phase screening of OBOC libraries. An encoded releasable OBOC small molecule library was constructed on Jeffamine derivatized TG beads with library compounds tethered to the outer layer via a disulfide linker and coding tags in the interior of the beads. Compound-beads could be efficiently loaded (5-10 minutes) into a 5 cm diameter Petri dish containing a 10,000-well PDMS microbead cassette, such that over 90% of the microwells were each filled with only one compound-bead. Jurkat T-lymphoid cancer cells suspended in Matrigel® were then layered over the microbead cassette to immobilize the compound-beads. After 24 hours of incubation at 37°C, dithiothreitol was added to trigger the release of library compounds. Forty-eight hours later, MTT reporter assay was used to identify regions of reduced cell viability surrounding each positive bead. From a total of about 20,000 beads screened, 3 positive beads were detected and physically isolated for decoding. A strong consensus motif was identified for these three positive compounds. These compounds were re-synthesized and found to be cytotoxic (IC50 50-150 μM) against two T-lymphoma cell lines and less so against the MDA-MB 231 breast cancer cell line. This novel ultra high-throughput OBOC releasable method can potentially be adapted to many existing 96- or 384-well solution-phase cell-based or biochemical assays. PMID:20593859

  2. Quick, sensitive and specific detection and evaluation of quantification of minor variants by high-throughput sequencing.

    PubMed

    Leung, Ross Ka-Kit; Dong, Zhi Qiang; Sa, Fei; Chong, Cheong Meng; Lei, Si Wan; Tsui, Stephen Kwok-Wing; Lee, Simon Ming-Yuen

    2014-02-01

    Minor variants have significant implications in quasispecies evolution, early cancer detection and non-invasive fetal genotyping but their accurate detection by next-generation sequencing (NGS) is hampered by sequencing errors. We generated sequencing data from mixtures at predetermined ratios in order to provide insight into sequencing errors and variations that can arise for which simulation cannot be performed. The information also enables better parameterization in depth of coverage, read quality and heterogeneity, library preparation techniques, technical repeatability for mathematical modeling, theory development and simulation experimental design. We devised minor variant authentication rules that achieved 100% accuracy in both testing and validation experiments. The rules are free from tedious inspection of alignment accuracy, sequencing read quality or errors introduced by homopolymers. The authentication processes only require minor variants to: (1) have minimum depth of coverage larger than 30; (2) be reported by (a) four or more variant callers, or (b) DiBayes or LoFreq, plus SNVer (or BWA when no results are returned by SNVer), and with the interassay coefficient of variation (CV) no larger than 0.1. Quantification accuracy undermined by sequencing errors could neither be overcome by ultra-deep sequencing, nor recruiting more variant callers to reach a consensus, such that consistent underestimation and overestimation (i.e. low CV) were observed. To accommodate stochastic error and adjust the observed ratio within a specified accuracy, we presented a proof of concept for the use of a double calibration curve for quantification, which provides an important reference towards potential industrial-scale fabrication of calibrants for NGS.

  3. Ultra-Structure database design methodology for managing systems biology data and analyses

    PubMed Central

    Maier, Christopher W; Long, Jeffrey G; Hemminger, Bradley M; Giddings, Morgan C

    2009-01-01

    Background Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping). Results We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research. Conclusion We find Ultra-Structure offers substantial benefits for biological information systems, the largest being the integration of diverse information sources into a common framework. This facilitates systems biology research by integrating data from disparate high-throughput techniques. It also enables us to readily incorporate new data types, sources, and domain knowledge with no change to the database structure or associated computer code. Ultra-Structure may be a significant step towards solving the hard problem of data management and integration in the systems biology era. PMID:19691849

  4. Method validation for 243 pesticides and environmental contaminants in meats and poultry by tandem mass spectrometry coupled to low-pressure gas chromatography and ultra high-performance liquid chromatography

    USDA-ARS?s Scientific Manuscript database

    An easy and reliable high-throughput analysis method was developed and validated for 192 diverse pesticides and 51 environmental contaminants (13 PCB congeners, 14 PAHs, 7 PBDE congeners, and 17 novel flame retardants) in cattle, swine, and poultry muscle. Sample preparation was based on the “quick,...

  5. High-Throughput Tabular Data Processor - Platform independent graphical tool for processing large data sets.

    PubMed

    Madanecki, Piotr; Bałut, Magdalena; Buckley, Patrick G; Ochocka, J Renata; Bartoszewski, Rafał; Crossman, David K; Messiaen, Ludwine M; Piotrowski, Arkadiusz

    2018-01-01

    High-throughput technologies generate considerable amount of data which often requires bioinformatic expertise to analyze. Here we present High-Throughput Tabular Data Processor (HTDP), a platform independent Java program. HTDP works on any character-delimited column data (e.g. BED, GFF, GTF, PSL, WIG, VCF) from multiple text files and supports merging, filtering and converting of data that is produced in the course of high-throughput experiments. HTDP can also utilize itemized sets of conditions from external files for complex or repetitive filtering/merging tasks. The program is intended to aid global, real-time processing of large data sets using a graphical user interface (GUI). Therefore, no prior expertise in programming, regular expression, or command line usage is required of the user. Additionally, no a priori assumptions are imposed on the internal file composition. We demonstrate the flexibility and potential of HTDP in real-life research tasks including microarray and massively parallel sequencing, i.e. identification of disease predisposing variants in the next generation sequencing data as well as comprehensive concurrent analysis of microarray and sequencing results. We also show the utility of HTDP in technical tasks including data merge, reduction and filtering with external criteria files. HTDP was developed to address functionality that is missing or rudimentary in other GUI software for processing character-delimited column data from high-throughput technologies. Flexibility, in terms of input file handling, provides long term potential functionality in high-throughput analysis pipelines, as the program is not limited by the currently existing applications and data formats. HTDP is available as the Open Source software (https://github.com/pmadanecki/htdp).

  6. High-Throughput Tabular Data Processor – Platform independent graphical tool for processing large data sets

    PubMed Central

    Bałut, Magdalena; Buckley, Patrick G.; Ochocka, J. Renata; Bartoszewski, Rafał; Crossman, David K.; Messiaen, Ludwine M.; Piotrowski, Arkadiusz

    2018-01-01

    High-throughput technologies generate considerable amount of data which often requires bioinformatic expertise to analyze. Here we present High-Throughput Tabular Data Processor (HTDP), a platform independent Java program. HTDP works on any character-delimited column data (e.g. BED, GFF, GTF, PSL, WIG, VCF) from multiple text files and supports merging, filtering and converting of data that is produced in the course of high-throughput experiments. HTDP can also utilize itemized sets of conditions from external files for complex or repetitive filtering/merging tasks. The program is intended to aid global, real-time processing of large data sets using a graphical user interface (GUI). Therefore, no prior expertise in programming, regular expression, or command line usage is required of the user. Additionally, no a priori assumptions are imposed on the internal file composition. We demonstrate the flexibility and potential of HTDP in real-life research tasks including microarray and massively parallel sequencing, i.e. identification of disease predisposing variants in the next generation sequencing data as well as comprehensive concurrent analysis of microarray and sequencing results. We also show the utility of HTDP in technical tasks including data merge, reduction and filtering with external criteria files. HTDP was developed to address functionality that is missing or rudimentary in other GUI software for processing character-delimited column data from high-throughput technologies. Flexibility, in terms of input file handling, provides long term potential functionality in high-throughput analysis pipelines, as the program is not limited by the currently existing applications and data formats. HTDP is available as the Open Source software (https://github.com/pmadanecki/htdp). PMID:29432475

  7. The autism sequencing consortium: large-scale, high-throughput sequencing in autism spectrum disorders.

    PubMed

    Buxbaum, Joseph D; Daly, Mark J; Devlin, Bernie; Lehner, Thomas; Roeder, Kathryn; State, Matthew W

    2012-12-20

    Research during the past decade has seen significant progress in the understanding of the genetic architecture of autism spectrum disorders (ASDs), with gene discovery accelerating as the characterization of genomic variation has become increasingly comprehensive. At the same time, this research has highlighted ongoing challenges. Here we address the enormous impact of high-throughput sequencing (HTS) on ASD gene discovery, outline a consensus view for leveraging this technology, and describe a large multisite collaboration developed to accomplish these goals. Similar approaches could prove effective for severe neurodevelopmental disorders more broadly. Copyright © 2012 Elsevier Inc. All rights reserved.

  8. High-throughput sequencing enhanced phage display enables the identification of patient-specific epitope motifs in serum.

    PubMed

    Christiansen, Anders; Kringelum, Jens V; Hansen, Christian S; Bøgh, Katrine L; Sullivan, Eric; Patel, Jigar; Rigby, Neil M; Eiwegger, Thomas; Szépfalusi, Zsolt; de Masi, Federico; Nielsen, Morten; Lund, Ole; Dufva, Martin

    2015-08-06

    Phage display is a prominent screening technique with a multitude of applications including therapeutic antibody development and mapping of antigen epitopes. In this study, phages were selected based on their interaction with patient serum and exhaustively characterised by high-throughput sequencing. A bioinformatics approach was developed in order to identify peptide motifs of interest based on clustering and contrasting to control samples. Comparison of patient and control samples confirmed a major issue in phage display, namely the selection of unspecific peptides. The potential of the bioinformatic approach was demonstrated by identifying epitopes of a prominent peanut allergen, Ara h 1, in sera from patients with severe peanut allergy. The identified epitopes were confirmed by high-density peptide micro-arrays. The present study demonstrates that high-throughput sequencing can empower phage display by (i) enabling the analysis of complex biological samples, (ii) circumventing the traditional laborious picking and functional testing of individual phage clones and (iii) reducing the number of selection rounds.

  9. A challenge to the striking genotypic heterogeneity of retinitis pigmentosa: a better understanding of the pathophysiology using the newest genetic strategies

    PubMed Central

    Sorrentino, F S; Gallenga, C E; Bonifazzi, C; Perri, P

    2016-01-01

    Retinitis pigmentosa (RP) is a group of inherited retinal disorders characterized by a complex association between tremendous genotypic multiplicity and great phenotypic heterogeneity. The severity of the clinical manifestation depends on penetrance and expressivity of the disease-gene. Also, various interactions between gene expression and environmental factors have been hypothesized. More than 250 genes with ~4500 causative mutations have been reported to be involved in different RP-related mechanisms. Nowadays, not more than the 50% of RPs are attributable to identified genes, whereas the rest of molecular defects are still undetectable, especially in populations where few genetic screenings have been performed. Therefore, new genetic strategies can be a remarkably useful tool to aid clinical diagnosis, potentially modifying treatment options, and family counseling. Genome-wide analytical techniques (array comparative genomic hybridization and single-nucleotide polymorphism genotyping) and DNA sequencing strategies (arrayed primer extension, Sanger sequencing, and ultra high-throughput sequencing) are successfully used to early make molecular diagnosis detecting single or multiple mutations in the huge heterogeneity of RPs. To date, further research needs to be carried out to better investigate the genotype/phenotype correlation, putting together genetic and clinical findings to provide detailed information concerning the risk of RP development and novel effective treatments. PMID:27564722

  10. ToTem: a tool for variant calling pipeline optimization.

    PubMed

    Tom, Nikola; Tom, Ondrej; Malcikova, Jitka; Pavlova, Sarka; Kubesova, Blanka; Rausch, Tobias; Kolarik, Miroslav; Benes, Vladimir; Bystry, Vojtech; Pospisilova, Sarka

    2018-06-26

    High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. ToTem is a tool for automated pipeline optimization which is freely available as a web application at  https://totem.software .

  11. Genomic analysis suggests that mRNA destabilization by the microprocessor is specialized for the auto-regulation of Dgcr8.

    PubMed

    Shenoy, Archana; Blelloch, Robert

    2009-09-11

    The Microprocessor, containing the RNA binding protein Dgcr8 and RNase III enzyme Drosha, is responsible for processing primary microRNAs to precursor microRNAs. The Microprocessor regulates its own levels by cleaving hairpins in the 5'UTR and coding region of the Dgcr8 mRNA, thereby destabilizing the mature transcript. To determine whether the Microprocessor has a broader role in directly regulating other coding mRNA levels, we integrated results from expression profiling and ultra high-throughput deep sequencing of small RNAs. Expression analysis of mRNAs in wild-type, Dgcr8 knockout, and Dicer knockout mouse embryonic stem (ES) cells uncovered mRNAs that were specifically upregulated in the Dgcr8 null background. A number of these transcripts had evolutionarily conserved predicted hairpin targets for the Microprocessor. However, analysis of deep sequencing data of 18 to 200nt small RNAs in mouse ES, HeLa, and HepG2 indicates that exonic sequence reads that map in a pattern consistent with Microprocessor activity are unique to Dgcr8. We conclude that the Microprocessor's role in directly destabilizing coding mRNAs is likely specifically targeted to Dgcr8 itself, suggesting a specialized cellular mechanism for gene auto-regulation.

  12. ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

    PubMed

    Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie

    2014-02-01

    Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.

  13. Coexistence of enhanced mobile broadband communications and ultra-reliable low-latency communications in mobile front-haul

    NASA Astrophysics Data System (ADS)

    Ying, Kai; Kowalski, John M.; Nogami, Toshizo; Yin, Zhanping; Sheng, Jia

    2018-01-01

    5G systems are supposed to support coexistence of multiple services such as ultra reliable low latency communications (URLLC) and enhanced mobile broadband (eMBB) communications. The target of eMBB communications is to meet the high-throughput requirement while URLLC are used for some high priority services. Due to the sporadic nature and low latency requirement, URLLC transmission may pre-empt the resource of eMBB transmission. Our work is to analyze the URLLC impact on eMBB transmission in mobile front-haul. Then, some solutions are proposed to guarantee the reliability/latency requirements for URLLC services and minimize the impact to eMBB services at the same time.

  14. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    PubMed

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  15. Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing.

    PubMed

    Giraud, Mathieu; Salson, Mikaël; Duez, Marc; Villenet, Céline; Quief, Sabine; Caillault, Aurélie; Grardel, Nathalie; Roumier, Christophe; Preudhomme, Claude; Figeac, Martin

    2014-05-28

    V(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood. We propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols. The proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.

  16. REDItools: high-throughput RNA editing detection made easy.

    PubMed

    Picardi, Ernesto; Pesole, Graziano

    2013-07-15

    The reliable detection of RNA editing sites from massive sequencing data remains challenging and, although several methodologies have been proposed, no computational tools have been released to date. Here, we introduce REDItools a suite of python scripts to perform high-throughput investigation of RNA editing using next-generation sequencing data. REDItools are in python programming language and freely available at http://code.google.com/p/reditools/. ernesto.picardi@uniba.it or graziano.pesole@uniba.it Supplementary data are available at Bioinformatics online.

  17. High throughput protein production screening

    DOEpatents

    Beernink, Peter T [Walnut Creek, CA; Coleman, Matthew A [Oakland, CA; Segelke, Brent W [San Ramon, CA

    2009-09-08

    Methods, compositions, and kits for the cell-free production and analysis of proteins are provided. The invention allows for the production of proteins from prokaryotic sequences or eukaryotic sequences, including human cDNAs using PCR and IVT methods and detecting the proteins through fluorescence or immunoblot techniques. This invention can be used to identify optimized PCR and WT conditions, codon usages and mutations. The methods are readily automated and can be used for high throughput analysis of protein expression levels, interactions, and functional states.

  18. High-Throughput Sequencing: A Roadmap Toward Community Ecology

    PubMed Central

    Poisot, Timothée; Péquin, Bérangère; Gravel, Dominique

    2013-01-01

    High-throughput sequencing is becoming increasingly important in microbial ecology, yet it is surprisingly under-used to generate or test biogeographic hypotheses. In this contribution, we highlight how adding these methods to the ecologist toolbox will allow the detection of new patterns, and will help our understanding of the structure and dynamics of diversity. Starting with a review of ecological questions that can be addressed, we move on to the technical and analytical issues that will benefit from an increased collaboration between different disciplines. PMID:23610649

  19. ShortRead: a bioconductor package for input, quality assessment and exploration of high-throughput sequence data

    PubMed Central

    Morgan, Martin; Anders, Simon; Lawrence, Michael; Aboyoun, Patrick; Pagès, Hervé; Gentleman, Robert

    2009-01-01

    Summary: ShortRead is a package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead is provided in the R and Bioconductor environments, allowing ready access to additional facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources. Availability and Implementation: This package is implemented in R and available at the Bioconductor web site; the package contains a ‘vignette’ outlining typical work flows. Contact: mtmorgan@fhcrc.org PMID:19654119

  20. Forecasting Ecological Genomics: High-Tech Animal Instrumentation Meets High-Throughput Sequencing

    PubMed Central

    Shafer, Aaron B. A.; Northrup, Joseph M.; Wikelski, Martin; Wittemyer, George; Wolf, Jochen B. W.

    2016-01-01

    Recent advancements in animal tracking technology and high-throughput sequencing are rapidly changing the questions and scope of research in the biological sciences. The integration of genomic data with high-tech animal instrumentation comes as a natural progression of traditional work in ecological genetics, and we provide a framework for linking the separate data streams from these technologies. Such a merger will elucidate the genetic basis of adaptive behaviors like migration and hibernation and advance our understanding of fundamental ecological and evolutionary processes such as pathogen transmission, population responses to environmental change, and communication in natural populations. PMID:26745372

  1. Frequency Based Design Partitioning to Achieve Higher Throughput in Digital Cross Correlator for Aperture Synthesis Passive MMW Imager.

    PubMed

    Asif, Muhammad; Guo, Xiangzhou; Zhang, Jing; Miao, Jungang

    2018-04-17

    Digital cross-correlation is central to many applications including but not limited to Digital Image Processing, Satellite Navigation and Remote Sensing. With recent advancements in digital technology, the computational demands of such applications have increased enormously. In this paper we are presenting a high throughput digital cross correlator, capable of processing 1-bit digitized stream, at the rate of up to 2 GHz, simultaneously on 64 channels i.e., approximately 4 Trillion correlation and accumulation operations per second. In order to achieve higher throughput, we have focused on frequency based partitioning of our design and tried to minimize and localize high frequency operations. This correlator is designed for a Passive Millimeter Wave Imager intended for the detection of contraband items concealed on human body. The goals are to increase the system bandwidth, achieve video rate imaging, improve sensitivity and reduce the size. Design methodology is detailed in subsequent sections, elaborating the techniques enabling high throughput. The design is verified for Xilinx Kintex UltraScale device in simulation and the implementation results are given in terms of device utilization and power consumption estimates. Our results show considerable improvements in throughput as compared to our baseline design, while the correlator successfully meets the functional requirements.

  2. The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

    PubMed

    Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske

    2007-02-14

    The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.

  3. High-Throughput Sequencing, a Versatile Weapon to Support Genome-Based Diagnosis in Infectious Diseases: Applications to Clinical Bacteriology

    PubMed Central

    Caboche, Ségolène; Audebert, Christophe; Hot, David

    2014-01-01

    The recent progresses of high-throughput sequencing (HTS) technologies enable easy and cost-reduced access to whole genome sequencing (WGS) or re-sequencing. HTS associated with adapted, automatic and fast bioinformatics solutions for sequencing applications promises an accurate and timely identification and characterization of pathogenic agents. Many studies have demonstrated that data obtained from HTS analysis have allowed genome-based diagnosis, which has been consistent with phenotypic observations. These proofs of concept are probably the first steps toward the future of clinical microbiology. From concept to routine use, many parameters need to be considered to promote HTS as a powerful tool to help physicians and clinicians in microbiological investigations. This review highlights the milestones to be completed toward this purpose. PMID:25437800

  4. [Genetic analysis of two children patients affected with CHARGE syndrome].

    PubMed

    Li, Guoqiang; Li, Niu; Xu, Yufei; Li, Juan; Ding, Yu; Shen, Yiping; Wang, Xiumin; Wang, Jian

    2018-04-10

    To analyze two Chinese pediatric patients with multiple malformations and growth and development delay. Both patients were subjected to targeted gene sequencing, and the results were analyzed with Ingenuity Variant Analysis software. Suspected pathogenic variations were verified by Sanger sequencing. High-throughput sequencing showed that both patients have carried heterozygous variants of the CHD7 gene. Patient 1 carried a nonsense mutation in exon 36 (c.7957C>T, p.Arg2653*), while patient 2 carried a nonsense mutation of exon 2 (c.718C>T, p.Gln240*). Sanger sequencing confirmed the above mutations in both patients, while their parents were of wild-type for the corresponding sites, indicating that the two mutations have happened de novo. Two patients were diagnosed with CHARGE syndrome by high-throughput sequencing.

  5. High-throughput gene mapping in Caenorhabditis elegans.

    PubMed

    Swan, Kathryn A; Curtis, Damian E; McKusick, Kathleen B; Voinov, Alexander V; Mapa, Felipa A; Cancilla, Michael R

    2002-07-01

    Positional cloning of mutations in model genetic systems is a powerful method for the identification of targets of medical and agricultural importance. To facilitate the high-throughput mapping of mutations in Caenorhabditis elegans, we have identified a further 9602 putative new single nucleotide polymorphisms (SNPs) between two C. elegans strains, Bristol N2 and the Hawaiian mapping strain CB4856, by sequencing inserts from a CB4856 genomic DNA library and using an informatics pipeline to compare sequences with the canonical N2 genomic sequence. When combined with data from other laboratories, our marker set of 17,189 SNPs provides even coverage of the complete worm genome. To date, we have confirmed >1099 evenly spaced SNPs (one every 91 +/- 56 kb) across the six chromosomes and validated the utility of our SNP marker set and new fluorescence polarization-based genotyping methods for systematic and high-throughput identification of genes in C. elegans by cloning several proprietary genes. We illustrate our approach by recombination mapping and confirmation of the mutation in the cloned gene, dpy-18.

  6. SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments.

    PubMed

    Burdick, David B; Cavnor, Chris C; Handcock, Jeremy; Killcoyne, Sarah; Lin, Jake; Marzolf, Bruz; Ramsey, Stephen A; Rovira, Hector; Bressler, Ryan; Shmulevich, Ilya; Boyle, John

    2010-07-14

    High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires. Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code. The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services.

  7. SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments

    PubMed Central

    2010-01-01

    Background High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires. Results Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code. Conclusion The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services. PMID:20630057

  8. Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

    PubMed

    Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

    2017-11-13

    A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.

  9. Mapping DNA polymerase errors by single-molecule sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, David F.; Lu, Jenny; Chang, Seungwoo

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less

  10. Mapping DNA polymerase errors by single-molecule sequencing

    DOE PAGES

    Lee, David F.; Lu, Jenny; Chang, Seungwoo; ...

    2016-05-16

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less

  11. Deep sequencing in library selection projects: what insight does it bring?

    PubMed

    Glanville, J; D'Angelo, S; Khan, T A; Reddy, S T; Naranjo, L; Ferrara, F; Bradbury, A R M

    2015-08-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Deep sequencing in library selection projects: what insight does it bring?

    PubMed Central

    Glanville, J; D’Angelo, S; Khan, T.A.; Reddy, S. T.; Naranjo, L.; Ferrara, F.; Bradbury, A.R.M.

    2015-01-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. PMID:26451649

  13. Characterization of DNA-protein interactions using high-throughput sequencing data from pulldown experiments

    NASA Astrophysics Data System (ADS)

    Moreland, Blythe; Oman, Kenji; Curfman, John; Yan, Pearlly; Bundschuh, Ralf

    Methyl-binding domain (MBD) protein pulldown experiments have been a valuable tool in measuring the levels of methylated CpG dinucleotides. Due to the frequent use of this technique, high-throughput sequencing data sets are available that allow a detailed quantitative characterization of the underlying interaction between methylated DNA and MBD proteins. Analyzing such data sets, we first found that two such proteins cannot bind closer to each other than 2 bp, consistent with structural models of the DNA-protein interaction. Second, the large amount of sequencing data allowed us to find rather weak but nevertheless clearly statistically significant sequence preferences for several bases around the required CpG. These results demonstrate that pulldown sequencing is a high-precision tool in characterizing DNA-protein interactions. This material is based upon work supported by the National Science Foundation under Grant No. DMR-1410172.

  14. High-throughput sequencing of complete human mtDNA genomes from the Caucasus and West Asia: high diversity and demographic inferences.

    PubMed

    Schönberg, Anna; Theunert, Christoph; Li, Mingkun; Stoneking, Mark; Nasidze, Ivan

    2011-09-01

    To investigate the demographic history of human populations from the Caucasus and surrounding regions, we used high-throughput sequencing to generate 147 complete mtDNA genome sequences from random samples of individuals from three groups from the Caucasus (Armenians, Azeri and Georgians), and one group each from Iran and Turkey. Overall diversity is very high, with 144 different sequences that fall into 97 different haplogroups found among the 147 individuals. Bayesian skyline plots (BSPs) of population size change through time show a population expansion around 40-50 kya, followed by a constant population size, and then another expansion around 15-18 kya for the groups from the Caucasus and Iran. The BSP for Turkey differs the most from the others, with an increase from 35 to 50 kya followed by a prolonged period of constant population size, and no indication of a second period of growth. An approximate Bayesian computation approach was used to estimate divergence times between each pair of populations; the oldest divergence times were between Turkey and the other four groups from the South Caucasus and Iran (~400-600 generations), while the divergence time of the three Caucasus groups from each other was comparable to their divergence time from Iran (average of ~360 generations). These results illustrate the value of random sampling of complete mtDNA genome sequences that can be obtained with high-throughput sequencing platforms.

  15. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results.

    PubMed

    He, Ji; Dai, Xinbin; Zhao, Xuechun

    2007-02-09

    BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of personal interest in favorite categories, (6) automated sequence annotation (such as NCBI NR and ontology-based annotation). PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results. The PLAN web interface is platform-independent, easily configurable and capable of comprehensive expansion, and user-intuitive. PLAN is freely available to academic users at http://bioinfo.noble.org/plan/. The source code for local deployment is provided under free license. Full support on system utilization, installation, configuration and customization are provided to academic users.

  16. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

    PubMed Central

    He, Ji; Dai, Xinbin; Zhao, Xuechun

    2007-01-01

    Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Results Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of personal interest in favorite categories, (6) automated sequence annotation (such as NCBI NR and ontology-based annotation). PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. Conclusion PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results. The PLAN web interface is platform-independent, easily configurable and capable of comprehensive expansion, and user-intuitive. PLAN is freely available to academic users at . The source code for local deployment is provided under free license. Full support on system utilization, installation, configuration and customization are provided to academic users. PMID:17291345

  17. High-Throughput Next-Generation Sequencing of Polioviruses

    PubMed Central

    Montmayeur, Anna M.; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J.; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L.; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A.; Oberste, M. Steven; Burns, Cara C.

    2016-01-01

    ABSTRACT The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance. PMID:27927929

  18. Whole-Genome Sequencing and Assembly with High-Throughput, Short-Read Technologies

    PubMed Central

    Sundquist, Andreas; Ronaghi, Mostafa; Tang, Haixu; Pevzner, Pavel; Batzoglou, Serafim

    2007-01-01

    While recently developed short-read sequencing technologies may dramatically reduce the sequencing cost and eventually achieve the $1000 goal for re-sequencing, their limitations prevent the de novo sequencing of eukaryotic genomes with the standard shotgun sequencing protocol. We present SHRAP (SHort Read Assembly Protocol), a sequencing protocol and assembly methodology that utilizes high-throughput short-read technologies. We describe a variation on hierarchical sequencing with two crucial differences: (1) we select a clone library from the genome randomly rather than as a tiling path and (2) we sample clones from the genome at high coverage and reads from the clones at low coverage. We assume that 200 bp read lengths with a 1% error rate and inexpensive random fragment cloning on whole mammalian genomes is feasible. Our assembly methodology is based on first ordering the clones and subsequently performing read assembly in three stages: (1) local assemblies of regions significantly smaller than a clone size, (2) clone-sized assemblies of the results of stage 1, and (3) chromosome-sized assemblies. By aggressively localizing the assembly problem during the first stage, our method succeeds in assembling short, unpaired reads sampled from repetitive genomes. We tested our assembler using simulated reads from D. melanogaster and human chromosomes 1, 11, and 21, and produced assemblies with large sets of contiguous sequence and a misassembly rate comparable to other draft assemblies. Tested on D. melanogaster and the entire human genome, our clone-ordering method produces accurate maps, thereby localizing fragment assembly and enabling the parallelization of the subsequent steps of our pipeline. Thus, we have demonstrated that truly inexpensive de novo sequencing of mammalian genomes will soon be possible with high-throughput, short-read technologies using our methodology. PMID:17534434

  19. A new arenavirus in a cluster of fatal transplant-associated diseases.

    PubMed

    Palacios, Gustavo; Druce, Julian; Du, Lei; Tran, Thomas; Birch, Chris; Briese, Thomas; Conlan, Sean; Quan, Phenix-Lan; Hui, Jeffrey; Marshall, John; Simons, Jan Fredrik; Egholm, Michael; Paddock, Christopher D; Shieh, Wun-Ju; Goldsmith, Cynthia S; Zaki, Sherif R; Catton, Mike; Lipkin, W Ian

    2008-03-06

    Three patients who received visceral-organ transplants from a single donor on the same day died of a febrile illness 4 to 6 weeks after transplantation. Culture, polymerase-chain-reaction (PCR) and serologic assays, and oligonucleotide microarray analysis for a wide range of infectious agents were not informative. We evaluated RNA obtained from the liver and kidney transplant recipients. Unbiased high-throughput sequencing was used to identify microbial sequences not found by means of other methods. The specificity of sequences for a new candidate pathogen was confirmed by means of culture and by means of PCR, immunohistochemical, and serologic analyses. High-throughput sequencing yielded 103,632 sequences, of which 14 represented an Old World arenavirus. Additional sequence analysis showed that this new arenavirus was related to lymphocytic choriomeningitis viruses. Specific PCR assays based on a unique sequence confirmed the presence of the virus in the kidneys, liver, blood, and cerebrospinal fluid of the recipients. Immunohistochemical analysis revealed arenavirus antigen in the liver and kidney transplants in the recipients. IgM and IgG antiviral antibodies were detected in the serum of the donor. Seroconversion was evident in serum specimens obtained from one recipient at two time points. Unbiased high-throughput sequencing is a powerful tool for the discovery of pathogens. The use of this method during an outbreak of disease facilitated the identification of a new arenavirus transmitted through solid-organ transplantation. Copyright 2008 Massachusetts Medical Society.

  20. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  1. DDBJ read annotation pipeline: a cloud computing-based pipeline for high-throughput analysis of next-generation sequencing data.

    PubMed

    Nagasaki, Hideki; Mochizuki, Takako; Kodama, Yuichi; Saruhashi, Satoshi; Morizaki, Shota; Sugawara, Hideaki; Ohyanagi, Hajime; Kurata, Nori; Okubo, Kousaku; Takagi, Toshihisa; Kaminuma, Eli; Nakamura, Yasukazu

    2013-08-01

    High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.

  2. DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data

    PubMed Central

    Nagasaki, Hideki; Mochizuki, Takako; Kodama, Yuichi; Saruhashi, Satoshi; Morizaki, Shota; Sugawara, Hideaki; Ohyanagi, Hajime; Kurata, Nori; Okubo, Kousaku; Takagi, Toshihisa; Kaminuma, Eli; Nakamura, Yasukazu

    2013-01-01

    High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/. PMID:23657089

  3. HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment1

    PubMed Central

    Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.

    2016-01-01

    Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175

  4. High-throughput sequencing reveals unprecedented diversities of Aspergillus species in outdoor air.

    PubMed

    Lee, S; An, C; Xu, S; Lee, S; Yamamoto, N

    2016-09-01

    This study used the Illumina MiSeq to analyse compositions and diversities of Aspergillus species in outdoor air. The seasonal air samplings were performed at two locations in Seoul, South Korea. The results showed the relative abundances of all Aspergillus species combined ranging from 0·20 to 18% and from 0·19 to 21% based on the number of the internal transcribed spacer 1 (ITS1) and β-tubulin (BenA) gene sequences respectively. Aspergillus fumigatus was the most dominant species with the mean relative abundances of 1·2 and 5·5% based on the number of the ITS1 and BenA sequences respectively. A total of 29 Aspergillus species were detected and identified down to the species rank, among which nine species were known opportunistic pathogens. Remarkably, eight of the nine pathogenic species were detected by either one of the two markers, suggesting the need of using multiple markers and/or primer pairs when the assessments are made based on the high-throughput sequencing. Due to diversity of species within the genus Aspergillus, the high-throughput sequencing was useful to characterize their compositions and diversities in outdoor air, which are thought to be difficult to be accurately characterized by conventional culture and/or Sanger sequencing-based techniques. Aspergillus is a diverse genus of fungi with more than 300 species reported in literature. Aspergillus is important since some species are known allergens and opportunistic human pathogens. Traditionally, growth-dependent methods have been used to detect Aspergillus species in air. However, these methods are limited in the number of isolates that can be analysed for their identities, resulting in inaccurate characterizations of Aspergillus diversities. This study used the high-throughput sequencing to explore Aspergillus diversities in outdoor, which are thought to be difficult to be accurately characterized by traditional growth-dependent techniques. © 2016 The Society for Applied Microbiology.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Athavale, Ajay

    Ajay Athavale (Monsanto) presents "High Throughput Plasmid Sequencing with Illumina and CLC Bio" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  6. Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches.

    PubMed

    Logares, Ramiro; Haverkamp, Thomas H A; Kumar, Surendra; Lanzén, Anders; Nederbragt, Alexander J; Quince, Christopher; Kauserud, Håvard

    2012-10-01

    The incursion of High-Throughput Sequencing (HTS) in environmental microbiology brings unique opportunities and challenges. HTS now allows a high-resolution exploration of the vast taxonomic and metabolic diversity present in the microbial world, which can provide an exceptional insight on global ecosystem functioning, ecological processes and evolution. This exploration has also economic potential, as we will have access to the evolutionary innovation present in microbial metabolisms, which could be used for biotechnological development. HTS is also challenging the research community, and the current bottleneck is present in the data analysis side. At the moment, researchers are in a sequence data deluge, with sequencing throughput advancing faster than the computer power needed for data analysis. However, new tools and approaches are being developed constantly and the whole process could be depicted as a fast co-evolution between sequencing technology, informatics and microbiologists. In this work, we examine the most popular and recently commercialized HTS platforms as well as bioinformatics methods for data handling and analysis used in microbial metagenomics. This non-exhaustive review is intended to serve as a broad state-of-the-art guide to researchers expanding into this rapidly evolving field. Copyright © 2012 Elsevier B.V. All rights reserved.

  7. Bacterial Pathogens and Community Composition in Advanced Sewage Treatment Systems Revealed by Metagenomics Analysis Based on High-Throughput Sequencing

    PubMed Central

    Lu, Xin; Zhang, Xu-Xiang; Wang, Zhu; Huang, Kailong; Wang, Yuan; Liang, Weigang; Tan, Yunfei; Liu, Bo; Tang, Junying

    2015-01-01

    This study used 454 pyrosequencing, Illumina high-throughput sequencing and metagenomic analysis to investigate bacterial pathogens and their potential virulence in a sewage treatment plant (STP) applying both conventional and advanced treatment processes. Pyrosequencing and Illumina sequencing consistently demonstrated that Arcobacter genus occupied over 43.42% of total abundance of potential pathogens in the STP. At species level, potential pathogens Arcobacter butzleri, Aeromonas hydrophila and Klebsiella pneumonia dominated in raw sewage, which was also confirmed by quantitative real time PCR. Illumina sequencing also revealed prevalence of various types of pathogenicity islands and virulence proteins in the STP. Most of the potential pathogens and virulence factors were eliminated in the STP, and the removal efficiency mainly depended on oxidation ditch. Compared with sand filtration, magnetic resin seemed to have higher removals in most of the potential pathogens and virulence factors. However, presence of the residual A. butzleri in the final effluent still deserves more concerns. The findings indicate that sewage acts as an important source of environmental pathogens, but STPs can effectively control their spread in the environment. Joint use of the high-throughput sequencing technologies is considered a reliable method for deep and comprehensive overview of environmental bacterial virulence. PMID:25938416

  8. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    PubMed

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  9. High-Throughput Mapping of Single-Neuron Projections by Sequencing of Barcoded RNA.

    PubMed

    Kebschull, Justus M; Garcia da Silva, Pedro; Reid, Ashlan P; Peikon, Ian D; Albeanu, Dinu F; Zador, Anthony M

    2016-09-07

    Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area is dissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionally viewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    PubMed

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.

  11. Development of a high-throughput SNP resource to advance genomic, genetic and breeding research in carrot (Daucus carota L.)

    USDA-ARS?s Scientific Manuscript database

    The rapid advancement in high-throughput SNP genotyping technologies along with next generation sequencing (NGS) platforms has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops with limited genomic resources such as carrot (Daucus carota) to access t...

  12. High-throughput tetrad analysis.

    PubMed

    Ludlow, Catherine L; Scott, Adrian C; Cromie, Gareth A; Jeffery, Eric W; Sirr, Amy; May, Patrick; Lin, Jake; Gilbert, Teresa L; Hays, Michelle; Dudley, Aimée M

    2013-07-01

    Tetrad analysis has been a gold-standard genetic technique for several decades. Unfortunately, the need to manually isolate, disrupt and space tetrads has relegated its application to small-scale studies and limited its integration with high-throughput DNA sequencing technologies. We have developed a rapid, high-throughput method, called barcode-enabled sequencing of tetrads (BEST), that uses (i) a meiosis-specific GFP fusion protein to isolate tetrads by FACS and (ii) molecular barcodes that are read during genotyping to identify spores derived from the same tetrad. Maintaining tetrad information allows accurate inference of missing genetic markers and full genotypes of missing (and presumably nonviable) individuals. An individual researcher was able to isolate over 3,000 yeast tetrads in 3 h, an output equivalent to that of almost 1 month of manual dissection. BEST is transferable to other microorganisms for which meiotic mapping is significantly more laborious.

  13. A Robust Framework for Microbial Archaeology

    PubMed Central

    Warinner, Christina; Herbig, Alexander; Mann, Allison; Yates, James A. Fellows; Weiβ, Clemens L.; Burbano, Hernán A.; Orlando, Ludovic; Krause, Johannes

    2017-01-01

    Microbial archaeology is flourishing in the era of high-throughput sequencing, revealing the agents behind devastating historical plagues, identifying the cryptic movements of pathogens in prehistory, and reconstructing the ancestral microbiota of humans. Here, we introduce the fundamental concepts and theoretical framework of the discipline, then discuss applied methodologies for pathogen identification and microbiome characterization from archaeological samples. We give special attention to the process of identifying, validating, and authenticating ancient microbes using high-throughput DNA sequencing data. Finally, we outline standards and precautions to guide future research in the field. PMID:28460196

  14. High throughput 16SrRNA gene sequencing reveals the correlation between Propionibacterium acnes and sarcoidosis.

    PubMed

    Zhao, Meng-Meng; Du, Shan-Shan; Li, Qiu-Hong; Chen, Tao; Qiu, Hui; Wu, Qin; Chen, Shan-Shan; Zhou, Ying; Zhang, Yuan; Hu, Yang; Su, Yi-Liang; Shen, Li; Zhang, Fen; Weng, Dong; Li, Hui-Ping

    2017-02-01

    This study aims to use high throughput 16SrRNA gene sequencing to examine the bacterial profile of lymph node biopsy samples of patients with sarcoidosis and to further verify the association between Propionibacterium acnes (P. acnes) and sarcoidosis. A total of 36 mediastinal lymph node biopsy specimens were collected from 17 cases of sarcoidosis, 8 tuberculosis (TB group), and 11 non-infectious lung diseases (control group). The V4 region of the bacterial 16SrRNA gene in the specimens was amplified and sequenced using the high throughput sequencing platform MiSeq, and bacterial profile was established. The data analysis software QIIME and Metastats were used to compare bacterial relative abundance in the three patient groups. Overall, 545 genera were identified; 38 showed significantly lower and 29 had significantly higher relative abundance in the sarcoidosis group than in the TB and control groups (P < 0.01). P. acnes 16SrRNA was exclusively found in all the 17 samples of the sarcoidosis group, whereas was not detected in the TB and control groups. The relative abundance of P. acnes in the sarcoidosis group (0.16% ± 0. 11%) was significantly higher than that in the TB (Metastats analysis: P = 0.0010, q = 0.0044) and control groups (Metastats analysis: P = 0.0010, q = 0.0038). The relative abundance of P. granulosum was only 0.0022% ± 0. 0044% in the sarcoidosis group. P. granulosum 16SrRNA was not detected in the other two groups. High throughput 16SrRNA gene sequencing appears to be a useful tool to investigate the bacterial profile of sarcoidosis specimens. The results suggest that P. acnes may be involved in sarcoidosis development.

  15. High-throughput sequencing in veterinary infection biology and diagnostics.

    PubMed

    Belák, S; Karlsson, O E; Leijon, M; Granberg, F

    2013-12-01

    Sequencing methods have improved rapidly since the first versions of the Sanger techniques, facilitating the development of very powerful tools for detecting and identifying various pathogens, such as viruses, bacteria and other microbes. The ongoing development of high-throughput sequencing (HTS; also known as next-generation sequencing) technologies has resulted in a dramatic reduction in DNA sequencing costs, making the technology more accessible to the average laboratory. In this White Paper of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine (Uppsala, Sweden), several approaches and examples of HTS are summarised, and their diagnostic applicability is briefly discussed. Selected future aspects of HTS are outlined, including the need for bioinformatic resources, with a focus on improving the diagnosis and control of infectious diseases in veterinary medicine.

  16. DnaSAM: Software to perform neutrality testing for large datasets with complex null models.

    PubMed

    Eckert, Andrew J; Liechty, John D; Tearse, Brandon R; Pande, Barnaly; Neale, David B

    2010-05-01

    Patterns of DNA sequence polymorphisms can be used to understand the processes of demography and adaptation within natural populations. High-throughput generation of DNA sequence data has historically been the bottleneck with respect to data processing and experimental inference. Advances in marker technologies have largely solved this problem. Currently, the limiting step is computational, with most molecular population genetic software allowing a gene-by-gene analysis through a graphical user interface. An easy-to-use analysis program that allows both high-throughput processing of multiple sequence alignments along with the flexibility to simulate data under complex demographic scenarios is currently lacking. We introduce a new program, named DnaSAM, which allows high-throughput estimation of DNA sequence diversity and neutrality statistics from experimental data along with the ability to test those statistics via Monte Carlo coalescent simulations. These simulations are conducted using the ms program, which is able to incorporate several genetic parameters (e.g. recombination) and demographic scenarios (e.g. population bottlenecks). The output is a set of diversity and neutrality statistics with associated probability values under a user-specified null model that are stored in easy to manipulate text file. © 2009 Blackwell Publishing Ltd.

  17. Ultra-high-throughput microarray generation and liquid dispensing using multiple disposable piezoelectric ejectors.

    PubMed

    Hsieh, Huangpin Ben; Fitch, John; White, Dave; Torres, Frank; Roy, Joy; Matusiak, Robert; Krivacic, Bob; Kowalski, Bob; Bruce, Richard; Elrod, Scott

    2004-03-01

    The authors have constructed an array of 12 piezoelectric ejectors for printing biological materials. A single-ejector footprint is 8 mm in diameter, standing 4 mm high with 2 reservoirs totaling 76 micro L. These ejectors have been tested by dispensing various fluids in several environmental conditions. Reliable drop ejection can be expected in both humidity-controlled and ambient environments over extended periods of time and in hot and cold room temperatures. In a prototype system, 12 ejectors are arranged in a rack, together with an X - Y stage, to allow printing any pattern desired. Printed arrays of features are created with a biological solution containing bovine serum albumin conjugated oligonucleotides, dye, and salty buffer. This ejector system is designed for the ultra-high-throughput generation of arrays on a variety of surfaces. These single or racked ejectors could be used as long-term storage vessels for materials such as small molecules, nucleic acids, proteins, or cell libraries, which would allow for efficient preprogrammed selection of individual clones and greatly reduce the chance of cross-contamination and loss due to transfer. A new generation of design ideas includes plastic injection molded ejectors that are inexpensive and disposable and handheld personal pipettes for liquid transfer in the nanoliter regime.

  18. A comprehensive insight into bacterial virulence in drinking water using 454 pyrosequencing and Illumina high-throughput sequencing.

    PubMed

    Huang, Kailong; Zhang, Xu-Xiang; Shi, Peng; Wu, Bing; Ren, Hongqiang

    2014-11-01

    In order to comprehensively investigate bacterial virulence in drinking water, 454 pyrosequencing and Illumina high-throughput sequencing were used to detect potential pathogenic bacteria and virulence factors (VFs) in a full-scale drinking water treatment and distribution system. 16S rRNA gene pyrosequencing revealed high bacterial diversity in the drinking water (441-586 operational taxonomic units). Bacterial diversity decreased after chlorine disinfection, but increased after pipeline distribution. α-Proteobacteria was the most dominant taxonomic class. Alignment against the established pathogen database showed that several types of putative pathogens were present in the drinking water and Pseudomonas aeruginosa had the highest abundance (over 11‰ of total sequencing reads). Many pathogens disappeared after chlorine disinfection, but P. aeruginosa and Leptospira interrogans were still detected in the tap water. High-throughput sequencing revealed prevalence of various pathogenicity islands and virulence proteins in the drinking water, and translocases, transposons, Clp proteases and flagellar motor switch proteins were the predominant VFs. Both diversity and abundance of the detectable VFs increased after the chlorination, and decreased after the pipeline distribution. This study indicates that joint use of 454 pyrosequencing and Illumina sequencing can comprehensively characterize environmental pathogenesis, and several types of putative pathogens and various VFs are prevalent in drinking water. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. Digital gene expression for non-model organisms

    PubMed Central

    Hong, Lewis Z.; Li, Jun; Schmidt-Küntzel, Anne; Warren, Wesley C.; Barsh, Gregory S.

    2011-01-01

    Next-generation sequencing technologies offer new approaches for global measurements of gene expression but are mostly limited to organisms for which a high-quality assembled reference genome sequence is available. We present a method for gene expression profiling called EDGE, or EcoP15I-tagged Digital Gene Expression, based on ultra-high-throughput sequencing of 27-bp cDNA fragments that uniquely tag the corresponding gene, thereby allowing direct quantification of transcript abundance. We show that EDGE is capable of assaying for expression in >99% of genes in the genome and achieves saturation after 6–8 million reads. EDGE exhibits very little technical noise, reveals a large (106) dynamic range of gene expression, and is particularly suited for quantification of transcript abundance in non-model organisms where a high-quality annotated genome is not available. In a direct comparison with RNA-seq, both methods provide similar assessments of relative transcript abundance, but EDGE does better at detecting gene expression differences for poorly expressed genes and does not exhibit transcript length bias. Applying EDGE to laboratory mice, we show that a loss-of-function mutation in the melanocortin 1 receptor (Mc1r), recognized as a Mendelian determinant of yellow hair color in many different mammals, also causes reduced expression of genes involved in the interferon response. To illustrate the application of EDGE to a non-model organism, we examine skin biopsy samples from a cheetah (Acinonyx jubatus) and identify genes likely to control differences in the color of spotted versus non-spotted regions. PMID:21844123

  20. Assembly and diploid architecture of an individual human genome via single-molecule technologies

    PubMed Central

    Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali

    2015-01-01

    We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality. PMID:26121404

  1. Assembly and diploid architecture of an individual human genome via single-molecule technologies.

    PubMed

    Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali

    2015-08-01

    We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.

  2. HTSstation: a web application and open-access libraries for high-throughput sequencing data analysis.

    PubMed

    David, Fabrice P A; Delafontaine, Julien; Carat, Solenne; Ross, Frederick J; Lefebvre, Gregory; Jarosz, Yohan; Sinclair, Lucas; Noordermeer, Daan; Rougemont, Jacques; Leleu, Marion

    2014-01-01

    The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch.

  3. HTSstation: A Web Application and Open-Access Libraries for High-Throughput Sequencing Data Analysis

    PubMed Central

    David, Fabrice P. A.; Delafontaine, Julien; Carat, Solenne; Ross, Frederick J.; Lefebvre, Gregory; Jarosz, Yohan; Sinclair, Lucas; Noordermeer, Daan; Rougemont, Jacques; Leleu, Marion

    2014-01-01

    The HTSstation analysis portal is a suite of simple web forms coupled to modular analysis pipelines for various applications of High-Throughput Sequencing including ChIP-seq, RNA-seq, 4C-seq and re-sequencing. HTSstation offers biologists the possibility to rapidly investigate their HTS data using an intuitive web application with heuristically pre-defined parameters. A number of open-source software components have been implemented and can be used to build, configure and run HTS analysis pipelines reactively. Besides, our programming framework empowers developers with the possibility to design their own workflows and integrate additional third-party software. The HTSstation web application is accessible at http://htsstation.epfl.ch. PMID:24475057

  4. High-Throughput rRNA Gene Sequencing Reveals High
and Complex Bacterial Diversity Associated with
Brazilian Coffee Bean Fermentation

    PubMed Central

    Vinícius de Melo, Gilberto

    2018-01-01

    Summary Coffee bean fermentation is a spontaneous, on-farm process involving the action of different microbial groups, including bacteria and fungi. In this study, high-throughput sequencing approach was employed to study the diversity and dynamics of bacteria associated with Brazilian coffee bean fermentation. The total DNA from fermenting coffee samples was extracted at different time points, and the 16S rRNA gene with segments around the V4 variable region was sequenced by Illumina high-throughput platform. Using this approach, the presence of over eighty bacterial genera was determined, many of which have been detected for the first time during coffee bean fermentation, including Fructobacillus, Pseudonocardia, Pedobacter, Sphingomonas and Hymenobacter. The presence of Fructobacillus suggests an influence of these bacteria on fructose metabolism during coffee fermentation. Temporal analysis showed a strong dominance of lactic acid bacteria with over 97% of read sequences at the end of fermentation, mainly represented by the Leuconostoc and Lactococcus. Metabolism of lactic acid bacteria was associated with the high formation of lactic acid during fermentation, as determined by HPLC analysis. The results reported in this study confirm the underestimation of bacterial diversity associated with coffee fermentation. New microbial groups reported in this study may be explored as functional starter cultures for on-farm coffee processing.

  5. Heat*seq: an interactive web tool for high-throughput sequencing experiment comparison with public data.

    PubMed

    Devailly, Guillaume; Mantsoki, Anna; Joshi, Anagha

    2016-11-01

    Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats. Web application: http://www.heatstarseq.roslin.ed.ac.uk/ Source code: https://github.com/gdevailly CONTACT: Guillaume.Devailly@roslin.ed.ac.uk or Anagha.Joshi@roslin.ed.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  6. RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA

    PubMed Central

    Wright, Imogen A.; Travers, Simon A.

    2014-01-01

    The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance. PMID:24861618

  7. LSGermOPA, a custom OPA of 384 EST-derived SNPs for high-throughput lettuce (Lactuca sativa L.) germplasm fingerprinting

    USDA-ARS?s Scientific Manuscript database

    We assessed the genetic diversity and population structure among 148 cultivated lettuce (Lactuca sativa L.) accessions using the high-throughput GoldenGate assay and 384 EST (Expressed Sequence Tag)-derived SNP (single nucleotide polymorphism) markers. A custom OPA (Oligo Pool All), LSGermOPA was fo...

  8. Genome sequencing of a single tardigrade Hypsibius dujardini individual

    PubMed Central

    Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru

    2016-01-01

    Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies. PMID:27529330

  9. Genome sequencing of a single tardigrade Hypsibius dujardini individual.

    PubMed

    Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru

    2016-08-16

    Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies.

  10. Dissecting enzyme function with microfluidic-based deep mutational scanning.

    PubMed

    Romero, Philip A; Tran, Tuan M; Abate, Adam R

    2015-06-09

    Natural enzymes are incredibly proficient catalysts, but engineering them to have new or improved functions is challenging due to the complexity of how an enzyme's sequence relates to its biochemical properties. Here, we present an ultrahigh-throughput method for mapping enzyme sequence-function relationships that combines droplet microfluidic screening with next-generation DNA sequencing. We apply our method to map the activity of millions of glycosidase sequence variants. Microfluidic-based deep mutational scanning provides a comprehensive and unbiased view of the enzyme function landscape. The mapping displays expected patterns of mutational tolerance and a strong correspondence to sequence variation within the enzyme family, but also reveals previously unreported sites that are crucial for glycosidase function. We modified the screening protocol to include a high-temperature incubation step, and the resulting thermotolerance landscape allowed the discovery of mutations that enhance enzyme thermostability. Droplet microfluidics provides a general platform for enzyme screening that, when combined with DNA-sequencing technologies, enables high-throughput mapping of enzyme sequence space.

  11. Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms.

    PubMed

    Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro

    2010-04-27

    To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be caused by a recent human selection in rice breeding. The definition of pedigree haplotypes by means of genome-wide SNPs will facilitate next-generation breeding of rice and other crops.

  12. Ultra high molecular weight polyethylene: Optical features at millimeter wavelengths

    NASA Astrophysics Data System (ADS)

    D'Alessandro, G.; Paiella, A.; Coppolecchia, A.; Castellano, M. G.; Colantoni, I.; de Bernardis, P.; Lamagna, L.; Masi, S.

    2018-05-01

    The next generation of experiments for the measurement of the Cosmic Microwave Background (CMB) requires more and more the use of advanced materials, with specific physical and structural properties. An example is the material used for receiver's cryostat windows and internal lenses. The large throughput of current CMB experiments requires a large diameter (of the order of 0.5 m) of these parts, resulting in heavy structural and optical requirements on the material to be used. Ultra High Molecular Weight (UHMW) polyethylene (PE) features high resistance to traction and good transmissivity in the frequency range of interest. In this paper, we discuss the possibility of using UHMW PE for windows and lenses in experiments working at millimeter wavelengths, by measuring its optical properties: emissivity, transmission and refraction index. Our measurements show that the material is well suited to this purpose.

  13. TriageTools: tools for partitioning and prioritizing analysis of high-throughput sequencing data.

    PubMed

    Fimereli, Danai; Detours, Vincent; Konopka, Tomasz

    2013-04-01

    High-throughput sequencing is becoming a popular research tool but carries with it considerable costs in terms of computation time, data storage and bandwidth. Meanwhile, some research applications focusing on individual genes or pathways do not necessitate processing of a full sequencing dataset. Thus, it is desirable to partition a large dataset into smaller, manageable, but relevant pieces. We present a toolkit for partitioning raw sequencing data that includes a method for extracting reads that are likely to map onto pre-defined regions of interest. We show the method can be used to extract information about genes of interest from DNA or RNA sequencing samples in a fraction of the time and disk space required to process and store a full dataset. We report speedup factors between 2.6 and 96, depending on settings and samples used. The software is available at http://www.sourceforge.net/projects/triagetools/.

  14. Measuring Sister Chromatid Cohesion Protein Genome Occupancy in Drosophila melanogaster by ChIP-seq.

    PubMed

    Dorsett, Dale; Misulovin, Ziva

    2017-01-01

    This chapter presents methods to conduct and analyze genome-wide chromatin immunoprecipitation of the cohesin complex and the Nipped-B cohesin loading factor in Drosophila cells using high-throughput DNA sequencing (ChIP-seq). Procedures for isolation of chromatin, immunoprecipitation, and construction of sequencing libraries for the Ion Torrent Proton high throughput sequencer are detailed, and computational methods to calculate occupancy as input-normalized fold-enrichment are described. The results obtained by ChIP-seq are compared to those obtained by ChIP-chip (genomic ChIP using tiling microarrays), and the effects of sequencing depth on the accuracy are analyzed. ChIP-seq provides similar sensitivity and reproducibility as ChIP-chip, and identifies the same broad regions of occupancy. The locations of enrichment peaks, however, can differ between ChIP-chip and ChIP-seq, and low sequencing depth can splinter broad regions of occupancy into distinct peaks.

  15. New Tools For Understanding Microbial Diversity Using High-throughput Sequence Data

    NASA Astrophysics Data System (ADS)

    Knight, R.; Hamady, M.; Liu, Z.; Lozupone, C.

    2007-12-01

    High-throughput sequencing techniques such as 454 are straining the limits of tools traditionally used to build trees, choose OTUs, and perform other essential sequencing tasks. We have developed a workflow for phylogenetic analysis of large-scale sequence data sets that combines existing tools, such as the Arb phylogeny package and the NAST multiple sequence alignment tool, with new methods for choosing and clustering OTUs and for performing phylogenetic community analysis with UniFrac. This talk discusses the cyberinfrastructure we are developing to support the human microbiome project, and the application of these workflows to analyze very large data sets that contrast the gut microbiota with a range of physical environments. These tools will ultimately help to define core and peripheral microbiomes in a range of environments, and will allow us to understand the physical and biotic factors that contribute most to differences in microbial diversity.

  16. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    USDA-ARS?s Scientific Manuscript database

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  17. Recent Applications of DNA Sequencing Technologies in Food, Nutrition and Agriculture

    USDA-ARS?s Scientific Manuscript database

    Next-generation DNA sequencing technologies are able to produce millions of short sequence reads in a high-throughput, cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. This review surveys their rec...

  18. Next generation sequencers: methods and applications in food-borne pathogens

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencers are able to produce millions of short sequence reads in a high-throughput, low-cost way. The emergence of these technologies has not only facilitated genome sequencing but also started to change the landscape of life sciences. This chapter will survey their methods and app...

  19. Genome-wide identification of conserved microRNA and their response to drought stress in Dongxiang wild rice (Oryza rufipogon Griff.).

    PubMed

    Zhang, Fantao; Luo, Xiangdong; Zhou, Yi; Xie, Jiankun

    2016-04-01

    To identify drought stress-responsive conserved microRNA (miRNA) from Dongxiang wild rice (Oryza rufipogon Griff., DXWR) on a genome-wide scale, high-throughput sequencing technology was used to sequence libraries of DXWR samples, treated with and without drought stress. 505 conserved miRNAs corresponding to 215 families were identified. 17 were significantly down-regulated and 16 were up-regulated under drought stress. Stem-loop qRT-PCR revealed the same expression patterns as high-throughput sequencing, suggesting the accuracy of the sequencing result was high. Potential target genes of the drought-responsive miRNA were predicted to be involved in diverse biological processes. Furthermore, 16 miRNA families were first identified to be involved in drought stress response from plants. These results present a comprehensive view of the conserved miRNA and their expression patterns under drought stress for DXWR, which will provide valuable information and sequence resources for future basis studies.

  20. The high throughput biomedicine unit at the institute for molecular medicine Finland: high throughput screening meets precision medicine.

    PubMed

    Pietiainen, Vilja; Saarela, Jani; von Schantz, Carina; Turunen, Laura; Ostling, Paivi; Wennerberg, Krister

    2014-05-01

    The High Throughput Biomedicine (HTB) unit at the Institute for Molecular Medicine Finland FIMM was established in 2010 to serve as a national and international academic screening unit providing access to state of the art instrumentation for chemical and RNAi-based high throughput screening. The initial focus of the unit was multiwell plate based chemical screening and high content microarray-based siRNA screening. However, over the first four years of operation, the unit has moved to a more flexible service platform where both chemical and siRNA screening is performed at different scales primarily in multiwell plate-based assays with a wide range of readout possibilities with a focus on ultraminiaturization to allow for affordable screening for the academic users. In addition to high throughput screening, the equipment of the unit is also used to support miniaturized, multiplexed and high throughput applications for other types of research such as genomics, sequencing and biobanking operations. Importantly, with the translational research goals at FIMM, an increasing part of the operations at the HTB unit is being focused on high throughput systems biological platforms for functional profiling of patient cells in personalized and precision medicine projects.

  1. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data.

    PubMed

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-07-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.

  2. WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data

    PubMed Central

    Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

    2010-01-01

    High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users. PMID:20501601

  3. Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation

    USDA-ARS?s Scientific Manuscript database

    Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry). Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker d...

  4. GENETIC-BASED ANALYTICAL METHODS FOR BACTERIA AND FUNGI

    EPA Science Inventory

    In the past two decades, advances in high-throughput sequencing technologies have lead to a veritable explosion in the generation of nucleic acid sequence information (1). While these advances are illustrated most prominently by the successful sequencing of the human genome, they...

  5. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)-A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes.

    PubMed

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.

  6. Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq)—A Method for High-Throughput Analysis of Differentially Methylated CCGG Sites in Plants with Large Genomes

    PubMed Central

    Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw

    2017-01-01

    Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096

  7. In silico assessment of primers for eDNA studies using PrimerTree and application to characterize the biodiversity surrounding the Cuyahoga River

    NASA Astrophysics Data System (ADS)

    Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.

    2016-03-01

    Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity.

  8. In silico assessment of primers for eDNA studies using PrimerTree and application to characterize the biodiversity surrounding the Cuyahoga River

    PubMed Central

    Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.

    2016-01-01

    Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity. PMID:26965911

  9. Comparison of next generation sequencing technologies for transcriptome characterization

    PubMed Central

    2009-01-01

    Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms. PMID:19646272

  10. High-throughput, pooled sequencing identifies mutations in NUBPL and FOXRED1 in human complex I deficiency

    PubMed Central

    Calvo, Sarah E; Tucker, Elena J; Compton, Alison G; Kirby, Denise M; Crawford, Gabriel; Burtt, Noel P; Rivas, Manuel A; Guiducci, Candace; Bruno, Damien L; Goldberger, Olga A; Redman, Michelle C; Wiltshire, Esko; Wilson, Callum J; Altshuler, David; Gabriel, Stacey B; Daly, Mark J; Thorburn, David R; Mootha, Vamsi K

    2010-01-01

    Discovering the molecular basis of mitochondrial respiratory chain disease is challenging given the large number of both mitochondrial and nuclear genes involved. We report a strategy of focused candidate gene prediction, high-throughput sequencing, and experimental validation to uncover the molecular basis of mitochondrial complex I (CI) disorders. We created five pools of DNA from a cohort of 103 patients and then performed deep sequencing of 103 candidate genes to spotlight 151 rare variants predicted to impact protein function. We used confirmatory experiments to establish genetic diagnoses in 22% of previously unsolved cases, and discovered that defects in NUBPL and FOXRED1 can cause CI deficiency. Our study illustrates how large-scale sequencing, coupled with functional prediction and experimental validation, can reveal novel disease-causing mutations in individual patients. PMID:20818383

  11. Novel genetic tools for studying food-borne Salmonella.

    PubMed

    Andrews-Polymenis, Helene L; Santiviago, Carlos A; McClelland, Michael

    2009-04-01

    Nontyphoidal Salmonellae are highly prevalent food-borne pathogens. High-throughput sequencing of Salmonella genomes is expanding our knowledge of the evolution of serovars and epidemic isolates. Genome sequences have also allowed the creation of complete microarrays. Microarrays have improved the throughput of in vivo expression technology (IVET) used to uncover promoters active during infection. In another method, signature tagged mutagenesis (STM), pools of mutants are subjected to selection. Changes in the population are monitored on a microarray, revealing genes under selection. Complete genome sequences permit the construction of pools of targeted in-frame deletions that have improved STM by minimizing the number of clones and the polarity of each mutant. Together, genome sequences and the continuing development of new tools for functional genomics will drive a revolution in the understanding of Salmonellae in many different niches that are critical for food safety.

  12. Personalized Oncology Through Integrative High-Throughput Sequencing: A Pilot Study

    PubMed Central

    Roychowdhury, Sameek; Iyer, Matthew K.; Robinson, Dan R.; Lonigro, Robert J.; Wu, Yi-Mi; Cao, Xuhong; Kalyana-Sundaram, Shanker; Sam, Lee; Balbin, O. Alejandro; Quist, Michael J.; Barrette, Terrence; Everett, Jessica; Siddiqui, Javed; Kunju, Lakshmi P.; Navone, Nora; Araujo, John C.; Troncoso, Patricia; Logothetis, Christopher J.; Innis, Jeffrey W.; Smith, David C.; Lao, Christopher D.; Kim, Scott Y.; Roberts, J. Scott; Gruber, Stephen B.; Pienta, Kenneth J.; Talpaz, Moshe; Chinnaiyan, Arul M.

    2012-01-01

    Individual cancers harbor a set of genetic aberrations that can be informative for identifying rational therapies currently available or in clinical trials. We implemented a pilot study to explore the practical challenges of applying high-throughput sequencing in clinical oncology. We enrolled patients with advanced or refractory cancer who were eligible for clinical trials. For each patient, we performed whole-genome sequencing of the tumor, targeted whole-exome sequencing of tumor and normal DNA, and transcriptome sequencing (RNA-Seq) of the tumor to identify potentially informative mutations in a clinically relevant time frame of 3 to 4 weeks. With this approach, we detected several classes of cancer mutations including structural rearrangements, copy number alterations, point mutations, and gene expression alterations. A multidisciplinary Sequencing Tumor Board (STB) deliberated on the clinical interpretation of the sequencing results obtained. We tested our sequencing strategy on human prostate cancer xenografts. Next, we enrolled two patients into the clinical protocol and were able to review the results at our STB within 24 days of biopsy. The first patient had metastatic colorectal cancer in which we identified somatic point mutations in NRAS, TP53, AURKA, FAS, and MYH11, plus amplification and overexpression of cyclin-dependent kinase 8 (CDK8). The second patient had malignant melanoma, in which we identified a somatic point mutation in HRAS and a structural rearrangement affecting CDKN2C. The STB identified the CDK8 amplification and Ras mutation as providing a rationale for clinical trials with CDK inhibitors or MEK (mitogenactivated or extracellular signal–regulated protein kinase kinase) and PI3K (phosphatidylinositol 3-kinase) inhibitors, respectively. Integrative high-throughput sequencing of patients with advanced cancer generates a comprehensive, individual mutational landscape to facilitate biomarker-driven clinical trials in oncology. PMID:22133722

  13. A radial flow microfluidic device for ultra-high-throughput affinity-based isolation of circulating tumor cells.

    PubMed

    Murlidhar, Vasudha; Zeinali, Mina; Grabauskiene, Svetlana; Ghannad-Rezaie, Mostafa; Wicha, Max S; Simeone, Diane M; Ramnath, Nithya; Reddy, Rishindra M; Nagrath, Sunitha

    2014-12-10

    Circulating tumor cells (CTCs) are believed to play an important role in metastasis, a process responsible for the majority of cancer-related deaths. But their rarity in the bloodstream makes microfluidic isolation complex and time-consuming. Additionally the low processing speeds can be a hindrance to obtaining higher yields of CTCs, limiting their potential use as biomarkers for early diagnosis. Here, a high throughput microfluidic technology, the OncoBean Chip, is reported. It employs radial flow that introduces a varying shear profile across the device, enabling efficient cell capture by affinity at high flow rates. The recovery from whole blood is validated with cancer cell lines H1650 and MCF7, achieving a mean efficiency >80% at a throughput of 10 mL h(-1) in contrast to a flow rate of 1 mL h(-1) standardly reported with other microfluidic devices. Cells are recovered with a viability rate of 93% at these high speeds, increasing the ability to use captured CTCs for downstream analysis. Broad clinical application is demonstrated using comparable flow rates from blood specimens obtained from breast, pancreatic, and lung cancer patients. Comparable CTC numbers are recovered in all the samples at the two flow rates, demonstrating the ability of the technology to perform at high throughputs. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Lensless on-chip imaging of cells provides a new tool for high-throughput cell-biology and medical diagnostics.

    PubMed

    Mudanyali, Onur; Erlinger, Anthony; Seo, Sungkyu; Su, Ting-Wei; Tseng, Derek; Ozcan, Aydogan

    2009-12-14

    Conventional optical microscopes image cells by use of objective lenses that work together with other lenses and optical components. While quite effective, this classical approach has certain limitations for miniaturization of the imaging platform to make it compatible with the advanced state of the art in microfluidics. In this report, we introduce experimental details of a lensless on-chip imaging concept termed LUCAS (Lensless Ultra-wide field-of-view Cell monitoring Array platform based on Shadow imaging) that does not require any microscope objectives or other bulky optical components to image a heterogeneous cell solution over an ultra-wide field of view that can span as large as approximately 18 cm(2). Moreover, unlike conventional microscopes, LUCAS can image a heterogeneous cell solution of interest over a depth-of-field of approximately 5 mm without the need for refocusing which corresponds to up to approximately 9 mL sample volume. This imaging platform records the shadows (i.e., lensless digital holograms) of each cell of interest within its field of view, and automated digital processing of these cell shadows can determine the type, the count and the relative positions of cells within the solution. Because it does not require any bulky optical components or mechanical scanning stages it offers a significantly miniaturized platform that at the same time reduces the cost, which is quite important for especially point of care diagnostic tools. Furthermore, the imaging throughput of this platform is orders of magnitude better than conventional optical microscopes, which could be exceedingly valuable for high-throughput cell-biology experiments.

  15. Lensless On-chip Imaging of Cells Provides a New Tool for High-throughput Cell-Biology and Medical Diagnostics

    PubMed Central

    Mudanyali, Onur; Erlinger, Anthony; Seo, Sungkyu; Su, Ting-Wei; Tseng, Derek; Ozcan, Aydogan

    2009-01-01

    Conventional optical microscopes image cells by use of objective lenses that work together with other lenses and optical components. While quite effective, this classical approach has certain limitations for miniaturization of the imaging platform to make it compatible with the advanced state of the art in microfluidics. In this report, we introduce experimental details of a lensless on-chip imaging concept termed LUCAS (Lensless Ultra-wide field-of-view Cell monitoring Array platform based on Shadow imaging) that does not require any microscope objectives or other bulky optical components to image a heterogeneous cell solution over an ultra-wide field of view that can span as large as ~18 cm2. Moreover, unlike conventional microscopes, LUCAS can image a heterogeneous cell solution of interest over a depth-of-field of ~5 mm without the need for refocusing which corresponds to up to ~9 mL sample volume. This imaging platform records the shadows (i.e., lensless digital holograms) of each cell of interest within its field of view, and automated digital processing of these cell shadows can determine the type, the count and the relative positions of cells within the solution. Because it does not require any bulky optical components or mechanical scanning stages it offers a significantly miniaturized platform that at the same time reduces the cost, which is quite important for especially point of care diagnostic tools. Furthermore, the imaging throughput of this platform is orders of magnitude better than conventional optical microscopes, which could be exceedingly valuable for high-throughput cell-biology experiments. PMID:20010542

  16. Restructuring of the Aquatic Bacterial Community by Hydric Dynamics Associated with Superstorm Sandy

    PubMed Central

    Ulrich, Nikea; Rosenberger, Abigail; Brislawn, Colin; Wright, Justin; Kessler, Collin; Toole, David; Solomon, Caroline; Strutt, Steven; McClure, Erin

    2016-01-01

    ABSTRACT Bacterial community composition and longitudinal fluctuations were monitored in a riverine system during and after Superstorm Sandy to better characterize inter- and intracommunity responses associated with the disturbance associated with a 100-year storm event. High-throughput sequencing of the 16S rRNA gene was used to assess microbial community structure within water samples from Muddy Creek Run, a second-order stream in Huntingdon, PA, at 12 different time points during the storm event (29 October to 3 November 2012) and under seasonally matched baseline conditions. High-throughput sequencing of the 16S rRNA gene was used to track changes in bacterial community structure and divergence during and after Superstorm Sandy. Bacterial community dynamics were correlated to measured physicochemical parameters and fecal indicator bacteria (FIB) concentrations. Bioinformatics analyses of 2.1 million 16S rRNA gene sequences revealed a significant increase in bacterial diversity in samples taken during peak discharge of the storm. Beta-diversity analyses revealed longitudinal shifts in the bacterial community structure. Successional changes were observed, in which Betaproteobacteria and Gammaproteobacteria decreased in 16S rRNA gene relative abundance, while the relative abundance of members of the Firmicutes increased. Furthermore, 16S rRNA gene sequences matching pathogenic bacteria, including strains of Legionella, Campylobacter, Arcobacter, and Helicobacter, as well as bacteria of fecal origin (e.g., Bacteroides), exhibited an increase in abundance after peak discharge of the storm. This study revealed a significant restructuring of in-stream bacterial community structure associated with hydric dynamics of a storm event. IMPORTANCE In order to better understand the microbial risks associated with freshwater environments during a storm event, a more comprehensive understanding of the variations in aquatic bacterial diversity is warranted. This study investigated the bacterial communities during and after Superstorm Sandy to provide fine time point resolution of dynamic changes in bacterial composition. This study adds to the current literature by revealing the variation in bacterial community structure during the course of a storm. This study employed high-throughput DNA sequencing, which generated a deep analysis of inter- and intracommunity responses during a significant storm event. This study has highlighted the utility of applying high-throughput sequencing for water quality monitoring purposes, as this approach enabled a more comprehensive investigation of the bacterial community structure. Altogether, these data suggest a drastic restructuring of the stream bacterial community during a storm event and highlight the potential of high-throughput sequencing approaches for assessing the microbiological quality of our environment. PMID:27060115

  17. Large-scale DNA Barcode Library Generation for Biomolecule Identification in High-throughput Screens.

    PubMed

    Lyons, Eli; Sheridan, Paul; Tremmel, Georg; Miyano, Satoru; Sugano, Sumio

    2017-10-24

    High-throughput screens allow for the identification of specific biomolecules with characteristics of interest. In barcoded screens, DNA barcodes are linked to target biomolecules in a manner allowing for the target molecules making up a library to be identified by sequencing the DNA barcodes using Next Generation Sequencing. To be useful in experimental settings, the DNA barcodes in a library must satisfy certain constraints related to GC content, homopolymer length, Hamming distance, and blacklisted subsequences. Here we report a novel framework to quickly generate large-scale libraries of DNA barcodes for use in high-throughput screens. We show that our framework dramatically reduces the computation time required to generate large-scale DNA barcode libraries, compared with a naїve approach to DNA barcode library generation. As a proof of concept, we demonstrate that our framework is able to generate a library consisting of one million DNA barcodes for use in a fragment antibody phage display screening experiment. We also report generating a general purpose one billion DNA barcode library, the largest such library yet reported in literature. Our results demonstrate the value of our novel large-scale DNA barcode library generation framework for use in high-throughput screening applications.

  18. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    PubMed

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  19. Improved Protocols for Illumina Sequencing

    PubMed Central

    Bronner, Iraad F.; Quail, Michael A.; Turner, Daniel J.; Swerdlow, Harold

    2013-01-01

    In this unit, we describe a set of improvements we have made to the standard Illumina protocols to make the sequencing process more reliable in a high-throughput environment, reduce amplification bias, narrow the distribution of insert sizes, and reliably obtain high yields of data. PMID:19582764

  20. Appliation of rad-sequencing to linkage mapping in citrus

    USDA-ARS?s Scientific Manuscript database

    High density linkage maps can be developed for modest cost using high-throughput DNA sequencing to genotype a defined fraction (representation) of the genome. We developed linkage maps in two citrus populations using the RAD (Restriction site Associated DNA) genotyping method which involves restrict...

  1. MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing

    PubMed Central

    Diroma, Maria Angela; Santorsola, Mariangela; Guttà, Cristiano; Gasparre, Giuseppe; Picardi, Ernesto; Pesole, Graziano; Attimonelli, Marcella

    2014-01-01

    Motivation: The increasing availability of mitochondria-targeted and off-target sequencing data in whole-exome and whole-genome sequencing studies (WXS and WGS) has risen the demand of effective pipelines to accurately measure heteroplasmy and to easily recognize the most functionally important mitochondrial variants among a huge number of candidates. To this purpose, we developed MToolBox, a highly automated pipeline to reconstruct and analyze human mitochondrial DNA from high-throughput sequencing data. Results: MToolBox implements an effective computational strategy for mitochondrial genomes assembling and haplogroup assignment also including a prioritization analysis of detected variants. MToolBox provides a Variant Call Format file featuring, for the first time, allele-specific heteroplasmy and annotation files with prioritized variants. MToolBox was tested on simulated samples and applied on 1000 Genomes WXS datasets. Availability and implementation: MToolBox package is available at https://sourceforge.net/projects/mtoolbox/. Contact: marcella.attimonelli@uniba.it Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25028726

  2. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection

    PubMed Central

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike

    2018-01-01

    ABSTRACT Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection. PMID:29564396

  3. A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection.

    PubMed

    Goodacre, Norman; Aljanahi, Aisha; Nandakumar, Subhiksha; Mikailov, Mike; Khan, Arifa S

    2018-01-01

    Detection of distantly related viruses by high-throughput sequencing (HTS) is bioinformatically challenging because of the lack of a public database containing all viral sequences, without abundant nonviral sequences, which can extend runtime and obscure viral hits. Our reference viral database (RVDB) includes all viral, virus-related, and virus-like nucleotide sequences (excluding bacterial viruses), regardless of length, and with overall reduced cellular sequences. Semantic selection criteria (SEM-I) were used to select viral sequences from GenBank, resulting in a first-generation viral database (VDB). This database was manually and computationally reviewed, resulting in refined, semantic selection criteria (SEM-R), which were applied to a new download of updated GenBank sequences to create a second-generation VDB. Viral entries in the latter were clustered at 98% by CD-HIT-EST to reduce redundancy while retaining high viral sequence diversity. The viral identity of the clustered representative sequences (creps) was confirmed by BLAST searches in NCBI databases and HMMER searches in PFAM and DFAM databases. The resulting RVDB contained a broad representation of viral families, sequence diversity, and a reduced cellular content; it includes full-length and partial sequences and endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Testing of RVDBv10.2, with an in-house HTS transcriptomic data set indicated a significantly faster run for virus detection than interrogating the entirety of the NCBI nonredundant nucleotide database, which contains all viral sequences but also nonviral sequences. RVDB is publically available for facilitating HTS analysis, particularly for novel virus detection. It is meant to be updated on a regular basis to include new viral sequences added to GenBank. IMPORTANCE To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection.

  4. Recovery of divergent avian bornaviruses from cases of proventricular dilatation disease: Identification of a candidate etiologic agent

    PubMed Central

    Kistler, Amy L; Gancz, Ady; Clubb, Susan; Skewes-Cox, Peter; Fischer, Kael; Sorber, Katherine; Chiu, Charles Y; Lublin, Avishai; Mechani, Sara; Farnoushi, Yigal; Greninger, Alexander; Wen, Christopher C; Karlene, Scott B; Ganem, Don; DeRisi, Joseph L

    2008-01-01

    Background Proventricular dilatation disease (PDD) is a fatal disorder threatening domesticated and wild psittacine birds worldwide. It is characterized by lymphoplasmacytic infiltration of the ganglia of the central and peripheral nervous system, leading to central nervous system disorders as well as disordered enteric motility and associated wasting. For almost 40 years, a viral etiology for PDD has been suspected, but to date no candidate etiologic agent has been reproducibly linked to the disease. Results Analysis of 2 PDD case-control series collected independently on different continents using a pan-viral microarray revealed a bornavirus hybridization signature in 62.5% of the PDD cases (5/8) and none of the controls (0/8). Ultra high throughput sequencing was utilized to recover the complete viral genome sequence from one of the virus-positive PDD cases. This revealed a bornavirus-like genome organization for this agent with a high degree of sequence divergence from all prior bornavirus isolates. We propose the name avian bornavirus (ABV) for this agent. Further specific ABV PCR analysis of an additional set of independently collected PDD cases and controls yielded a significant difference in ABV detection rate among PDD cases (71%, n = 7) compared to controls (0%, n = 14) (P = 0.01; Fisher's Exact Test). Partial sequence analysis of a total of 16 ABV isolates we have now recovered from these and an additional set of cases reveals at least 5 distinct ABV genetic subgroups. Conclusion These studies clearly demonstrate the existence of an avian reservoir of remarkably diverse bornaviruses and provide a compelling candidate in the search for an etiologic agent of PDD. PMID:18671869

  5. Evaluating imputation algorithms for low-depth genotyping-by-sequencing (GBS) data

    USDA-ARS?s Scientific Manuscript database

    Well-powered genomic studies require genome-wide marker coverage across many individuals. For non-model species with few genomic resources, high-throughput sequencing (HTS) methods, such as Genotyping-By-Sequencing (GBS), offer an inexpensive alternative to array-based genotyping. Although affordabl...

  6. Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

    PubMed Central

    Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

    2018-01-01

    Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139

  7. FMLRC: Hybrid long read error correction using an FM-index.

    PubMed

    Wang, Jeremy R; Holt, James; McMillan, Leonard; Jones, Corbin D

    2018-02-09

    Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging "hybrid" assemblies that use long reads for scaffolding and short reads for accuracy. We describe a novel method leveraging a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We demonstrate that our method efficiently produces significantly more high quality corrected sequence than existing hybrid error-correction methods. We also show that our method produces more contiguous assemblies, in many cases, than existing state-of-the-art hybrid and long-read only de novo assembly methods. Our method accurately corrects long read sequence data using complementary short reads. We demonstrate higher total throughput of corrected long reads and a corresponding increase in contiguity of the resulting de novo assemblies. Improved throughput and computational efficiency than existing methods will help better economically utilize emerging long read sequencing technologies.

  8. Analysis of Illumina Microbial Assemblies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clum, Alicia; Foster, Brian; Froula, Jeff

    2010-05-28

    Since the emerging of second generation sequencing technologies, the evaluation of different sequencing approaches and their assembly strategies for different types of genomes has become an important undertaken. Next generation sequencing technologies dramatically increase sequence throughput while decreasing cost, making them an attractive tool for whole genome shotgun sequencing. To compare different approaches for de-novo whole genome assembly, appropriate tools and a solid understanding of both quantity and quality of the underlying sequence data are crucial. Here, we performed an in-depth analysis of short-read Illumina sequence assembly strategies for bacterial and archaeal genomes. Different types of Illumina libraries as wellmore » as different trim parameters and assemblers were evaluated. Results of the comparative analysis and sequencing platforms will be presented. The goal of this analysis is to develop a cost-effective approach for the increased throughput of the generation of high quality microbial genomes.« less

  9. Mutation detection using automated fluorescence-based sequencing.

    PubMed

    Montgomery, Kate T; Iartchouck, Oleg; Li, Li; Perera, Anoja; Yassin, Yosuf; Tamburino, Alex; Loomis, Stephanie; Kucherlapati, Raju

    2008-04-01

    The development of high-throughput DNA sequencing techniques has made direct DNA sequencing of PCR-amplified genomic DNA a rapid and economical approach to the identification of polymorphisms that may play a role in disease. Point mutations as well as small insertions or deletions are readily identified by DNA sequencing. The mutations may be heterozygous (occurring in one allele while the other allele retains the normal sequence) or homozygous (occurring in both alleles). Sequencing alone cannot discriminate between true homozygosity and apparent homozygosity due to the loss of one allele due to a large deletion. In this unit, strategies are presented for using PCR amplification and automated fluorescence-based sequencing to identify sequence variation. The size of the project and laboratory preference and experience will dictate how the data is managed and which software tools are used for analysis. A high-throughput protocol is given that has been used to search for mutations in over 200 different genes at the Harvard Medical School - Partners Center for Genetics and Genomics (HPCGG, http://www.hpcgg.org/). Copyright 2008 by John Wiley & Sons, Inc.

  10. A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments.

    PubMed

    Bansal, Vikas

    2017-03-14

    PCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important to assess the frequency of such reads. Existing computational methods do not distinguish PCR duplicates from "natural" read duplicates that represent independent DNA fragments and therefore, over-estimate the PCR duplication rate for DNA-seq and RNA-seq experiments. In this paper, we present a computational method to estimate the average PCR duplication rate of high-throughput sequence datasets that accounts for natural read duplicates by leveraging heterozygous variants in an individual genome. Analysis of simulated data and exome sequence data from the 1000 Genomes project demonstrated that our method can accurately estimate the PCR duplication rate on paired-end as well as single-end read datasets which contain a high proportion of natural read duplicates. Further, analysis of exome datasets prepared using the Nextera library preparation method indicated that 45-50% of read duplicates correspond to natural read duplicates likely due to fragmentation bias. Finally, analysis of RNA-seq datasets from individuals in the 1000 Genomes project demonstrated that 70-95% of read duplicates observed in such datasets correspond to natural duplicates sampled from genes with high expression and identified outlier samples with a 2-fold greater PCR duplication rate than other samples. The method described here is a useful tool for estimating the PCR duplication rate of high-throughput sequence datasets and for assessing the fraction of read duplicates that correspond to natural read duplicates. An implementation of the method is available at https://github.com/vibansal/PCRduplicates .

  11. Isolating Viral and Host RNA Sequences from Archival Material and Production of cDNA Libraries for High-Throughput DNA Sequencing

    PubMed Central

    Xiao, Yongli; Sheng, Zong-Mei; Taubenberger, Jeffery K.

    2015-01-01

    The vast majority of surgical biopsy and post-mortem tissue samples are formalin-fixed and paraffin-embedded (FFPE), but this process leads to RNA degradation that limits gene expression analysis. As an example, the viral RNA genome of the 1918 pandemic influenza A virus was previously determined in a 9-year effort by overlapping RT-PCR from post-mortem samples. Using the protocols described here, the full genome of the 1918 virus at high coverage was determined in one high-throughput sequencing run of a cDNA library derived from total RNA of a 1918 FFPE sample after duplex-specific nuclease treatments. This basic methodological approach should assist in the analysis of FFPE tissue samples isolated over the past century from a variety of infectious diseases. PMID:26344216

  12. High throughput screening of particle conditioning operations: I. System design and method development.

    PubMed

    Noyes, Aaron; Huffman, Ben; Godavarti, Ranga; Titchener-Hooker, Nigel; Coffman, Jonathan; Sunasara, Khurram; Mukhopadhyay, Tarit

    2015-08-01

    The biotech industry is under increasing pressure to decrease both time to market and development costs. Simultaneously, regulators are expecting increased process understanding. High throughput process development (HTPD) employs small volumes, parallel processing, and high throughput analytics to reduce development costs and speed the development of novel therapeutics. As such, HTPD is increasingly viewed as integral to improving developmental productivity and deepening process understanding. Particle conditioning steps such as precipitation and flocculation may be used to aid the recovery and purification of biological products. In this first part of two articles, we describe an ultra scale-down system (USD) for high throughput particle conditioning (HTPC) composed of off-the-shelf components. The apparatus is comprised of a temperature-controlled microplate with magnetically driven stirrers and integrated with a Tecan liquid handling robot. With this system, 96 individual reaction conditions can be evaluated in parallel, including downstream centrifugal clarification. A comprehensive suite of high throughput analytics enables measurement of product titer, product quality, impurity clearance, clarification efficiency, and particle characterization. HTPC at the 1 mL scale was evaluated with fermentation broth containing a vaccine polysaccharide. The response profile was compared with the Pilot-scale performance of a non-geometrically similar, 3 L reactor. An engineering characterization of the reactors and scale-up context examines theoretical considerations for comparing this USD system with larger scale stirred reactors. In the second paper, we will explore application of this system to industrially relevant vaccines and test different scale-up heuristics. © 2015 Wiley Periodicals, Inc.

  13. RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA.

    PubMed

    Wright, Imogen A; Travers, Simon A

    2014-07-01

    The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. High-Throughput Identification of Loss-of-Function Mutations for Anti-Interferon Activity in the Influenza A Virus NS Segment

    PubMed Central

    Wu, Nicholas C.; Young, Arthur P.; Al-Mawsawi, Laith Q.; Olson, C. Anders; Feng, Jun; Qi, Hangfei; Luan, Harding H.; Li, Xinmin; Wu, Ting-Ting

    2014-01-01

    ABSTRACT Viral proteins often display several functions which require multiple assays to dissect their genetic basis. Here, we describe a systematic approach to screen for loss-of-function mutations that confer a fitness disadvantage under a specified growth condition. Our methodology was achieved by genetically monitoring a mutant library under two growth conditions, with and without interferon, by deep sequencing. We employed a molecular tagging technique to distinguish true mutations from sequencing error. This approach enabled us to identify mutations that were negatively selected against, in addition to those that were positively selected for. Using this technique, we identified loss-of-function mutations in the influenza A virus NS segment that were sensitive to type I interferon in a high-throughput fashion. Mechanistic characterization further showed that a single substitution, D92Y, resulted in the inability of NS to inhibit RIG-I ubiquitination. The approach described in this study can be applied under any specified condition for any virus that can be genetically manipulated. IMPORTANCE Traditional genetics focuses on a single genotype-phenotype relationship, whereas high-throughput genetics permits phenotypic characterization of numerous mutants in parallel. High-throughput genetics often involves monitoring of a mutant library with deep sequencing. However, deep sequencing suffers from a high error rate (∼0.1 to 1%), which is usually higher than the occurrence frequency for individual point mutations within a mutant library. Therefore, only mutations that confer a fitness advantage can be identified with confidence due to an enrichment in the occurrence frequency. In contrast, it is impossible to identify deleterious mutations using most next-generation sequencing techniques. In this study, we have applied a molecular tagging technique to distinguish true mutations from sequencing errors. It enabled us to identify mutations that underwent negative selection, in addition to mutations that experienced positive selection. This study provides a proof of concept by screening for loss-of-function mutations on the influenza A virus NS segment that are involved in its anti-interferon activity. PMID:24965464

  15. First report of Beet western yellows virus infecting Epiphyllum spp

    USDA-ARS?s Scientific Manuscript database

    Beet western yellow virus (BWYV) was identified from an orchid cactus (Epiphyllum spp.) hybrid without obvious symptoms by high-throughput sequencing. The nearly complete genomic sequence of 5,458 nucleotides of the virus was determined. The isolate has the highest nucleotide sequence identity (93%)...

  16. Polymerase chain reaction-hybridization method using urease gene sequences for high-throughput Ureaplasma urealyticum and Ureaplasma parvum detection and differentiation.

    PubMed

    Xu, Chen; Zhang, Nan; Huo, Qianyu; Chen, Minghui; Wang, Rengfeng; Liu, Zhili; Li, Xue; Liu, Yunde; Bao, Huijing

    2016-04-15

    In this article, we discuss the polymerase chain reaction (PCR)-hybridization assay that we developed for high-throughput simultaneous detection and differentiation of Ureaplasma urealyticum and Ureaplasma parvum using one set of primers and two specific DNA probes based on urease gene nucleotide sequence differences. First, U. urealyticum and U. parvum DNA samples were specifically amplified using one set of biotin-labeled primers. Furthermore, amine-modified DNA probes, which can specifically react with U. urealyticum or U. parvum DNA, were covalently immobilized to a DNA-BIND plate surface. The plate was then incubated with the PCR products to facilitate sequence-specific DNA binding. Horseradish peroxidase-streptavidin conjugation and a colorimetric assay were used. Based on the results, the PCR-hybridization assay we developed can specifically differentiate U. urealyticum and U. parvum with high sensitivity (95%) compared with cultivation (72.5%). Hence, this study demonstrates a new method for high-throughput simultaneous differentiation and detection of U. urealyticum and U. parvum with high sensitivity. Based on these observations, the PCR-hybridization assay developed in this study is ideal for detecting and discriminating U. urealyticum and U. parvum in clinical applications. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Using High-Throughput Sequencing to Leverage Surveillance of Genetic Diversity and Oseltamivir Resistance: A Pilot Study during the 2009 Influenza A(H1N1) Pandemic

    PubMed Central

    Téllez-Sosa, Juan; Rodríguez, Mario Henry; Gómez-Barreto, Rosa E.; Valdovinos-Torres, Humberto; Hidalgo, Ana Cecilia; Cruz-Hervert, Pablo; Luna, René Santos; Carrillo-Valenzo, Erik; Ramos, Celso; García-García, Lourdes; Martínez-Barnetche, Jesús

    2013-01-01

    Background Influenza viruses display a high mutation rate and complex evolutionary patterns. Next-generation sequencing (NGS) has been widely used for qualitative and semi-quantitative assessment of genetic diversity in complex biological samples. The “deep sequencing” approach, enabled by the enormous throughput of current NGS platforms, allows the identification of rare genetic viral variants in targeted genetic regions, but is usually limited to a small number of samples. Methodology and Principal Findings We designed a proof-of-principle study to test whether redistributing sequencing throughput from a high depth-small sample number towards a low depth-large sample number approach is feasible and contributes to influenza epidemiological surveillance. Using 454-Roche sequencing, we sequenced at a rather low depth, a 307 bp amplicon of the neuraminidase gene of the Influenza A(H1N1) pandemic (A(H1N1)pdm) virus from cDNA amplicons pooled in 48 barcoded libraries obtained from nasal swab samples of infected patients (n  =  299) taken from May to November, 2009 pandemic period in Mexico. This approach revealed that during the transition from the first (May-July) to second wave (September-November) of the pandemic, the initial genetic variants were replaced by the N248D mutation in the NA gene, and enabled the establishment of temporal and geographic associations with genetic diversity and the identification of mutations associated with oseltamivir resistance. Conclusions NGS sequencing of a short amplicon from the NA gene at low sequencing depth allowed genetic screening of a large number of samples, providing insights to viral genetic diversity dynamics and the identification of genetic variants associated with oseltamivir resistance. Further research is needed to explain the observed replacement of the genetic variants seen during the second wave. As sequencing throughput rises and library multiplexing and automation improves, we foresee that the approach presented here can be scaled up for global genetic surveillance of influenza and other infectious diseases. PMID:23843978

  18. High-throughput analysis using non-depletive SPME: challenges and applications to the determination of free and total concentrations in small sample volumes.

    PubMed

    Boyacı, Ezel; Bojko, Barbara; Reyes-Garcés, Nathaly; Poole, Justen J; Gómez-Ríos, Germán Augusto; Teixeira, Alexandre; Nicol, Beate; Pawliszyn, Janusz

    2018-01-18

    In vitro high-throughput non-depletive quantitation of chemicals in biofluids is of growing interest in many areas. Some of the challenges facing researchers include the limited volume of biofluids, rapid and high-throughput sampling requirements, and the lack of reliable methods. Coupled to the above, growing interest in the monitoring of kinetics and dynamics of miniaturized biosystems has spurred the demand for development of novel and revolutionary methodologies for analysis of biofluids. The applicability of solid-phase microextraction (SPME) is investigated as a potential technology to fulfill the aforementioned requirements. As analytes with sufficient diversity in their physicochemical features, nicotine, N,N-Diethyl-meta-toluamide, and diclofenac were selected as test compounds for the study. The objective was to develop methodologies that would allow repeated non-depletive sampling from 96-well plates, using 100 µL of sample. Initially, thin film-SPME was investigated. Results revealed substantial depletion and consequent disruption in the system. Therefore, new ultra-thin coated fibers were developed. The applicability of this device to the described sampling scenario was tested by determining the protein binding of the analytes. Results showed good agreement with rapid equilibrium dialysis. The presented method allows high-throughput analysis using small volumes, enabling fast reliable free and total concentration determinations without disruption of system equilibrium.

  19. High-Throughput Phenotyping of Human Induced Pluripotent Stem Cell-Derived Cardiomyocytes and Neurons Using Electric Field Stimulation and High-Speed Fluorescence Imaging

    PubMed Central

    Daily, Neil J.; Du, Zhong-Wei

    2017-01-01

    Abstract Electrophysiology of excitable cells, including muscle cells and neurons, has been measured by making direct contact with a single cell using a micropipette electrode. To increase the assay throughput, optical devices such as microscopes and microplate readers have been used to analyze electrophysiology of multiple cells. We have established a high-throughput (HTP) analysis of action potentials (APs) in highly enriched motor neurons and cardiomyocytes (CMs) that are differentiated from human induced pluripotent stem cells (iPSCs). A multichannel electric field stimulation (EFS) device enabled the ability to electrically stimulate cells and measure dynamic changes in APs of excitable cells ultra-rapidly (>100 data points per second) by imaging entire 96-well plates. We found that the activities of both neurons and CMs and their response to EFS and chemicals are readily discerned by our fluorescence imaging-based HTP phenotyping assay. The latest generation of calcium (Ca2+) indicator dyes, FLIPR Calcium 6 and Cal-520, with the HTP device enables physiological analysis of human iPSC-derived samples highlighting its potential application for understanding disease mechanisms and discovering new therapeutic treatments. PMID:28525289

  20. The utility of ultra-high performance supercritical fluid chromatography-tandem mass spectrometry (UHPSFC-MS/MS) for clinically relevant steroid analysis.

    PubMed

    Storbeck, Karl-Heinz; Gilligan, Lorna; Jenkinson, Carl; Baranowski, Elizabeth S; Quanson, Jonathan L; Arlt, Wiebke; Taylor, Angela E

    2018-05-15

    Liquid chromatography tandem mass spectrometry (LC-MS/MS) assays are considered the reference standard for serum steroid hormone analyses, while full urinary steroid profiles are only achievable by gas chromatography (GC-MS). Both LC-MS/MS and GC-MS have well documented strengths and limitations. Recently, commercial ultra-high performance supercritical fluid chromatography-tandem mass spectrometry (UHPSFC-MS/MS) systems have been developed. These systems combine the resolution of GC with the high-throughput capabilities of UHPLC. Uptake of this new technology into research and clinical labs has been slow, possibly due to the perceived increase in complexity. Here we therefore present fundamental principles of UHPSFC-MS/MS and the likely applications for this technology in the clinical research setting, while commenting on potential hurdles based on our experience to date. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

  1. Genome-Wide Identification of Different Dormant Medicago sativa L. MicroRNAs in Response to Fall Dormancy

    PubMed Central

    Du, Hongqi; Sun, Xiaoge; Shi, Yinghua; Wang, Chengzhang

    2014-01-01

    Background MicroRNAs (miRNAs) are a class of regulatory small RNAs (sRNAs) that regulate gene post-transcriptional expression in plants and animals. High-throughput sequencing technology is capable of identifying small RNAs in plant species. Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide, and fall dormancy is an adaptive characteristic related to the biomass production and winter survival in alfalfa. Here, we applied high-throughput sRNA sequencing to identify some miRNAs that were responsive to fall dormancy in standard variety (Maverick and CUF101) of alfalfa. Results Four sRNA libraries were generated and sequenced from alfalfa leaves in two typical varieties at distinct seasons. Through integrative analysis, we identified 51 novel miRNA candidates of 206 families. Additionally, we identified 28 miRNAs associated with fall dormancy in standard variety (Maverick and CUF101), including 20 known miRNAs and eight novel miRNAs. Both high-throughput sequencing and RT-qPCR confirmed that eight known miRNA members were up-regulated and six known miRNA members were down-regulated in response to fall dormancy in standard variety (Maverick and CUF101). Among the 51 novel miRNA candidates, five miRNAs were up-regulated and three miRNAs were down-regulated in response to fall dormancy in standard variety (Maverick and CUF101), and five of them were confirmed by Northern blot analysis. Conclusion We identified 20 known miRNAs and eight new miRNA candidates that were responsive to fall dormancy in standard variety (Maverick and CUF101) by high-throughput sequencing of small RNAs from Medicago sativa. Our data provide a useful resource for investigating miRNA-mediated regulatory mechanisms of fall dormancy in alfalfa, and these findings are important for our understanding of the roles played by miRNAs in the response of plants to abiotic stress in general and fall dormancy in alfalfa. PMID:25473944

  2. Genome-wide identification of different dormant Medicago sativa L. MicroRNAs in response to fall dormancy.

    PubMed

    Fan, Wenna; Zhang, Senhao; Du, Hongqi; Sun, Xiaoge; Shi, Yinghua; Wang, Chengzhang

    2014-01-01

    MicroRNAs (miRNAs) are a class of regulatory small RNAs (sRNAs) that regulate gene post-transcriptional expression in plants and animals. High-throughput sequencing technology is capable of identifying small RNAs in plant species. Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide, and fall dormancy is an adaptive characteristic related to the biomass production and winter survival in alfalfa. Here, we applied high-throughput sRNA sequencing to identify some miRNAs that were responsive to fall dormancy in standard variety (Maverick and CUF101) of alfalfa. Four sRNA libraries were generated and sequenced from alfalfa leaves in two typical varieties at distinct seasons. Through integrative analysis, we identified 51 novel miRNA candidates of 206 families. Additionally, we identified 28 miRNAs associated with fall dormancy in standard variety (Maverick and CUF101), including 20 known miRNAs and eight novel miRNAs. Both high-throughput sequencing and RT-qPCR confirmed that eight known miRNA members were up-regulated and six known miRNA members were down-regulated in response to fall dormancy in standard variety (Maverick and CUF101). Among the 51 novel miRNA candidates, five miRNAs were up-regulated and three miRNAs were down-regulated in response to fall dormancy in standard variety (Maverick and CUF101), and five of them were confirmed by Northern blot analysis. We identified 20 known miRNAs and eight new miRNA candidates that were responsive to fall dormancy in standard variety (Maverick and CUF101) by high-throughput sequencing of small RNAs from Medicago sativa. Our data provide a useful resource for investigating miRNA-mediated regulatory mechanisms of fall dormancy in alfalfa, and these findings are important for our understanding of the roles played by miRNAs in the response of plants to abiotic stress in general and fall dormancy in alfalfa.

  3. Comparing Sanger sequencing and high-throughput metabarcoding for inferring photobiont diversity in lichens.

    PubMed

    Paul, Fiona; Otte, Jürgen; Schmitt, Imke; Dal Grande, Francesco

    2018-06-05

    The implementation of HTS (high-throughput sequencing) approaches is rapidly changing our understanding of the lichen symbiosis, by uncovering high bacterial and fungal diversity, which is often host-specific. Recently, HTS methods revealed the presence of multiple photobionts inside a single thallus in several lichen species. This differs from Sanger technology, which typically yields a single, unambiguous algal sequence per individual. Here we compared HTS and Sanger methods for estimating the diversity of green algal symbionts within lichen thalli using 240 lichen individuals belonging to two species of lichen-forming fungi. According to HTS data, Sanger technology consistently yielded the most abundant photobiont sequence in the sample. However, if the second most abundant photobiont exceeded 30% of the total HTS reads in a sample, Sanger sequencing generally failed. Our results suggest that most lichen individuals in the two analyzed species, Lasallia hispanica and L. pustulata, indeed contain a single, predominant green algal photobiont. We conclude that Sanger sequencing is a valid approach to detect the dominant photobionts in lichen individuals and populations. We discuss which research areas in lichen ecology and evolution will continue to benefit from Sanger sequencing, and which areas will profit from HTS approaches to assessing symbiont diversity.

  4. Genome-derived vaccines.

    PubMed

    De Groot, Anne S; Rappuoli, Rino

    2004-02-01

    Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.

  5. Droplet barcoding for single cell transcriptomics applied to embryonic stem cells

    PubMed Central

    Klein, Allon M; Mazutis, Linas; Akartuna, Ilke; Tallapragada, Naren; Veres, Adrian; Li, Victor; Peshkin, Leonid; Weitz, David A; Kirschner, Marc W

    2015-01-01

    Summary It has long been the dream of biologists to map gene expression at the single cell level. With such data one might track heterogeneous cell sub-populations, and infer regulatory relationships between genes and pathways. Recently, RNA sequencing has achieved single cell resolution. What is limiting is an effective way to routinely isolate and process large numbers of individual cells for quantitative in-depth sequencing. We have developed a high-throughput droplet-microfluidic approach for barcoding the RNA from thousands of individual cells for subsequent analysis by next-generation sequencing. The method shows a surprisingly low noise profile and is readily adaptable to other sequencing-based assays. We analyzed mouse embryonic stem cells, revealing in detail the population structure and the heterogeneous onset of differentiation after LIF withdrawal. The reproducibility of these high-throughput single cell data allowed us to deconstruct cell populations and infer gene expression relationships. PMID:26000487

  6. A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences

    PubMed Central

    Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.

    2017-01-01

    An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204

  7. Illumina GA IIx& HiSeq 2000 Production Sequenccing and QC Analysis Pipelines at the DOE Joint Genome Institute

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daum, Christopher; Zane, Matthew; Han, James

    2011-01-31

    The U.S. Department of Energy (DOE) Joint Genome Institute's (JGI) Production Sequencing group is committed to the generation of high-quality genomic DNA sequence to support the mission areas of renewable energy generation, global carbon management, and environmental characterization and clean-up. Within the JGI's Production Sequencing group, a robust Illumina Genome Analyzer and HiSeq pipeline has been established. Optimization of the sesequencer pipelines has been ongoing with the aim of continual process improvement of the laboratory workflow, reducing operational costs and project cycle times to increases ample throughput, and improving the overall quality of the sequence generated. A sequence QC analysismore » pipeline has been implemented to automatically generate read and assembly level quality metrics. The foremost of these optimization projects, along with sequencing and operational strategies, throughput numbers, and sequencing quality results will be presented.« less

  8. High-Throughput Non-Contact Vitrification of Cell-Laden Droplets Based on Cell Printing

    NASA Astrophysics Data System (ADS)

    Shi, Meng; Ling, Kai; Yong, Kar Wey; Li, Yuhui; Feng, Shangsheng; Zhang, Xiaohui; Pingguan-Murphy, Belinda; Lu, Tian Jian; Xu, Feng

    2015-12-01

    Cryopreservation is the most promising way for long-term storage of biological samples e.g., single cells and cellular structures. Among various cryopreservation methods, vitrification is advantageous by employing high cooling rate to avoid the formation of harmful ice crystals in cells. Most existing vitrification methods adopt direct contact of cells with liquid nitrogen to obtain high cooling rates, which however causes the potential contamination and difficult cell collection. To address these limitations, we developed a non-contact vitrification device based on an ultra-thin freezing film to achieve high cooling/warming rate and avoid direct contact between cells and liquid nitrogen. A high-throughput cell printer was employed to rapidly generate uniform cell-laden microdroplets into the device, where the microdroplets were hung on one side of the film and then vitrified by pouring the liquid nitrogen onto the other side via boiling heat transfer. Through theoretical and experimental studies on vitrification processes, we demonstrated that our device offers a high cooling/warming rate for vitrification of the NIH 3T3 cells and human adipose-derived stem cells (hASCs) with maintained cell viability and differentiation potential. This non-contact vitrification device provides a novel and effective way to cryopreserve cells at high throughput and avoid the contamination and collection problems.

  9. High-Throughput Non-Contact Vitrification of Cell-Laden Droplets Based on Cell Printing

    PubMed Central

    Shi, Meng; Ling, Kai; Yong, Kar Wey; Li, Yuhui; Feng, Shangsheng; Zhang, Xiaohui; Pingguan-Murphy, Belinda; Lu, Tian Jian; Xu, Feng

    2015-01-01

    Cryopreservation is the most promising way for long-term storage of biological samples e.g., single cells and cellular structures. Among various cryopreservation methods, vitrification is advantageous by employing high cooling rate to avoid the formation of harmful ice crystals in cells. Most existing vitrification methods adopt direct contact of cells with liquid nitrogen to obtain high cooling rates, which however causes the potential contamination and difficult cell collection. To address these limitations, we developed a non-contact vitrification device based on an ultra-thin freezing film to achieve high cooling/warming rate and avoid direct contact between cells and liquid nitrogen. A high-throughput cell printer was employed to rapidly generate uniform cell-laden microdroplets into the device, where the microdroplets were hung on one side of the film and then vitrified by pouring the liquid nitrogen onto the other side via boiling heat transfer. Through theoretical and experimental studies on vitrification processes, we demonstrated that our device offers a high cooling/warming rate for vitrification of the NIH 3T3 cells and human adipose-derived stem cells (hASCs) with maintained cell viability and differentiation potential. This non-contact vitrification device provides a novel and effective way to cryopreserve cells at high throughput and avoid the contamination and collection problems. PMID:26655688

  10. Future ultra-speed tube-flight

    NASA Astrophysics Data System (ADS)

    Salter, Robert M.

    1994-05-01

    Future long-link, ultra-speed, surface transport systems will require electromagnetically (EM) driven and restrained vehicles operating under reduced-atmosphere in very straight tubes. Such tube-flight trains will be safe, energy conservative, pollution-free, and in a protected environment. Hypersonic (and even hyperballistic) speeds are theoretically achievable. Ultimate system choices will represent tradeoffs between amoritized capital costs (ACC) and operating costs. For example, long coasting links might employ aerodynamic lift coupled with EM restraint and drag make-up. Optimized, combined EM lift, and thrust vectors could reduce energy costs but at increased ACC. (Repulsive levitation can produce lift-over-drag l/d ratios a decade greater than aerodynamic), Alternatively, vehicle-emanated, induced-mirror fields in a conducting (aluminum sheet) road bed could reduce ACC but at substantial energy costs. Ultra-speed tube flight will demand fast-acting, high-precision sensors and computerized magnetic shimming. This same control system can maintain a magnetic 'guide way' invariant in inertial space with inertial detectors imbedded in tube structures to sense and correct for earth tremors. Ultra-speed tube flight can complete with aircraft for transit time and can provide even greater passenger convenience by single-model connections with local subways and feeder lines. Although cargo transport generally will not need to be performed at ultra speeds, such speeds may well be desirable for high throughput to optimize channel costs. Thus, a large and expensive pipeline might be replaced with small EM-driven pallets at high speeds.

  11. Future ultra-speed tube-flight

    NASA Technical Reports Server (NTRS)

    Salter, Robert M.

    1994-01-01

    Future long-link, ultra-speed, surface transport systems will require electromagnetically (EM) driven and restrained vehicles operating under reduced-atmosphere in very straight tubes. Such tube-flight trains will be safe, energy conservative, pollution-free, and in a protected environment. Hypersonic (and even hyperballistic) speeds are theoretically achievable. Ultimate system choices will represent tradeoffs between amoritized capital costs (ACC) and operating costs. For example, long coasting links might employ aerodynamic lift coupled with EM restraint and drag make-up. Optimized, combined EM lift, and thrust vectors could reduce energy costs but at increased ACC. (Repulsive levitation can produce lift-over-drag l/d ratios a decade greater than aerodynamic), Alternatively, vehicle-emanated, induced-mirror fields in a conducting (aluminum sheet) road bed could reduce ACC but at substantial energy costs. Ultra-speed tube flight will demand fast-acting, high-precision sensors and computerized magnetic shimming. This same control system can maintain a magnetic 'guide way' invariant in inertial space with inertial detectors imbedded in tube structures to sense and correct for earth tremors. Ultra-speed tube flight can complete with aircraft for transit time and can provide even greater passenger convenience by single-model connections with local subways and feeder lines. Although cargo transport generally will not need to be performed at ultra speeds, such speeds may well be desirable for high throughput to optimize channel costs. Thus, a large and expensive pipeline might be replaced with small EM-driven pallets at high speeds.

  12. Genotyping-by-sequencing (GBS) revealed molecular genetic diversity of Iranian wheat landraces and cultivars

    USDA-ARS?s Scientific Manuscript database

    Genetic diversity is an essential resource for breeders to improve new cultivars with desirable characteristics. Recently genotyping-by-sequencing (GBS), a next generation sequencing (NGS) based technology that can simplify complex genomes, has been used as a high-throughput and cost-effective molec...

  13. High-Throughput resequencing of maize landraces at genomic regions associated with flowering time

    USDA-ARS?s Scientific Manuscript database

    Despite the reduction in the price of sequencing, it remains expensive to sequence and assemble whole, complex genomes of multiple samples for population studies, particularly for large genomes like those of many crop species. Enrichment of target genome regions coupled with next generation sequenci...

  14. High-Throughput Gene Mapping in Caenorhabditis elegans

    PubMed Central

    Swan, Kathryn A.; Curtis, Damian E.; McKusick, Kathleen B.; Voinov, Alexander V.; Mapa, Felipa A.; Cancilla, Michael R.

    2002-01-01

    Positional cloning of mutations in model genetic systems is a powerful method for the identification of targets of medical and agricultural importance. To facilitate the high-throughput mapping of mutations in Caenorhabditis elegans, we have identified a further 9602 putative new single nucleotide polymorphisms (SNPs) between two C. elegans strains, Bristol N2 and the Hawaiian mapping strain CB4856, by sequencing inserts from a CB4856 genomic DNA library and using an informatics pipeline to compare sequences with the canonical N2 genomic sequence. When combined with data from other laboratories, our marker set of 17,189 SNPs provides even coverage of the complete worm genome. To date, we have confirmed >1099 evenly spaced SNPs (one every 91 ± 56 kb) across the six chromosomes and validated the utility of our SNP marker set and new fluorescence polarization-based genotyping methods for systematic and high-throughput identification of genes in C. elegans by cloning several proprietary genes. We illustrate our approach by recombination mapping and confirmation of the mutation in the cloned gene, dpy-18. [The sequence data described in this paper have been submitted to the NCBI dbSNP data library under accession nos. 4388625–4389689 and GenBank dbSTS under accession nos. 973810–974874. The following individuals and institutions kindly provided reagents, samples, or unpublished information as indicated in the paper: The C. elegans Sequencing Consortium and The Caenorhabditis Genetics Center.] PMID:12097347

  15. Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

    PubMed

    Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

    2016-01-01

    Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.

  16. LOCATE: a mouse protein subcellular localization database

    PubMed Central

    Fink, J. Lynn; Aturaliya, Rajith N.; Davis, Melissa J.; Zhang, Fasheng; Hanson, Kelly; Teasdale, Melvena S.; Kai, Chikatoshi; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide; Teasdale, Rohan D.

    2006-01-01

    We present here LOCATE, a curated, web-accessible database that houses data describing the membrane organization and subcellular localization of proteins from the FANTOM3 Isoform Protein Sequence set. Membrane organization is predicted by the high-throughput, computational pipeline MemO. The subcellular locations of selected proteins from this set were determined by a high-throughput, immunofluorescence-based assay and by manually reviewing >1700 peer-reviewed publications. LOCATE represents the first effort to catalogue the experimentally verified subcellular location and membrane organization of mammalian proteins using a high-throughput approach and provides localization data for ∼40% of the mouse proteome. It is available at . PMID:16381849

  17. Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples

    PubMed Central

    Wang, Jingwen; Skoog, Tiina; Einarsdottir, Elisabet; Kaartokallio, Tea; Laivuori, Hannele; Grauers, Anna; Gerdhem, Paul; Hytönen, Marjo; Lohi, Hannes; Kere, Juha; Jiao, Hong

    2016-01-01

    High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r = 0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r = 0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies. PMID:27633116

  18. Novel Method for High-Throughput Full-Length IGHV-D-J Sequencing of the Immune Repertoire from Bulk B-Cells with Single-Cell Resolution.

    PubMed

    Vergani, Stefano; Korsunsky, Ilya; Mazzarello, Andrea Nicola; Ferrer, Gerardo; Chiorazzi, Nicholas; Bagnara, Davide

    2017-01-01

    Efficient and accurate high-throughput DNA sequencing of the adaptive immune receptor repertoire (AIRR) is necessary to study immune diversity in healthy subjects and disease-related conditions. The high complexity and diversity of the AIRR coupled with the limited amount of starting material, which can compromise identification of the full biological diversity makes such sequencing particularly challenging. AIRR sequencing protocols often fail to fully capture the sampled AIRR diversity, especially for samples containing restricted numbers of B lymphocytes. Here, we describe a library preparation method for immunoglobulin sequencing that results in an exhaustive full-length repertoire where virtually every sampled B-cell is sequenced. This maximizes the likelihood of identifying and quantifying the entire IGHV-D-J repertoire of a sample, including the detection of rearrangements present in only one cell in the starting population. The methodology establishes the importance of circumventing genetic material dilution in the preamplification phases and incorporates the use of certain described concepts: (1) balancing the starting material amount and depth of sequencing, (2) avoiding IGHV gene-specific amplification, and (3) using Unique Molecular Identifier. Together, this methodology is highly efficient, in particular for detecting rare rearrangements in the sampled population and when only a limited amount of starting material is available.

  19. High-Throughput, Data-Rich Cellular RNA Device Engineering

    PubMed Central

    Townshend, Brent; Kennedy, Andrew B.; Xiang, Joy S.; Smolke, Christina D.

    2015-01-01

    Methods for rapidly assessing sequence-structure-function landscapes and developing conditional gene-regulatory devices are critical to our ability to manipulate and interface with biology. We describe a framework for engineering RNA devices from preexisting aptamers that exhibit ligand-responsive ribozyme tertiary interactions. Our methodology utilizes cell sorting, high-throughput sequencing, and statistical data analyses to enable parallel measurements of the activities of hundreds of thousands of sequences from RNA device libraries in the absence and presence of ligands. Our tertiary interaction RNA devices exhibit improved performance in terms of gene silencing, activation ratio, and ligand sensitivity as compared to optimized RNA devices that rely on secondary structure changes. We apply our method to building biosensors for diverse ligands and determine consensus sequences that enable ligand-responsive tertiary interactions. These methods advance our ability to develop broadly applicable genetic tools and to elucidate understanding of the underlying sequence-structure-function relationships that empower rational design of complex biomolecules. PMID:26258292

  20. Nanowire-nanopore transistor sensor for DNA detection during translocation

    NASA Astrophysics Data System (ADS)

    Xie, Ping; Xiong, Qihua; Fang, Ying; Qing, Quan; Lieber, Charles

    2011-03-01

    Nanopore sequencing, as a promising low cost, high throughput sequencing technique, has been proposed more than a decade ago. Due to the incompatibility between small ionic current signal and fast translocation speed and the technical difficulties on large scale integration of nanopore for direct ionic current sequencing, alternative methods rely on integrated DNA sensors have been proposed, such as using capacitive coupling or tunnelling current etc. But none of them have been experimentally demonstrated yet. Here we show that for the first time an amplified sensor signal has been experimentally recorded from a nanowire-nanopore field effect transistor sensor during DNA translocation. Independent multi-channel recording was also demonstrated for the first time. Our results suggest that the signal is from highly localized potential change caused by DNA translocation in none-balanced buffer condition. Given this method may produce larger signal for smaller nanopores, we hope our experiment can be a starting point for a new generation of nanopore sequencing devices with larger signal, higher bandwidth and large-scale multiplexing capability and finally realize the ultimate goal of low cost high throughput sequencing.

  1. High-throughput assays for DNA gyrase and other topoisomerases

    PubMed Central

    Maxwell, Anthony; Burton, Nicolas P.; O'Hagan, Natasha

    2006-01-01

    We have developed high-throughput microtitre plate-based assays for DNA gyrase and other DNA topoisomerases. These assays exploit the fact that negatively supercoiled plasmids form intermolecular triplexes more efficiently than when they are relaxed. Two assays are presented, one using capture of a plasmid containing a single triplex-forming sequence by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by staining with a DNA-specific fluorescent dye. The other uses capture of a plasmid containing two triplex-forming sequences by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by a second oligonucleotide that is radiolabelled. The assays are shown to be appropriate for assaying DNA supercoiling by Escherichia coli DNA gyrase and DNA relaxation by eukaryotic topoisomerases I and II, and E.coli topoisomerase IV. The assays are readily adaptable to other enzymes that change DNA supercoiling (e.g. restriction enzymes) and are suitable for use in a high-throughput format. PMID:16936317

  2. High-throughput assays for DNA gyrase and other topoisomerases.

    PubMed

    Maxwell, Anthony; Burton, Nicolas P; O'Hagan, Natasha

    2006-01-01

    We have developed high-throughput microtitre plate-based assays for DNA gyrase and other DNA topoisomerases. These assays exploit the fact that negatively supercoiled plasmids form intermolecular triplexes more efficiently than when they are relaxed. Two assays are presented, one using capture of a plasmid containing a single triplex-forming sequence by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by staining with a DNA-specific fluorescent dye. The other uses capture of a plasmid containing two triplex-forming sequences by an oligonucleotide tethered to the surface of a microtitre plate and subsequent detection by a second oligonucleotide that is radiolabelled. The assays are shown to be appropriate for assaying DNA supercoiling by Escherichia coli DNA gyrase and DNA relaxation by eukaryotic topoisomerases I and II, and E.coli topoisomerase IV. The assays are readily adaptable to other enzymes that change DNA supercoiling (e.g. restriction enzymes) and are suitable for use in a high-throughput format.

  3. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice

    PubMed Central

    Yang, Wanneng; Guo, Zilong; Huang, Chenglong; Duan, Lingfeng; Chen, Guoxing; Jiang, Ni; Fang, Wei; Feng, Hui; Xie, Weibo; Lian, Xingming; Wang, Gongwei; Luo, Qingming; Zhang, Qifa; Liu, Qian; Xiong, Lizhong

    2014-01-01

    Even as the study of plant genomics rapidly develops through the use of high-throughput sequencing techniques, traditional plant phenotyping lags far behind. Here we develop a high-throughput rice phenotyping facility (HRPF) to monitor 13 traditional agronomic traits and 2 newly defined traits during the rice growth period. Using genome-wide association studies (GWAS) of the 15 traits, we identify 141 associated loci, 25 of which contain known genes such as the Green Revolution semi-dwarf gene, SD1. Based on a performance evaluation of the HRPF and GWAS results, we demonstrate that high-throughput phenotyping has the potential to replace traditional phenotyping techniques and can provide valuable gene identification information. The combination of the multifunctional phenotyping tools HRPF and GWAS provides deep insights into the genetic architecture of important traits. PMID:25295980

  4. High-throughput analysis of T-DNA location and structure using sequence capture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less

  5. High-throughput analysis of T-DNA location and structure using sequence capture

    DOE PAGES

    Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...

    2015-10-07

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less

  6. Human genetics and genomics a decade after the release of the draft sequence of the human genome.

    PubMed

    Naidoo, Nasheen; Pawitan, Yudi; Soong, Richie; Cooper, David N; Ku, Chee-Seng

    2011-10-01

    Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.

  7. Human genetics and genomics a decade after the release of the draft sequence of the human genome

    PubMed Central

    2011-01-01

    Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605

  8. Re-engineering adenovirus vector systems to enable high-throughput analyses of gene function.

    PubMed

    Stanton, Richard J; McSharry, Brian P; Armstrong, Melanie; Tomasec, Peter; Wilkinson, Gavin W G

    2008-12-01

    With the enhanced capacity of bioinformatics to interrogate extensive banks of sequence data, more efficient technologies are needed to test gene function predictions. Replication-deficient recombinant adenovirus (Ad) vectors are widely used in expression analysis since they provide for extremely efficient expression of transgenes in a wide range of cell types. To facilitate rapid, high-throughput generation of recombinant viruses, we have re-engineered an adenovirus vector (designated AdZ) to allow single-step, directional gene insertion using recombineering technology. Recombineering allows for direct insertion into the Ad vector of PCR products, synthesized sequences, or oligonucleotides encoding shRNAs without requirement for a transfer vector Vectors were optimized for high-throughput applications by making them "self-excising" through incorporating the I-SceI homing endonuclease into the vector removing the need to linearize vectors prior to transfection into packaging cells. AdZ vectors allow genes to be expressed in their native form or with strep, V5, or GFP tags. Insertion of tetracycline operators downstream of the human cytomegalovirus major immediate early (HCMV MIE) promoter permits silencing of transgenes in helper cells expressing the tet repressor thus making the vector compatible with the cloning of toxic gene products. The AdZ vector system is robust, straightforward, and suited to both sporadic and high-throughput applications.

  9. PCR Primers to Study the Diversity of Expressed Fungal Genes Encoding Lignocellulolytic Enzymes in Soils Using High-Throughput Sequencing

    PubMed Central

    Barbi, Florian; Bragalini, Claudia; Vallon, Laurent; Prudent, Elsa; Dubost, Audrey; Fraissinet-Tachet, Laurence; Marmeisse, Roland; Luis, Patricia

    2014-01-01

    Plant biomass degradation in soil is one of the key steps of carbon cycling in terrestrial ecosystems. Fungal saprotrophic communities play an essential role in this process by producing hydrolytic enzymes active on the main components of plant organic matter. Open questions in this field regard the diversity of the species involved, the major biochemical pathways implicated and how these are affected by external factors such as litter quality or climate changes. This can be tackled by environmental genomic approaches involving the systematic sequencing of key enzyme-coding gene families using soil-extracted RNA as material. Such an approach necessitates the design and evaluation of gene family-specific PCR primers producing sequence fragments compatible with high-throughput sequencing approaches. In the present study, we developed and evaluated PCR primers for the specific amplification of fungal CAZy Glycoside Hydrolase gene families GH5 (subfamily 5) and GH11 encoding endo-β-1,4-glucanases and endo-β-1,4-xylanases respectively as well as Basidiomycota class II peroxidases, corresponding to the CAZy Auxiliary Activity family 2 (AA2), active on lignin. These primers were experimentally validated using DNA extracted from a wide range of Ascomycota and Basidiomycota species including 27 with sequenced genomes. Along with the published primers for Glycoside Hydrolase GH7 encoding enzymes active on cellulose, the newly design primers were shown to be compatible with the Illumina MiSeq sequencing technology. Sequences obtained from RNA extracted from beech or spruce forest soils showed a high diversity and were uniformly distributed in gene trees featuring the global diversity of these gene families. This high-throughput sequencing approach using several degenerate primers constitutes a robust method, which allows the simultaneous characterization of the diversity of different fungal transcripts involved in plant organic matter degradation and may lead to the discovery of complex patterns in gene expression of soil fungal communities. PMID:25545363

  10. Discovery of DNA viruses in wild-caught mosquitoes using small RNA high throughput sequencing.

    PubMed

    Ma, Maijuan; Huang, Yong; Gong, Zhengda; Zhuang, Lu; Li, Cun; Yang, Hong; Tong, Yigang; Liu, Wei; Cao, Wuchun

    2011-01-01

    Mosquito-borne infectious diseases pose a severe threat to public health in many areas of the world. Current methods for pathogen detection and surveillance are usually dependent on prior knowledge of the etiologic agents involved. Hence, efficient approaches are required for screening wild mosquito populations to detect known and unknown pathogens. In this study, we explored the use of Next Generation Sequencing to identify viral agents in wild-caught mosquitoes. We extracted total RNA from different mosquito species from South China. Small 18-30 bp length RNA molecules were purified, reverse-transcribed into cDNA and sequenced using Illumina GAIIx instrumentation. Bioinformatic analyses to identify putative viral agents were conducted and the results confirmed by PCR. We identified a non-enveloped single-stranded DNA densovirus in the wild-caught Culex pipiens molestus mosquitoes. The majority of the viral transcripts (.>80% of the region) were covered by the small viral RNAs, with a few peaks of very high coverage obtained. The +/- strand sequence ratio of the small RNAs was approximately 7∶1, indicating that the molecules were mainly derived from the viral RNA transcripts. The small viral RNAs overlapped, enabling contig assembly of the viral genome sequence. We identified some small RNAs in the reverse repeat regions of the viral 5'- and 3' -untranslated regions where no transcripts were expected. Our results demonstrate for the first time that high throughput sequencing of small RNA is feasible for identifying viral agents in wild-caught mosquitoes. Our results show that it is possible to detect DNA viruses by sequencing the small RNAs obtained from insects, although the underlying mechanism of small viral RNA biogenesis is unclear. Our data and those of other researchers show that high throughput small RNA sequencing can be used for pathogen surveillance in wild mosquito vectors.

  11. Microbial community analysis of the hypersaline water of the Dead Sea using high-throughput amplicon sequencing.

    PubMed

    Jacob, Jacob H; Hussein, Emad I; Shakhatreh, Muhamad Ali K; Cornelison, Christopher T

    2017-10-01

    Amplicon sequencing using next-generation technology (bTEFAP ® ) has been utilized in describing the diversity of Dead Sea microbiota. The investigated area is a well-known salt lake in the western part of Jordan found in the lowest geographical location in the world (more than 420 m below sea level) and characterized by extreme salinity (approximately, 34%) in addition to other extreme conditions (low pH, unique ionic composition different from sea water). DNA was extracted from Dead Sea water. A total of 314,310 small subunit RNA (SSU rRNA) sequences were parsed, and 288,452 sequences were then clustered. For alpha diversity analysis, sample was rarefied to 3,000 sequences. The Shannon-Wiener index curve plot reached a plateau at approximately 3,000 sequences indicating that sequencing depth was sufficient to capture the full scope of microbial diversity. Archaea was found to be dominating the sequences (52%), whereas Bacteria constitute 45% of the sequences. Altogether, prokaryotic sequences (which constitute 97% of all sequences) were found to predominate. The findings expand on previous studies by using high-throughput amplicon sequencing to describe the microbial community in an environment which in recent years has been shown to hide some interesting diversity. © 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.

  12. Regulatory Efficacy of Brown Seaweed Lessonia nigrescens Extract on the Gene Expression Profile and Intestinal Microflora in Type 2 Diabetic Mice.

    PubMed

    Zhao, Chao; Yang, Chengfeng; Chen, Mingjun; Lv, Xucong; Liu, Bin; Yi, Lunzhao; Cornara, Laura; Wei, Ming-Chi; Yang, Yu-Chiao; Tundis, Rosa; Xiao, Jianbo

    2018-02-01

    In this study, the antidiabetic activity of Lessonia nigrescens ethanolic extract (LNE) is investigated in streptozotocin (SZT)-induced type 2 diabetic mice fed with a high-sucrose/high-fat diet. Ultra high performance liquid chromatography coupled with photo-DAD and electospray ionization-mass spectrometry (ESI-MS) is employed to analyze the major compounds in LNE. The components of the intestinal microflora in type 2 diabetic mice are analyzed by high-throughput next-generation 16S rRNA gene sequencing. Fasting blood glucose levels in diabetic mice are significantly decreased after LNE administration. The histology reveals that LNE could protect the cellular architecture of liver and kidney. LNE treatment significantly increases Bacteroidetes and decreases Firmicutes populations in intestinal microflora. Specifically, It could selectively enrich the amounts of beneficial bacteria, Barnesiella, as well as reduce the abundances of Clostridium and Alistipes. The increased gene and protein expression levels of phosphatidylinositol 3-kinase (PI3K) in the liver are observed in LNE treatment groups, while the expressions of c-Jun N-terminal kinase (JNK) are significantly downregulated. The above findings suggest that LNE could be considered as a functional food for reducing blood glucose and regulating intestinal microflora. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Evolutionary dynamics of retrotransposons assessed by high-throughput sequencing in wild relatives of wheat.

    PubMed

    Senerchia, Natacha; Wicker, Thomas; Felber, François; Parisod, Christian

    2013-01-01

    Transposable elements (TEs) represent a major fraction of plant genomes and drive their evolution. An improved understanding of genome evolution requires the dynamics of a large number of TE families to be considered. We put forward an approach bypassing the required step of a complete reference genome to assess the evolutionary trajectories of high copy number TE families from genome snapshot with high-throughput sequencing. Low coverage sequencing of the complex genomes of Aegilops cylindrica and Ae. geniculata using 454 identified more than 70% of the sequences as known TEs, mainly long terminal repeat (LTR) retrotransposons. Comparing the abundance of reads as well as patterns of sequence diversity and divergence within and among genomes assessed the dynamics of 44 major LTR retrotransposon families of the 165 identified. In particular, molecular population genetics on individual TE copies distinguished recently active from quiescent families and highlighted different evolutionary trajectories of retrotransposons among related species. This work presents a suite of tools suitable for current sequencing data, allowing to address the genome-wide evolutionary dynamics of TEs at the family level and advancing our understanding of the evolution of nonmodel genomes.

  14. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

    PubMed Central

    Collins, Lesley Joan

    2011-01-01

    ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390

  15. Dynamic contrast-enhanced breast MRI at 7 Tesla utilizing a single-loop coil: a feasibility trial.

    PubMed

    Umutlu, Lale; Maderwald, Stefan; Kraff, Oliver; Theysohn, Jens M; Kuemmel, Sherko; Hauth, Elke A; Forsting, Michael; Antoch, Gerald; Ladd, Mark E; Quick, Harald H; Lauenstein, Thomas C

    2010-08-01

    The aim of this study was to assess the feasibility of dynamic contrast-enhanced ultra-high-field breast imaging at 7 Tesla. A total of 15 subjects, including 5 patients with histologically proven breast cancer, were examined on a 7 Tesla whole-body magnetic resonance imaging system using a unilateral linearly polarized single-loop coil. Subjects were placed in prone position on a biopsy support system, with the coil placed directly below the region of interest. The examination protocol included the following sequences: 1) T2-weighted turbo spin echo sequence; 2) six dynamic T1-weighted spoiled gradient-echo sequences; and 3) subtraction imaging. Contrast-enhanced T1-weighted imaging at 7 Tesla could be obtained at high spatial resolution with short acquisition times, providing good image accuracy and a conclusively good delineation of small anatomical and pathological structures. T2-weighted imaging could be obtained with high spatial resolution at adequate acquisition times. Because of coil limitations, four high-field magnetic resonance examinations showed decreased diagnostic value. This first scientific approach of dynamic contrast-enhanced breast magnetic resonance imaging at 7 Tesla demonstrates the complexity of ultra-high-field breast magnetic resonance imaging and countenances the implementation of further advanced bilateral coil concepts to circumvent current limitations from the coil and ultra-high-field magnetic strength. 2010 AUR. Published by Elsevier Inc. All rights reserved.

  16. Phased genotyping-by-sequencing enhances analysis of genetic diversity and reveals divergent copy number variants in maize

    USDA-ARS?s Scientific Manuscript database

    High-throughput sequencing of reduced representation genomic libraries has ushered in an era of genotyping-by-sequencing (GBS), where genome-wide genotype data can be obtained for nearly any species. However, there remains a need for imputation-free GBS methods for genotyping large samples taken fr...

  17. OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.

    PubMed

    Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A

    2016-01-01

    High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.

  18. OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid

    PubMed Central

    Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.

    2016-01-01

    High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617

  19. A Proteomic Workflow Using High-Throughput De Novo Sequencing Towards Complementation of Genome Information for Improved Comparative Crop Science.

    PubMed

    Turetschek, Reinhard; Lyon, David; Desalegn, Getinet; Kaul, Hans-Peter; Wienkoop, Stefanie

    2016-01-01

    The proteomic study of non-model organisms, such as many crop plants, is challenging due to the lack of comprehensive genome information. Changing environmental conditions require the study and selection of adapted cultivars. Mutations, inherent to cultivars, hamper protein identification and thus considerably complicate the qualitative and quantitative comparison in large-scale systems biology approaches. With this workflow, cultivar-specific mutations are detected from high-throughput comparative MS analyses, by extracting sequence polymorphisms with de novo sequencing. Stringent criteria are suggested to filter for confidential mutations. Subsequently, these polymorphisms complement the initially used database, which is ready to use with any preferred database search algorithm. In our example, we thereby identified 26 specific mutations in two cultivars of Pisum sativum and achieved an increased number (17 %) of peptide spectrum matches.

  20. Identification of antigen-specific human monoclonal antibodies using high-throughput sequencing of the antibody repertoire.

    PubMed

    Liu, Ju; Li, Ruihua; Liu, Kun; Li, Liangliang; Zai, Xiaodong; Chi, Xiangyang; Fu, Ling; Xu, Junjie; Chen, Wei

    2016-04-22

    High-throughput sequencing of the antibody repertoire provides a large number of antibody variable region sequences that can be used to generate human monoclonal antibodies. However, current screening methods for identifying antigen-specific antibodies are inefficient. In the present study, we developed an antibody clone screening strategy based on clone dynamics and relative frequency, and used it to identify antigen-specific human monoclonal antibodies. Enzyme-linked immunosorbent assay showed that at least 52% of putative positive immunoglobulin heavy chains composed antigen-specific antibodies. Combining information on dynamics and relative frequency improved identification of positive clones and elimination of negative clones. and increase the credibility of putative positive clones. Therefore the screening strategy could simplify the subsequent experimental screening and may facilitate the generation of antigen-specific antibodies. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Arioc: high-throughput read alignment with GPU-accelerated exploration of the seed-and-extend search space

    PubMed Central

    Budavari, Tamas; Langmead, Ben; Wheelan, Sarah J.; Salzberg, Steven L.; Szalay, Alexander S.

    2015-01-01

    When computing alignments of DNA sequences to a large genome, a key element in achieving high processing throughput is to prioritize locations in the genome where high-scoring mappings might be expected. We formulated this task as a series of list-processing operations that can be efficiently performed on graphics processing unit (GPU) hardware.We followed this approach in implementing a read aligner called Arioc that uses GPU-based parallel sort and reduction techniques to identify high-priority locations where potential alignments may be found. We then carried out a read-by-read comparison of Arioc’s reported alignments with the alignments found by several leading read aligners. With simulated reads, Arioc has comparable or better accuracy than the other read aligners we tested. With human sequencing reads, Arioc demonstrates significantly greater throughput than the other aligners we evaluated across a wide range of sensitivity settings. The Arioc software is available at https://github.com/RWilton/Arioc. It is released under a BSD open-source license. PMID:25780763

  2. Tracking single hematopoietic stem cells in vivo using high-throughput sequencing in conjunction with viral genetic barcoding

    PubMed Central

    Lu, Rong; Neff, Norma F.; Quake, Stephen R.; Weissman, Irving L.

    2011-01-01

    Disentangling cellular heterogeneity is a challenge in many fields, particularly in the stem cell and cancer biology fields. Here, we demonstrate how to combine viral genetic barcoding with high-throughput sequencing to track single cells in a heterogeneous population. We use this technique to track the in vivo differentiation of unitary hematopoietic stem cells (HSCs). The results are consistent with single cell transplantation studies, but require two orders of magnitude fewer mice. In addition to its high throughput, the high sensitivity of the technique allows for a direct examination of the clonality of sparse cell populations such as HSCs. We show how these capabilities offer a clonal perspective of the HSC differentiation process. In particular, our data suggests that HSCs do not equally contribute to blood cells after irradiation-mediated transplantation, and that two distinct HSC differentiation patterns co-exist in the same recipient mouse post irradiation. This technique can be applied to any viral accessible cell type for both in vitro and in vivo processes. PMID:21964413

  3. Process in manufacturing high efficiency AlGaAs/GaAs solar cells by MO-CVD

    NASA Technical Reports Server (NTRS)

    Yeh, Y. C. M.; Chang, K. I.; Tandon, J.

    1984-01-01

    Manufacturing technology for mass producing high efficiency GaAs solar cells is discussed. A progress using a high throughput MO-CVD reactor to produce high efficiency GaAs solar cells is discussed. Thickness and doping concentration uniformity of metal oxide chemical vapor deposition (MO-CVD) GaAs and AlGaAs layer growth are discussed. In addition, new tooling designs are given which increase the throughput of solar cell processing. To date, 2cm x 2cm AlGaAs/GaAs solar cells with efficiency up to 16.5% were produced. In order to meet throughput goals for mass producing GaAs solar cells, a large MO-CVD system (Cambridge Instrument Model MR-200) with a susceptor which was initially capable of processing 20 wafers (up to 75 mm diameter) during a single growth run was installed. In the MR-200, the sequencing of the gases and the heating power are controlled by a microprocessor-based programmable control console. Hence, operator errors can be reduced, leading to a more reproducible production sequence.

  4. Developing High-Throughput HIV Incidence Assay with Pyrosequencing Platform

    PubMed Central

    Park, Sung Yong; Goeken, Nolan; Lee, Hyo Jin; Bolan, Robert; Dubé, Michael P.

    2014-01-01

    ABSTRACT Human immunodeficiency virus (HIV) incidence is an important measure for monitoring the epidemic and evaluating the efficacy of intervention and prevention trials. This study developed a high-throughput, single-measure incidence assay by implementing a pyrosequencing platform. We devised a signal-masking bioinformatics pipeline, which yielded a process error rate of 5.8 × 10−4 per base. The pipeline was then applied to analyze 18,434 envelope gene segments (HXB2 7212 to 7601) obtained from 12 incident and 24 chronic patients who had documented HIV-negative and/or -positive tests. The pyrosequencing data were cross-checked by using the single-genome-amplification (SGA) method to independently obtain 302 sequences from 13 patients. Using two genomic biomarkers that probe for the presence of similar sequences, the pyrosequencing platform correctly classified all 12 incident subjects (100% sensitivity) and 23 of 24 chronic subjects (96% specificity). One misclassified subject's chronic infection was correctly classified by conducting the same analysis with SGA data. The biomarkers were statistically associated across the two platforms, suggesting the assay's reproducibility and robustness. Sampling simulations showed that the biomarkers were tolerant of sequencing errors and template resampling, two factors most likely to affect the accuracy of pyrosequencing results. We observed comparable biomarker scores between AIDS and non-AIDS chronic patients (multivariate analysis of variance [MANOVA], P = 0.12), indicating that the stage of HIV disease itself does not affect the classification scheme. The high-throughput genomic HIV incidence marks a significant step toward determining incidence from a single measure in cross-sectional surveys. IMPORTANCE Annual HIV incidence, the number of newly infected individuals within a year, is the key measure of monitoring the epidemic's rise and decline. Developing reliable assays differentiating recent from chronic infections has been a long-standing quest in the HIV community. Over the past 15 years, these assays have traditionally measured various HIV-specific antibodies, but recent technological advancements have expanded the diversity of proposed accurate, user-friendly, and financially viable tools. Here we designed a high-throughput genomic HIV incidence assay based on the signature imprinted in the HIV gene sequence population. By combining next-generation sequencing techniques with bioinformatics analysis, we demonstrated that genomic fingerprints are capable of distinguishing recently infected patients from chronically infected patients with high precision. Our high-throughput platform is expected to allow us to process many patients' samples from a single experiment, permitting the assay to be cost-effective for routine surveillance. PMID:24371062

  5. An integrated approach to exploit linkage disequilibrium for ultra high dimensional genome-wide data

    USDA-ARS?s Scientific Manuscript database

    With the advent of recent DNA sequencing methods (determining molecule order) that quickly produce millions of DNA sequences, variation among sequences in a genome (all the DNA contained in chromosomes of an organism) can be tested for association with traits of economic interest on a relatively lar...

  6. SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

    PubMed Central

    2010-01-01

    Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148

  7. Generation of a large volume of clinically relevant nanometre-sized ultra-high-molecular-weight polyethylene wear particles for cell culture studies

    PubMed Central

    Ingham, Eileen; Fisher, John; Tipper, Joanne L

    2014-01-01

    It has recently been shown that the wear of ultra-high-molecular-weight polyethylene in hip and knee prostheses leads to the generation of nanometre-sized particles, in addition to micron-sized particles. The biological activity of nanometre-sized ultra-high-molecular-weight polyethylene wear particles has not, however, previously been studied due to difficulties in generating sufficient volumes of nanometre-sized ultra-high-molecular-weight polyethylene wear particles suitable for cell culture studies. In this study, wear simulation methods were investigated to generate a large volume of endotoxin-free clinically relevant nanometre-sized ultra-high-molecular-weight polyethylene wear particles. Both single-station and six-station multidirectional pin-on-plate wear simulators were used to generate ultra-high-molecular-weight polyethylene wear particles under sterile and non-sterile conditions. Microbial contamination and endotoxin levels in the lubricants were determined. The results indicated that microbial contamination was absent and endotoxin levels were low and within acceptable limits for the pharmaceutical industry, when a six-station pin-on-plate wear simulator was used to generate ultra-high-molecular-weight polyethylene wear particles in a non-sterile environment. Different pore-sized polycarbonate filters were investigated to isolate nanometre-sized ultra-high-molecular-weight polyethylene wear particles from the wear test lubricants. The use of the filter sequence of 10, 1, 0.1, 0.1 and 0.015 µm pore sizes allowed successful isolation of ultra-high-molecular-weight polyethylene wear particles with a size range of < 100 nm, which was suitable for cell culture studies. PMID:24658586

  8. Generation of a large volume of clinically relevant nanometre-sized ultra-high-molecular-weight polyethylene wear particles for cell culture studies.

    PubMed

    Liu, Aiqin; Ingham, Eileen; Fisher, John; Tipper, Joanne L

    2014-04-01

    It has recently been shown that the wear of ultra-high-molecular-weight polyethylene in hip and knee prostheses leads to the generation of nanometre-sized particles, in addition to micron-sized particles. The biological activity of nanometre-sized ultra-high-molecular-weight polyethylene wear particles has not, however, previously been studied due to difficulties in generating sufficient volumes of nanometre-sized ultra-high-molecular-weight polyethylene wear particles suitable for cell culture studies. In this study, wear simulation methods were investigated to generate a large volume of endotoxin-free clinically relevant nanometre-sized ultra-high-molecular-weight polyethylene wear particles. Both single-station and six-station multidirectional pin-on-plate wear simulators were used to generate ultra-high-molecular-weight polyethylene wear particles under sterile and non-sterile conditions. Microbial contamination and endotoxin levels in the lubricants were determined. The results indicated that microbial contamination was absent and endotoxin levels were low and within acceptable limits for the pharmaceutical industry, when a six-station pin-on-plate wear simulator was used to generate ultra-high-molecular-weight polyethylene wear particles in a non-sterile environment. Different pore-sized polycarbonate filters were investigated to isolate nanometre-sized ultra-high-molecular-weight polyethylene wear particles from the wear test lubricants. The use of the filter sequence of 10, 1, 0.1, 0.1 and 0.015 µm pore sizes allowed successful isolation of ultra-high-molecular-weight polyethylene wear particles with a size range of < 100 nm, which was suitable for cell culture studies.

  9. Network issues for large mass storage requirements

    NASA Technical Reports Server (NTRS)

    Perdue, James

    1992-01-01

    File Servers and Supercomputing environments need high performance networks to balance the I/O requirements seen in today's demanding computing scenarios. UltraNet is one solution which permits both high aggregate transfer rates and high task-to-task transfer rates as demonstrated in actual tests. UltraNet provides this capability as both a Server-to-Server and Server-to-Client access network giving the supercomputing center the following advantages highest performance Transport Level connections (to 40 MBytes/sec effective rates); matches the throughput of the emerging high performance disk technologies, such as RAID, parallel head transfer devices and software striping; supports standard network and file system applications using SOCKET's based application program interface such as FTP, rcp, rdump, etc.; supports access to the Network File System (NFS) and LARGE aggregate bandwidth for large NFS usage; provides access to a distributed, hierarchical data server capability using DISCOS UniTree product; supports file server solutions available from multiple vendors, including Cray, Convex, Alliant, FPS, IBM, and others.

  10. A Low-Power High-Speed Smart Sensor Design for Space Exploration Missions

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi

    1997-01-01

    A low-power high-speed smart sensor system based on a large format active pixel sensor (APS) integrated with a programmable neural processor for space exploration missions is presented. The concept of building an advanced smart sensing system is demonstrated by a system-level microchip design that is composed with an APS sensor, a programmable neural processor, and an embedded microprocessor in a SOI CMOS technology. This ultra-fast smart sensor system-on-a-chip design mimics what is inherent in biological vision systems. Moreover, it is programmable and capable of performing ultra-fast machine vision processing in all levels such as image acquisition, image fusion, image analysis, scene interpretation, and control functions. The system provides about one tera-operation-per-second computing power which is a two order-of-magnitude increase over that of state-of-the-art microcomputers. Its high performance is due to massively parallel computing structures, high data throughput rates, fast learning capabilities, and advanced VLSI system-on-a-chip implementation.

  11. Mechanical diagnosis of human erythrocytes by ultra-high speed manipulation unraveled critical time window for global cytoskeletal remodeling

    NASA Astrophysics Data System (ADS)

    Ito, Hiroaki; Murakami, Ryo; Sakuma, Shinya; Tsai, Chia-Hung Dylan; Gutsmann, Thomas; Brandenburg, Klaus; Pöschl, Johannes M. B.; Arai, Fumihito; Kaneko, Makoto; Tanaka, Motomu

    2017-02-01

    Large deformability of erythrocytes in microvasculature is a prerequisite to realize smooth circulation. We develop a novel tool for the three-step “Catch-Load-Launch” manipulation of a human erythrocyte based on an ultra-high speed position control by a microfluidic “robotic pump”. Quantification of the erythrocyte shape recovery as a function of loading time uncovered the critical time window for the transition between fast and slow recoveries. The comparison with erythrocytes under depletion of adenosine triphosphate revealed that the cytoskeletal remodeling over a whole cell occurs in 3 orders of magnitude longer timescale than the local dissociation-reassociation of a single spectrin node. Finally, we modeled septic conditions by incubating erythrocytes with endotoxin, and found that the exposure to endotoxin results in a significant delay in the characteristic transition time for cytoskeletal remodeling. The high speed manipulation of erythrocytes with a robotic pump technique allows for high throughput mechanical diagnosis of blood-related diseases.

  12. RAD sequencing yields a high success rate for westslope cutthroat and rainbow trout species-diagnostic SNP assays

    USGS Publications Warehouse

    Stephen J. Amish,; Paul A. Hohenlohe,; Sally Painter,; Robb F. Leary,; Muhlfeld, Clint C.; Fred W. Allendorf,; Luikart, Gordon

    2012-01-01

    Hybridization with introduced rainbow trout threatens most native westslope cutthroat trout populations. Understanding the genetic effects of hybridization and introgression requires a large set of high-throughput, diagnostic genetic markers to inform conservation and management. Recently, we identified several thousand candidate single-nucleotide polymorphism (SNP) markers based on RAD sequencing of 11 westslope cutthroat trout and 13 rainbow trout individuals. Here, we used flanking sequence for 56 of these candidate SNP markers to design high-throughput genotyping assays. We validated the assays on a total of 92 individuals from 22 populations and seven hatchery strains. Forty-six assays (82%) amplified consistently and allowed easy identification of westslope cutthroat and rainbow trout alleles as well as heterozygote controls. The 46 SNPs will provide high power for early detection of population admixture and improved identification of hybrid and nonhybridized individuals. This technique shows promise as a very low-cost, reliable and relatively rapid method for developing and testing SNP markers for nonmodel organisms with limited genomic resources.

  13. Calculating the quality of public high-throughput sequencing data to obtain a suitable subset for reanalysis from the Sequence Read Archive

    PubMed Central

    Nakazato, Takeru; Bono, Hidemasa

    2017-01-01

    Abstract It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party. PMID:28449062

  14. Genomic variation in macrophage-cultured European porcine reproductive and respiratory syndrome virus Olot/91 revealed using ultra-deep next generation sequencing.

    PubMed

    Lu, Zen H; Brown, Alexander; Wilson, Alison D; Calvert, Jay G; Balasch, Monica; Fuentes-Utrilla, Pablo; Loecherbach, Julia; Turner, Frances; Talbot, Richard; Archibald, Alan L; Ait-Ali, Tahar

    2014-03-04

    Porcine Reproductive and Respiratory Syndrome (PRRS) is a disease of major economic impact worldwide. The etiologic agent of this disease is the PRRS virus (PRRSV). Increasing evidence suggest that microevolution within a coexisting quasispecies population can give rise to high sequence heterogeneity in PRRSV. We developed a pipeline based on the ultra-deep next generation sequencing approach to first construct the complete genome of a European PRRSV, strain Olot/9, cultured on macrophages and then capture the rare variants representative of the mixed quasispecies population. Olot/91 differs from the reference Lelystad strain by about 5% and a total of 88 variants, with frequencies as low as 1%, were detected in the mixed population. These variants included 16 non-synonymous variants concentrated in the genes encoding structural and nonstructural proteins; including Glycoprotein 2a and 5. Using an ultra-deep sequencing methodology, the complete genome of Olot/91 was constructed without any prior knowledge of the sequence. Rare variants that constitute minor fractions of the heterogeneous PRRSV population could successfully be detected to allow further exploration of microevolutionary events.

  15. Application of Genomic Technologies to the Breeding of Trees

    PubMed Central

    Badenes, Maria L.; Fernández i Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J.

    2016-01-01

    The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species. PMID:27895664

  16. Using high throughput sequencing to explore the biodiversity in oral bacterial communities.

    PubMed

    Diaz, P I; Dupuy, A K; Abusleme, L; Reese, B; Obergfell, C; Choquette, L; Dongari-Bagtzoglou, A; Peterson, D E; Terzi, E; Strausbaugh, L D

    2012-06-01

    High throughput sequencing of 16S ribosomal RNA gene amplicons is a cost-effective method for characterization of oral bacterial communities. However, before undertaking large-scale studies, it is necessary to understand the technique-associated limitations and intrinsic variability of the oral ecosystem. In this work we evaluated bias in species representation using an in vitro-assembled mock community of oral bacteria. We then characterized the bacterial communities in saliva and buccal mucosa of five healthy subjects to investigate the power of high throughput sequencing in revealing their diversity and biogeography patterns. Mock community analysis showed primer and DNA isolation biases and an overestimation of diversity that was reduced after eliminating singleton operational taxonomic units (OTUs). Sequencing of salivary and mucosal communities found a total of 455 OTUs (0.3% dissimilarity) with only 78 of these present in all subjects. We demonstrate that this variability was partly the result of incomplete richness coverage even at great sequencing depths, and so comparing communities by their structure was more effective than comparisons based solely on membership. With respect to oral biogeography, we found inter-subject variability in community structure was lower than site differences between salivary and mucosal communities within subjects. These differences were evident at very low sequencing depths and were mostly caused by the abundance of Streptococcus mitis and Gemella haemolysans in mucosa. In summary, we present an experimental and data analysis framework that will facilitate design and interpretation of pyrosequencing-based studies. Despite challenges associated with this technique, we demonstrate its power for evaluation of oral diversity and biogeography patterns. © 2012 John Wiley & Sons A/S.

  17. Application of Genomic Technologies to the Breeding of Trees.

    PubMed

    Badenes, Maria L; Fernández I Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J

    2016-01-01

    The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species.

  18. A fluorescence-based thiol quantification assay for ultra-high-throughput screening for inhibitors of coenzyme A production.

    PubMed

    Chung, Christine C; Ohwaki, Kenji; Schneeweis, Jonathan E; Stec, Erica; Varnerin, Jeffrey P; Goudreau, Paul N; Chang, Amy; Cassaday, Jason; Yang, Lihu; Yamakawa, Takeru; Kornienko, Oleg; Hodder, Peter; Inglese, James; Ferrer, Marc; Strulovici, Berta; Kusunoki, Jun; Tota, Michael R; Takagi, Toshimitsu

    2008-06-01

    Here we report the development and miniaturization of a cell-free enzyme assay for ultra-high-throughput screening (uHTS) for inhibitors of two potential drug targets for obesity and cancer: fatty acid synthase (FAS) and acetyl-coenzyme A (CoA) carboxylase (ACC) 2. This assay detects CoA, a product of the FAS-catalyzed condensation of malonyl-CoA and acetyl-CoA. The free thiol of CoA can react with 7-diethylamino-3-(4'-maleimidylphenyl)-4-methylcoumarin (CPM), a profluorescent coumarin maleimide derivative that becomes fluorescent upon reaction with thiols. FAS produces long-chain fatty acid and CoA from the condensation of malonyl-CoA and acetyl-CoA. In our FAS assay, CoA released in the FAS reaction forms a fluorescence adduct with CPM that emits at 530 nm when excited at 405 nm. Using this detection method for CoA, we measured the activity of sequential enzymes in the fatty acid synthesis pathway to develop an ACC2/FAS-coupled assay where ACC2 produces malonyl-CoA from acetyl-CoA. We miniaturized the FAS and ACC2/FAS assays to 3,456- and 1,536-well plate format, respectively, and completed uHTSs for small molecule inhibitors of this enzyme system. This report shows the results of assay development, miniaturization, and inhibitor screening for these potential drug targets.

  19. Diversity and distribution of unicellular opisthokonts along the European coast analyzed using high-throughput sequencing

    PubMed Central

    del Campo, Javier; Mallo, Diego; Massana, Ramon; de Vargas, Colomban; Richards, Thomas A.; Ruiz-Trillo, Iñaki

    2015-01-01

    Summary The opisthokonts are one of the major super-groups of eukaryotes. It comprises two major clades: 1) the Metazoa and their unicellular relatives and 2) the Fungi and their unicellular relatives. There is, however, little knowledge of the role of opisthokont microbes in many natural environments, especially among non-metazoan and non-fungal opisthokonts. Here we begin to address this gap by analyzing high throughput 18S rDNA and 18S rRNA sequencing data from different European coastal sites, sampled at different size fractions and depths. In particular, we analyze the diversity and abundance of choanoflagellates, filastereans, ichthyosporeans, nucleariids, corallochytreans and their related lineages. Our results show the great diversity of choanoflagellates in coastal waters as well as a relevant role of the ichthyosporeans and the uncultured marine opisthokonts (MAOP). Furthermore, we describe a new lineage of marine fonticulids (MAFO) that appears to be abundant in sediments. Therefore, our work points to a greater potential ecological role for unicellular opisthokonts than previously appreciated in marine environments, both in water column and sediments, and also provides evidence of novel opisthokont phylogenetic lineages. This study highlights the importance of high throughput sequencing approaches to unravel the diversity and distribution of both known and novel eukaryotic lineages. PMID:25556908

  20. Networking Omic Data to Envisage Systems Biological Regulation.

    PubMed

    Kalapanulak, Saowalak; Saithong, Treenut; Thammarongtham, Chinae

    To understand how biological processes work, it is necessary to explore the systematic regulation governing the behaviour of the processes. Not only driving the normal behavior of organisms, the systematic regulation evidently underlies the temporal responses to surrounding environments (dynamics) and long-term phenotypic adaptation (evolution). The systematic regulation is, in effect, formulated from the regulatory components which collaboratively work together as a network. In the drive to decipher such a code of lives, a spectrum of technologies has continuously been developed in the post-genomic era. With current advances, high-throughput sequencing technologies are tremendously powerful for facilitating genomics and systems biology studies in the attempt to understand system regulation inside the cells. The ability to explore relevant regulatory components which infer transcriptional and signaling regulation, driving core cellular processes, is thus enhanced. This chapter reviews high-throughput sequencing technologies, including second and third generation sequencing technologies, which support the investigation of genomics and transcriptomics data. Utilization of this high-throughput data to form the virtual network of systems regulation is explained, particularly transcriptional regulatory networks. Analysis of the resulting regulatory networks could lead to an understanding of cellular systems regulation at the mechanistic and dynamics levels. The great contribution of the biological networking approach to envisage systems regulation is finally demonstrated by a broad range of examples.

  1. High-throughput, Highly Sensitive Analyses of Bacterial Morphogenesis Using Ultra Performance Liquid Chromatography*

    PubMed Central

    Desmarais, Samantha M.; Tropini, Carolina; Miguel, Amanda; Cava, Felipe; Monds, Russell D.; de Pedro, Miguel A.; Huang, Kerwyn Casey

    2015-01-01

    The bacterial cell wall is a network of glycan strands cross-linked by short peptides (peptidoglycan); it is responsible for the mechanical integrity of the cell and shape determination. Liquid chromatography can be used to measure the abundance of the muropeptide subunits composing the cell wall. Characteristics such as the degree of cross-linking and average glycan strand length are known to vary across species. However, a systematic comparison among strains of a given species has yet to be undertaken, making it difficult to assess the origins of variability in peptidoglycan composition. We present a protocol for muropeptide analysis using ultra performance liquid chromatography (UPLC) and demonstrate that UPLC achieves resolution comparable with that of HPLC while requiring orders of magnitude less injection volume and a fraction of the elution time. We also developed a software platform to automate the identification and quantification of chromatographic peaks, which we demonstrate has improved accuracy relative to other software. This combined experimental and computational methodology revealed that peptidoglycan composition was approximately maintained across strains from three Gram-negative species despite taxonomical and morphological differences. Peptidoglycan composition and density were maintained after we systematically altered cell size in Escherichia coli using the antibiotic A22, indicating that cell shape is largely decoupled from the biochemistry of peptidoglycan synthesis. High-throughput, sensitive UPLC combined with our automated software for chromatographic analysis will accelerate the discovery of peptidoglycan composition and the molecular mechanisms of cell wall structure determination. PMID:26468288

  2. High-throughput sequencing of TCR repertoires in multiple sclerosis reveals intrathecal enrichment of EBV-reactive CD8+ T cells.

    PubMed

    Lossius, Andreas; Johansen, Jorunn N; Vartdal, Frode; Robins, Harlan; Jūratė Šaltytė, Benth; Holmøy, Trygve; Olweus, Johanna

    2014-11-01

    Epstein-Barr virus (EBV) has long been suggested as a pathogen in multiple sclerosis (MS). Here, we used high-throughput sequencing to determine the diversity, compartmentalization, persistence, and EBV-reactivity of the T-cell receptor (TCR) repertoires in MS. TCR-β genes were sequenced in paired samples of cerebrospinal fluid (CSF) and blood from patients with MS and controls with other inflammatory neurological diseases. The TCR repertoires were highly diverse in both compartments and patient groups. Expanded T-cell clones, represented by TCR-β sequences >0.1%, were of different identity in CSF and blood of MS patients, and persisted for more than a year. Reference TCR-β libraries generated from peripheral blood T cells reactive against autologous EBV-transformed B cells were highly enriched for public EBV-specific sequences and were used to quantify EBV-reactive TCR-β sequences in CSF. TCR-β sequences of EBV-reactive CD8+ T cells, including several public EBV-specific sequences, were intrathecally enriched in MS patients only, whereas those of EBV-reactive CD4+ T cells were also enriched in CSF of controls. These data provide evidence for a clonally diverse, yet compartmentalized and persistent, intrathecal T-cell response in MS. The presented strategy links TCR sequence to intrathecal T-cell specificity, demonstrating enrichment of EBV-reactive CD8+ T cells in MS. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  3. Non-biological synthetic spike-in controls and the AMPtk software pipeline improve mycobiome data

    Treesearch

    Jonathan M. Palmer; Michelle A. Jusino; Mark T. Banik; Daniel L. Lindner

    2018-01-01

    High-throughput amplicon sequencing (HTAS) of conserved DNA regions is a powerful technique to characterize microbial communities. Recently, spike-in mock communities have been used to measure accuracy of sequencing platforms and data analysis pipelines. To assess the ability of sequencing platforms and data processing pipelines using fungal internal transcribed spacer...

  4. A re-sequencing based assessment of genomic heterogeneity and fast neutron-induced deletions in a common bean cultivar

    USDA-ARS?s Scientific Manuscript database

    A small fast neutron mutant population has been established from Phaseolus vulgaris cv. Red Hawk. We leveraged the available P. vulgaris genome sequence and high throughput next generation DNA sequencing to examine the genomic structure of five Phaseolus vulgaris cv. Red Hawk fast neutron mutants wi...

  5. A fungal mock community control for amplicon sequencing experiments

    USDA-ARS?s Scientific Manuscript database

    The field of microbial ecology has been profoundly advanced by the ability to profile the composition of complex microbial communities by means of high throughput amplicon sequencing of marker genes amplified directly from environmental genomic DNA extracts. However, it has become increasingly clear...

  6. HTSeq--a Python framework to work with high-throughput sequencing data.

    PubMed

    Anders, Simon; Pyl, Paul Theodor; Huber, Wolfgang

    2015-01-15

    A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. HTSeq is released as an open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index at https://pypi.python.org/pypi/HTSeq. © The Author 2014. Published by Oxford University Press.

  7. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells.

    PubMed

    Klein, Allon M; Mazutis, Linas; Akartuna, Ilke; Tallapragada, Naren; Veres, Adrian; Li, Victor; Peshkin, Leonid; Weitz, David A; Kirschner, Marc W

    2015-05-21

    It has long been the dream of biologists to map gene expression at the single-cell level. With such data one might track heterogeneous cell sub-populations, and infer regulatory relationships between genes and pathways. Recently, RNA sequencing has achieved single-cell resolution. What is limiting is an effective way to routinely isolate and process large numbers of individual cells for quantitative in-depth sequencing. We have developed a high-throughput droplet-microfluidic approach for barcoding the RNA from thousands of individual cells for subsequent analysis by next-generation sequencing. The method shows a surprisingly low noise profile and is readily adaptable to other sequencing-based assays. We analyzed mouse embryonic stem cells, revealing in detail the population structure and the heterogeneous onset of differentiation after leukemia inhibitory factor (LIF) withdrawal. The reproducibility of these high-throughput single-cell data allowed us to deconstruct cell populations and infer gene expression relationships. VIDEO ABSTRACT. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Empirical analysis of RNA robustness and evolution using high-throughput sequencing of ribozyme reactions.

    PubMed

    Hayden, Eric J

    2016-08-15

    RNA molecules provide a realistic but tractable model of a genotype to phenotype relationship. This relationship has been extensively investigated computationally using secondary structure prediction algorithms. Enzymatic RNA molecules, or ribozymes, offer access to genotypic and phenotypic information in the laboratory. Advancements in high-throughput sequencing technologies have enabled the analysis of sequences in the lab that now rivals what can be accomplished computationally. This has motivated a resurgence of in vitro selection experiments and opened new doors for the analysis of the distribution of RNA functions in genotype space. A body of computational experiments has investigated the persistence of specific RNA structures despite changes in the primary sequence, and how this mutational robustness can promote adaptations. This article summarizes recent approaches that were designed to investigate the role of mutational robustness during the evolution of RNA molecules in the laboratory, and presents theoretical motivations, experimental methods and approaches to data analysis. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Viruses associated with Antarctic wildlife: From serology based detection to identification of genomes using high throughput sequencing.

    PubMed

    Smeele, Zoe E; Ainley, David G; Varsani, Arvind

    2018-01-02

    The Antarctic, sub-Antarctic islands and surrounding sea-ice provide a unique environment for the existence of organisms. Nonetheless, birds and seals of a variety of species inhabit them, particularly during their breeding seasons. Early research on Antarctic wildlife health, using serology-based assays, showed exposure to viruses in the families Birnaviridae, Flaviviridae, Herpesviridae, Orthomyxoviridae and Paramyxoviridae circulating in seals (Phocidae), penguins (Spheniscidae), petrels (Procellariidae) and skuas (Stercorariidae). It is only during the last decade or so that polymerase chain reaction-based assays have been used to characterize viruses associated with Antarctic animals. Furthermore, it is only during the last five years that full/whole genomes of viruses (adenoviruses, anelloviruses, orthomyxoviruses, a papillomavirus, paramyoviruses, polyomaviruses and a togavirus) have been sequenced using Sanger sequencing or high throughput sequencing (HTS) approaches. This review summaries the knowledge of animal Antarctic virology and discusses potential future directions with the advent of HTS in virus discovery and ecology. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs.

    PubMed

    Panda, Amaresh C; De, Supriyo; Grammatikakis, Ioannis; Munk, Rachel; Yang, Xiaoling; Piao, Yulan; Dudekula, Dawood B; Abdelmohsen, Kotb; Gorospe, Myriam

    2017-07-07

    High-throughput RNA sequencing methods coupled with specialized bioinformatic analyses have recently uncovered tens of thousands of unique circular (circ)RNAs, but their complete sequences, genes of origin and functions are largely unknown. Given that circRNAs lack free ends and are thus relatively stable, their association with microRNAs (miRNAs) and RNA-binding proteins (RBPs) can influence gene expression programs. While exoribonuclease treatment is widely used to degrade linear RNAs and enrich circRNAs in RNA samples, it does not efficiently eliminate all linear RNAs. Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  11. High-purity circular RNA isolation method (RPAD) reveals vast collection of intronic circRNAs

    PubMed Central

    De, Supriyo; Grammatikakis, Ioannis; Munk, Rachel; Yang, Xiaoling; Piao, Yulan; Dudekula, Dawood B.; Gorospe, Myriam

    2017-01-01

    Abstract High-throughput RNA sequencing methods coupled with specialized bioinformatic analyses have recently uncovered tens of thousands of unique circular (circ)RNAs, but their complete sequences, genes of origin and functions are largely unknown. Given that circRNAs lack free ends and are thus relatively stable, their association with microRNAs (miRNAs) and RNA-binding proteins (RBPs) can influence gene expression programs. While exoribonuclease treatment is widely used to degrade linear RNAs and enrich circRNAs in RNA samples, it does not efficiently eliminate all linear RNAs. Here, we describe a novel method for the isolation of highly pure circRNA populations involving RNase R treatment followed by Polyadenylation and poly(A)+ RNA Depletion (RPAD), which removes linear RNA to near completion. High-throughput sequencing of RNA prepared using RPAD from human cervical carcinoma HeLa cells and mouse C2C12 myoblasts led to two surprising discoveries: (i) many exonic circRNA (EcircRNA) isoforms share an identical backsplice sequence but have different body sizes and sequences, and (ii) thousands of novel intronic circular RNAs (IcircRNAs) are expressed in cells. In sum, isolating high-purity circRNAs using the RPAD method can enable quantitative and qualitative analyses of circRNA types and sequence composition, paving the way for the elucidation of circRNA functions. PMID:28444238

  12. The draft genome sequence of cork oak

    PubMed Central

    Ramos, António Marcos; Usié, Ana; Barbosa, Pedro; Barros, Pedro M.; Capote, Tiago; Chaves, Inês; Simões, Fernanda; Abreu, Isabl; Carrasquinho, Isabel; Faro, Carlos; Guimarães, Joana B.; Mendonça, Diogo; Nóbrega, Filomena; Rodrigues, Leandra; Saibo, Nelson J. M.; Varela, Maria Carolina; Egas, Conceição; Matos, José; Miguel, Célia M.; Oliveira, M. Margarida; Ricardo, Cândido P.; Gonçalves, Sónia

    2018-01-01

    Cork oak (Quercus suber) is native to southwest Europe and northwest Africa where it plays a crucial environmental and economical role. To tackle the cork oak production and industrial challenges, advanced research is imperative but dependent on the availability of a sequenced genome. To address this, we produced the first draft version of the cork oak genome. We followed a de novo assembly strategy based on high-throughput sequence data, which generated a draft genome comprising 23,347 scaffolds and 953.3 Mb in size. A total of 79,752 genes and 83,814 transcripts were predicted, including 33,658 high-confidence genes. An InterPro signature assignment was detected for 69,218 transcripts, which represented 82.6% of the total. Validation studies demonstrated the genome assembly and annotation completeness and highlighted the usefulness of the draft genome for read mapping of high-throughput sequence data generated using different protocols. All data generated is available through the public databases where it was deposited, being therefore ready to use by the academic and industry communities working on cork oak and/or related species. PMID:29786699

  13. The draft genome sequence of cork oak.

    PubMed

    Ramos, António Marcos; Usié, Ana; Barbosa, Pedro; Barros, Pedro M; Capote, Tiago; Chaves, Inês; Simões, Fernanda; Abreu, Isabl; Carrasquinho, Isabel; Faro, Carlos; Guimarães, Joana B; Mendonça, Diogo; Nóbrega, Filomena; Rodrigues, Leandra; Saibo, Nelson J M; Varela, Maria Carolina; Egas, Conceição; Matos, José; Miguel, Célia M; Oliveira, M Margarida; Ricardo, Cândido P; Gonçalves, Sónia

    2018-05-22

    Cork oak (Quercus suber) is native to southwest Europe and northwest Africa where it plays a crucial environmental and economical role. To tackle the cork oak production and industrial challenges, advanced research is imperative but dependent on the availability of a sequenced genome. To address this, we produced the first draft version of the cork oak genome. We followed a de novo assembly strategy based on high-throughput sequence data, which generated a draft genome comprising 23,347 scaffolds and 953.3 Mb in size. A total of 79,752 genes and 83,814 transcripts were predicted, including 33,658 high-confidence genes. An InterPro signature assignment was detected for 69,218 transcripts, which represented 82.6% of the total. Validation studies demonstrated the genome assembly and annotation completeness and highlighted the usefulness of the draft genome for read mapping of high-throughput sequence data generated using different protocols. All data generated is available through the public databases where it was deposited, being therefore ready to use by the academic and industry communities working on cork oak and/or related species.

  14. Using high-throughput barcode sequencing to efficiently map connectomes.

    PubMed

    Peikon, Ian D; Kebschull, Justus M; Vagin, Vasily V; Ravens, Diana I; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R; Bressan, Dario; Zador, Anthony M

    2017-07-07

    The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision-a 'connectome'-is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence-an RNA 'barcode'-which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. SELMAP - SELEX affinity landscape MAPping of transcription factor binding sites using integrated microfluidics

    PubMed Central

    Chen, Dana; Orenstein, Yaron; Golodnitsky, Rada; Pellach, Michal; Avrahami, Dorit; Wachtel, Chaim; Ovadia-Shochat, Avital; Shir-Shapira, Hila; Kedmi, Adi; Juven-Gershon, Tamar; Shamir, Ron; Gerber, Doron

    2016-01-01

    Transcription factors (TFs) alter gene expression in response to changes in the environment through sequence-specific interactions with the DNA. These interactions are best portrayed as a landscape of TF binding affinities. Current methods to study sequence-specific binding preferences suffer from limited dynamic range, sequence bias, lack of specificity and limited throughput. We have developed a microfluidic-based device for SELEX Affinity Landscape MAPping (SELMAP) of TF binding, which allows high-throughput measurement of 16 proteins in parallel. We used it to measure the relative affinities of Pho4, AtERF2 and Btd full-length proteins to millions of different DNA binding sites, and detected both high and low-affinity interactions in equilibrium conditions, generating a comprehensive landscape of the relative TF affinities to all possible DNA 6-mers, and even DNA10-mers with increased sequencing depth. Low quantities of both the TFs and DNA oligomers were sufficient for obtaining high-quality results, significantly reducing experimental costs. SELMAP allows in-depth screening of hundreds of TFs, and provides a means for better understanding of the regulatory processes that govern gene expression. PMID:27628341

  16. PANGEA: pipeline for analysis of next generation amplicons

    PubMed Central

    Giongo, Adriana; Crabb, David B; Davis-Richardson, Austin G; Chauliac, Diane; Mobberley, Jennifer M; Gano, Kelsey A; Mukherjee, Nabanita; Casella, George; Roesch, Luiz FW; Walts, Brandon; Riva, Alberto; King, Gary; Triplett, Eric W

    2010-01-01

    High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including preprocessing, clustering, database matching and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities. Two different sets of bacterial 16S rRNA sequences were used to show the efficiency of this workflow. The first set of 16S rRNA sequences is derived from various soils from Hawaii Volcanoes National Park. The second set is derived from stool samples collected from diabetes-resistant and diabetes-prone rats. The workflow described here allows the investigator to quickly assess libraries of sequences on personal computers with customized databases. PANGEA is provided for users as individual scripts for each step in the process or as a single script where all processes, except the χ2 step, are joined into one program called the ‘backbone’. PMID:20182525

  17. PANGEA: pipeline for analysis of next generation amplicons.

    PubMed

    Giongo, Adriana; Crabb, David B; Davis-Richardson, Austin G; Chauliac, Diane; Mobberley, Jennifer M; Gano, Kelsey A; Mukherjee, Nabanita; Casella, George; Roesch, Luiz F W; Walts, Brandon; Riva, Alberto; King, Gary; Triplett, Eric W

    2010-07-01

    High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including pre-processing, clustering, database matching and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities. Two different sets of bacterial 16S rRNA sequences were used to show the efficiency of this workflow. The first set of 16S rRNA sequences is derived from various soils from Hawaii Volcanoes National Park. The second set is derived from stool samples collected from diabetes-resistant and diabetes-prone rats. The workflow described here allows the investigator to quickly assess libraries of sequences on personal computers with customized databases. PANGEA is provided for users as individual scripts for each step in the process or as a single script where all processes, except the chi(2) step, are joined into one program called the 'backbone'.

  18. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.

    PubMed

    Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia; Pérez-Lluch, Sílvia; Abad, Amaya; Davis, Carrie; Gingeras, Thomas R; Frankish, Adam; Harrow, Jennifer; Guigo, Roderic; Johnson, Rory

    2017-12-01

    Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.

  19. 'PACLIMS': a component LIM system for high-throughput functional genomic analysis.

    PubMed

    Donofrio, Nicole; Rajagopalon, Ravi; Brown, Douglas; Diener, Stephen; Windham, Donald; Nolin, Shelly; Floyd, Anna; Mitchell, Thomas; Galadima, Natalia; Tucker, Sara; Orbach, Marc J; Patel, Gayatri; Farman, Mark; Pampanwar, Vishal; Soderlund, Cari; Lee, Yong-Hwan; Dean, Ralph A

    2005-04-12

    Recent advances in sequencing techniques leading to cost reduction have resulted in the generation of a growing number of sequenced eukaryotic genomes. Computational tools greatly assist in defining open reading frames and assigning tentative annotations. However, gene functions cannot be asserted without biological support through, among other things, mutational analysis. In taking a genome-wide approach to functionally annotate an entire organism, in this application the approximately 11,000 predicted genes in the rice blast fungus (Magnaporthe grisea), an effective platform for tracking and storing both the biological materials created and the data produced across several participating institutions was required. The platform designed, named PACLIMS, was built to support our high throughput pipeline for generating 50,000 random insertion mutants of Magnaporthe grisea. To be a useful tool for materials and data tracking and storage, PACLIMS was designed to be simple to use, modifiable to accommodate refinement of research protocols, and cost-efficient. Data entry into PACLIMS was simplified through the use of barcodes and scanners, thus reducing the potential human error, time constraints, and labor. This platform was designed in concert with our experimental protocol so that it leads the researchers through each step of the process from mutant generation through phenotypic assays, thus ensuring that every mutant produced is handled in an identical manner and all necessary data is captured. Many sequenced eukaryotes have reached the point where computational analyses are no longer sufficient and require biological support for their predicted genes. Consequently, there is an increasing need for platforms that support high throughput genome-wide mutational analyses. While PACLIMS was designed specifically for this project, the source and ideas present in its implementation can be used as a model for other high throughput mutational endeavors.

  20. 'PACLIMS': A component LIM system for high-throughput functional genomic analysis

    PubMed Central

    Donofrio, Nicole; Rajagopalon, Ravi; Brown, Douglas; Diener, Stephen; Windham, Donald; Nolin, Shelly; Floyd, Anna; Mitchell, Thomas; Galadima, Natalia; Tucker, Sara; Orbach, Marc J; Patel, Gayatri; Farman, Mark; Pampanwar, Vishal; Soderlund, Cari; Lee, Yong-Hwan; Dean, Ralph A

    2005-01-01

    Background Recent advances in sequencing techniques leading to cost reduction have resulted in the generation of a growing number of sequenced eukaryotic genomes. Computational tools greatly assist in defining open reading frames and assigning tentative annotations. However, gene functions cannot be asserted without biological support through, among other things, mutational analysis. In taking a genome-wide approach to functionally annotate an entire organism, in this application the ~11,000 predicted genes in the rice blast fungus (Magnaporthe grisea), an effective platform for tracking and storing both the biological materials created and the data produced across several participating institutions was required. Results The platform designed, named PACLIMS, was built to support our high throughput pipeline for generating 50,000 random insertion mutants of Magnaporthe grisea. To be a useful tool for materials and data tracking and storage, PACLIMS was designed to be simple to use, modifiable to accommodate refinement of research protocols, and cost-efficient. Data entry into PACLIMS was simplified through the use of barcodes and scanners, thus reducing the potential human error, time constraints, and labor. This platform was designed in concert with our experimental protocol so that it leads the researchers through each step of the process from mutant generation through phenotypic assays, thus ensuring that every mutant produced is handled in an identical manner and all necessary data is captured. Conclusion Many sequenced eukaryotes have reached the point where computational analyses are no longer sufficient and require biological support for their predicted genes. Consequently, there is an increasing need for platforms that support high throughput genome-wide mutational analyses. While PACLIMS was designed specifically for this project, the source and ideas present in its implementation can be used as a model for other high throughput mutational endeavors. PMID:15826298

  1. Whole Wiskott‑Aldrich syndrome protein gene deletion identified by high throughput sequencing.

    PubMed

    He, Xiangling; Zou, Runying; Zhang, Bing; You, Yalan; Yang, Yang; Tian, Xin

    2017-11-01

    Wiskott‑Aldrich syndrome (WAS) is a rare X‑linked recessive immunodeficiency disorder, characterized by thrombocytopenia, small platelets, eczema and recurrent infections associated with increased risk of autoimmunity and malignancy disorders. Mutations in the WAS protein (WASP) gene are responsible for WAS. To date, WASP mutations, including missense/nonsense, splicing, small deletions, small insertions, gross deletions, and gross insertions have been identified in patients with WAS. In addition, WASP‑interacting proteins are suspected in patients with clinical features of WAS, in whom the WASP gene sequence and mRNA levels are normal. The present study aimed to investigate the application of next generation sequencing in definitive diagnosis and clinical therapy for WAS. A 5 month‑old child with WAS who displayed symptoms of thrombocytopenia was examined. Whole exome sequence analysis of genomic DNA showed that the coverage and depth of WASP were extremely low. Quantitative polymerase chain reaction indicated total WASP gene deletion in the proband. In conclusion, high throughput sequencing is useful for the verification of WAS on the genetic profile, and has implications for family planning guidance and establishment of clinical programs.

  2. The use of museum specimens with high-throughput DNA sequencers

    PubMed Central

    Burrell, Andrew S.; Disotell, Todd R.; Bergey, Christina M.

    2015-01-01

    Natural history collections have long been used by morphologists, anatomists, and taxonomists to probe the evolutionary process and describe biological diversity. These biological archives also offer great opportunities for genetic research in taxonomy, conservation, systematics, and population biology. They allow assays of past populations, including those of extinct species, giving context to present patterns of genetic variation and direct measures of evolutionary processes. Despite this potential, museum specimens are difficult to work with because natural postmortem processes and preservation methods fragment and damage DNA. These problems have restricted geneticists’ ability to use natural history collections primarily by limiting how much of the genome can be surveyed. Recent advances in DNA sequencing technology, however, have radically changed this, making truly genomic studies from museum specimens possible. We review the opportunities and drawbacks of the use of museum specimens, and suggest how to best execute projects when incorporating such samples. Several high-throughput (HT) sequencing methodologies, including whole genome shotgun sequencing, sequence capture, and restriction digests (demonstrated here), can be used with archived biomaterials. PMID:25532801

  3. Rapid and reliable high-throughput methods of DNA extraction for use in barcoding and molecular systematics of mushrooms.

    PubMed

    Dentinger, Bryn T M; Margaritescu, Simona; Moncalvo, Jean-Marc

    2010-07-01

    We present two methods for DNA extraction from fresh and dried mushrooms that are adaptable to high-throughput sequencing initiatives, such as DNA barcoding. Our results show that these protocols yield ∼85% sequencing success from recently collected materials. Tests with both recent (<2 year) and older (>100 years) specimens reveal that older collections have low success rates and may be an inefficient resource for populating a barcode database. However, our method of extracting DNA from herbarium samples using small amount of tissue is reliable and could be used for important historical specimens. The application of these protocols greatly reduces time, and therefore cost, of generating DNA sequences from mushrooms and other fungi vs. traditional extraction methods. The efficiency of these methods illustrates that standardization and streamlining of sample processing should be shifted from the laboratory to the field. © 2009 Blackwell Publishing Ltd.

  4. High-Throughput Single-Cell RNA Sequencing and Data Analysis.

    PubMed

    Sagar; Herman, Josip Stefan; Pospisilik, John Andrew; Grün, Dominic

    2018-01-01

    Understanding biological systems at a single cell resolution may reveal several novel insights which remain masked by the conventional population-based techniques providing an average readout of the behavior of cells. Single-cell transcriptome sequencing holds the potential to identify novel cell types and characterize the cellular composition of any organ or tissue in health and disease. Here, we describe a customized high-throughput protocol for single-cell RNA-sequencing (scRNA-seq) combining flow cytometry and a nanoliter-scale robotic system. Since scRNA-seq requires amplification of a low amount of endogenous cellular RNA, leading to substantial technical noise in the dataset, downstream data filtering and analysis require special care. Therefore, we also briefly describe in-house state-of-the-art data analysis algorithms developed to identify cellular subpopulations including rare cell types as well as to derive lineage trees by ordering the identified subpopulations of cells along the inferred differentiation trajectories.

  5. Quasispecies variant of pre-S/S gene in HBV-related hepatocellular carcinoma with HBs antigen positive and occult infection.

    PubMed

    Hatazawa, Yuri; Yano, Yoshihiko; Okada, Rina; Tanahashi, Toshihito; Hayashi, Hiroki; Hirano, Hirotaka; Minami, Akihiro; Kawano, Yuki; Tanaka, Motofumi; Fukumoto, Takumi; Murakami, Yoshiki; Yoshida, Masaru; Hayashi, Yoshitake

    2018-01-01

    Hepatocellular carcinoma (HCC) can develop in patients who are negative for the hepatitis B surface antigen (HBsAg) in serum but positive for hepatitis B virus (HBV) DNA in the liver, referred to as occult HBV infection (OBI). Previous reports showed that HBV variants in OBI-related HCC are different from those in HBsAg-positive HCC. In the present study, HBV quasispecies based on the pre-S/S gene in OBI-related HCC patients were examined by high throughput sequencing and compared with those in HBsAg-positive HCC. Nineteen tissue samples (9 OBI-related and 10 HBsAg-positive non-cancerous tissues) were collected at the time of surgery at Kobe University Hospital. The quasispecies with more than 1% variation in the pre-S/S region were isolated and analysed by ultra-deep sequencing. There were no significant differences in the major HBV populations, which exhibit more than 20% variation within the entire pre-S/S region, between OBI-related HCC and HBsAg-positive HCC. However, the prevalences of major populations with pre-S2 region mutations and of minor populations with polymerized human serum albumin-binding domain mutations were significantly higher in OBI-related HCC than in HBsAg-positive HCC. Moreover, the major variant populations associated with the B-cell epitope, located within the pre-S1 region, and the a determinant domain, located in the S region, were detected frequently in HBsAg-positive HCC. The minor populations of variants harbouring the W4R, L30S, Q118R/Stop, N123D and S124F/P mutations in the pre-S region and the L21F/S and L42F/S mutations in the S region were detected more frequently in OBI-related HCC than in HBsAg-positive HCC. Ultra-deep sequencing revealed that the B-cell epitope domain in the pre-S1 region and alpha determinant domain in the S region were variable in HBsAg-positive HCC, although the quasispecies associated with the pre-S2 region were highly prevalent in OBI-related HCC. Ref: R000034382/UMIN000030113; Retrospectively registered 25 November 2017.

  6. Epigenetics and Epigenomics of Plants.

    PubMed

    Yadav, Chandra Bhan; Pandey, Garima; Muthamilarasan, Mehanathan; Prasad, Manoj

    2018-01-23

    The genetic material DNA in association with histone proteins forms the complex structure called chromatin, which is prone to undergo modification through certain epigenetic mechanisms including cytosine DNA methylation, histone modifications, and small RNA-mediated methylation. Alterations in chromatin structure lead to inaccessibility of genomic DNA to various regulatory proteins such as transcription factors, which eventually modulates gene expression. Advancements in high-throughput sequencing technologies have provided the opportunity to study the epigenetic mechanisms at genome-wide levels. Epigenomic studies using high-throughput technologies will widen the understanding of mechanisms as well as functions of regulatory pathways in plant genomes, which will further help in manipulating these pathways using genetic and biochemical approaches. This technology could be a potential research tool for displaying the systematic associations of genetic and epigenetic variations, especially in terms of cytosine methylation onto the genomic region in a specific cell or tissue. A comprehensive study of plant populations to correlate genotype to epigenotype and to phenotype, and also the study of methyl quantitative trait loci (QTL) or epiGWAS, is possible by using high-throughput sequencing methods, which will further accelerate molecular breeding programs for crop improvement. Graphical Abstract.

  7. Development and use of molecular markers: past and present.

    PubMed

    Grover, Atul; Sharma, P C

    2016-01-01

    Molecular markers, due to their stability, cost-effectiveness and ease of use provide an immensely popular tool for a variety of applications including genome mapping, gene tagging, genetic diversity diversity, phylogenetic analysis and forensic investigations. In the last three decades, a number of molecular marker techniques have been developed and exploited worldwide in different systems. However, only a handful of these techniques, namely RFLPs, RAPDs, AFLPs, ISSRs, SSRs and SNPs have received global acceptance. A recent revolution in DNA sequencing techniques has taken the discovery and application of molecular markers to high-throughput and ultrahigh-throughput levels. Although, the choice of marker will obviously depend on the targeted use, microsatellites, SNPs and genotyping by sequencing (GBS) largely fulfill most of the user requirements. Further, modern transcriptomic and functional markers will lead the ventures onto high-density genetic map construction, identification of QTLs, breeding and conservation strategies in times to come in combination with other high throughput techniques. This review presents an overview of different marker technologies and their variants with a comparative account of their characteristic features and applications.

  8. Filling reference gaps via assembling DNA barcodes using high-throughput sequencing-moving toward barcoding the world.

    PubMed

    Liu, Shanlin; Yang, Chentao; Zhou, Chengran; Zhou, Xin

    2017-12-01

    Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)-based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn't show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. © The Authors 2017. Published by Oxford University Press.

  9. Transcription profile of boar spermatozoa as revealed by RNA-sequencing

    USDA-ARS?s Scientific Manuscript database

    High-throughput RNA sequencing (RNA-Seq) overcomes the limitations of the current hybridization-based techniques to detect the actual pool of RNA transcripts in spermatozoa. The application of this technology in livestock can speed the discovery of potential predictors of male fertility. As a first ...

  10. Biofilm-Growing Bacteria Involved in the Corrosion of Concrete Wastewater Pipes: Protocols for Comparative Metagenomic Analyses

    EPA Science Inventory

    Advances in high-throughput next-generation sequencing (NGS) technology for direct sequencing of environmental DNA (i.e. shotgun metagenomics) is transforming the field of microbiology. NGS technologies are now regularly being applied in comparative metagenomic studies, which pr...

  11. A highly efficient, high-throughput lipidomics platform for the quantitative detection of eicosanoids in human whole blood.

    PubMed

    Song, Jiao; Liu, Xuejun; Wu, Jiejun; Meehan, Michael J; Blevitt, Jonathan M; Dorrestein, Pieter C; Milla, Marcos E

    2013-02-15

    We have developed an ultra-performance liquid chromatography-multiple reaction monitoring/mass spectrometry (UPLC-MRM/MS)-based, high-content, high-throughput platform that enables simultaneous profiling of multiple lipids produced ex vivo in human whole blood (HWB) on treatment with calcium ionophore and its modulation with pharmacological agents. HWB samples were processed in a 96-well plate format compatible with high-throughput sample processing instrumentation. We employed a scheduled MRM (sMRM) method, with a triple-quadrupole mass spectrometer coupled to a UPLC system, to measure absolute amounts of 122 distinct eicosanoids using deuterated internal standards. In a 6.5-min run, we resolved and detected with high sensitivity (lower limit of quantification in the range of 0.4-460 pg) all targeted analytes from a very small HWB sample (2.5 μl). Approximately 90% of the analytes exhibited a dynamic range exceeding 1000. We also developed a tailored software package that dramatically sped up the overall data quantification and analysis process with superior consistency and accuracy. Matrix effects from HWB and precision of the calibration curve were evaluated using this newly developed automation tool. This platform was successfully applied to the global quantification of changes on all 122 eicosanoids in HWB samples from healthy donors in response to calcium ionophore stimulation. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. Eight-Channel AC Magnetosusceptometer of Magnetic Nanoparticles for High-Throughput and Ultra-High-Sensitivity Immunoassay

    PubMed Central

    Chieh, Jen-Jie; Wei, Wen-Chun; Chen, Hsin-Hsein; Lee, Yen-Fu; Lin, Feng-Chun; Chiang, Ming-Hsien; Chiu, Ming-Jang; Horng, Herng-Er; Yang, Shieh-Yueh

    2018-01-01

    An alternating-current magnetosusceptometer of antibody-functionalized magnetic nanoparticles (MNPs) was developed for immunomagnetic reduction (IMR). A high-sensitivity, high-critical-temperature superconducting quantum interference device was used in the magnetosusceptometer. Minute levels of biomarkers of early-stage neurodegeneration diseases were detectable in serum, but measuring each biomarker required approximately 4 h. Hence, an eight-channel platform was developed in this study to fit minimal screening requirements for Alzheimer’s disease. Two consistent results were measured for three biomarkers, namely Aβ40, Aβ42, and tau protein, per human specimen. This paper presents the instrument configuration as well as critical characteristics, such as the low noise level variations among channels, a high signal-to-noise ratio, and the coefficient of variation for the biomarkers’ IMR values. The instrument’s ultrahigh sensitivity levels for the three biomarkers and the substantially shorter total measurement time in comparison with the previous single- and four-channels platforms were also demonstrated in this study. Thus, the eight-channel instrument may serve as a powerful tool for clinical high-throughput screening of Alzheimer’s disease. PMID:29601532

  13. High-Throughput Sequencing and Metagenomics: Moving Forward in the Culture-Independent Analysis of Food Microbial Ecology

    PubMed Central

    2013-01-01

    Following recent trends in environmental microbiology, food microbiology has benefited from the advances in molecular biology and adopted novel strategies to detect, identify, and monitor microbes in food. An in-depth study of the microbial diversity in food can now be achieved by using high-throughput sequencing (HTS) approaches after direct nucleic acid extraction from the sample to be studied. In this review, the workflow of applying culture-independent HTS to food matrices is described. The current scenario and future perspectives of HTS uses to study food microbiota are presented, and the decision-making process leading to the best choice of working conditions to fulfill the specific needs of food research is described. PMID:23475615

  14. Using high-throughput barcode sequencing to efficiently map connectomes

    PubMed Central

    Peikon, Ian D.; Kebschull, Justus M.; Vagin, Vasily V.; Ravens, Diana I.; Sun, Yu-Chi; Brouzes, Eric; Corrêa, Ivan R.; Bressan, Dario

    2017-01-01

    Abstract The function of a neural circuit is determined by the details of its synaptic connections. At present, the only available method for determining a neural wiring diagram with single synapse precision—a ‘connectome’—is based on imaging methods that are slow, labor-intensive and expensive. Here, we present SYNseq, a method for converting the connectome into a form that can exploit the speed and low cost of modern high-throughput DNA sequencing. In SYNseq, each neuron is labeled with a unique random nucleotide sequence—an RNA ‘barcode’—which is targeted to the synapse using engineered proteins. Barcodes in pre- and postsynaptic neurons are then associated through protein-protein crosslinking across the synapse, extracted from the tissue, and joined into a form suitable for sequencing. Although our failure to develop an efficient barcode joining scheme precludes the widespread application of this approach, we expect that with further development SYNseq will enable tracing of complex circuits at high speed and low cost. PMID:28449067

  15. RIPiT-Seq: A high-throughput approach for footprinting RNA:protein complexes

    PubMed Central

    Singh, Guramrit; Ricci, Emiliano P.; Moore, Melissa J.

    2013-01-01

    Development of high-throughput approaches to map the RNA interaction sites of individual RNA binding proteins (RBPs) transcriptome-wide is rapidly transforming our understanding of post-transcriptional gene regulatory mechanisms. Here we describe a ribonucleoprotein (RNP) footprinting approach we recently developed for identifying occupancy sites of both individual RBPs and multi-subunit RNP complexes. RNA:protein immunoprecipitation in tandem (RIPiT) yields highly specific RNA footprints of cellular RNPs isolated via two sequential purifications; the resulting RNA footprints can then be identified by high-throughput sequencing (Seq). RIPiT-Seq is broadly applicable to all RBPs regardless of their RNA binding mode and thus provides a means to map the RNA binding sites of RBPs with poor inherent ultraviolet (UV) crosslinkability. Further, among current high-throughput approaches, RIPiT has the unique capacity to differentiate binding sites of RNPs with overlapping protein composition. It is therefore particularly suited for studying dynamic RNP assemblages whose composition evolves as gene expression proceeds. PMID:24096052

  16. Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

    PubMed Central

    Laehnemann, David; Borkhardt, Arndt

    2016-01-01

    Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159

  17. Ancient pathogen DNA in archaeological samples detected with a Microbial Detection Array.

    PubMed

    Devault, Alison M; McLoughlin, Kevin; Jaing, Crystal; Gardner, Shea; Porter, Teresita M; Enk, Jacob M; Thissen, James; Allen, Jonathan; Borucki, Monica; DeWitte, Sharon N; Dhody, Anna N; Poinar, Hendrik N

    2014-03-06

    Ancient human remains of paleopathological interest typically contain highly degraded DNA in which pathogenic taxa are often minority components, making sequence-based metagenomic characterization costly. Microarrays may hold a potential solution to these challenges, offering a rapid, affordable, and highly informative snapshot of microbial diversity in complex samples without the lengthy analysis and/or high cost associated with high-throughput sequencing. Their versatility is well established for modern clinical specimens, but they have yet to be applied to ancient remains. Here we report bacterial profiles of archaeological and historical human remains using the Lawrence Livermore Microbial Detection Array (LLMDA). The array successfully identified previously-verified bacterial human pathogens, including Vibrio cholerae (cholera) in a 19th century intestinal specimen and Yersinia pestis ("Black Death" plague) in a medieval tooth, which represented only minute fractions (0.03% and 0.08% alignable high-throughput shotgun sequencing reads) of their respective DNA content. This demonstrates that the LLMDA can identify primary and/or co-infecting bacterial pathogens in ancient samples, thereby serving as a rapid and inexpensive paleopathological screening tool to study health across both space and time.

  18. BarraCUDA - a fast short read sequence aligner using graphics processing units

    PubMed Central

    2012-01-01

    Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net PMID:22244497

  19. Durable silver thin film coating for diffraction gratings

    DOEpatents

    Wolfe, Jesse D [Discovery Bay, CA; Britten, Jerald A [Oakley, CA; Komashko, Aleksey M [San Diego, CA

    2006-05-30

    A durable silver film thin film coated non-planar optical element has been developed to replace Gold as a material for fabricating such devices. Such a coating and resultant optical element has an increased efficiency and is resistant to tarnishing, can be easily stripped and re-deposited without modifying underlying grating structure, improves the throughput and power loading of short pulse compressor designs for ultra-fast laser systems, and can be utilized in variety of optical and spectrophotometric systems, particularly high-end spectrometers that require maximized efficiency.

  20. An efficient interpolation filter VLSI architecture for HEVC standard

    NASA Astrophysics Data System (ADS)

    Zhou, Wei; Zhou, Xin; Lian, Xiaocong; Liu, Zhenyu; Liu, Xiaoxiang

    2015-12-01

    The next-generation video coding standard of High-Efficiency Video Coding (HEVC) is especially efficient for coding high-resolution video such as 8K-ultra-high-definition (UHD) video. Fractional motion estimation in HEVC presents a significant challenge in clock latency and area cost as it consumes more than 40 % of the total encoding time and thus results in high computational complexity. With aims at supporting 8K-UHD video applications, an efficient interpolation filter VLSI architecture for HEVC is proposed in this paper. Firstly, a new interpolation filter algorithm based on the 8-pixel interpolation unit is proposed in this paper. It can save 19.7 % processing time on average with acceptable coding quality degradation. Based on the proposed algorithm, an efficient interpolation filter VLSI architecture, composed of a reused data path of interpolation, an efficient memory organization, and a reconfigurable pipeline interpolation filter engine, is presented to reduce the implement hardware area and achieve high throughput. The final VLSI implementation only requires 37.2k gates in a standard 90-nm CMOS technology at an operating frequency of 240 MHz. The proposed architecture can be reused for either half-pixel interpolation or quarter-pixel interpolation, which can reduce the area cost for about 131,040 bits RAM. The processing latency of our proposed VLSI architecture can support the real-time processing of 4:2:0 format 7680 × 4320@78fps video sequences.

  1. Commercial aspects of epitaxial thin film growth in outer space

    NASA Technical Reports Server (NTRS)

    Ignatiev, Alex; Chu, C. W.

    1988-01-01

    A new concept for materials processing in space exploits the ultra vacuum component of space for thin film epitaxial growth. The unique low earth orbit space environment is expected to yield 10 to the -14th torr or better pressures, semiinfinite pumping speeds and large ultra vacuum volume (about 100 cu m) without walls. These space ultra vacuum properties promise major improvement in the quality, unique nature, and the throughput of epitaxially grown materials especially in the area of semiconductors for microelectronics use. For such thin film materials there is expected a very large value added from space ultra vacuum processing, and as a result the application of the epitaxial thin film growth technology to space could lead to major commercial efforts in space.

  2. High-Throughput SNP Discovery through Deep Resequencing of a Reduced Representation Library to Anchor and Orient Scaffolds in the Soybean Whole Genome Sequence

    USDA-ARS?s Scientific Manuscript database

    The soybean Consensus Map 4.0 facilitated the anchoring of 95.6% of the soybean whole genome sequence developed by the Joint Genome Institute, Department of Energy but only properly oriented 66% of the sequence scaffolds. To find additional single nucleotide polymorphism (SNP) markers for additiona...

  3. Comparison of Burrows-Wheeler transform-based mapping algorithms used in high-throughput whole-genome sequencing: application to Illumina data for livestock genomes

    USDA-ARS?s Scientific Manuscript database

    Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Eff...

  4. PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

    PubMed

    Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

    2017-11-01

    High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Davenport, Karen

    Karen Davenport of Los Alamos National Laboratory discusses a high-throughput next generation genome finishing pipeline on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM.

  6. Single Nucleobase Identification Using Biophysical Signatures from Nanoelectronic Quantum Tunneling.

    PubMed

    Korshoj, Lee E; Afsari, Sepideh; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

    2017-03-01

    Nanoelectronic DNA sequencing can provide an important alternative to sequencing-by-synthesis by reducing sample preparation time, cost, and complexity as a high-throughput next-generation technique with accurate single-molecule identification. However, sample noise and signature overlap continue to prevent high-resolution and accurate sequencing results. Probing the molecular orbitals of chemically distinct DNA nucleobases offers a path for facile sequence identification, but molecular entropy (from nucleotide conformations) makes such identification difficult when relying only on the energies of lowest-unoccupied and highest-occupied molecular orbitals (LUMO and HOMO). Here, nine biophysical parameters are developed to better characterize molecular orbitals of individual nucleobases, intended for single-molecule DNA sequencing using quantum tunneling of charges. For this analysis, theoretical models for quantum tunneling are combined with transition voltage spectroscopy to obtain measurable parameters unique to the molecule within an electronic junction. Scanning tunneling spectroscopy is then used to measure these nine biophysical parameters for DNA nucleotides, and a modified machine learning algorithm identified nucleobases. The new parameters significantly improve base calling over merely using LUMO and HOMO frontier orbital energies. Furthermore, high accuracies for identifying DNA nucleobases were observed at different pH conditions. These results have significant implications for developing a robust and accurate high-throughput nanoelectronic DNA sequencing technique. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Lights, camera, action: high-throughput plant phenotyping is ready for a close-up

    USDA-ARS?s Scientific Manuscript database

    Modern techniques for crop improvement rely on both DNA sequencing and accurate quantification of plant traits to identify genes and germplasm of interest. With rapid advances in DNA sequencing technologies, plant phenotyping is now a bottleneck in advancing crop yields [1,2]. Furthermore, the envir...

  8. Alfalfa virus S, a new species in the family Alphaflexiviridae

    USDA-ARS?s Scientific Manuscript database

    A new species of the family Alphaflexiviridae provisionally named alfalfa virus S (AVS) was discovered in alfalfa samples originating from Sudan. A complete nucleotide sequence of the viral genome consisting of 8,349 nucleotides excluding the 3’ poly(A) tail was determined by high throughput sequenc...

  9. High-throughput illumina strand-specific RNA sequencing library preparation

    USDA-ARS?s Scientific Manuscript database

    Conventional Illumina RNA-Seq does not have the resolution to decode the complex eukaryote transcriptome due to the lack of RNA polarity information. Strand-specific RNA sequencing (ssRNA-Seq) can overcome these limitations and as such is better suited for genome annotation, de novo transcriptome as...

  10. Application of genotyping-by-sequencing for mapping disease resistance in grapevine breeding families

    USDA-ARS?s Scientific Manuscript database

    Genotyping-by-Sequencing (GBS) is a low-cost, high-throughput, method for genome-wide polymorphism discovery and genotyping adjacent to restriction sites. Since 2010, GBS has been applied for the genotyping of over 12,000 grape breeding lines, with a primary focus on identifying markers predictive ...

  11. Effort versus reward: preparing samples for fungal community characterization in high-throughput sequencing surveys of soils

    USDA-ARS?s Scientific Manuscript database

    Next generation fungal amplicon sequencing is being used with increasing frequency to study fungal diversity in various ecosystems; however, the influence of sample preparation on the characterization of fungal community is poorly understood. We investigated the effects of four procedural modificati...

  12. High-throughput interpretation of gene structure changes in human and nonhuman resequencing data, using ACE

    USDA-ARS?s Scientific Manuscript database

    We describe a suite of software tools for identifying possible functional changes in gene structure that may result from sequence variants. ACE (“Assessing Changes to Exons”) converts phased genotype calls to a collection of explicit haplotype sequences, maps transcript annotations onto them, detect...

  13. SNP-based genotyping in lentil: linking sequence information with phenotypes

    USDA-ARS?s Scientific Manuscript database

    Lentil (Lens culinaris) has been late to enter the world of high throughput molecular analysis due to a general lack of genomic resources. Using a 454 sequencing-based approach, SNPs have been identified in genes across the lentil genome. Several hundred have been turned into single SNP KASP assay...

  14. Mining conifers’ mega-genome using rapid and efficient multiplexed high-throughput genotyping-by-sequencing (GBS) SNP discovery platform

    USDA-ARS?s Scientific Manuscript database

    Next-generation sequencing (NGS) technologies are revolutionizing both medical and biological research through generation of massive SNP data sets for identifying heritable genome variation underlying key traits, from rare human diseases to important agronomic phenotypes in crop species. We evaluate...

  15. Maize HapMap2 identifies extant variation from a genome in flux

    USDA-ARS?s Scientific Manuscript database

    The maize genome is the largest, most diverse and complex plant genome sequenced to date. Using high-throughput sequencing to access genetic variation and a population genetics model to score the polymorphisms, we characterize and unite the diversity of the world’s key breeding germplasm, wild rela...

  16. [Community composition and diversity of endophytic fungi from roots of Sinopodophyllum hexandrum in forest of Upper-north mountain of Qinghai province].

    PubMed

    Ning, Yi; Li, Yan-Ling; Zhou, Guo-Ying; Yang, Lu-Cun; Xu, Wen-Hua

    2016-04-01

    High throughput sequencing technology is also called Next Generation Sequencing (NGS), which can sequence hundreds and thousands sequences in different samples at the same time. In the present study, the culture-independent high throughput sequencing technology was applied to sequence the fungi metagenomic DNA of the fungal internal transcribed spacer 1(ITS 1) in the root of Sinopodophyllum hexandrum. Sequencing data suggested that after the quality control, 22 565 reads were remained. Cluster similarity analysis was done based on 97% sequence similarity, which obtained 517 OTUs for the three samples (LD1, LD2 and LD3). All the fungi which identified from all the reads of OTUs based on 0.8 classification thresholds using the software of RDP classifier were classified as 13 classes, 35 orders, 44 family, 55 genera. Among these genera, the genus of Tetracladium was the dominant genera in all samples(35.49%, 68.55% and 12.96%).The Shannon's diversity indices and the Simpson indices of the endophytic fungi in the samples ranged from 1.75-2.92, 0.11-0.32, respectively.This is the first time for applying high through put sequencing technol-ogyto analyze the community composition and diversity of endophytic fungi in the medicinal plant, and the results showed that there were hyper diver sity and high community composition complexity of endophytic fungi in the root of S. hexandrum. It is also proved that the high through put sequencing technology has great advantage for analyzing ecommunity composition and diversity of endophtye in the plant. Copyright© by the Chinese Pharmaceutical Association.

  17. Coding Complete Genome for the Mogiana Tick Virus, a Jingmenvirus Isolated from Ticks in Brazil

    DTIC Science & Technology

    2017-05-04

    sequences for all four genome segments. We downloaded the raw Illumina sequence reads from the NCBI Short Read Archive (GenBank...MGTV genome segments through sequence similarity (BLASTN) to the published genome of Jingmen tick virus (JMTV) isolate SY84 (GenBank: KJ001579-KJ001582...2014. Standards for sequencing viral genomes in the era of high-throughput sequencing . MBio 5:e01360–14. 8. Bankevich A, Nurk S, Antipov

  18. Transcriptome-based differentiation of closely-related Miscanthus lines.

    PubMed

    Chouvarine, Philippe; Cooksey, Amanda M; McCarthy, Fiona M; Ray, David A; Baldwin, Brian S; Burgess, Shane C; Peterson, Daniel G

    2012-01-01

    Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthus (Miscanthus × giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations. A SNP comparative analysis of rhizome-derived cDNA sequences was successfully utilized to distinguish three Miscanthus × giganteus cultivars from each other and from other Miscanthus species. Moreover, the resulting phylogenetic tree generated from SNP frequency data parallels the known breeding history of the plants examined. Some of the giant miscanthus plants exhibit considerable sequence divergence. Here we describe an analysis of Miscanthus in which high-throughput exome sequencing was utilized to differentiate between closely related genotypes despite the current lack of a reference genome sequence. We functionally annotated the exome sequences and provide resources to support Miscanthus systems biology. In addition, we demonstrate the use of the commercial high-performance cloud computing to do computational GO annotation.

  19. Accounting for Errors in Low Coverage High-Throughput Sequencing Data When Constructing Genetic Maps Using Biparental Outcrossed Populations

    PubMed Central

    Bilton, Timothy P.; Schofield, Matthew R.; Black, Michael A.; Chagné, David; Wilcox, Phillip L.; Dodds, Ken G.

    2018-01-01

    Next-generation sequencing is an efficient method that allows for substantially more markers than previous technologies, providing opportunities for building high-density genetic linkage maps, which facilitate the development of nonmodel species’ genomic assemblies and the investigation of their genes. However, constructing genetic maps using data generated via high-throughput sequencing technology (e.g., genotyping-by-sequencing) is complicated by the presence of sequencing errors and genotyping errors resulting from missing parental alleles due to low sequencing depth. If unaccounted for, these errors lead to inflated genetic maps. In addition, map construction in many species is performed using full-sibling family populations derived from the outcrossing of two individuals, where unknown parental phase and varying segregation types further complicate construction. We present a new methodology for modeling low coverage sequencing data in the construction of genetic linkage maps using full-sibling populations of diploid species, implemented in a package called GUSMap. Our model is based on the Lander–Green hidden Markov model but extended to account for errors present in sequencing data. We were able to obtain accurate estimates of the recombination fractions and overall map distance using GUSMap, while most existing mapping packages produced inflated genetic maps in the presence of errors. Our results demonstrate the feasibility of using low coverage sequencing data to produce genetic maps without requiring extensive filtering of potentially erroneous genotypes, provided that the associated errors are correctly accounted for in the model. PMID:29487138

  20. Accounting for Errors in Low Coverage High-Throughput Sequencing Data When Constructing Genetic Maps Using Biparental Outcrossed Populations.

    PubMed

    Bilton, Timothy P; Schofield, Matthew R; Black, Michael A; Chagné, David; Wilcox, Phillip L; Dodds, Ken G

    2018-05-01

    Next-generation sequencing is an efficient method that allows for substantially more markers than previous technologies, providing opportunities for building high-density genetic linkage maps, which facilitate the development of nonmodel species' genomic assemblies and the investigation of their genes. However, constructing genetic maps using data generated via high-throughput sequencing technology ( e.g. , genotyping-by-sequencing) is complicated by the presence of sequencing errors and genotyping errors resulting from missing parental alleles due to low sequencing depth. If unaccounted for, these errors lead to inflated genetic maps. In addition, map construction in many species is performed using full-sibling family populations derived from the outcrossing of two individuals, where unknown parental phase and varying segregation types further complicate construction. We present a new methodology for modeling low coverage sequencing data in the construction of genetic linkage maps using full-sibling populations of diploid species, implemented in a package called GUSMap. Our model is based on the Lander-Green hidden Markov model but extended to account for errors present in sequencing data. We were able to obtain accurate estimates of the recombination fractions and overall map distance using GUSMap, while most existing mapping packages produced inflated genetic maps in the presence of errors. Our results demonstrate the feasibility of using low coverage sequencing data to produce genetic maps without requiring extensive filtering of potentially erroneous genotypes, provided that the associated errors are correctly accounted for in the model. Copyright © 2018 Bilton et al.

  1. Calculating the quality of public high-throughput sequencing data to obtain a suitable subset for reanalysis from the Sequence Read Archive.

    PubMed

    Ohta, Tazro; Nakazato, Takeru; Bono, Hidemasa

    2017-06-01

    It is important for public data repositories to promote the reuse of archived data. In the growing field of omics science, however, the increasing number of submissions of high-throughput sequencing (HTSeq) data to public repositories prevents users from choosing a suitable data set from among the large number of search results. Repository users need to be able to set a threshold to reduce the number of results to obtain a suitable subset of high-quality data for reanalysis. We calculated the quality of sequencing data archived in a public data repository, the Sequence Read Archive (SRA), by using the quality control software FastQC. We obtained quality values for 1 171 313 experiments, which can be used to evaluate the suitability of data for reuse. We also visualized the data distribution in SRA by integrating the quality information and metadata of experiments and samples. We provide quality information of all of the archived sequencing data, which enable users to obtain sufficient quality sequencing data for reanalyses. The calculated quality data are available to the public in various formats. Our data also provide an example of enhancing the reuse of public data by adding metadata to published research data by a third party. © The Authors 2017. Published by Oxford University Press.

  2. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    PubMed

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Development of a High-Throughput Resequencing Array for the Detection of Pathogenic Mutations in Osteogenesis Imperfecta

    PubMed Central

    Wang, Yao; Cui, Yazhou; Zhou, Xiaoyan; Han, Jinxiang

    2015-01-01

    Objective Osteogenesis imperfecta (OI) is a rare inherited skeletal disease, characterized by bone fragility and low bone density. The mutations in this disorder have been widely reported to be on various exonal hotspots of the candidate genes, including COL1A1, COL1A2, CRTAP, LEPRE1, and FKBP10, thus creating a great demand for precise genetic tests. However, large genome sizes make the process daunting and the analyses, inefficient and expensive. Therefore, we aimed at developing a fast, accurate, efficient, and cheaper sequencing platform for OI diagnosis; and to this end, use of an advanced array-based technique was proposed. Method A CustomSeq Affymetrix Resequencing Array was established for high-throughput sequencing of five genes simultaneously. Genomic DNA extraction from 13 OI patients and 85 normal controls and amplification using long-range PCR (LR-PCR) were followed by DNA fragmentation and chip hybridization, according to standard Affymetrix protocols. Hybridization signals were determined using GeneChip Sequence Analysis Software (GSEQ). To examine the feasibility, the outcome from new resequencing approach was validated by conventional capillary sequencing method. Result Overall call rates using resequencing array was 96–98% and the agreement between microarray and capillary sequencing was 99.99%. 11 out of 13 OI patients with pathogenic mutations were successfully detected by the chip analysis without adjustment, and one mutation could also be identified using manual visual inspection. Conclusion A high-throughput resequencing array was developed that detects the disease-associated mutations in OI, providing a potential tool to facilitate large-scale genetic screening for OI patients. Through this method, a novel mutation was also found. PMID:25742658

  4. Spent Fuel Assay with an Ultra-High Rate HPGe Spectrometer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fast, James; Fulsom, Bryan; Pitts, Karl

    2015-07-01

    Traditional verification of spent nuclear fuel (SNF) includes determination of initial enrichment, burnup and cool down time (IE, BU, CT). Along with neutron measurements, passive gamma assay provides important information for determining BU and CT. Other gamma-ray-based assay methods such as passive tomography and active delayed gamma offer the potential to measure the spatial distribution of fission products and the fissile isotopic concentration of the fuel, respectively. All fuel verification methods involving gamma-ray spectroscopy require that the spectrometers manage very high count rates while extracting the signatures of interest. PNNL has developed new digital filtering and analysis techniques to producemore » an ultra-high rate gamma-ray spectrometer from a standard coaxial high-purity germanium (HPGe) crystal. This 37% relative efficiency detector has been operated for SNF measurements at input count rates of 500-1300 kcps and throughput in excess of 150 kcps. Optimized filtering algorithms preserve the spectroscopic capability of the system even at these high rates. This paper will present the results of both passive and active SNF measurement performed with this system at PNNL. (authors)« less

  5. Characterization of Capsicum annuum Genetic Diversity and Population Structure Based on Parallel Polymorphism Discovery with a 30K Unigene Pepper GeneChip

    PubMed Central

    Hill, Theresa A.; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W.; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum. PMID:23409153

  6. Characterization of Capsicum annuum genetic diversity and population structure based on parallel polymorphism discovery with a 30K unigene Pepper GeneChip.

    PubMed

    Hill, Theresa A; Ashrafi, Hamid; Reyes-Chin-Wo, Sebastian; Yao, JiQiang; Stoffel, Kevin; Truco, Maria-Jose; Kozik, Alexander; Michelmore, Richard W; Van Deynze, Allen

    2013-01-01

    The widely cultivated pepper, Capsicum spp., important as a vegetable and spice crop world-wide, is one of the most diverse crops. To enhance breeding programs, a detailed characterization of Capsicum diversity including morphological, geographical and molecular data is required. Currently, molecular data characterizing Capsicum genetic diversity is limited. The development and application of high-throughput genome-wide markers in Capsicum will facilitate more detailed molecular characterization of germplasm collections, genetic relationships, and the generation of ultra-high density maps. We have developed the Pepper GeneChip® array from Affymetrix for polymorphism detection and expression analysis in Capsicum. Probes on the array were designed from 30,815 unigenes assembled from expressed sequence tags (ESTs). Our array design provides a maximum redundancy of 13 probes per base pair position allowing integration of multiple hybridization values per position to detect single position polymorphism (SPP). Hybridization of genomic DNA from 40 diverse C. annuum lines, used in breeding and research programs, and a representative from three additional cultivated species (C. frutescens, C. chinense and C. pubescens) detected 33,401 SPP markers within 13,323 unigenes. Among the C. annuum lines, 6,426 SPPs covering 3,818 unigenes were identified. An estimated three-fold reduction in diversity was detected in non-pungent compared with pungent lines, however, we were able to detect 251 highly informative markers across these C. annuum lines. In addition, an 8.7 cM region without polymorphism was detected around Pun1 in non-pungent C. annuum. An analysis of genetic relatedness and diversity using the software Structure revealed clustering of the germplasm which was confirmed with statistical support by principle components analysis (PCA) and phylogenetic analysis. This research demonstrates the effectiveness of parallel high-throughput discovery and application of genome-wide transcript-based markers to assess genetic and genomic features among Capsicum annuum.

  7. An ultra-high density linkage map and QTL mapping for sex and growth-related traits of common carp (Cyprinus carpio)

    PubMed Central

    Peng, Wenzhu; Xu, Jian; Zhang, Yan; Feng, Jianxin; Dong, Chuanju; Jiang, Likun; Feng, Jingyan; Chen, Baohua; Gong, Yiwen; Chen, Lin; Xu, Peng

    2016-01-01

    High density genetic linkage maps are essential for QTL fine mapping, comparative genomics and high quality genome sequence assembly. In this study, we constructed a high-density and high-resolution genetic linkage map with 28,194 SNP markers on 14,146 distinct loci for common carp based on high-throughput genotyping with the carp 250 K single nucleotide polymorphism (SNP) array in a mapping family. The genetic length of the consensus map was 10,595.94 cM with an average locus interval of 0.75 cM and an average marker interval of 0.38 cM. Comparative genomic analysis revealed high level of conserved syntenies between common carp and the closely related model species zebrafish and medaka. The genome scaffolds were anchored to the high-density linkage map, spanning 1,357 Mb of common carp reference genome. QTL mapping and association analysis identified 22 QTLs for growth-related traits and 7 QTLs for sex dimorphism. Candidate genes underlying growth-related traits were identified, including important regulators such as KISS2, IGF1, SMTLB, NPFFR1 and CPE. Candidate genes associated with sex dimorphism were also identified including 3KSR and DMRT2b. The high-density and high-resolution genetic linkage map provides an important tool for QTL fine mapping and positional cloning of economically important traits, and improving common carp genome assembly. PMID:27225429

  8. Position-specific automated processing of V3 env ultra-deep pyrosequencing data for predicting HIV-1 tropism

    PubMed Central

    Jeanne, Nicolas; Saliou, Adrien; Carcenac, Romain; Lefebvre, Caroline; Dubois, Martine; Cazabat, Michelle; Nicot, Florence; Loiseau, Claire; Raymond, Stéphanie; Izopet, Jacques; Delobel, Pierre

    2015-01-01

    HIV-1 coreceptor usage must be accurately determined before starting CCR5 antagonist-based treatment as the presence of undetected minor CXCR4-using variants can cause subsequent virological failure. Ultra-deep pyrosequencing of HIV-1 V3 env allows to detect low levels of CXCR4-using variants that current genotypic approaches miss. However, the computation of the mass of sequence data and the need to identify true minor variants while excluding artifactual sequences generated during amplification and ultra-deep pyrosequencing is rate-limiting. Arbitrary fixed cut-offs below which minor variants are discarded are currently used but the errors generated during ultra-deep pyrosequencing are sequence-dependant rather than random. We have developed an automated processing of HIV-1 V3 env ultra-deep pyrosequencing data that uses biological filters to discard artifactual or non-functional V3 sequences followed by statistical filters to determine position-specific sensitivity thresholds, rather than arbitrary fixed cut-offs. It allows to retain authentic sequences with point mutations at V3 positions of interest and discard artifactual ones with accurate sensitivity thresholds. PMID:26585833

  9. Position-specific automated processing of V3 env ultra-deep pyrosequencing data for predicting HIV-1 tropism.

    PubMed

    Jeanne, Nicolas; Saliou, Adrien; Carcenac, Romain; Lefebvre, Caroline; Dubois, Martine; Cazabat, Michelle; Nicot, Florence; Loiseau, Claire; Raymond, Stéphanie; Izopet, Jacques; Delobel, Pierre

    2015-11-20

    HIV-1 coreceptor usage must be accurately determined before starting CCR5 antagonist-based treatment as the presence of undetected minor CXCR4-using variants can cause subsequent virological failure. Ultra-deep pyrosequencing of HIV-1 V3 env allows to detect low levels of CXCR4-using variants that current genotypic approaches miss. However, the computation of the mass of sequence data and the need to identify true minor variants while excluding artifactual sequences generated during amplification and ultra-deep pyrosequencing is rate-limiting. Arbitrary fixed cut-offs below which minor variants are discarded are currently used but the errors generated during ultra-deep pyrosequencing are sequence-dependant rather than random. We have developed an automated processing of HIV-1 V3 env ultra-deep pyrosequencing data that uses biological filters to discard artifactual or non-functional V3 sequences followed by statistical filters to determine position-specific sensitivity thresholds, rather than arbitrary fixed cut-offs. It allows to retain authentic sequences with point mutations at V3 positions of interest and discard artifactual ones with accurate sensitivity thresholds.

  10. Read count-based method for high-throughput allelic genotyping of transposable elements and structural variants.

    PubMed

    Kuhn, Alexandre; Ong, Yao Min; Quake, Stephen R; Burkholder, William F

    2015-07-08

    Like other structural variants, transposable element insertions can be highly polymorphic across individuals. Their functional impact, however, remains poorly understood. Current genome-wide approaches for genotyping insertion-site polymorphisms based on targeted or whole-genome sequencing remain very expensive and can lack accuracy, hence new large-scale genotyping methods are needed. We describe a high-throughput method for genotyping transposable element insertions and other types of structural variants that can be assayed by breakpoint PCR. The method relies on next-generation sequencing of multiplex, site-specific PCR amplification products and read count-based genotype calls. We show that this method is flexible, efficient (it does not require rounds of optimization), cost-effective and highly accurate. This method can benefit a wide range of applications from the routine genotyping of animal and plant populations to the functional study of structural variants in humans.

  11. NCBI GEO: archive for high-throughput functional genomic data.

    PubMed

    Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Edgar, Ron

    2009-01-01

    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as 'Minimum Information About a Microarray Experiment' (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.

  12. A high-throughput Sanger strategy for human mitochondrial genome sequencing

    PubMed Central

    2013-01-01

    Background A population reference database of complete human mitochondrial genome (mtGenome) sequences is needed to enable the use of mitochondrial DNA (mtDNA) coding region data in forensic casework applications. However, the development of entire mtGenome haplotypes to forensic data quality standards is difficult and laborious. A Sanger-based amplification and sequencing strategy that is designed for automated processing, yet routinely produces high quality sequences, is needed to facilitate high-volume production of these mtGenome data sets. Results We developed a robust 8-amplicon Sanger sequencing strategy that regularly produces complete, forensic-quality mtGenome haplotypes in the first pass of data generation. The protocol works equally well on samples representing diverse mtDNA haplogroups and DNA input quantities ranging from 50 pg to 1 ng, and can be applied to specimens of varying DNA quality. The complete workflow was specifically designed for implementation on robotic instrumentation, which increases throughput and reduces both the opportunities for error inherent to manual processing and the cost of generating full mtGenome sequences. Conclusions The described strategy will assist efforts to generate complete mtGenome haplotypes which meet the highest data quality expectations for forensic genetic and other applications. Additionally, high-quality data produced using this protocol can be used to assess mtDNA data developed using newer technologies and chemistries. Further, the amplification strategy can be used to enrich for mtDNA as a first step in sample preparation for targeted next-generation sequencing. PMID:24341507

  13. Diversity and distribution of unicellular opisthokonts along the European coast analysed using high-throughput sequencing.

    PubMed

    Del Campo, Javier; Mallo, Diego; Massana, Ramon; de Vargas, Colomban; Richards, Thomas A; Ruiz-Trillo, Iñaki

    2015-09-01

    The opisthokonts are one of the major super groups of eukaryotes. It comprises two major clades: (i) the Metazoa and their unicellular relatives and (ii) the Fungi and their unicellular relatives. There is, however, little knowledge of the role of opisthokont microbes in many natural environments, especially among non-metazoan and non-fungal opisthokonts. Here, we begin to address this gap by analysing high-throughput 18S rDNA and 18S rRNA sequencing data from different European coastal sites, sampled at different size fractions and depths. In particular, we analyse the diversity and abundance of choanoflagellates, filastereans, ichthyosporeans, nucleariids, corallochytreans and their related lineages. Our results show the great diversity of choanoflagellates in coastal waters as well as a relevant representation of the ichthyosporeans and the uncultured marine opisthokonts (MAOP). Furthermore, we describe a new lineage of marine fonticulids (MAFO) that appears to be abundant in sediments. Taken together, our work points to a greater potential ecological role for unicellular opisthokonts than previously appreciated in marine environments, both in water column and sediments, and also provides evidence of novel opisthokont phylogenetic lineages. This study highlights the importance of high-throughput sequencing approaches to unravel the diversity and distribution of both known and novel eukaryotic lineages. © 2014 Society for Applied Microbiology and John Wiley & Sons Ltd.

  14. Quantitative assessment of RNA-protein interactions with high-throughput sequencing-RNA affinity profiling.

    PubMed

    Ozer, Abdullah; Tome, Jacob M; Friedman, Robin C; Gheba, Dan; Schroth, Gary P; Lis, John T

    2015-08-01

    Because RNA-protein interactions have a central role in a wide array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay that couples sequencing on an Illumina GAIIx genome analyzer with the quantitative assessment of protein-RNA interactions. This assay is able to analyze interactions between one or possibly several proteins with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of the EGFP and negative elongation factor subunit E (NELF-E) proteins with their corresponding canonical and mutant RNA aptamers. Here we provide a detailed protocol for HiTS-RAP that can be completed in about a month (8 d hands-on time). This includes the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, HiTS and protein binding with a GAIIx instrument, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, quantitative analysis of RNA on a massively parallel array (RNA-MaP) and RNA Bind-n-Seq (RBNS), for quantitative analysis of RNA-protein interactions.

  15. Methods for processing high-throughput RNA sequencing data.

    PubMed

    Ares, Manuel

    2014-11-03

    High-throughput sequencing (HTS) methods for analyzing RNA populations (RNA-Seq) are gaining rapid application to many experimental situations. The steps in an RNA-Seq experiment require thought and planning, especially because the expense in time and materials is currently higher and the protocols are far less routine than those used for other high-throughput methods, such as microarrays. As always, good experimental design will make analysis and interpretation easier. Having a clear biological question, an idea about the best way to do the experiment, and an understanding of the number of replicates needed will make the entire process more satisfying. Whether the goal is capturing transcriptome complexity from a tissue or identifying small fragments of RNA cross-linked to a protein of interest, conversion of the RNA to cDNA followed by direct sequencing using the latest methods is a developing practice, with new technical modifications and applications appearing every day. Even more rapid are the development and improvement of methods for analysis of the very large amounts of data that arrive at the end of an RNA-Seq experiment, making considerations regarding reproducibility, validation, visualization, and interpretation increasingly important. This introduction is designed to review and emphasize a pathway of analysis from experimental design through data presentation that is likely to be successful, with the recognition that better methods are right around the corner. © 2014 Cold Spring Harbor Laboratory Press.

  16. Detection and Tracking of NY-ESO-1-Specific CD8+ T Cells by High-Throughput T Cell Receptor β (TCRB) Gene Rearrangements Sequencing in a Peptide-Vaccinated Patient.

    PubMed

    Miyai, Manami; Eikawa, Shingo; Hosoi, Akihiro; Iino, Tamaki; Matsushita, Hirokazu; Isobe, Midori; Uenaka, Akiko; Udono, Heiichiro; Nakajima, Jun; Nakayama, Eiichi; Kakimi, Kazuhiro

    2015-01-01

    Comprehensive immunological evaluation is crucial for monitoring patients undergoing antigen-specific cancer immunotherapy. The identification and quantification of T cell responses is most important for the further development of such therapies. Using well-characterized clinical samples from a high responder patient (TK-f01) in an NY-ESO-1f peptide vaccine study, we performed high-throughput T cell receptor β-chain (TCRB) gene next generation sequencing (NGS) to monitor the frequency of NY-ESO-1-specific CD8+ T cells. We compared these results with those of conventional immunological assays, such as IFN-γ capture, tetramer binding and limiting dilution clonality assays. We sequenced human TCRB complementarity-determining region 3 (CDR3) rearrangements of two NY-ESO-1f-specific CD8+ T cell clones, 6-8L and 2F6, as well as PBMCs over the course of peptide vaccination. Clone 6-8L possessed the TCRB CDR3 gene TCRBV11-03*01 and BJ02-01*01 with amino acid sequence CASSLRGNEQFF, whereas 2F6 possessed TCRBV05-08*01 and BJ02-04*01 (CASSLVGTNIQYF). Using these two sequences as models, we evaluated the frequency of NY-ESO-1-specific CD8+ T cells in PBMCs ex vivo. The 6-8L CDR3 sequence was the second most frequent in PBMC and was present at high frequency (0.7133%) even prior to vaccination, and sustained over the course of vaccination. Despite a marked expansion of NY-ESO-1-specific CD8+ T cells detected from the first through 6th vaccination by tetramer staining and IFN-γ capture assays, as evaluated by CDR3 sequencing the frequency did not increase with increasing rounds of peptide vaccination. By clonal analysis using 12 day in vitro stimulation, the frequency of B*52:01-restricted NY-ESO-1f peptide-specific CD8+ T cells in PBMCs was estimated as only 0.0023%, far below the 0.7133% by NGS sequencing. Thus, assays requiring in vitro stimulation might be underestimating the frequency of clones with lower proliferation potential. High-throughput TCRB sequencing using NGS can potentially better estimate the actual frequency of antigen-specific T cells and thus provide more accurate patient monitoring.

  17. Detection and Tracking of NY-ESO-1-Specific CD8+ T Cells by High-Throughput T Cell Receptor β (TCRB) Gene Rearrangements Sequencing in a Peptide-Vaccinated Patient

    PubMed Central

    Miyai, Manami; Eikawa, Shingo; Hosoi, Akihiro; Iino, Tamaki; Matsushita, Hirokazu; Isobe, Midori; Uenaka, Akiko; Udono, Heiichiro; Nakajima, Jun; Nakayama, Eiichi; Kakimi, Kazuhiro

    2015-01-01

    Comprehensive immunological evaluation is crucial for monitoring patients undergoing antigen-specific cancer immunotherapy. The identification and quantification of T cell responses is most important for the further development of such therapies. Using well-characterized clinical samples from a high responder patient (TK-f01) in an NY-ESO-1f peptide vaccine study, we performed high-throughput T cell receptor β-chain (TCRB) gene next generation sequencing (NGS) to monitor the frequency of NY-ESO-1-specific CD8+ T cells. We compared these results with those of conventional immunological assays, such as IFN-γ capture, tetramer binding and limiting dilution clonality assays. We sequenced human TCRB complementarity-determining region 3 (CDR3) rearrangements of two NY-ESO-1f-specific CD8+ T cell clones, 6-8L and 2F6, as well as PBMCs over the course of peptide vaccination. Clone 6-8L possessed the TCRB CDR3 gene TCRBV11-03*01 and BJ02-01*01 with amino acid sequence CASSLRGNEQFF, whereas 2F6 possessed TCRBV05-08*01 and BJ02-04*01 (CASSLVGTNIQYF). Using these two sequences as models, we evaluated the frequency of NY-ESO-1-specific CD8+ T cells in PBMCs ex vivo. The 6-8L CDR3 sequence was the second most frequent in PBMC and was present at high frequency (0.7133%) even prior to vaccination, and sustained over the course of vaccination. Despite a marked expansion of NY-ESO-1-specific CD8+ T cells detected from the first through 6th vaccination by tetramer staining and IFN-γ capture assays, as evaluated by CDR3 sequencing the frequency did not increase with increasing rounds of peptide vaccination. By clonal analysis using 12 day in vitro stimulation, the frequency of B*52:01-restricted NY-ESO-1f peptide-specific CD8+ T cells in PBMCs was estimated as only 0.0023%, far below the 0.7133% by NGS sequencing. Thus, assays requiring in vitro stimulation might be underestimating the frequency of clones with lower proliferation potential. High-throughput TCRB sequencing using NGS can potentially better estimate the actual frequency of antigen-specific T cells and thus provide more accurate patient monitoring. PMID:26291626

  18. Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

    PubMed Central

    2012-01-01

    Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952) of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612) were intronic and 9% (n = 464) were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS). Significant (P < 0.01) mean allele frequency differentials between the low and high fertility groups were observed for 720 SNPs (58 NSS). Allele frequencies for 43 of the SNPs were also determined by genotyping the 150 individual animals (Sequenom® MassARRAY). No significant differences (P > 0.1) were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total). Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving interval plausibly harbouring causative variants contributing to heritable variation. To our knowledge, this is the first report describing sequencing of targeted genomic regions in any livestock species using groups with divergent phenotypes for an economically important trait. PMID:22235840

  19. CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing.

    PubMed

    Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian

    2011-08-30

    Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.

  20. Coprolites as a source of information on the genome and diet of the cave hyena

    PubMed Central

    Bon, Céline; Berthonaud, Véronique; Maksud, Frédéric; Labadie, Karine; Poulain, Julie; Artiguenave, François; Wincker, Patrick; Aury, Jean-Marc; Elalouf, Jean-Marc

    2012-01-01

    We performed high-throughput sequencing of DNA from fossilized faeces to evaluate this material as a source of information on the genome and diet of Pleistocene carnivores. We analysed coprolites derived from the extinct cave hyena (Crocuta crocuta spelaea), and sequenced 90 million DNA fragments from two specimens. The DNA reads enabled a reconstruction of the cave hyena mitochondrial genome with up to a 158-fold coverage. This genome, and those sequenced from extant spotted (Crocuta crocuta) and striped (Hyaena hyaena) hyena specimens, allows for the establishment of a robust phylogeny that supports a close relationship between the cave and the spotted hyena. We also demonstrate that high-throughput sequencing yields data for cave hyena multi-copy and single-copy nuclear genes, and that about 50 per cent of the coprolite DNA can be ascribed to this species. Analysing the data for additional species to indicate the cave hyena diet, we retrieved abundant sequences for the red deer (Cervus elaphus), and characterized its mitochondrial genome with up to a 3.8-fold coverage. In conclusion, we have demonstrated the presence of abundant ancient DNA in the coprolites surveyed. Shotgun sequencing of this material yielded a wealth of DNA sequences for a Pleistocene carnivore and allowed unbiased identification of diet. PMID:22456883

  1. Characterization of a new apple luteovirus identified by high-throughput sequencing.

    PubMed

    Liu, Huawei; Wu, Liping; Nikolaeva, Ekaterina; Peter, Kari; Liu, Zongrang; Mollov, Dimitre; Cao, Mengji; Li, Ruhui

    2018-05-15

    'Rapid Apple Decline' (RAD) is a newly emerging problem of young, dwarf apple trees in the Northeastern USA. The affected trees show trunk necrosis, cracking and canker before collapse in summer. In this study, we discovered and characterized a new luteovirus from apple trees in RAD-affected orchards using high-throughput sequencing (HTS) technology and subsequent Sanger sequencing. Illumina NextSeq sequencing was applied to total RNAs prepared from three diseased apple trees. Sequence reads were de novo assembled, and contigs were annotated by BLASTx. RT-PCR and 5'/3' RACE sequencing were used to obtain the complete genome of a new virus. RT-PCR was used to detect the virus. Three common apple viruses and a new luteovirus were identified from the diseased trees by HTS and RT-PCR. Sequence analyses of the complete genome of the new virus show that it is a new species of the genus Luteovirus in the family Luteoviridae. The virus is graft transmissible and detected by RT-PCR in apple trees in a couple of orchards. A new luteovirus and/or three known viruses were found to be associated with RAD. Molecular characterization of the new luteovirus provides important information for further investigation of its distribution and etiological role.

  2. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  3. Alignment of high-throughput sequencing data inside in-memory databases.

    PubMed

    Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias

    2014-01-01

    In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.

  4. Massively parallel whole genome amplification for single-cell sequencing using droplet microfluidics.

    PubMed

    Hosokawa, Masahito; Nishikawa, Yohei; Kogawa, Masato; Takeyama, Haruko

    2017-07-12

    Massively parallel single-cell genome sequencing is required to further understand genetic diversities in complex biological systems. Whole genome amplification (WGA) is the first step for single-cell sequencing, but its throughput and accuracy are insufficient in conventional reaction platforms. Here, we introduce single droplet multiple displacement amplification (sd-MDA), a method that enables massively parallel amplification of single cell genomes while maintaining sequence accuracy and specificity. Tens of thousands of single cells are compartmentalized in millions of picoliter droplets and then subjected to lysis and WGA by passive droplet fusion in microfluidic channels. Because single cells are isolated in compartments, their genomes are amplified to saturation without contamination. This enables the high-throughput acquisition of contamination-free and cell specific sequence reads from single cells (21,000 single-cells/h), resulting in enhancement of the sequence data quality compared to conventional methods. This method allowed WGA of both single bacterial cells and human cancer cells. The obtained sequencing coverage rivals those of conventional techniques with superior sequence quality. In addition, we also demonstrate de novo assembly of uncultured soil bacteria and obtain draft genomes from single cell sequencing. This sd-MDA is promising for flexible and scalable use in single-cell sequencing.

  5. Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing.

    PubMed

    Fang, Chao; Zhong, Huanzi; Lin, Yuxiang; Chen, Bing; Han, Mo; Ren, Huahui; Lu, Haorong; Luber, Jacob M; Xia, Min; Li, Wangsheng; Stein, Shayna; Xu, Xun; Zhang, Wenwei; Drmanac, Radoje; Wang, Jian; Yang, Huanming; Hammarström, Lennart; Kostic, Aleksandar D; Kristiansen, Karsten; Li, Junhua

    2018-03-01

    More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of 2 Illumina platforms. Using fecal samples from 20 healthy individuals, we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform vs the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the 2 Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high-quality reads (96.06% of raw reads) per sample, with 90.56% of bases scoring Q30 and above, was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02%-3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias toward genes with higher GC content being enriched on the HiSeq platforms. Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.

  6. Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools.

    PubMed

    Kebschull, Moritz; Fittler, Melanie Julia; Demmer, Ryan T; Papapanou, Panos N

    2017-01-01

    Today, -omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ, or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier "candidate" gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized -omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease.A major issue when inferring biological information from high-throughput -omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences.In this chapter, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of -omics data generated using microarrays or next-generation sequencing technology using open-source tools. Starting with quality control measures and necessary preprocessing steps for data originating from different -omics technologies, we next outline a differential expression analysis pipeline that can be used for data from both microarray and sequencing experiments, and offers the possibility to account for random or fixed effects. Finally, we present an overview of the possibilities for a functional analysis of the obtained data.

  7. Improved growth of GaN layers on ultra thin silicon nitride/Si (1 1 1) by RF-MBE

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kumar, Mahesh; Roul, Basanta; Central Research Laboratory, Bharat Electronics, Bangalore 560013

    High-quality GaN epilayers were grown on Si (1 1 1) substrates by molecular beam epitaxy using a new growth process sequence which involved a substrate nitridation at low temperatures, annealing at high temperatures, followed by nitridation at high temperatures, deposition of a low-temperature buffer layer, and a high-temperature overgrowth. The material quality of the GaN films was also investigated as a function of nitridation time and temperature. Crystallinity and surface roughness of GaN was found to improve when the Si substrate was treated under the new growth process sequence. Micro-Raman and photoluminescence (PL) measurement results indicate that the GaN filmmore » grown by the new process sequence has less tensile stress and optically good. The surface and interface structures of an ultra thin silicon nitride film grown on the Si surface are investigated by core-level photoelectron spectroscopy and it clearly indicates that the quality of silicon nitride notably affects the properties of GaN growth.« less

  8. Development of a Rapid Identification Method for a Variety of Antibody Candidates Using High-throughput Sequencing.

    PubMed

    Ito, Yuji

    2017-01-01

    As an alternative to hybridoma technology, the antibody phage library system can also be used for antibody selection. This method enables the isolation of antigen-specific binders through an in vitro selection process known as biopanning. While it has several advantages, such as an avoidance of animal immunization, the phage cloning and screening steps of biopanning are time-consuming and problematic. Here, we introduce a novel biopanning method combined with high-throughput sequencing (HTS) using a next-generation sequencer (NGS) to save time and effort in antibody selection, and to increase the diversity of acquired antibody sequences. Biopannings against a target antigen were performed using a human single chain Fv (scFv) antibody phage library. VH genes in pooled phages at each round of biopanning were analyzed by HTS on a NGS. The obtained data were trimmed, merged, and translated into amino acid sequences. The frequencies (%) of the respective VH sequences at each biopanning step were calculated, and the amplification factor (change of frequency through biopanning) was obtained to estimate the potential for antigen binding. A phylogenetic tree was drawn using the top 50 VH sequences with high amplification factors. Representative VH sequences forming the cluster were then picked up and used to reconstruct scFv genes harboring these VHs. Their derived scFv-Fc fusion proteins showed clear antigen binding activity. These results indicate that a combination of biopanning and HTS enables the rapid and comprehensive identification of specific binders from antibody phage libraries.

  9. An ultra-high-throughput spiral microfluidic biochip for the enrichment of circulating tumor cells.

    PubMed

    Warkiani, Majid Ebrahimi; Khoo, Bee Luan; Tan, Daniel Shao-Weng; Bhagat, Ali Asgar S; Lim, Wan-Teck; Yap, Yoon Sim; Lee, Soo Chin; Soo, Ross A; Han, Jongyoon; Lim, Chwee Teck

    2014-07-07

    The detection and characterization of rare circulating tumor cells (CTCs) from the blood of cancer patients can potentially provide critical insights into tumor biology and hold great promise for cancer management. The ability to collect a large number of viable CTCs for various downstream assays such as quantitative measurements of specific biomarkers or targeted somatic mutation analysis is increasingly important in medical oncology. Here, we present a simple yet reliable microfluidic device for the ultra-high-throughput, label-free, size-based isolation of CTCs from clinically relevant blood volumes. The fast processing time of the technique (7.5 mL blood in less than 10 min) and the ability to collect more CTCs from larger blood volumes lends itself to a broad range of potential genomic and transcriptomic applications. A critical advantage of this protocol is the ability to return all fractions of blood (i.e., plasma (centrifugation), CTCs and white blood cells (WBCs) (size-based sorting)) that can be utilized for diverse biomarker studies or time-sensitive molecular assays such as RT-PCR. The clinical use of this biochip was demonstrated by detecting CTCs from 100% (10/10) of blood samples collected from patients with advanced-stage metastatic breast and lung cancers. The CTC recovery rate ranged from 20 to 135 CTCs mL(-1) and obtained under high purity (of 1 CTC out of every 30-100 WBCs which gives ∼4 log depletion of WBCs). They were identified with immunofluorescence assays (pan-cytokeratin+/CD45-) and molecular probes such as HER2/neu.

  10. Filling reference gaps via assembling DNA barcodes using high-throughput sequencing—moving toward barcoding the world

    PubMed Central

    Zhou, Chengran

    2017-01-01

    Abstract Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)–based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn’t show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. PMID:29077841

  11. Report for the NGFA-5 project.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jaing, C; Jackson, P; Thissen, J

    The objective of this project is to provide DHS a comprehensive evaluation of the current genomic technologies including genotyping, TaqMan PCR, multiple locus variable tandem repeat analysis (MLVA), microarray and high-throughput DNA sequencing in the analysis of biothreat agents from complex environmental samples. To effectively compare the sensitivity and specificity of the different genomic technologies, we used SNP TaqMan PCR, MLVA, microarray and high-throughput illumine and 454 sequencing to test various strains from B. anthracis, B. thuringiensis, BioWatch aerosol filter extracts or soil samples that were spiked with B. anthracis, and samples that were previously collected during DHS and EPAmore » environmental release exercises that were known to contain B. thuringiensis spores. The results of all the samples against the various assays are discussed in this report.« less

  12. Insights into the microbial diversity and community dynamics of Chinese traditional fermented foods from using high-throughput sequencing approaches*

    PubMed Central

    He, Guo-qing; Liu, Tong-jie; Sadiq, Faizan A.; Gu, Jing-si; Zhang, Guo-hua

    2017-01-01

    Chinese traditional fermented foods have a very long history dating back thousands of years and have become an indispensable part of Chinese dietary culture. A plethora of research has been conducted to unravel the composition and dynamics of microbial consortia associated with Chinese traditional fermented foods using culture-dependent as well as culture-independent methods, like different high-throughput sequencing (HTS) techniques. These HTS techniques enable us to understand the relationship between a food product and its microbes to a greater extent than ever before. Considering the importance of Chinese traditional fermented products, the objective of this paper is to review the diversity and dynamics of microbiota in Chinese traditional fermented foods revealed by HTS approaches. PMID:28378567

  13. Gold nanoparticles for high-throughput genotyping of long-range haplotypes

    NASA Astrophysics Data System (ADS)

    Chen, Peng; Pan, Dun; Fan, Chunhai; Chen, Jianhua; Huang, Ke; Wang, Dongfang; Zhang, Honglu; Li, You; Feng, Guoyin; Liang, Peiji; He, Lin; Shi, Yongyong

    2011-10-01

    Completion of the Human Genome Project and the HapMap Project has led to increasing demands for mapping complex traits in humans to understand the aetiology of diseases. Identifying variations in the DNA sequence, which affect how we develop disease and respond to pathogens and drugs, is important for this purpose, but it is difficult to identify these variations in large sample sets. Here we show that through a combination of capillary sequencing and polymerase chain reaction assisted by gold nanoparticles, it is possible to identify several DNA variations that are associated with age-related macular degeneration and psoriasis on significant regions of human genomic DNA. Our method is accurate and promising for large-scale and high-throughput genetic analysis of susceptibility towards disease and drug resistance.

  14. Ultra-low-cost 3D gaze estimation: an intuitive high information throughput compliment to direct brain-machine interfaces

    NASA Astrophysics Data System (ADS)

    Abbott, W. W.; Faisal, A. A.

    2012-08-01

    Eye movements are highly correlated with motor intentions and are often retained by patients with serious motor deficiencies. Despite this, eye tracking is not widely used as control interface for movement in impaired patients due to poor signal interpretation and lack of control flexibility. We propose that tracking the gaze position in 3D rather than 2D provides a considerably richer signal for human machine interfaces by allowing direct interaction with the environment rather than via computer displays. We demonstrate here that by using mass-produced video-game hardware, it is possible to produce an ultra-low-cost binocular eye-tracker with comparable performance to commercial systems, yet 800 times cheaper. Our head-mounted system has 30 USD material costs and operates at over 120 Hz sampling rate with a 0.5-1 degree of visual angle resolution. We perform 2D and 3D gaze estimation, controlling a real-time volumetric cursor essential for driving complex user interfaces. Our approach yields an information throughput of 43 bits s-1, more than ten times that of invasive and semi-invasive brain-machine interfaces (BMIs) that are vastly more expensive. Unlike many BMIs our system yields effective real-time closed loop control of devices (10 ms latency), after just ten minutes of training, which we demonstrate through a novel BMI benchmark—the control of the video arcade game ‘Pong’.

  15. A Dual-Mode Large-Arrayed CMOS ISFET Sensor for Accurate and High-Throughput pH Sensing in Biomedical Diagnosis.

    PubMed

    Huang, Xiwei; Yu, Hao; Liu, Xu; Jiang, Yu; Yan, Mei; Wu, Dongping

    2015-09-01

    The existing ISFET-based DNA sequencing detects hydrogen ions released during the polymerization of DNA strands on microbeads, which are scattered into microwell array above the ISFET sensor with unknown distribution. However, false pH detection happens at empty microwells due to crosstalk from neighboring microbeads. In this paper, a dual-mode CMOS ISFET sensor is proposed to have accurate pH detection toward DNA sequencing. Dual-mode sensing, optical and chemical modes, is realized by integrating a CMOS image sensor (CIS) with ISFET pH sensor, and is fabricated in a standard 0.18-μm CIS process. With accurate determination of microbead physical locations with CIS pixel by contact imaging, the dual-mode sensor can correlate local pH for one DNA slice at one location-determined microbead, which can result in improved pH detection accuracy. Moreover, toward a high-throughput DNA sequencing, a correlated-double-sampling readout that supports large array for both modes is deployed to reduce pixel-to-pixel nonuniformity such as threshold voltage mismatch. The proposed CMOS dual-mode sensor is experimentally examined to show a well correlated pH map and optical image for microbeads with a pH sensitivity of 26.2 mV/pH, a fixed pattern noise (FPN) reduction from 4% to 0.3%, and a readout speed of 1200 frames/s. A dual-mode CMOS ISFET sensor with suppressed FPN for accurate large-arrayed pH sensing is proposed and demonstrated with state-of-the-art measured results toward accurate and high-throughput DNA sequencing. The developed dual-mode CMOS ISFET sensor has great potential for future personal genome diagnostics with high accuracy and low cost.

  16. Pathogenic bacteria in sewage treatment plants as revealed by 454 pyrosequencing.

    PubMed

    Ye, Lin; Zhang, Tong

    2011-09-01

    This study applied 454 high-throughput pyrosequencing to analyze potentially pathogenic bacteria in activated sludge from 14 municipal wastewater treatment plants (WWTPs) across four countries (China, U.S., Canada, and Singapore), plus the influent and effluent of one of the 14 WWTPs. A total of 370,870 16S rRNA gene sequences with average length of 207 bps were obtained and all of them were assigned to corresponding taxonomic ranks by using RDP classifier and MEGAN. It was found that the most abundant potentially pathogenic bacteria in the WWTPs were affiliated with the genera of Aeromonas and Clostridium. Aeromonas veronii, Aeromonas hydrophila, and Clostridium perfringens were species most similar to the potentially pathogenic bacteria found in this study. Some sequences highly similar (>99%) to Corynebacterium diphtheriae were found in the influent and activated sludge samples from a saline WWTP. Overall, the percentage of the sequences closely related (>99%) to known pathogenic bacteria sequences was about 0.16% of the total sequences. Additionally, a platform-independent Java application (BAND) was developed for graphical visualization of the data of microbial abundance generated by high-throughput pyrosequencing. The approach demonstrated in this study could examine most of the potentially pathogenic bacteria simultaneously instead of one-by-one detection by other methods.

  17. Accurate Sample Assignment in a Multiplexed, Ultrasensitive, High-Throughput Sequencing Assay for Minimal Residual Disease.

    PubMed

    Bartram, Jack; Mountjoy, Edward; Brooks, Tony; Hancock, Jeremy; Williamson, Helen; Wright, Gary; Moppett, John; Goulden, Nick; Hubank, Mike

    2016-07-01

    High-throughput sequencing (HTS) (next-generation sequencing) of the rearranged Ig and T-cell receptor genes promises to be less expensive and more sensitive than current methods of monitoring minimal residual disease (MRD) in patients with acute lymphoblastic leukemia. However, the adoption of new approaches by clinical laboratories requires careful evaluation of all potential sources of error and the development of strategies to ensure the highest accuracy. Timely and efficient clinical use of HTS platforms will depend on combining multiple samples (multiplexing) in each sequencing run. Here we examine the Ig heavy-chain gene HTS on the Illumina MiSeq platform for MRD. We identify errors associated with multiplexing that could potentially impact the accuracy of MRD analysis. We optimize a strategy that combines high-purity, sequence-optimized oligonucleotides, dual indexing, and an error-aware demultiplexing approach to minimize errors and maximize sensitivity. We present a probability-based, demultiplexing pipeline Error-Aware Demultiplexer that is suitable for all MiSeq strategies and accurately assigns samples to the correct identifier without excessive loss of data. Finally, using controls quantified by digital PCR, we show that HTS-MRD can accurately detect as few as 1 in 10(6) copies of specific leukemic MRD. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  18. Fluorescence-based high-throughput screening of dicer cleavage activity.

    PubMed

    Podolska, Katerina; Sedlak, David; Bartunek, Petr; Svoboda, Petr

    2014-03-01

    Production of small RNAs by ribonuclease III Dicer is a key step in microRNA and RNA interference pathways, which employ Dicer-produced small RNAs as sequence-specific silencing guides. Further studies and manipulations of microRNA and RNA interference pathways would benefit from identification of small-molecule modulators. Here, we report a study of a fluorescence-based in vitro Dicer cleavage assay, which was adapted for high-throughput screening. The kinetic assay can be performed under single-turnover conditions (35 nM substrate and 70 nM Dicer) in a small volume (5 µL), which makes it suitable for high-throughput screening in a 1536-well format. As a proof of principle, a small library of bioactive compounds was analyzed, demonstrating potential of the assay.

  19. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing

    PubMed Central

    Hykin, Sarah M.; Bi, Ke; McGuire, Jimmy A.

    2015-01-01

    For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis. PMID:26505622

  20. Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

    PubMed

    Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

    2015-01-01

    For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.

  1. High-throughput physical mapping of chromosomes using automated in situ hybridization.

    PubMed

    George, Phillip; Sharakhova, Maria V; Sharakhov, Igor V

    2012-06-28

    Projects to obtain whole-genome sequences for 10,000 vertebrate species and for 5,000 insect and related arthropod species are expected to take place over the next 5 years. For example, the sequencing of the genomes for 15 malaria mosquitospecies is currently being done using an Illumina platform. This Anopheles species cluster includes both vectors and non-vectors of malaria. When the genome assemblies become available, researchers will have the unique opportunity to perform comparative analysis for inferring evolutionary changes relevant to vector ability. However, it has proven difficult to use next-generation sequencing reads to generate high-quality de novo genome assemblies. Moreover, the existing genome assemblies for Anopheles gambiae, although obtained using the Sanger method, are gapped or fragmented. Success of comparative genomic analyses will be limited if researchers deal with numerous sequencing contigs, rather than with chromosome-based genome assemblies. Fragmented, unmapped sequences create problems for genomic analyses because: (i) unidentified gaps cause incorrect or incomplete annotation of genomic sequences; (ii) unmapped sequences lead to confusion between paralogous genes and genes from different haplotypes; and (iii) the lack of chromosome assignment and orientation of the sequencing contigs does not allow for reconstructing rearrangement phylogeny and studying chromosome evolution. Developing high-resolution physical maps for species with newly sequenced genomes is a timely and cost-effective investment that will facilitate genome annotation, evolutionary analysis, and re-sequencing of individual genomes from natural populations. Here, we present innovative approaches to chromosome preparation, fluorescent in situ hybridization (FISH), and imaging that facilitate rapid development of physical maps. Using An. gambiae as an example, we demonstrate that the development of physical chromosome maps can potentially improve genome assemblies and, thus, the quality of genomic analyses. First, we use a high-pressure method to prepare polytene chromosome spreads. This method, originally developed for Drosophila, allows the user to visualize more details on chromosomes than the regular squashing technique. Second, a fully automated, front-end system for FISH is used for high-throughput physical genome mapping. The automated slide staining system runs multiple assays simultaneously and dramatically reduces hands-on time. Third, an automatic fluorescent imaging system, which includes a motorized slide stage, automatically scans and photographs labeled chromosomes after FISH. This system is especially useful for identifying and visualizing multiple chromosomal plates on the same slide. In addition, the scanning process captures a more uniform FISH result. Overall, the automated high-throughput physical mapping protocol is more efficient than a standard manual protocol.

  2. CRISPR-Cas9-Edited Site Sequencing (CRES-Seq): An Efficient and High-Throughput Method for the Selection of CRISPR-Cas9-Edited Clones.

    PubMed

    Veeranagouda, Yaligara; Debono-Lagneaux, Delphine; Fournet, Hamida; Thill, Gilbert; Didier, Michel

    2018-01-16

    The emergence of clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) gene editing systems has enabled the creation of specific mutants at low cost, in a short time and with high efficiency, in eukaryotic cells. Since a CRISPR-Cas9 system typically creates an array of mutations in targeted sites, a successful gene editing project requires careful selection of edited clones. This process can be very challenging, especially when working with multiallelic genes and/or polyploid cells (such as cancer and plants cells). Here we described a next-generation sequencing method called CRISPR-Cas9 Edited Site Sequencing (CRES-Seq) for the efficient and high-throughput screening of CRISPR-Cas9-edited clones. CRES-Seq facilitates the precise genotyping up to 96 CRISPR-Cas9-edited sites (CRES) in a single MiniSeq (Illumina) run with an approximate sequencing cost of $6/clone. CRES-Seq is particularly useful when multiple genes are simultaneously targeted by CRISPR-Cas9, and also for screening of clones generated from multiallelic genes/polyploid cells. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  3. Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data.

    PubMed

    Waszak, Sebastian M; Kilpinen, Helena; Gschwind, Andreas R; Orioli, Andrea; Raghav, Sunil K; Witwicki, Robert M; Migliavacca, Eugenia; Yurovsky, Alisa; Lappalainen, Tuuli; Hernandez, Nouria; Reymond, Alexandre; Dermitzakis, Emmanouil T; Deplancke, Bart

    2014-01-15

    High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts. We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays. The R package abs filter for library clonality simulations and detection of amplification-biased sites is available from http://updepla1srv1.epfl.ch/waszaks/absfilter

  4. Human papillomavirus detection using the Abbott RealTime high-risk HPV tests compared with conventional nested PCR coupled to high-throughput sequencing of amplification products in cervical smear specimens from a Gabonese female population.

    PubMed

    Moussavou-Boundzanga, Pamela; Koumakpayi, Ismaël Hervé; Labouba, Ingrid; Leroy, Eric M; Belembaogo, Ernest; Berthet, Nicolas

    2017-12-21

    Cervical cancer is the fourth most common malignancy in women worldwide. However, screening with human papillomavirus (HPV) molecular tests holds promise for reducing cervical cancer incidence and mortality in low- and middle-income countries. The performance of the Abbott RealTime High-Risk HPV test (AbRT) was evaluated in 83 cervical smear specimens and compared with a conventional nested PCR coupled to high-throughput sequencing (HTS) to identify the amplicons. The AbRT assay detected at least one HPV genotype in 44.57% of women regardless of the grade of cervical abnormalities. Except for one case, good concordance was observed for the genotypes detected with the AbRT assay in the high-risk HPV category determined with HTS of the amplicon generated by conventional nested PCR. The AbRT test is an easy and reliable molecular tool and was as sensitive as conventional nested PCR in cervical smear specimens for detection HPVs associated with high-grade lesions. Moreover, sequencing amplicons using an HTS approach effectively identified the genotype of the hrHPV identified with the AbRT test.

  5. Simultaneous identification and molecular characterization of viruses associated with an apple tree with mosaic symptom

    USDA-ARS?s Scientific Manuscript database

    We conducted genomic sequencing to identify viruses associated with mosaic disease of an apple tree using the high-throughput sequencing (HTS) Illumina RNA-seq platform. The objective was to examine if rapid identification and characterization of viruses could be effectively achieved by RNA-seq anal...

  6. Analysis of petunia hybrida in response to salt stress using high throughput RNA sequencing

    USDA-ARS?s Scientific Manuscript database

    Salt and drought are among the greatest challenges to crop and native plants in meeting their yield and reproductive potentials. DNA sequencing-enabled transcriptome profiling provides a means of assessing what genes are responding to salt or drought stress so as to better understand the molecular ...

  7. A Computational Framework for High-Throughput Isotopic Natural Abundance Correction of Omics-Level Ultra-High Resolution FT-MS Datasets

    PubMed Central

    Carreer, William J.; Flight, Robert M.; Moseley, Hunter N. B.

    2013-01-01

    New metabolomics applications of ultra-high resolution and accuracy mass spectrometry can provide thousands of detectable isotopologues, with the number of potentially detectable isotopologues increasing exponentially with the number of stable isotopes used in newer isotope tracing methods like stable isotope-resolved metabolomics (SIRM) experiments. This huge increase in usable data requires software capable of correcting the large number of isotopologue peaks resulting from SIRM experiments in a timely manner. We describe the design of a new algorithm and software system capable of handling these high volumes of data, while including quality control methods for maintaining data quality. We validate this new algorithm against a previous single isotope correction algorithm in a two-step cross-validation. Next, we demonstrate the algorithm and correct for the effects of natural abundance for both 13C and 15N isotopes on a set of raw isotopologue intensities of UDP-N-acetyl-D-glucosamine derived from a 13C/15N-tracing experiment. Finally, we demonstrate the algorithm on a full omics-level dataset. PMID:24404440

  8. Enantioselective ultra high performance liquid and supercritical fluid chromatography: The race to the shortest chromatogram.

    PubMed

    Ciogli, Alessia; Ismail, Omar H; Mazzoccanti, Giulia; Villani, Claudio; Gasparrini, Francesco

    2018-03-01

    The ever-increasing need for enantiomerically pure chiral compounds has greatly expanded the number of enantioselective separation methods available for the precise and accurate measurements of the enantiomeric purity. The introduction of chiral stationary phases for liquid chromatography in the last decades has revolutionized the routine methods to determine enantiomeric purity of chiral drugs, agrochemicals, fragrances, and in general of organic and organometallic compounds. In recent years, additional efforts have been placed on faster, enantioselective analytical methods capable to fulfill the high throughput requirements of modern screening procedures. Efforts in this field, capitalizing on improved chromatographic particle technology and dedicated instrumentation, have led to highly efficient separations that are routinely completed on the seconds time scale. An overview of the recent achievements in the field of ultra-high-resolution chromatography on column packed with chiral stationary phases, both based on sub-2 μm fully porous and sub-3 μm superficially porous particles, will be given, with an emphasis on very recent studies on ultrafast chiral separations. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. [Study on Microbial Diversity of Peri-implantitis Subgingival by High-throughput Sequencing].

    PubMed

    Li, Zhi-jie; Wang, Shao-guo; Li, Yue-hong; Tu, Dong-xiang; Liu, Shi-yun; Nie, Hong-bing; Li, Zhi-qiang; Zhang, Ju-mei

    2015-07-01

    To study microbial diversity of peri-implantitis subgingival with high-throughput sequencing, and investigate microbiological etiology of peri-implantitis. Subgingival plaques were sampled from the patients with peri-implantitis (D group) and non-peri-implantitis subjects (N group). The microbiological diversity of the subgingival plaques was detected by sequencing V4 region of 16S rRNA with Illumina Miseq platform. The diversity of the community structure was analyzed using Mothur software. A total of 156 507 gene sequences were detected in nine samples and 4 402 operational taxonomic units (OTUs) were found. Selenomonas, Pseudomonas, and Fusobacterium were dominant bacteria in D group, while Fusobacterium, Veillonella and Streptococcus were dominant bacteria in N group. Differences between peri-implantitis and non-peri-implantitis bacterial communities were observed at all phylogenetic levels by LEfSe, which was also found in PcoA test. The occurrence of peri-implantitis is not only related to periodontitis pathogenic microbe, but also related with the changes of oral microbial community structure. Treponema, Herbaspirillum, Butyricimonas and Phaeobacte may be closely related to the occurrence and development of peri-implantitis.

  10. High-throughput sequencing of natively paired antibody chains provides evidence for original antigenic sin shaping the antibody response to influenza vaccination.

    PubMed

    Tan, Yann-Chong; Blum, Lisa K; Kongpachith, Sarah; Ju, Chia-Hsin; Cai, Xiaoyong; Lindstrom, Tamsin M; Sokolove, Jeremy; Robinson, William H

    2014-03-01

    We developed a DNA barcoding method to enable high-throughput sequencing of the cognate heavy- and light-chain pairs of the antibodies expressed by individual B cells. We used this approach to elucidate the plasmablast antibody response to influenza vaccination. We show that >75% of the rationally selected plasmablast antibodies bind and neutralize influenza, and that antibodies from clonal families, defined by sharing both heavy-chain VJ and light-chain VJ sequence usage, do so most effectively. Vaccine-induced heavy-chain VJ regions contained on average >20 nucleotide mutations as compared to their predicted germline gene sequences, and some vaccine-induced antibodies exhibited higher binding affinities for hemagglutinins derived from prior years' seasonal influenza as compared to their affinities for the immunization strains. Our results show that influenza vaccination induces the recall of memory B cells that express antibodies that previously underwent affinity maturation against prior years' seasonal influenza, suggesting that 'original antigenic sin' shapes the antibody response to influenza vaccination. Published by Elsevier Inc.

  11. Brain Connectivity as a DNA Sequencing Problem

    NASA Astrophysics Data System (ADS)

    Zador, Anthony

    The mammalian cortex consists of millions or billions of neurons, each connected to thousands of other neurons. Traditional methods for determining the brain connectivity rely on microscopy to visualize neuronal connections, but such methods are slow, labor-intensive and often lack single neuron resolution. We have recently developed a new method, MAPseq, to recast the determination of brain wiring into a form that can exploit the tremendous recent advances in high-throughput DNA sequencing. DNA sequencing technology has outpaced even Moore's law, so that the cost of sequencing the human genome has dropped from a billion dollars in 2001 to below a thousand dollars today. MAPseq works by introducing random sequences of DNA-``barcodes''-to tag neurons uniquely. With MAPseq, we can determine the connectivity of over 50K single neurons in a single mouse cortex in about a week, an unprecedented throughput, ushering in the era of ``big data'' for brain wiring. We are now developing analytical tools and algorithms to make sense of these novel data sets.

  12. High-throughput automated microfluidic sample preparation for accurate microbial genomics

    PubMed Central

    Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B.; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P.; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C.

    2017-01-01

    Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications. PMID:28128213

  13. Genome-wide mapping of autonomous promoter activity in human cells

    PubMed Central

    van Arensbergen, Joris; FitzPatrick, Vincent D.; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J.; van Steensel, Bas

    2017-01-01

    Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of sequences that could be tested. Here we present Survey of Regulatory Elements (SuRE), a method to assay more than 108 DNA fragments, each 0.2–2kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library is constructed of random genomic fragments upstream of a 20bp barcode and decoded by paired-end sequencing. This library is then transfected into cells and transcribed barcodes are quantified in the RNA by high throughput sequencing. When applied to the human genome, we achieved a 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide. By computational modeling we delineated subregions within promoters that are relevant for their activity. For instance, we show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites. PMID:28024146

  14. MetaUniDec: High-Throughput Deconvolution of Native Mass Spectra

    NASA Astrophysics Data System (ADS)

    Reid, Deseree J.; Diesing, Jessica M.; Miller, Matthew A.; Perry, Scott M.; Wales, Jessica A.; Montfort, William R.; Marty, Michael T.

    2018-04-01

    The expansion of native mass spectrometry (MS) methods for both academic and industrial applications has created a substantial need for analysis of large native MS datasets. Existing software tools are poorly suited for high-throughput deconvolution of native electrospray mass spectra from intact proteins and protein complexes. The UniDec Bayesian deconvolution algorithm is uniquely well suited for high-throughput analysis due to its speed and robustness but was previously tailored towards individual spectra. Here, we optimized UniDec for deconvolution, analysis, and visualization of large data sets. This new module, MetaUniDec, centers around a hierarchical data format 5 (HDF5) format for storing datasets that significantly improves speed, portability, and file size. It also includes code optimizations to improve speed and a new graphical user interface for visualization, interaction, and analysis of data. To demonstrate the utility of MetaUniDec, we applied the software to analyze automated collision voltage ramps with a small bacterial heme protein and large lipoprotein nanodiscs. Upon increasing collisional activation, bacterial heme-nitric oxide/oxygen binding (H-NOX) protein shows a discrete loss of bound heme, and nanodiscs show a continuous loss of lipids and charge. By using MetaUniDec to track changes in peak area or mass as a function of collision voltage, we explore the energetic profile of collisional activation in an ultra-high mass range Orbitrap mass spectrometer. [Figure not available: see fulltext.

  15. Tempo and mode of genomic mutations unveil human evolutionary history.

    PubMed

    Hara, Yuichiro

    2015-01-01

    Mutations that have occurred in human genomes provide insight into various aspects of evolutionary history such as speciation events and degrees of natural selection. Comparing genome sequences between human and great apes or among humans is a feasible approach for inferring human evolutionary history. Recent advances in high-throughput or so-called 'next-generation' DNA sequencing technologies have enabled the sequencing of thousands of individual human genomes, as well as a variety of reference genomes of hominids, many of which are publicly available. These sequence data can help to unveil the detailed demographic history of the lineage leading to humans as well as the explosion of modern human population size in the last several thousand years. In addition, high-throughput sequencing illustrates the tempo and mode of de novo mutations, which are producing human genetic variation at this moment. Pedigree-based human genome sequencing has shown that mutation rates vary significantly across the human genome. These studies have also provided an improved timescale of human evolution, because the mutation rate estimated from pedigree analysis is half that estimated from traditional analyses based on molecular phylogeny. Because of the dramatic reduction in sequencing cost, sequencing on-demand samples designed for specific studies is now also becoming popular. To produce data of sufficient quality to meet the requirements of the study, it is necessary to set an explicit sequencing plan that includes the choice of sample collection methods, sequencing platforms, and number of sequence reads.

  16. Sharing of photobionts in sympatric populations of Thamnolia and Cetraria lichens: evidence from high-throughput sequencing.

    PubMed

    Onuț-Brännström, Ioana; Benjamin, Mitchell; Scofield, Douglas G; Heiðmarsson, Starri; Andersson, Martin G I; Lindström, Eva S; Johannesson, Hanna

    2018-03-13

    In this study, we explored the diversity of green algal symbionts (photobionts) in sympatric populations of the cosmopolitan lichen-forming fungi Thamnolia and Cetraria. We sequenced with both Sanger and Ion Torrent High-Throughput Sequencing technologies the photobiont ITS-region of 30 lichen thalli from two islands: Iceland and Öland. While Sanger recovered just one photobiont genotype from each thallus, the Ion Torrent data recovered 10-18 OTUs for each pool of 5 lichen thalli, suggesting that individual lichens can contain heterogeneous photobiont populations. Both methods showed evidence for photobiont sharing between Thamnolia and Cetraria on Iceland. In contrast, our data suggest that on Öland the two mycobionts associate with distinct photobiont communities, with few shared OTUs revealed by Ion Torrent sequencing. Furthermore, by comparing our sequences with public data, we identified closely related photobionts from geographically distant localities. Taken together, we suggest that the photobiont composition in Thamnolia and Cetraria results from both photobiont-mycobiont codispersal and local acquisition during mycobiont establishment and/or lichen growth. We hypothesize that this is a successful strategy for lichens to be flexible in the use of the most adapted photobiont for the environment.

  17. Integrated analysis of RNA-binding protein complexes using in vitro selection and high-throughput sequencing and sequence specificity landscapes (SEQRS).

    PubMed

    Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T

    2017-04-15

    RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. Terminator oligo blocking efficiently eliminates rRNA from Drosophila small RNA sequencing libraries.

    PubMed

    Wickersheim, Michelle L; Blumenstiel, Justin P

    2013-11-01

    A large number of methods are available to deplete ribosomal RNA reads from high-throughput RNA sequencing experiments. Such methods are critical for sequencing Drosophila small RNAs between 20 and 30 nucleotides because size selection is not typically sufficient to exclude the highly abundant class of 30 nucleotide 2S rRNA. Here we demonstrate that pre-annealing terminator oligos complimentary to Drosophila 2S rRNA prior to 5' adapter ligation and reverse transcription efficiently depletes 2S rRNA sequences from the sequencing reaction in a simple and inexpensive way. This depletion is highly specific and is achieved with minimal perturbation of miRNA and piRNA profiles.

  19. Electrochemical Corrosion Properties of Commercial Ultra-Thin Copper Foils

    NASA Astrophysics Data System (ADS)

    Yen, Ming-Hsuan; Liu, Jen-Hsiang; Song, Jenn-Ming; Lin, Shih-Ching

    2017-08-01

    Ultra-thin electrodeposited Cu foils have been developed for substrate thinning for mobile devices. Considering the corrosion by residual etchants from the lithography process for high-density circuit wiring, this study investigates the microstructural features of ultra-thin electrodeposited Cu foils with a thickness of 3 μm and their electrochemical corrosion performance in CuCl2-based etching solution. X-ray diffraction and electron backscatter diffraction analyses verify that ultra-thin Cu foils exhibit a random texture and equi-axed grains. Polarization curves show that ultra-thin foils exhibit a higher corrosion potential and a lower corrosion current density compared with conventional (220)-oriented foils with fan-like distributed fine-elongated columnar grains. Chronoamperometric results also suggest that ultra-thin foils possess superior corrosion resistance. The passive layer, mainly composed of CuCl and Cu2O, forms and dissolves in sequence during polarization.

  20. High-resolution phylogenetic microbial community profiling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singer, Esther; Bushnell, Brian; Coleman-Derr, Devin

    Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structuresmore » at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake's water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.« less

  1. Squeezing water from a stone: high-throughput sequencing from a 145-year old holotype resolves (barely) a cryptic species problem in flying lizards.

    PubMed

    McGuire, Jimmy A; Cotoras, Darko D; O'Connell, Brendan; Lawalata, Shobi Z S; Wang-Claypool, Cynthia Y; Stubbs, Alexander; Huang, Xiaoting; Wogan, Guinevere O U; Hykin, Sarah M; Reilly, Sean B; Bi, Ke; Riyanto, Awal; Arida, Evy; Smith, Lydia L; Milne, Heather; Streicher, Jeffrey W; Iskandar, Djoko T

    2018-01-01

    We used Massively Parallel High-Throughput Sequencing to obtain genetic data from a 145-year old holotype specimen of the flying lizard, Draco cristatellus . Obtaining genetic data from this holotype was necessary to resolve an otherwise intractable taxonomic problem involving the status of this species relative to closely related sympatric Draco species that cannot otherwise be distinguished from one another on the basis of museum specimens. Initial analyses suggested that the DNA present in the holotype sample was so degraded as to be unusable for sequencing. However, we used a specialized extraction procedure developed for highly degraded ancient DNA samples and MiSeq shotgun sequencing to obtain just enough low-coverage mitochondrial DNA (721 base pairs) to conclusively resolve the species status of the holotype as well as a second known specimen of this species. The holotype was prepared before the advent of formalin-fixation and therefore was most likely originally fixed with ethanol and never exposed to formalin. Whereas conventional wisdom suggests that formalin-fixed samples should be the most challenging for DNA sequencing, we propose that evaporation during long-term alcohol storage and consequent water-exposure may subject older ethanol-fixed museum specimens to hydrolytic damage. If so, this may pose an even greater challenge for sequencing efforts involving historical samples.

  2. High-resolution phylogenetic microbial community profiling

    DOE PAGES

    Singer, Esther; Bushnell, Brian; Coleman-Derr, Devin; ...

    2016-02-09

    Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structuresmore » at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake's water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.« less

  3. Unmanned aerial vehicles for high-throughput phenotyping and agronomic research

    USDA-ARS?s Scientific Manuscript database

    Advances in automation and data science have led agriculturists to seek real-time, high-quality, high-volume crop data to accelerate crop improvement through breeding and to optimize agronomic practices. Breeders have recently gained massive data-collection capability in genome sequencing of plants....

  4. Hierarchical assembly of viral nanotemplates with encoded microparticles via nucleic acid hybridization.

    PubMed

    Tan, Wui Siew; Lewis, Christina L; Horelik, Nicholas E; Pregibon, Daniel C; Doyle, Patrick S; Yi, Hyunmin

    2008-11-04

    We demonstrate hierarchical assembly of tobacco mosaic virus (TMV)-based nanotemplates with hydrogel-based encoded microparticles via nucleic acid hybridization. TMV nanotemplates possess a highly defined structure and a genetically engineered high density thiol functionality. The encoded microparticles are produced in a high throughput microfluidic device via stop-flow lithography (SFL) and consist of spatially discrete regions containing encoded identity information, an internal control, and capture DNAs. For the hybridization-based assembly, partially disassembled TMVs were programmed with linker DNAs that contain sequences complementary to both the virus 5' end and a selected capture DNA. Fluorescence microscopy, atomic force microscopy (AFM), and confocal microscopy results clearly indicate facile assembly of TMV nanotemplates onto microparticles with high spatial and sequence selectivity. We anticipate that our hybridization-based assembly strategy could be employed to create multifunctional viral-synthetic hybrid materials in a rapid and high-throughput manner. Additionally, we believe that these viral-synthetic hybrid microparticles may find broad applications in high capacity, multiplexed target sensing.

  5. High-Throughput Sequencing Reveals Principles of Adeno-Associated Virus Serotype 2 Integration

    PubMed Central

    Janovitz, Tyler; Klein, Isaac A.; Oliveira, Thiago; Mukherjee, Piali; Nussenzweig, Michel C.; Sadelain, Michel

    2013-01-01

    Viral integrations are important in human biology, yet genome-wide integration profiles have not been determined for many viruses. Adeno-associated virus (AAV) infects most of the human population and is a prevalent gene therapy vector. AAV integrates into the human genome with preference for a single locus, termed AAVS1. However, the genome-wide integration of AAV has not been defined, and the principles underlying this recombination remain unclear. Using a novel high-throughput approach, integrant capture sequencing, nearly 12 million AAV junctions were recovered from a human cell line, providing five orders of magnitude more data than were previously available. Forty-five percent of integrations occurred near AAVS1, and several thousand novel integration hotspots were identified computationally. Most of these occurred in genes, with dozens of hotspots targeting known oncogenes. Viral replication protein binding sites (RBS) and transcriptional activity were major factors favoring integration. In a first for eukaryotic viruses, the data reveal a unique asymmetric integration profile with distinctive directional orientation of viral genomes. These studies provide a new understanding of AAV integration biology through the use of unbiased high-throughput data acquisition and bioinformatics. PMID:23720718

  6. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays

    PubMed Central

    Aryee, Martin J.; Jaffe, Andrew E.; Corrada-Bravo, Hector; Ladd-Acosta, Christine; Feinberg, Andrew P.; Hansen, Kasper D.; Irizarry, Rafael A.

    2014-01-01

    Motivation: The recently released Infinium HumanMethylation450 array (the ‘450k’ array) provides a high-throughput assay to quantify DNA methylation (DNAm) at ∼450 000 loci across a range of genomic features. Although less comprehensive than high-throughput sequencing-based techniques, this product is more cost-effective and promises to be the most widely used DNAm high-throughput measurement technology over the next several years. Results: Here we describe a suite of computational tools that incorporate state-of-the-art statistical techniques for the analysis of DNAm data. The software is structured to easily adapt to future versions of the technology. We include methods for preprocessing, quality assessment and detection of differentially methylated regions from the kilobase to the megabase scale. We show how our software provides a powerful and flexible development platform for future methods. We also illustrate how our methods empower the technology to make discoveries previously thought to be possible only with sequencing-based methods. Availability and implementation: http://bioconductor.org/packages/release/bioc/html/minfi.html. Contact: khansen@jhsph.edu; rafa@jimmy.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24478339

  7. Microbial forensics: fiber optic microarray subtyping of Bacillus anthracis

    NASA Astrophysics Data System (ADS)

    Shepard, Jason R. E.

    2009-05-01

    The past decade has seen increased development and subsequent adoption of rapid molecular techniques involving DNA analysis for detection of pathogenic microorganisms, also termed microbial forensics. The continued accumulation of microbial sequence information in genomic databases now better positions the field of high-throughput DNA analysis to proceed in a more manageable fashion. The potential to build off of these databases exists as technology continues to develop, which will enable more rapid, cost effective analyses. This wealth of genetic information, along with new technologies, has the potential to better address some of the current problems and solve the key issues involved in DNA analysis of pathogenic microorganisms. To this end, a high density fiber optic microarray has been employed, housing numerous DNA sequences simultaneously for detection of various pathogenic microorganisms, including Bacillus anthracis, among others. Each organism is analyzed with multiple sequences and can be sub-typed against other closely related organisms. For public health labs, real-time PCR methods have been developed as an initial preliminary screen, but culture and growth are still considered the gold standard. Technologies employing higher throughput than these standard methods are better suited to capitalize on the limitless potential garnered from the sequence information. Microarray analyses are one such format positioned to exploit this potential, and our array platform is reusable, allowing repetitive tests on a single array, providing an increase in throughput and decrease in cost, along with a certainty of detection, down to the individual strain level.

  8. UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences.

    PubMed

    Du, Pu-Feng; Zhao, Wei; Miao, Yang-Yang; Wei, Le-Yi; Wang, Likun

    2017-11-14

    With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.

  9. A Comprehensive Analysis of In Vitro and In Vivo Genetic Fitness of Pseudomonas aeruginosa Using High-Throughput Sequencing of Transposon Libraries

    PubMed Central

    Aschard, Hugues; Cattoir, Vincent; Yoder-Himes, Deborah; Lory, Stephen; Pier, Gerald B.

    2013-01-01

    High-throughput sequencing of transposon (Tn) libraries created within entire genomes identifies and quantifies the contribution of individual genes and operons to the fitness of organisms in different environments. We used insertion-sequencing (INSeq) to analyze the contribution to fitness of all non-essential genes in the chromosome of Pseudomonas aeruginosa strain PA14 based on a library of ∼300,000 individual Tn insertions. In vitro growth in LB provided a baseline for comparison with the survival of the Tn insertion strains following 6 days of colonization of the murine gastrointestinal tract as well as a comparison with Tn-inserts subsequently able to systemically disseminate to the spleen following induction of neutropenia. Sequencing was performed following DNA extraction from the recovered bacteria, digestion with the MmeI restriction enzyme that hydrolyzes DNA 16 bp away from the end of the Tn insert, and fractionation into oligonucleotides of 1,200–1,500 bp that were prepared for high-throughput sequencing. Changes in frequency of Tn inserts into the P. aeruginosa genome were used to quantify in vivo fitness resulting from loss of a gene. 636 genes had <10 sequencing reads in LB, thus defined as unable to grow in this medium. During in vivo infection there were major losses of strains with Tn inserts in almost all known virulence factors, as well as respiration, energy utilization, ion pumps, nutritional genes and prophages. Many new candidates for virulence factors were also identified. There were consistent changes in the recovery of Tn inserts in genes within most operons and Tn insertions into some genes enhanced in vivo fitness. Strikingly, 90% of the non-essential genes were required for in vivo survival following systemic dissemination during neutropenia. These experiments resulted in the identification of the P. aeruginosa strain PA14 genes necessary for optimal survival in the mucosal and systemic environments of a mammalian host. PMID:24039572

  10. A data set from flash X-ray imaging of carboxysomes

    NASA Astrophysics Data System (ADS)

    Hantke, Max F.; Hasse, Dirk; Ekeberg, Tomas; John, Katja; Svenda, Martin; Loh, Duane; Martin, Andrew V.; Timneanu, Nicusor; Larsson, Daniel S. D.; van der Schot, Gijs; Carlsson, Gunilla H.; Ingelman, Margareta; Andreasson, Jakob; Westphal, Daniel; Iwan, Bianca; Uetrecht, Charlotte; Bielecki, Johan; Liang, Mengning; Stellato, Francesco; Deponte, Daniel P.; Bari, Sadia; Hartmann, Robert; Kimmel, Nils; Kirian, Richard A.; Seibert, M. Marvin; Mühlig, Kerstin; Schorb, Sebastian; Ferguson, Ken; Bostedt, Christoph; Carron, Sebastian; Bozek, John D.; Rolles, Daniel; Rudenko, Artem; Foucar, Lutz; Epp, Sascha W.; Chapman, Henry N.; Barty, Anton; Andersson, Inger; Hajdu, Janos; Maia, Filipe R. N. C.

    2016-08-01

    Ultra-intense femtosecond X-ray pulses from X-ray lasers permit structural studies on single particles and biomolecules without crystals. We present a large data set on inherently heterogeneous, polyhedral carboxysome particles. Carboxysomes are cell organelles that vary in size and facilitate up to 40% of Earth’s carbon fixation by cyanobacteria and certain proteobacteria. Variation in size hinders crystallization. Carboxysomes appear icosahedral in the electron microscope. A protein shell encapsulates a large number of Rubisco molecules in paracrystalline arrays inside the organelle. We used carboxysomes with a mean diameter of 115±26 nm from Halothiobacillus neapolitanus. A new aerosol sample-injector allowed us to record 70,000 low-noise diffraction patterns in 12 min. Every diffraction pattern is a unique structure measurement and high-throughput imaging allows sampling the space of structural variability. The different structures can be separated and phased directly from the diffraction data and open a way for accurate, high-throughput studies on structures and structural heterogeneity in biology and elsewhere.

  11. High throughput CIGS solar cell fabrication via ultra-thin absorber layer with optical confinement and (Cd, CBD)-free heterojunction partner

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marsillac, Sylvain

    2015-11-30

    The main objective of this proposal was to use several pathways to reduce the production cost of Cu(In,Ga)Se 2 (CIGS) PV modules and therefore the levelized cost of energy (LCOE) associated with this technology. Three high cost drivers were identified, nominally: 1) Materials cost and availability; 2) Large scale uniformity; 3) Improved throughput These three cost drivers were targeted using the following pathways: 1) Reducing the thickness of the CIGS layer while enhancing materials quality; 2) Developing and applying enhanced in-situ metrology via real time spectroscopic ellipsometry; 3) Looking into alternative heterojunction partner, back contact and anti-reflection (AR) coating Elevenmore » main Tasks were then defined to achieve these goals (5 in Phase 1 and 6 in Phase 2), with 11 Milestones and 2 Go/No-go decision points at the end of Phase 1. The key results are summarized below« less

  12. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library.

    PubMed

    Sánchez, Cecilia Castaño; Smith, Timothy P L; Wiedmann, Ralph T; Vallejo, Roger L; Salem, Mohamed; Yao, Jianbo; Rexroad, Caird E

    2009-11-25

    To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts. The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.

  13. High throughput sequencing identifies chilling responsive genes in sweetpotato (Ipomoea batatas Lam.) during storage.

    PubMed

    Xie, Zeyi; Zhou, Zhilin; Li, Hongmin; Yu, Jingjing; Jiang, Jiaojiao; Tang, Zhonghou; Ma, Daifu; Zhang, Baohong; Han, Yonghua; Li, Zongyun

    2018-05-21

    Sweetpotato (Ipomoea batatas L.) is a globally important economic food crop. It belongs to Convolvulaceae family and origins in the tropics; however, sweetpotato is sensitive to cold stress during storage. In this study, we performed transcriptome sequencing to investigate the sweetpotato response to chilling stress during storage. A total of 110,110 unigenes were generated via high-throughput sequencing. Differentially expressed genes (DEGs) analysis showed that 18,681 genes were up-regulated and 21,983 genes were down-regulated in low temperature condition. Many DEGs were related to the cell membrane system, antioxidant enzymes, carbohydrate metabolism, and hormone metabolism, which are potentially associated with sweetpotato resistance to low temperature. The existence of DEGs suggests a molecular basis for the biochemical and physiological consequences of sweetpotato in low temperature storage conditions. Our analysis will provide a new target for enhancement of sweetpotato cold stress tolerance in postharvest storage through genetic manipulation. Copyright © 2018. Published by Elsevier Inc.

  14. ImmuneDB: a system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data.

    PubMed

    Rosenfeld, Aaron M; Meng, Wenzhao; Luning Prak, Eline T; Hershberg, Uri

    2017-01-15

    As high-throughput sequencing of B cells becomes more common, the need for tools to analyze the large quantity of data also increases. This article introduces ImmuneDB, a system for analyzing vast amounts of heavy chain variable region sequences and exploring the resulting data. It can take as input raw FASTA/FASTQ data, identify genes, determine clones, construct lineages, as well as provide information such as selection pressure and mutation analysis. It uses an industry leading database, MySQL, to provide fast analysis and avoid the complexities of using error prone flat-files. ImmuneDB is freely available at http://immunedb.comA demo of the ImmuneDB web interface is available at: http://immunedb.com/demo CONTACT: Uh25@drexel.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Surveying the repair of ancient DNA from bones via high-throughput sequencing.

    PubMed

    Mouttham, Nathalie; Klunk, Jennifer; Kuch, Melanie; Fourney, Ron; Poinar, Hendrik

    2015-07-01

    DNA damage in the form of abasic sites, chemically altered nucleotides, and strand fragmentation is the foremost limitation in obtaining genetic information from many ancient samples. Upon cell death, DNA continues to endure various chemical attacks such as hydrolysis and oxidation, but repair pathways found in vivo no longer operate. By incubating degraded DNA with specific enzyme combinations adopted from these pathways, it is possible to reverse some of the post-mortem nucleic acid damage prior to downstream analyses such as library preparation, targeted enrichment, and high-throughput sequencing. Here, we evaluate the performance of two available repair protocols on previously characterized DNA extracts from four mammoths. Both methods use endonucleases and glycosylases along with a DNA polymerase-ligase combination. PreCR Repair Mix increases the number of molecules converted to sequencing libraries, leading to an increase in endogenous content and a decrease in cytosine-to-thymine transitions due to cytosine deamination. However, the effects of Nelson Repair Mix on repair of DNA damage remain inconclusive.

  16. Genome Sequence of the Novel Marine Member of the Gammaproteobacteria Strain HTCC5015▿

    PubMed Central

    Thrash, J. Cameron; Stingl, Ulrich; Cho, Jang-Cheon; Ferriera, Steve; Johnson, Justin; Vergin, Kevin L.; Giovannoni, Stephen J.

    2010-01-01

    HTCC5015 is a novel, highly divergent marine member of the Gammaproteobacteria, currently without a cultured representative with greater than 89% 16S rRNA gene identity to itself. The organism was isolated from water collected from Hydrostation S south of Bermuda using high-throughput dilution-to-extinction culturing techniques. Here we present the genome sequence of the unique Gammaproteobacterium strain HTCC5015. PMID:20472792

  17. The largest subunit of RNA polymerase II as a new marker gene to study assemblages of arbuscular mycorrhizal fungi in the field.

    PubMed

    Stockinger, Herbert; Peyret-Guzzon, Marine; Koegel, Sally; Bouffaud, Marie-Lara; Redecker, Dirk

    2014-01-01

    Due to the potential of arbuscular mycorrhizal fungi (AMF, Glomeromycota) to improve plant growth and soil quality, the influence of agricultural practice on their diversity continues to be an important research question. Up to now studies of community diversity in AMF have exclusively been based on nuclear ribosomal gene regions, which in AMF show high intra-organism polymorphism, seriously complicating interpretation of these data. We designed specific PCR primers for 454 sequencing of a region of the largest subunit of RNA polymerase II gene, and established a new reference dataset comprising all major AMF lineages. This gene is known to be monomorphic within fungal isolates but shows an excellent barcode gap between species. We designed a primer set to amplify all known lineages of AMF and demonstrated its applicability in combination with high-throughput sequencing in a long-term tillage experiment. The PCR primers showed a specificity of 99.94% for glomeromycotan sequences. We found evidence of significant shifts of the AMF communities caused by soil management and showed that tillage effects on different AMF taxa are clearly more complex than previously thought. The high resolving power of high-throughput sequencing highlights the need for quantitative measurements to efficiently detect these effects.

  18. High-throughput sequencing-based analysis of endogenetic fungal communities inhabiting the Chinese Cordyceps reveals unexpectedly high fungal diversity

    PubMed Central

    Xia, Fei; Chen, Xin; Guo, Meng-Yuan; Bai, Xiao-Hui; Liu, Yan; Shen, Guang-Rong; Li, Yu-Ling; Lin, Juan; Zhou, Xuan-Wei

    2016-01-01

    Chinese Cordyceps, known in Chinese as “DongChong XiaCao”, is a parasitic complex of a fungus (Ophiocordyceps sinensis) and a caterpillar. The current study explored the endogenetic fungal communities inhabiting Chinese Cordyceps. Samples were collected from five different geographical regions of Qinghai and Tibet, and the nuclear ribosomal internal transcribed spacer-1 sequences from each sample were obtained using Illumina high-throughput sequencing. The results showed that Ascomycota was the dominant fungal phylum in Chinese Cordyceps and its soil microhabitat from different sampling regions. Among the Ascomycota, 65 genera were identified, and the abundant operational taxonomic units showed the strongest sequence similarity to Ophiocordyceps, Verticillium, Pseudallescheria, Candida and Ilyonectria Not surprisingly, the genus Ophiocordyceps was the largest among the fungal communities identified in the fruiting bodies and external mycelial cortices of Chinese Cordyceps. In addition, fungal communities in the soil microhabitats were clustered separately from the external mycelial cortices and fruiting bodies of Chinese Cordyceps from different sampling regions. There was no significant structural difference in the fungal communities between the fruiting bodies and external mycelial cortices of Chinese Cordyceps. This study revealed an unexpectedly high diversity of fungal communities inhabiting the Chinese Cordyceps and its microhabitats. PMID:27625176

  19. Alignment-free design of highly discriminatory diagnostic primer sets for Escherichia coli O104:H4 outbreak strains.

    PubMed

    Pritchard, Leighton; Holden, Nicola J; Bielaszewska, Martina; Karch, Helge; Toth, Ian K

    2012-01-01

    An Escherichia coli O104:H4 outbreak in Germany in summer 2011 caused 53 deaths, over 4000 individual infections across Europe, and considerable economic, social and political impact. This outbreak was the first in a position to exploit rapid, benchtop high-throughput sequencing (HTS) technologies and crowdsourced data analysis early in its investigation, establishing a new paradigm for rapid response to disease threats. We describe a novel strategy for design of diagnostic PCR primers that exploited this rapid draft bacterial genome sequencing to distinguish between E. coli O104:H4 outbreak isolates and other pathogenic E. coli isolates, including the historical hæmolytic uræmic syndrome (HUSEC) E. coli HUSEC041 O104:H4 strain, which possesses the same serotype as the outbreak isolates. Primers were designed using a novel alignment-free strategy against eleven draft whole genome assemblies of E. coli O104:H4 German outbreak isolates from the E. coli O104:H4 Genome Analysis Crowd-Sourcing Consortium website, and a negative sequence set containing 69 E. coli chromosome and plasmid sequences from public databases. Validation in vitro against 21 'positive' E. coli O104:H4 outbreak and 32 'negative' non-outbreak EHEC isolates indicated that individual primer sets exhibited 100% sensitivity for outbreak isolates, with false positive rates of between 9% and 22%. A minimal combination of two primers discriminated between outbreak and non-outbreak E. coli isolates with 100% sensitivity and 100% specificity. Draft genomes of isolates of disease outbreak bacteria enable high throughput primer design and enhanced diagnostic performance in comparison to traditional molecular assays. Future outbreak investigations will be able to harness HTS rapidly to generate draft genome sequences and diagnostic primer sets, greatly facilitating epidemiology and clinical diagnostics. We expect that high throughput primer design strategies will enable faster, more precise responses to future disease outbreaks of bacterial origin, and help to mitigate their societal impact.

  20. Bimodal imprint chips for peptide screening: integration of high-throughput sequencing by MS and affinity analyses by surface plasmon resonance imaging.

    PubMed

    Wang, Weizhi; Li, Menglin; Wei, Zewen; Wang, Zihua; Bu, Xiangli; Lai, Wenjia; Yang, Shu; Gong, He; Zheng, Hui; Wang, Yuqiao; Liu, Ying; Li, Qin; Fang, Qiaojun; Hu, Zhiyuan

    2014-04-15

    Peptide probes and drugs have widespread applications in disease diagnostics and therapy. The demand for peptides ligands with high affinity and high specificity toward various targets has surged in the biomedical field in recent years. The traditional peptide screening procedure involves selection, sequencing, and characterization steps, and each step is manual and tedious. Herein, we developed a bimodal imprint microarray system to embrace the whole peptide screening process. Silver-sputtered silicon chip fabricated with microwell array can trap and pattern the candidate peptide beads in a one-well-one-bead manner. Peptides on beads were photocleaved in situ. A portion of the peptide in each well was transferred to a gold-coated chip to print the peptide array for high-throughput affinity analyses by surface plasmon resonance imaging (SPRi), and the peptide left in the silver-sputtered chip was ready for in situ single bead sequencing by matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF-MS). Using the bimodal imprint chip system, affinity peptides toward AHA were efficiently screened out from the 7 × 10(4) peptide library. The method provides a solution for high efficiency peptide screening.

  1. Investigating the viral ecology of global bee communities with high-throughput metagenomics.

    PubMed

    Galbraith, David A; Fuller, Zachary L; Ray, Allyson M; Brockmann, Axel; Frazier, Maryann; Gikungu, Mary W; Martinez, J Francisco Iturralde; Kapheim, Karen M; Kerby, Jeffrey T; Kocher, Sarah D; Losyev, Oleksiy; Muli, Elliud; Patch, Harland M; Rosa, Cristina; Sakamoto, Joyce M; Stanley, Scott; Vaudo, Anthony D; Grozinger, Christina M

    2018-06-11

    Bee viral ecology is a fascinating emerging area of research: viruses exert a range of effects on their hosts, exacerbate impacts of other environmental stressors, and, importantly, are readily shared across multiple bee species in a community. However, our understanding of bee viral communities is limited, as it is primarily derived from studies of North American and European Apis mellifera populations. Here, we examined viruses in populations of A. mellifera and 11 other bee species from 9 countries, across 4 continents and Oceania. We developed a novel pipeline to rapidly and inexpensively screen for bee viruses. This pipeline includes purification of encapsulated RNA/DNA viruses, sequence-independent amplification, high throughput sequencing, integrated assembly of contigs, and filtering to identify contigs specifically corresponding to viral sequences. We identified sequences for (+)ssRNA, (-)ssRNA, dsRNA, and ssDNA viruses. Overall, we found 127 contigs corresponding to novel viruses (i.e. previously not observed in bees), with 27 represented by >0.1% of the reads in a given sample, and 7 contained an RdRp or replicase sequence which could be used for robust phylogenetic analysis. This study provides a sequence-independent pipeline for viral metagenomics analysis, and greatly expands our understanding of the diversity of viruses found in bee communities.

  2. Accuracy of the high-throughput amplicon sequencing to identify species within the genus Aspergillus.

    PubMed

    Lee, Seungeun; Yamamoto, Naomichi

    2015-12-01

    This study characterized the accuracy of high-throughput amplicon sequencing to identify species within the genus Aspergillus. To this end, we sequenced the internal transcribed spacer 1 (ITS1), β-tubulin (BenA), and calmodulin (CaM) gene encoding sequences as DNA markers from eight reference Aspergillus strains with known identities using 300-bp sequencing on the Illumina MiSeq platform, and compared them with the BLASTn outputs. The identifications with the sequences longer than 250 bp were accurate at the section rank, with some ambiguities observed at the species rank due to mostly cross detection of sibling species. Additionally, in silico analysis was performed to predict the identification accuracy for all species in the genus Aspergillus, where 107, 210, and 187 species were predicted to be identifiable down to the species rank based on ITS1, BenA, and CaM, respectively. Finally, air filter samples were analysed to quantify the relative abundances of Aspergillus species in outdoor air. The results were reproducible across biological duplicates both at the species and section ranks, but not strongly correlated between ITS1 and BenA, suggesting the Aspergillus detection can be taxonomically biased depending on the selection of the DNA markers and/or primers. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  3. BlackOPs: increasing confidence in variant detection through mappability filtering.

    PubMed

    Cabanski, Christopher R; Wilkerson, Matthew D; Soloway, Matthew; Parker, Joel S; Liu, Jinze; Prins, Jan F; Marron, J S; Perou, Charles M; Hayes, D Neil

    2013-10-01

    Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.

  4. High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency.

    PubMed

    Takahashi, Mayumi; Wu, Xiwei; Ho, Michelle; Chomchan, Pritsana; Rossi, John J; Burnett, John C; Zhou, Jiehua

    2016-09-22

    The systemic evolution of ligands by exponential enrichment (SELEX) technique is a powerful and effective aptamer-selection procedure. However, modifications to the process can dramatically improve selection efficiency and aptamer performance. For example, droplet digital PCR (ddPCR) has been recently incorporated into SELEX selection protocols to putatively reduce the propagation of byproducts and avoid selection bias that result from differences in PCR efficiency of sequences within the random library. However, a detailed, parallel comparison of the efficacy of conventional solution PCR versus the ddPCR modification in the RNA aptamer-selection process is needed to understand effects on overall SELEX performance. In the present study, we took advantage of powerful high throughput sequencing technology and bioinformatics analysis coupled with SELEX (HT-SELEX) to thoroughly investigate the effects of initial library and PCR methods in the RNA aptamer identification. Our analysis revealed that distinct "biased sequences" and nucleotide composition existed in the initial, unselected libraries purchased from two different manufacturers and that the fate of the "biased sequences" was target-dependent during selection. Our comparison of solution PCR- and ddPCR-driven HT-SELEX demonstrated that PCR method affected not only the nucleotide composition of the enriched sequences, but also the overall SELEX efficiency and aptamer efficacy.

  5. Alterations of microbiota in urine from women with interstitial cystitis

    PubMed Central

    2012-01-01

    Background Interstitial Cystitis (IC) is a chronic inflammatory condition of the bladder with unknown etiology. The aim of this study was to characterize the microbial community present in the urine from IC female patients by 454 high throughput sequencing of the 16S variable regions V1V2 and V6. The taxonomical composition, richness and diversity of the IC microbiota were determined and compared to the microbial profile of asymptomatic healthy female (HF) urine. Results The composition and distribution of bacterial sequences differed between the urine microbiota of IC patients and HFs. Reduced sequence richness and diversity were found in IC patient urine, and a significant difference in the community structure of IC urine in relation to HF urine was observed. More than 90% of the IC sequence reads were identified as belonging to the bacterial genus Lactobacillus, a marked increase compared to 60% in HF urine. Conclusion The 16S rDNA sequence data demonstrates a shift in the composition of the bacterial community in IC urine. The reduced microbial diversity and richness is accompanied by a higher abundance of the bacterial genus Lactobacillus, compared to HF urine. This study demonstrates that high throughput sequencing analysis of urine microbiota in IC patients is a powerful tool towards a better understanding of this enigmatic disease. PMID:22974186

  6. Identification and correction of systematic error in high-throughput sequence data

    PubMed Central

    2011-01-01

    Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. PMID:22099972

  7. High-throughput full-length single-cell mRNA-seq of rare cells.

    PubMed

    Ooi, Chin Chun; Mantalas, Gary L; Koh, Winston; Neff, Norma F; Fuchigami, Teruaki; Wong, Dawson J; Wilson, Robert J; Park, Seung-Min; Gambhir, Sanjiv S; Quake, Stephen R; Wang, Shan X

    2017-01-01

    Single-cell characterization techniques, such as mRNA-seq, have been applied to a diverse range of applications in cancer biology, yielding great insight into mechanisms leading to therapy resistance and tumor clonality. While single-cell techniques can yield a wealth of information, a common bottleneck is the lack of throughput, with many current processing methods being limited to the analysis of small volumes of single cell suspensions with cell densities on the order of 107 per mL. In this work, we present a high-throughput full-length mRNA-seq protocol incorporating a magnetic sifter and magnetic nanoparticle-antibody conjugates for rare cell enrichment, and Smart-seq2 chemistry for sequencing. We evaluate the efficiency and quality of this protocol with a simulated circulating tumor cell system, whereby non-small-cell lung cancer cell lines (NCI-H1650 and NCI-H1975) are spiked into whole blood, before being enriched for single-cell mRNA-seq by EpCAM-functionalized magnetic nanoparticles and the magnetic sifter. We obtain high efficiency (> 90%) capture and release of these simulated rare cells via the magnetic sifter, with reproducible transcriptome data. In addition, while mRNA-seq data is typically only used for gene expression analysis of transcriptomic data, we demonstrate the use of full-length mRNA-seq chemistries like Smart-seq2 to facilitate variant analysis of expressed genes. This enables the use of mRNA-seq data for differentiating cells in a heterogeneous population by both their phenotypic and variant profile. In a simulated heterogeneous mixture of circulating tumor cells in whole blood, we utilize this high-throughput protocol to differentiate these heterogeneous cells by both their phenotype (lung cancer versus white blood cells), and mutational profile (H1650 versus H1975 cells), in a single sequencing run. This high-throughput method can help facilitate single-cell analysis of rare cell populations, such as circulating tumor or endothelial cells, with demonstrably high-quality transcriptomic data.

  8. High-throughput sequencing of microbial community diversity in soil, grapes, leaves, grape juice and wine of grapevine from China.

    PubMed

    Wei, Yu-Jie; Wu, Yun; Yan, Yin-Zhuo; Zou, Wan; Xue, Jie; Ma, Wen-Rui; Wang, Wei; Tian, Ge; Wang, Li-Ye

    2018-01-01

    In this study Illumina MiSeq was performed to investigate microbial diversity in soil, leaves, grape, grape juice and wine. A total of 1,043,102 fungal Internal Transcribed Spacer (ITS) reads and 2,422,188 high quality bacterial 16S rDNA sequences were used for taxonomic classification, revealed five fungal and eight bacterial phyla. At the genus level, the dominant fungi were Ascomycota, Sordariales, Tetracladium and Geomyces in soil, Aureobasidium and Pleosporaceae in grapes leaves, Aureobasidium in grape and grape juice. The dominant bacteria were Kaistobacter, Arthrobacter, Skermanella and Sphingomonas in soil, Pseudomonas, Acinetobacter and Kaistobacter in grape and grapes leaves, and Oenococcus in grape juice and wine. Principal coordinate analysis showed structural separation between the composition of fungi and bacteria in all samples. This is the first study to understand microbiome population in soil, grape, grapes leaves, grape juice and wine in Xinjiang through High-throughput Sequencing and identify microorganisms like Saccharomyces cerevisiae and Oenococcus spp. that may contribute to the quality and flavor of wine.

  9. High-throughput sequencing of microbial community diversity in soil, grapes, leaves, grape juice and wine of grapevine from China

    PubMed Central

    Yan, Yin-zhuo; Zou, Wan; Ma, Wen-rui; Wang, Wei; Tian, Ge; Wang, Li-ye

    2018-01-01

    In this study Illumina MiSeq was performed to investigate microbial diversity in soil, leaves, grape, grape juice and wine. A total of 1,043,102 fungal Internal Transcribed Spacer (ITS) reads and 2,422,188 high quality bacterial 16S rDNA sequences were used for taxonomic classification, revealed five fungal and eight bacterial phyla. At the genus level, the dominant fungi were Ascomycota, Sordariales, Tetracladium and Geomyces in soil, Aureobasidium and Pleosporaceae in grapes leaves, Aureobasidium in grape and grape juice. The dominant bacteria were Kaistobacter, Arthrobacter, Skermanella and Sphingomonas in soil, Pseudomonas, Acinetobacter and Kaistobacter in grape and grapes leaves, and Oenococcus in grape juice and wine. Principal coordinate analysis showed structural separation between the composition of fungi and bacteria in all samples. This is the first study to understand microbiome population in soil, grape, grapes leaves, grape juice and wine in Xinjiang through High-throughput Sequencing and identify microorganisms like Saccharomyces cerevisiae and Oenococcus spp. that may contribute to the quality and flavor of wine. PMID:29565999

  10. FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery

    PubMed Central

    Piazza, Rocco; Pirola, Alessandra; Spinelli, Roberta; Valletta, Simona; Redaelli, Sara; Magistroni, Vera; Gambacorti-Passerini, Carlo

    2012-01-01

    Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2–ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data. PMID:22570408

  11. FusionAnalyser: a new graphical, event-driven tool for fusion rearrangements discovery.

    PubMed

    Piazza, Rocco; Pirola, Alessandra; Spinelli, Roberta; Valletta, Simona; Redaelli, Sara; Magistroni, Vera; Gambacorti-Passerini, Carlo

    2012-09-01

    Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2-ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data.

  12. Library construction for next-generation sequencing: Overviews and challenges

    PubMed Central

    Head, Steven R.; Komori, H. Kiyomi; LaMere, Sarah A.; Whisenant, Thomas; Van Nieuwerburgh, Filip; Salomon, Daniel R.; Ordoukhanian, Phillip

    2014-01-01

    High-throughput sequencing, also known as next-generation sequencing (NGS), has revolutionized genomic research. In recent years, NGS technology has steadily improved, with costs dropping and the number and range of sequencing applications increasing exponentially. Here, we examine the critical role of sequencing library quality and consider important challenges when preparing NGS libraries from DNA and RNA sources. Factors such as the quantity and physical characteristics of the RNA or DNA source material as well as the desired application (i.e., genome sequencing, targeted sequencing, RNA-seq, ChIP-seq, RIP-seq, and methylation) are addressed in the context of preparing high quality sequencing libraries. In addition, the current methods for preparing NGS libraries from single cells are also discussed. PMID:24502796

  13. High-throughput neuroimaging-genetics computational infrastructure

    PubMed Central

    Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Hobel, Sam; Vespa, Paul; Woo Moon, Seok; Van Horn, John D.; Franco, Joseph; Toga, Arthur W.

    2014-01-01

    Many contemporary neuroscientific investigations face significant challenges in terms of data management, computational processing, data mining, and results interpretation. These four pillars define the core infrastructure necessary to plan, organize, orchestrate, validate, and disseminate novel scientific methods, computational resources, and translational healthcare findings. Data management includes protocols for data acquisition, archival, query, transfer, retrieval, and aggregation. Computational processing involves the necessary software, hardware, and networking infrastructure required to handle large amounts of heterogeneous neuroimaging, genetics, clinical, and phenotypic data and meta-data. Data mining refers to the process of automatically extracting data features, characteristics and associations, which are not readily visible by human exploration of the raw dataset. Result interpretation includes scientific visualization, community validation of findings and reproducible findings. In this manuscript we describe the novel high-throughput neuroimaging-genetics computational infrastructure available at the Institute for Neuroimaging and Informatics (INI) and the Laboratory of Neuro Imaging (LONI) at University of Southern California (USC). INI and LONI include ultra-high-field and standard-field MRI brain scanners along with an imaging-genetics database for storing the complete provenance of the raw and derived data and meta-data. In addition, the institute provides a large number of software tools for image and shape analysis, mathematical modeling, genomic sequence processing, and scientific visualization. A unique feature of this architecture is the Pipeline environment, which integrates the data management, processing, transfer, and visualization. Through its client-server architecture, the Pipeline environment provides a graphical user interface for designing, executing, monitoring validating, and disseminating of complex protocols that utilize diverse suites of software tools and web-services. These pipeline workflows are represented as portable XML objects which transfer the execution instructions and user specifications from the client user machine to remote pipeline servers for distributed computing. Using Alzheimer's and Parkinson's data, we provide several examples of translational applications using this infrastructure1. PMID:24795619

  14. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences

    PubMed Central

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong

    2015-01-01

    Abstract We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate—slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory. PMID:25549288

  15. New approach for the study of mite reproduction: the first transcriptome analysis of a mite, Phytoseiulus persimilis (Acari: Phytoseiidae)

    USDA-ARS?s Scientific Manuscript database

    Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yie...

  16. Evaluation of a new modified QuEChERS method for the monitoring of carbamate residues in high-fat cheeses by using UHPLC-MS/MS.

    PubMed

    Hamed, Ahmed M; Moreno-González, David; Gámiz-Gracia, Laura; García-Campaña, Ana M

    2017-01-01

    A simple and efficient method for the determination of 28 carbamates in high-fat cheeses is proposed. The methodology is based on a modified quick, easy, cheap, effective, rugged, and safe procedure as sample treatment using a new sorbent (Z-Sep + ) followed by ultra-high performance liquid chromatography with tandem mass spectrometry determination. The method has been validated in different kinds of cheese (Gorgonzola, Roquefort, and Camembert), achieving recoveries of 70-115%, relative standard deviations lower than 13% and limits of quantification lower than 5.4 μg/kg, below the maximum residue levels tolerated for these compounds by the European legislation. The matrix effect was lower than ±30% for all the studied pesticides. The combination of ultra-high performance liquid chromatography and tandem mass spectrometry with this modified quick, easy, cheap, effective, rugged, and safe procedure using Z-Sep + allowed a high sample throughput and an efficient cleaning of extracts for the control of these residues in cheeses with a high fat content. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  17. Analysis of bacterial and fungal communities in Marcha and Thiat, traditionally prepared amylolytic starters of India.

    PubMed

    Sha, Shankar Prasad; Jani, Kunal; Sharma, Avinash; Anupma, Anu; Pradhan, Pooja; Shouche, Yogesh; Tamang, Jyoti Prakash

    2017-09-08

    Marcha and thiat are traditionally prepared amylolytic starters use for production of various ethnic alcoholic beverages in Sikkim and Meghalaya states in India. In the present study we have tried to investigate the bacterial and fungal community composition of marcha and thiat by using high throughput sequencing. Characterization of bacterial community depicts phylum Proteobacteria is the most dominant in both marcha (91.4%) and thiat (53.8%), followed by Firmicutes, and Actinobacteria. Estimates of fungal community composition showed Ascomycota as the dominant phylum. Presence of Zygomycota in marcha distinguishes it from the thiat. The results of NGS analysis revealed dominance of yeasts in marcha whereas molds out numbers in case of thiat. This is the first report on microbial communities of traditionally prepared amylolytic starters of India using high throughput sequencing.

  18. The main challenges that remain in applying high-throughput sequencing to clinical diagnostics.

    PubMed

    Loeffelholz, Michael; Fofanov, Yuriy

    2015-01-01

    Over the last 10 years, the quality, price and availability of high-throughput sequencing instruments have improved to the point that this technology may be close to becoming a routine tool in the diagnostic microbiology laboratory. Two groups of challenges, however, have to be resolved in order to move this powerful research technology into routine use in the clinical microbiology laboratory. The computational/bioinformatics challenges include data storage cost and privacy concerns, requiring analysis to be performed without access to cloud storage or expensive computational infrastructure. The logistical challenges include interpretation of complex results and acceptance and understanding of the advantages and limitations of this technology by the medical community. This article focuses on the approaches to address these challenges, such as file formats, algorithms, data collection, reporting and good laboratory practices.

  19. Mapping specificity landscapes of RNA-protein interactions by high throughput sequencing.

    PubMed

    Jankowsky, Eckhard; Harris, Michael E

    2017-04-15

    To function in a biological setting, RNA binding proteins (RBPs) have to discriminate between alternative binding sites in RNAs. This discrimination can occur in the ground state of an RNA-protein binding reaction, in its transition state, or in both. The extent by which RBPs discriminate at these reaction states defines RBP specificity landscapes. Here, we describe the HiTS-Kin and HiTS-EQ techniques, which combine kinetic and equilibrium binding experiments with high throughput sequencing to quantitatively assess substrate discrimination for large numbers of substrate variants at ground and transition states of RNA-protein binding reactions. We discuss experimental design, practical considerations and data analysis and outline how a combination of HiTS-Kin and HiTS-EQ allows the mapping of RBP specificity landscapes. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. The Grapevine and Wine Microbiome: Insights from High-Throughput Amplicon Sequencing

    PubMed Central

    Morgan, Horatio H.; du Toit, Maret; Setati, Mathabatha E.

    2017-01-01

    From the time when microbial activity in wine fermentation was first demonstrated, the microbial ecology of the vineyard, grape, and wine has been extensively investigated using culture-based methods. However, the last 2 decades have been characterized by an important change in the approaches used for microbial examination, due to the introduction of DNA-based community fingerprinting methods such as DGGE, SSCP, T-RFLP, and ARISA. These approaches allowed for the exploration of microbial community structures without the need to cultivate, and have been extensively applied to decipher the microbial populations associated with the grapevine as well as the microbial dynamics throughout grape berry ripening and wine fermentation. These techniques are well-established for the rapid more sensitive profiling of microbial communities; however, they often do not provide direct taxonomic information and possess limited ability to detect the presence of rare taxa and taxa with low abundance. Consequently, the past 5 years have seen an upsurge in the application of high-throughput sequencing methods for the in-depth assessment of the grapevine and wine microbiome. Although a relatively new approach in wine sciences, these methods reveal a considerably greater diversity than previously reported, and identified several species that had not yet been reported. The aim of the current review is to highlight the contribution of high-throughput next generation sequencing and metagenomics approaches to vineyard microbial ecology especially unraveling the influence of vineyard management practices on microbial diversity. PMID:28553266

Top