high-throughput sequencing platform: Topics by Science.gov

Sample records for high-throughput sequencing platform

[Current applications of high-throughput DNA sequencing technology in antibody drug research].

PubMed

Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

2012-03-01

Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing.

PubMed

Fang, Chao; Zhong, Huanzi; Lin, Yuxiang; Chen, Bing; Han, Mo; Ren, Huahui; Lu, Haorong; Luber, Jacob M; Xia, Min; Li, Wangsheng; Stein, Shayna; Xu, Xun; Zhang, Wenwei; Drmanac, Radoje; Wang, Jian; Yang, Huanming; Hammarström, Lennart; Kostic, Aleksandar D; Kristiansen, Karsten; Li, Junhua

2018-03-01

More extensive use of metagenomic shotgun sequencing in microbiome research relies on the development of high-throughput, cost-effective sequencing. Here we present a comprehensive evaluation of the performance of the new high-throughput sequencing platform BGISEQ-500 for metagenomic shotgun sequencing and compare its performance with that of 2 Illumina platforms. Using fecal samples from 20 healthy individuals, we evaluated the intra-platform reproducibility for metagenomic sequencing on the BGISEQ-500 platform in a setup comprising 8 library replicates and 8 sequencing replicates. Cross-platform consistency was evaluated by comparing 20 pairwise replicates on the BGISEQ-500 platform vs the Illumina HiSeq 2000 platform and the Illumina HiSeq 4000 platform. In addition, we compared the performance of the 2 Illumina platforms against each other. By a newly developed overall accuracy quality control method, an average of 82.45 million high-quality reads (96.06% of raw reads) per sample, with 90.56% of bases scoring Q30 and above, was obtained using the BGISEQ-500 platform. Quantitative analyses revealed extremely high reproducibility between BGISEQ-500 intra-platform replicates. Cross-platform replicates differed slightly more than intra-platform replicates, yet a high consistency was observed. Only a low percentage (2.02%-3.25%) of genes exhibited significant differences in relative abundance comparing the BGISEQ-500 and HiSeq platforms, with a bias toward genes with higher GC content being enriched on the HiSeq platforms. Our study provides the first set of performance metrics for human gut metagenomic sequencing data using BGISEQ-500. The high accuracy and technical reproducibility confirm the applicability of the new platform for metagenomic studies, though caution is still warranted when combining metagenomic data from different platforms.
Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leung, Elo; Huang, Amy; Cadag, Eithon

In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

DOE PAGES

Leung, Elo; Huang, Amy; Cadag, Eithon; ...

2016-01-20

In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Non-biological synthetic spike-in controls and the AMPtk software pipeline improve mycobiome data

Treesearch

Jonathan M. Palmer; Michelle A. Jusino; Mark T. Banik; Daniel L. Lindner

2018-01-01

High-throughput amplicon sequencing (HTAS) of conserved DNA regions is a powerful technique to characterize microbial communities. Recently, spike-in mock communities have been used to measure accuracy of sequencing platforms and data analysis pipelines. To assess the ability of sequencing platforms and data processing pipelines using fungal internal transcribed spacer...
Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

PubMed

Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

2017-01-01

Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.
A new fungal large subunit ribosomal RNA primer for high throughput sequencing surveys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mueller, Rebecca C.; Gallegos-Graves, La Verne; Kuske, Cheryl R.

The inclusion of phylogenetic metrics in community ecology has provided insights into important ecological processes, particularly when combined with high-throughput sequencing methods; however, these approaches have not been widely used in studies of fungal communities relative to other microbial groups. Two obstacles have been considered: (1) the internal transcribed spacer (ITS) region has limited utility for constructing phylogenies and (2) most PCR primers that target the large subunit (LSU) ribosomal unit generate amplicons that exceed current limits of high-throughput sequencing platforms. We designed and tested a PCR primer (LR22R) to target approximately 300–400 bp region of the D2 hypervariable regionmore » of the fungal LSU for use with the Illumina MiSeq platform. Both in silico and empirical analyses showed that the LR22R–LR3 pair captured a broad range of fungal taxonomic groups with a small fraction of non-fungal groups. Phylogenetic placement of publically available LSU D2 sequences showed broad agreement with taxonomic classification. Comparisons of the LSU D2 and the ITS2 ribosomal regions from environmental samples and known communities showed similar discriminatory abilities of the two primer sets. Altogether, these findings show that the LR22R–LR3 primer pair has utility for phylogenetic analyses of fungal communities using high-throughput sequencing methods.« less
A new fungal large subunit ribosomal RNA primer for high throughput sequencing surveys

DOE PAGES

Mueller, Rebecca C.; Gallegos-Graves, La Verne; Kuske, Cheryl R.

2015-12-09

The inclusion of phylogenetic metrics in community ecology has provided insights into important ecological processes, particularly when combined with high-throughput sequencing methods; however, these approaches have not been widely used in studies of fungal communities relative to other microbial groups. Two obstacles have been considered: (1) the internal transcribed spacer (ITS) region has limited utility for constructing phylogenies and (2) most PCR primers that target the large subunit (LSU) ribosomal unit generate amplicons that exceed current limits of high-throughput sequencing platforms. We designed and tested a PCR primer (LR22R) to target approximately 300–400 bp region of the D2 hypervariable regionmore » of the fungal LSU for use with the Illumina MiSeq platform. Both in silico and empirical analyses showed that the LR22R–LR3 pair captured a broad range of fungal taxonomic groups with a small fraction of non-fungal groups. Phylogenetic placement of publically available LSU D2 sequences showed broad agreement with taxonomic classification. Comparisons of the LSU D2 and the ITS2 ribosomal regions from environmental samples and known communities showed similar discriminatory abilities of the two primer sets. Altogether, these findings show that the LR22R–LR3 primer pair has utility for phylogenetic analyses of fungal communities using high-throughput sequencing methods.« less
Development of a high-throughput SNP resource to advance genomic, genetic and breeding research in carrot (Daucus carota L.)

USDA-ARS?s Scientific Manuscript database

The rapid advancement in high-throughput SNP genotyping technologies along with next generation sequencing (NGS) platforms has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops with limited genomic resources such as carrot (Daucus carota) to access t...
Enabling systematic interrogation of protein-protein interactions in live cells with a versatile ultra-high-throughput biosensor platform | Office of Cancer Genomics

Cancer.gov

The vast datasets generated by next generation gene sequencing and expression profiling have transformed biological and translational research. However, technologies to produce large-scale functional genomics datasets, such as high-throughput detection of protein-protein interactions (PPIs), are still in early development. While a number of powerful technologies have been employed to detect PPIs, a singular PPI biosensor platform featured with both high sensitivity and robustness in a mammalian cell environment remains to be established.
'PACLIMS': a component LIM system for high-throughput functional genomic analysis.

PubMed

Donofrio, Nicole; Rajagopalon, Ravi; Brown, Douglas; Diener, Stephen; Windham, Donald; Nolin, Shelly; Floyd, Anna; Mitchell, Thomas; Galadima, Natalia; Tucker, Sara; Orbach, Marc J; Patel, Gayatri; Farman, Mark; Pampanwar, Vishal; Soderlund, Cari; Lee, Yong-Hwan; Dean, Ralph A

2005-04-12

Recent advances in sequencing techniques leading to cost reduction have resulted in the generation of a growing number of sequenced eukaryotic genomes. Computational tools greatly assist in defining open reading frames and assigning tentative annotations. However, gene functions cannot be asserted without biological support through, among other things, mutational analysis. In taking a genome-wide approach to functionally annotate an entire organism, in this application the approximately 11,000 predicted genes in the rice blast fungus (Magnaporthe grisea), an effective platform for tracking and storing both the biological materials created and the data produced across several participating institutions was required. The platform designed, named PACLIMS, was built to support our high throughput pipeline for generating 50,000 random insertion mutants of Magnaporthe grisea. To be a useful tool for materials and data tracking and storage, PACLIMS was designed to be simple to use, modifiable to accommodate refinement of research protocols, and cost-efficient. Data entry into PACLIMS was simplified through the use of barcodes and scanners, thus reducing the potential human error, time constraints, and labor. This platform was designed in concert with our experimental protocol so that it leads the researchers through each step of the process from mutant generation through phenotypic assays, thus ensuring that every mutant produced is handled in an identical manner and all necessary data is captured. Many sequenced eukaryotes have reached the point where computational analyses are no longer sufficient and require biological support for their predicted genes. Consequently, there is an increasing need for platforms that support high throughput genome-wide mutational analyses. While PACLIMS was designed specifically for this project, the source and ideas present in its implementation can be used as a model for other high throughput mutational endeavors.
'PACLIMS': A component LIM system for high-throughput functional genomic analysis

PubMed Central

Donofrio, Nicole; Rajagopalon, Ravi; Brown, Douglas; Diener, Stephen; Windham, Donald; Nolin, Shelly; Floyd, Anna; Mitchell, Thomas; Galadima, Natalia; Tucker, Sara; Orbach, Marc J; Patel, Gayatri; Farman, Mark; Pampanwar, Vishal; Soderlund, Cari; Lee, Yong-Hwan; Dean, Ralph A

2005-01-01

Background Recent advances in sequencing techniques leading to cost reduction have resulted in the generation of a growing number of sequenced eukaryotic genomes. Computational tools greatly assist in defining open reading frames and assigning tentative annotations. However, gene functions cannot be asserted without biological support through, among other things, mutational analysis. In taking a genome-wide approach to functionally annotate an entire organism, in this application the ~11,000 predicted genes in the rice blast fungus (Magnaporthe grisea), an effective platform for tracking and storing both the biological materials created and the data produced across several participating institutions was required. Results The platform designed, named PACLIMS, was built to support our high throughput pipeline for generating 50,000 random insertion mutants of Magnaporthe grisea. To be a useful tool for materials and data tracking and storage, PACLIMS was designed to be simple to use, modifiable to accommodate refinement of research protocols, and cost-efficient. Data entry into PACLIMS was simplified through the use of barcodes and scanners, thus reducing the potential human error, time constraints, and labor. This platform was designed in concert with our experimental protocol so that it leads the researchers through each step of the process from mutant generation through phenotypic assays, thus ensuring that every mutant produced is handled in an identical manner and all necessary data is captured. Conclusion Many sequenced eukaryotes have reached the point where computational analyses are no longer sufficient and require biological support for their predicted genes. Consequently, there is an increasing need for platforms that support high throughput genome-wide mutational analyses. While PACLIMS was designed specifically for this project, the source and ideas present in its implementation can be used as a model for other high throughput mutational endeavors. PMID:15826298
Developing High-Throughput HIV Incidence Assay with Pyrosequencing Platform

PubMed Central

Park, Sung Yong; Goeken, Nolan; Lee, Hyo Jin; Bolan, Robert; Dubé, Michael P.

2014-01-01

ABSTRACT Human immunodeficiency virus (HIV) incidence is an important measure for monitoring the epidemic and evaluating the efficacy of intervention and prevention trials. This study developed a high-throughput, single-measure incidence assay by implementing a pyrosequencing platform. We devised a signal-masking bioinformatics pipeline, which yielded a process error rate of 5.8 × 10−4 per base. The pipeline was then applied to analyze 18,434 envelope gene segments (HXB2 7212 to 7601) obtained from 12 incident and 24 chronic patients who had documented HIV-negative and/or -positive tests. The pyrosequencing data were cross-checked by using the single-genome-amplification (SGA) method to independently obtain 302 sequences from 13 patients. Using two genomic biomarkers that probe for the presence of similar sequences, the pyrosequencing platform correctly classified all 12 incident subjects (100% sensitivity) and 23 of 24 chronic subjects (96% specificity). One misclassified subject's chronic infection was correctly classified by conducting the same analysis with SGA data. The biomarkers were statistically associated across the two platforms, suggesting the assay's reproducibility and robustness. Sampling simulations showed that the biomarkers were tolerant of sequencing errors and template resampling, two factors most likely to affect the accuracy of pyrosequencing results. We observed comparable biomarker scores between AIDS and non-AIDS chronic patients (multivariate analysis of variance [MANOVA], P = 0.12), indicating that the stage of HIV disease itself does not affect the classification scheme. The high-throughput genomic HIV incidence marks a significant step toward determining incidence from a single measure in cross-sectional surveys. IMPORTANCE Annual HIV incidence, the number of newly infected individuals within a year, is the key measure of monitoring the epidemic's rise and decline. Developing reliable assays differentiating recent from chronic infections has been a long-standing quest in the HIV community. Over the past 15 years, these assays have traditionally measured various HIV-specific antibodies, but recent technological advancements have expanded the diversity of proposed accurate, user-friendly, and financially viable tools. Here we designed a high-throughput genomic HIV incidence assay based on the signature imprinted in the HIV gene sequence population. By combining next-generation sequencing techniques with bioinformatics analysis, we demonstrated that genomic fingerprints are capable of distinguishing recently infected patients from chronically infected patients with high precision. Our high-throughput platform is expected to allow us to process many patients' samples from a single experiment, permitting the assay to be cost-effective for routine surveillance. PMID:24371062
PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results.

PubMed

He, Ji; Dai, Xinbin; Zhao, Xuechun

2007-02-09

BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of personal interest in favorite categories, (6) automated sequence annotation (such as NCBI NR and ontology-based annotation). PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results. The PLAN web interface is platform-independent, easily configurable and capable of comprehensive expansion, and user-intuitive. PLAN is freely available to academic users at http://bioinfo.noble.org/plan/. The source code for local deployment is provided under free license. Full support on system utilization, installation, configuration and customization are provided to academic users.
PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

PubMed Central

He, Ji; Dai, Xinbin; Zhao, Xuechun

2007-01-01

Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Results Personal BLAST Navigator (PLAN) is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1) query and target sequence database management, (2) automated high-throughput BLAST searching, (3) indexing and searching of results, (4) filtering results online, (5) managing results of personal interest in favorite categories, (6) automated sequence annotation (such as NCBI NR and ontology-based annotation). PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. Conclusion PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results. The PLAN web interface is platform-independent, easily configurable and capable of comprehensive expansion, and user-intuitive. PLAN is freely available to academic users at . The source code for local deployment is provided under free license. Full support on system utilization, installation, configuration and customization are provided to academic users. PMID:17291345
Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches.

PubMed

Logares, Ramiro; Haverkamp, Thomas H A; Kumar, Surendra; Lanzén, Anders; Nederbragt, Alexander J; Quince, Christopher; Kauserud, Håvard

2012-10-01

The incursion of High-Throughput Sequencing (HTS) in environmental microbiology brings unique opportunities and challenges. HTS now allows a high-resolution exploration of the vast taxonomic and metabolic diversity present in the microbial world, which can provide an exceptional insight on global ecosystem functioning, ecological processes and evolution. This exploration has also economic potential, as we will have access to the evolutionary innovation present in microbial metabolisms, which could be used for biotechnological development. HTS is also challenging the research community, and the current bottleneck is present in the data analysis side. At the moment, researchers are in a sequence data deluge, with sequencing throughput advancing faster than the computer power needed for data analysis. However, new tools and approaches are being developed constantly and the whole process could be depicted as a fast co-evolution between sequencing technology, informatics and microbiologists. In this work, we examine the most popular and recently commercialized HTS platforms as well as bioinformatics methods for data handling and analysis used in microbial metagenomics. This non-exhaustive review is intended to serve as a broad state-of-the-art guide to researchers expanding into this rapidly evolving field. Copyright © 2012 Elsevier B.V. All rights reserved.
The high throughput biomedicine unit at the institute for molecular medicine Finland: high throughput screening meets precision medicine.

PubMed

Pietiainen, Vilja; Saarela, Jani; von Schantz, Carina; Turunen, Laura; Ostling, Paivi; Wennerberg, Krister

2014-05-01

The High Throughput Biomedicine (HTB) unit at the Institute for Molecular Medicine Finland FIMM was established in 2010 to serve as a national and international academic screening unit providing access to state of the art instrumentation for chemical and RNAi-based high throughput screening. The initial focus of the unit was multiwell plate based chemical screening and high content microarray-based siRNA screening. However, over the first four years of operation, the unit has moved to a more flexible service platform where both chemical and siRNA screening is performed at different scales primarily in multiwell plate-based assays with a wide range of readout possibilities with a focus on ultraminiaturization to allow for affordable screening for the academic users. In addition to high throughput screening, the equipment of the unit is also used to support miniaturized, multiplexed and high throughput applications for other types of research such as genomics, sequencing and biobanking operations. Importantly, with the translational research goals at FIMM, an increasing part of the operations at the HTB unit is being focused on high throughput systems biological platforms for functional profiling of patient cells in personalized and precision medicine projects.
High-Throughput Tabular Data Processor - Platform independent graphical tool for processing large data sets.

PubMed

Madanecki, Piotr; Bałut, Magdalena; Buckley, Patrick G; Ochocka, J Renata; Bartoszewski, Rafał; Crossman, David K; Messiaen, Ludwine M; Piotrowski, Arkadiusz

2018-01-01

High-throughput technologies generate considerable amount of data which often requires bioinformatic expertise to analyze. Here we present High-Throughput Tabular Data Processor (HTDP), a platform independent Java program. HTDP works on any character-delimited column data (e.g. BED, GFF, GTF, PSL, WIG, VCF) from multiple text files and supports merging, filtering and converting of data that is produced in the course of high-throughput experiments. HTDP can also utilize itemized sets of conditions from external files for complex or repetitive filtering/merging tasks. The program is intended to aid global, real-time processing of large data sets using a graphical user interface (GUI). Therefore, no prior expertise in programming, regular expression, or command line usage is required of the user. Additionally, no a priori assumptions are imposed on the internal file composition. We demonstrate the flexibility and potential of HTDP in real-life research tasks including microarray and massively parallel sequencing, i.e. identification of disease predisposing variants in the next generation sequencing data as well as comprehensive concurrent analysis of microarray and sequencing results. We also show the utility of HTDP in technical tasks including data merge, reduction and filtering with external criteria files. HTDP was developed to address functionality that is missing or rudimentary in other GUI software for processing character-delimited column data from high-throughput technologies. Flexibility, in terms of input file handling, provides long term potential functionality in high-throughput analysis pipelines, as the program is not limited by the currently existing applications and data formats. HTDP is available as the Open Source software (https://github.com/pmadanecki/htdp).
High-Throughput Tabular Data Processor – Platform independent graphical tool for processing large data sets

PubMed Central

Bałut, Magdalena; Buckley, Patrick G.; Ochocka, J. Renata; Bartoszewski, Rafał; Crossman, David K.; Messiaen, Ludwine M.; Piotrowski, Arkadiusz

2018-01-01

High-throughput technologies generate considerable amount of data which often requires bioinformatic expertise to analyze. Here we present High-Throughput Tabular Data Processor (HTDP), a platform independent Java program. HTDP works on any character-delimited column data (e.g. BED, GFF, GTF, PSL, WIG, VCF) from multiple text files and supports merging, filtering and converting of data that is produced in the course of high-throughput experiments. HTDP can also utilize itemized sets of conditions from external files for complex or repetitive filtering/merging tasks. The program is intended to aid global, real-time processing of large data sets using a graphical user interface (GUI). Therefore, no prior expertise in programming, regular expression, or command line usage is required of the user. Additionally, no a priori assumptions are imposed on the internal file composition. We demonstrate the flexibility and potential of HTDP in real-life research tasks including microarray and massively parallel sequencing, i.e. identification of disease predisposing variants in the next generation sequencing data as well as comprehensive concurrent analysis of microarray and sequencing results. We also show the utility of HTDP in technical tasks including data merge, reduction and filtering with external criteria files. HTDP was developed to address functionality that is missing or rudimentary in other GUI software for processing character-delimited column data from high-throughput technologies. Flexibility, in terms of input file handling, provides long term potential functionality in high-throughput analysis pipelines, as the program is not limited by the currently existing applications and data formats. HTDP is available as the Open Source software (https://github.com/pmadanecki/htdp). PMID:29432475
CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing.

PubMed

Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian

2011-08-30

Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.

Towards Clinical Molecular Diagnosis of Inherited Cardiac Conditions: A Comparison of Bench-Top Genome DNA Sequencers

PubMed Central

Wilkinson, Samuel L.; John, Shibu; Walsh, Roddy; Novotny, Tomas; Valaskova, Iveta; Gupta, Manu; Game, Laurence; Barton, Paul J R.; Cook, Stuart A.; Ware, James S.

2013-01-01

Background Molecular genetic testing is recommended for diagnosis of inherited cardiac disease, to guide prognosis and treatment, but access is often limited by cost and availability. Recently introduced high-throughput bench-top DNA sequencing platforms have the potential to overcome these limitations. Methodology/Principal Findings We evaluated two next-generation sequencing (NGS) platforms for molecular diagnostics. The protein-coding regions of six genes associated with inherited arrhythmia syndromes were amplified from 15 human samples using parallelised multiplex PCR (Access Array, Fluidigm), and sequenced on the MiSeq (Illumina) and Ion Torrent PGM (Life Technologies). Overall, 97.9% of the target was sequenced adequately for variant calling on the MiSeq, and 96.8% on the Ion Torrent PGM. Regions missed tended to be of high GC-content, and most were problematic for both platforms. Variant calling was assessed using 107 variants detected using Sanger sequencing: within adequately sequenced regions, variant calling on both platforms was highly accurate (Sensitivity: MiSeq 100%, PGM 99.1%. Positive predictive value: MiSeq 95.9%, PGM 95.5%). At the time of the study the Ion Torrent PGM had a lower capital cost and individual runs were cheaper and faster. The MiSeq had a higher capacity (requiring fewer runs), with reduced hands-on time and simpler laboratory workflows. Both provide significant cost and time savings over conventional methods, even allowing for adjunct Sanger sequencing to validate findings and sequence exons missed by NGS. Conclusions/Significance MiSeq and Ion Torrent PGM both provide accurate variant detection as part of a PCR-based molecular diagnostic workflow, and provide alternative platforms for molecular diagnosis of inherited cardiac conditions. Though there were performance differences at this throughput, platforms differed primarily in terms of cost, scalability, protocol stability and ease of use. Compared with current molecular genetic diagnostic tests for inherited cardiac arrhythmias, these NGS approaches are faster, less expensive, and yet more comprehensive. PMID:23861798
"First generation" automated DNA sequencing technology.

PubMed

Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

2011-10-01

Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
GenomicTools: a computational platform for developing high-throughput analytics in genomics.

PubMed

Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

2012-01-15

Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.
An innovative SNP genotyping method adapting to multiple platforms and throughputs.

PubMed

Long, Y M; Chao, W S; Ma, G J; Xu, S S; Qi, L L

2017-03-01

An innovative genotyping method designated as semi-thermal asymmetric reverse PCR (STARP) was developed for genotyping individual SNPs with improved accuracy, flexible throughputs, low operational costs, and high platform compatibility. Multiplex chip-based technology for genome-scale genotyping of single nucleotide polymorphisms (SNPs) has made great progress in the past two decades. However, PCR-based genotyping of individual SNPs still remains problematic in accuracy, throughput, simplicity, and/or operational costs as well as the compatibility with multiple platforms. Here, we report a novel SNP genotyping method designated semi-thermal asymmetric reverse PCR (STARP). In this method, genotyping assay was performed under unique PCR conditions using two universal priming element-adjustable primers (PEA-primers) and one group of three locus-specific primers: two asymmetrically modified allele-specific primers (AMAS-primers) and their common reverse primer. The two AMAS-primers each were substituted one base in different positions at their 3' regions to significantly increase the amplification specificity of the two alleles and tailed at 5' ends to provide priming sites for PEA-primers. The two PEA-primers were developed for common use in all genotyping assays to stringently target the PCR fragments generated by the two AMAS-primers with similar PCR efficiencies and for flexible detection using either gel-free fluorescence signals or gel-based size separation. The state-of-the-art primer design and unique PCR conditions endowed STARP with all the major advantages of high accuracy, flexible throughputs, simple assay design, low operational costs, and platform compatibility. In addition to SNPs, STARP can also be employed in genotyping of indels (insertion-deletion polymorphisms). As vast variations in DNA sequences are being unearthed by many genome sequencing projects and genotyping by sequencing, STARP will have wide applications across all biological organisms in agriculture, medicine, and forensics.
CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

PubMed Central

2011-01-01

Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105
Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; ...

2015-04-14

During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less
Mining conifers’ mega-genome using rapid and efficient multiplexed high-throughput genotyping-by-sequencing (GBS) SNP discovery platform

USDA-ARS?s Scientific Manuscript database

Next-generation sequencing (NGS) technologies are revolutionizing both medical and biological research through generation of massive SNP data sets for identifying heritable genome variation underlying key traits, from rare human diseases to important agronomic phenotypes in crop species. We evaluate...
Evaluation of sequencing approaches for high-throughput ...

EPA Pesticide Factsheets

Whole-genome in vitro transcriptomics has shown the capability to identify mechanisms of action and estimates of potency for chemical-mediated effects in a toxicological framework, but with limited throughput and high cost. We present the evaluation of three toxicogenomics platforms for potential application to high-throughput screening: 1. TempO-Seq utilizing custom designed paired probes per gene; 2. Targeted sequencing (TSQ) utilizing Illumina’s TruSeq RNA Access Library Prep Kit containing tiled exon-specific probe sets; 3. Low coverage whole transcriptome sequencing (LSQ) using Illumina’s TruSeq Stranded mRNA Kit. Each platform was required to cover the ~20,000 genes of the full transcriptome, operate directly with cell lysates, and be automatable with 384-well plates. Technical reproducibility was assessed using MAQC control RNA samples A and B, while functional utility for chemical screening was evaluated using six treatments at a single concentration after 6 hr in MCF7 breast cancer cells: 10 µM chlorpromazine, 10 µM ciclopriox, 10 µM genistein, 100 nM sirolimus, 1 µM tanespimycin, and 1 µM trichostatin A. All RNA samples and chemical treatments were run with 5 technical replicates. The three platforms achieved different read depths, with the TempO-Seq having ~34M mapped reads per sample, while TSQ and LSQ averaged 20M and 11M aligned reads per sample, respectively. Inter-replicate correlation averaged ≥0.95 for raw log2 expression values i
Next Generation MUT-MAP, a High-Sensitivity High-Throughput Microfluidics Chip-Based Mutation Analysis Panel

PubMed Central

Patel, Rajesh; Tsan, Alison; Sumiyoshi, Teiko; Fu, Ling; Desai, Rupal; Schoenbrunner, Nancy; Myers, Thomas W.; Bauer, Keith; Smith, Edward; Raja, Rajiv

2014-01-01

Molecular profiling of tumor tissue to detect alterations, such as oncogenic mutations, plays a vital role in determining treatment options in oncology. Hence, there is an increasing need for a robust and high-throughput technology to detect oncogenic hotspot mutations. Although commercial assays are available to detect genetic alterations in single genes, only a limited amount of tissue is often available from patients, requiring multiplexing to allow for simultaneous detection of mutations in many genes using low DNA input. Even though next-generation sequencing (NGS) platforms provide powerful tools for this purpose, they face challenges such as high cost, large DNA input requirement, complex data analysis, and long turnaround times, limiting their use in clinical settings. We report the development of the next generation mutation multi-analyte panel (MUT-MAP), a high-throughput microfluidic, panel for detecting 120 somatic mutations across eleven genes of therapeutic interest (AKT1, BRAF, EGFR, FGFR3, FLT3, HRAS, KIT, KRAS, MET, NRAS, and PIK3CA) using allele-specific PCR (AS-PCR) and Taqman technology. This mutation panel requires as little as 2 ng of high quality DNA from fresh frozen or 100 ng of DNA from formalin-fixed paraffin-embedded (FFPE) tissues. Mutation calls, including an automated data analysis process, have been implemented to run 88 samples per day. Validation of this platform using plasmids showed robust signal and low cross-reactivity in all of the newly added assays and mutation calls in cell line samples were found to be consistent with the Catalogue of Somatic Mutations in Cancer (COSMIC) database allowing for direct comparison of our platform to Sanger sequencing. High correlation with NGS when compared to the SuraSeq500 panel run on the Ion Torrent platform in a FFPE dilution experiment showed assay sensitivity down to 0.45%. This multiplexed mutation panel is a valuable tool for high-throughput biomarker discovery in personalized medicine and cancer drug development. PMID:24658394
PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data.

PubMed

Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho

2017-11-01

High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.
Simultaneous identification and molecular characterization of viruses associated with an apple tree with mosaic symptom

USDA-ARS?s Scientific Manuscript database

We conducted genomic sequencing to identify viruses associated with mosaic disease of an apple tree using the high-throughput sequencing (HTS) Illumina RNA-seq platform. The objective was to examine if rapid identification and characterization of viruses could be effectively achieved by RNA-seq anal...
Deep sequencing in library selection projects: what insight does it bring?

PubMed

Glanville, J; D'Angelo, S; Khan, T A; Reddy, S T; Naranjo, L; Ferrara, F; Bradbury, A R M

2015-08-01

High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. Copyright © 2015 Elsevier Ltd. All rights reserved.
Deep sequencing in library selection projects: what insight does it bring?

PubMed Central

Glanville, J; D’Angelo, S; Khan, T.A.; Reddy, S. T.; Naranjo, L.; Ferrara, F.; Bradbury, A.R.M.

2015-01-01

High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. PMID:26451649
A remark on copy number variation detection methods.

PubMed

Li, Shuo; Dou, Xialiang; Gao, Ruiqi; Ge, Xinzhou; Qian, Minping; Wan, Lin

2018-01-01

Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily.
Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

PubMed Central

Laehnemann, David; Borkhardt, Arndt

2016-01-01

Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159
Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges.

PubMed

Yin, Zekun; Lan, Haidong; Tan, Guangming; Lu, Mian; Vasilakos, Athanasios V; Liu, Weiguo

2017-01-01

The last decade has witnessed an explosion in the amount of available biological sequence data, due to the rapid progress of high-throughput sequencing projects. However, the biological data amount is becoming so great that traditional data analysis platforms and methods can no longer meet the need to rapidly perform data analysis tasks in life sciences. As a result, both biologists and computer scientists are facing the challenge of gaining a profound insight into the deepest biological functions from big biological data. This in turn requires massive computational resources. Therefore, high performance computing (HPC) platforms are highly needed as well as efficient and scalable algorithms that can take advantage of these platforms. In this paper, we survey the state-of-the-art HPC platforms for big biological data analytics. We first list the characteristics of big biological data and popular computing platforms. Then we provide a taxonomy of different biological data analysis applications and a survey of the way they have been mapped onto various computing platforms. After that, we present a case study to compare the efficiency of different computing platforms for handling the classical biological sequence alignment problem. At last we discuss the open issues in big biological data analytics.
High-throughput sequencing of three Lemnoideae (duckweeds) chloroplast genomes from total DNA.

PubMed

Wang, Wenqin; Messing, Joachim

2011-01-01

Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power.
High-Throughput Sequencing of Three Lemnoideae (Duckweeds) Chloroplast Genomes from Total DNA

PubMed Central

Wang, Wenqin; Messing, Joachim

2011-01-01

Background Chloroplast genomes provide a wealth of information for evolutionary and population genetic studies. Chloroplasts play a particularly important role in the adaption for aquatic plants because they float on water and their major surface is exposed continuously to sunlight. The subfamily of Lemnoideae represents such a collection of aquatic species that because of photosynthesis represents one of the fastest growing plant species on earth. Methods We sequenced the chloroplast genomes from three different genera of Lemnoideae, Spirodela polyrhiza, Wolffiella lingulata and Wolffia australiana by high-throughput DNA sequencing of genomic DNA using the SOLiD platform. Unfractionated total DNA contains high copies of plastid DNA so that sequences from the nucleus and mitochondria can easily be filtered computationally. Remaining sequence reads were assembled into contiguous sequences (contigs) using SOLiD software tools. Contigs were mapped to a reference genome of Lemna minor and gaps, selected by PCR, were sequenced on the ABI3730xl platform. Conclusions This combinatorial approach yielded whole genomic contiguous sequences in a cost-effective manner. Over 1,000-time coverage of chloroplast from total DNA were reached by the SOLiD platform in a single spot on a quadrant slide without purification. Comparative analysis indicated that the chloroplast genome was conserved in gene number and organization with respect to the reference genome of L. minor. However, higher nucleotide substitution, abundant deletions and insertions occurred in non-coding regions of these genomes, indicating a greater genomic dynamics than expected from the comparison of other related species in the Pooideae. Noticeably, there was no transition bias over transversion in Lemnoideae. The data should have immediate applications in evolutionary biology and plant taxonomy with increased resolution and statistical power. PMID:21931804
Accurate Sample Assignment in a Multiplexed, Ultrasensitive, High-Throughput Sequencing Assay for Minimal Residual Disease.

PubMed

Bartram, Jack; Mountjoy, Edward; Brooks, Tony; Hancock, Jeremy; Williamson, Helen; Wright, Gary; Moppett, John; Goulden, Nick; Hubank, Mike

2016-07-01

High-throughput sequencing (HTS) (next-generation sequencing) of the rearranged Ig and T-cell receptor genes promises to be less expensive and more sensitive than current methods of monitoring minimal residual disease (MRD) in patients with acute lymphoblastic leukemia. However, the adoption of new approaches by clinical laboratories requires careful evaluation of all potential sources of error and the development of strategies to ensure the highest accuracy. Timely and efficient clinical use of HTS platforms will depend on combining multiple samples (multiplexing) in each sequencing run. Here we examine the Ig heavy-chain gene HTS on the Illumina MiSeq platform for MRD. We identify errors associated with multiplexing that could potentially impact the accuracy of MRD analysis. We optimize a strategy that combines high-purity, sequence-optimized oligonucleotides, dual indexing, and an error-aware demultiplexing approach to minimize errors and maximize sensitivity. We present a probability-based, demultiplexing pipeline Error-Aware Demultiplexer that is suitable for all MiSeq strategies and accurately assigns samples to the correct identifier without excessive loss of data. Finally, using controls quantified by digital PCR, we show that HTS-MRD can accurately detect as few as 1 in 10(6) copies of specific leukemic MRD. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
Web-based visual analysis for high-throughput genomics

PubMed Central

2013-01-01

Background Visualization plays an essential role in genomics research by making it possible to observe correlations and trends in large datasets as well as communicate findings to others. Visual analysis, which combines visualization with analysis tools to enable seamless use of both approaches for scientific investigation, offers a powerful method for performing complex genomic analyses. However, there are numerous challenges that arise when creating rich, interactive Web-based visualizations/visual analysis applications for high-throughput genomics. These challenges include managing data flow from Web server to Web browser, integrating analysis tools and visualizations, and sharing visualizations with colleagues. Results We have created a platform simplifies the creation of Web-based visualization/visual analysis applications for high-throughput genomics. This platform provides components that make it simple to efficiently query very large datasets, draw common representations of genomic data, integrate with analysis tools, and share or publish fully interactive visualizations. Using this platform, we have created a Circos-style genome-wide viewer, a generic scatter plot for correlation analysis, an interactive phylogenetic tree, a scalable genome browser for next-generation sequencing data, and an application for systematically exploring tool parameter spaces to find good parameter values. All visualizations are interactive and fully customizable. The platform is integrated with the Galaxy (http://galaxyproject.org) genomics workbench, making it easy to integrate new visual applications into Galaxy. Conclusions Visualization and visual analysis play an important role in high-throughput genomics experiments, and approaches are needed to make it easier to create applications for these activities. Our framework provides a foundation for creating Web-based visualizations and integrating them into Galaxy. Finally, the visualizations we have created using the framework are useful tools for high-throughput genomics experiments. PMID:23758618

HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment1

PubMed Central

Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.

2016-01-01

Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175
Analysis of Illumina Microbial Assemblies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clum, Alicia; Foster, Brian; Froula, Jeff

2010-05-28

Since the emerging of second generation sequencing technologies, the evaluation of different sequencing approaches and their assembly strategies for different types of genomes has become an important undertaken. Next generation sequencing technologies dramatically increase sequence throughput while decreasing cost, making them an attractive tool for whole genome shotgun sequencing. To compare different approaches for de-novo whole genome assembly, appropriate tools and a solid understanding of both quantity and quality of the underlying sequence data are crucial. Here, we performed an in-depth analysis of short-read Illumina sequence assembly strategies for bacterial and archaeal genomes. Different types of Illumina libraries as wellmore » as different trim parameters and assemblers were evaluated. Results of the comparative analysis and sequencing platforms will be presented. The goal of this analysis is to develop a cost-effective approach for the increased throughput of the generation of high quality microbial genomes.« less
Dissecting enzyme function with microfluidic-based deep mutational scanning.

PubMed

Romero, Philip A; Tran, Tuan M; Abate, Adam R

2015-06-09

Natural enzymes are incredibly proficient catalysts, but engineering them to have new or improved functions is challenging due to the complexity of how an enzyme's sequence relates to its biochemical properties. Here, we present an ultrahigh-throughput method for mapping enzyme sequence-function relationships that combines droplet microfluidic screening with next-generation DNA sequencing. We apply our method to map the activity of millions of glycosidase sequence variants. Microfluidic-based deep mutational scanning provides a comprehensive and unbiased view of the enzyme function landscape. The mapping displays expected patterns of mutational tolerance and a strong correspondence to sequence variation within the enzyme family, but also reveals previously unreported sites that are crucial for glycosidase function. We modified the screening protocol to include a high-temperature incubation step, and the resulting thermotolerance landscape allowed the discovery of mutations that enhance enzyme thermostability. Droplet microfluidics provides a general platform for enzyme screening that, when combined with DNA-sequencing technologies, enables high-throughput mapping of enzyme sequence space.
Germplasm Management in the Post-genomics Era-a case study with lettuce

USDA-ARS?s Scientific Manuscript database

High-throughput genotyping platforms and next-generation sequencing technologies revolutionized our ways in germplasm characterization. In collaboration with UC Davis Genome Center, we completed a project of genotyping the entire cultivated lettuce (Lactuca sativa L.) collection of 1,066 accessions ...
A world of opportunities with nanopore sequencing.

PubMed

Leggett, Richard M; Clark, Matthew D

2017-11-28

Oxford Nanopore Technologies' MinION sequencer was launched in pre-release form in 2014 and represents an exciting new sequencing paradigm. The device offers multi-kilobase reads and a streamed mode of operation that allows processing of reads as they are generated. Crucially, it is an extremely compact device that is powered from the USB port of a laptop computer, enabling it to be taken out of the lab and facilitating previously impossible in-field sequencing experiments to be undertaken. Many of the initial publications concerning the platform focused on provision of tools to access and analyse the new sequence formats and then demonstrating the assembly of microbial genomes. More recently, as throughput and accuracy have increased, it has been possible to begin work involving more complex genomes and metagenomes. With the release of the high-throughput GridION X5 and PromethION platforms, the sequencing of large genomes will become more cost efficient, and enable the leveraging of extremely long (>100 kb) reads for resolution of complex genomic structures. This review provides a brief overview of nanopore sequencing technology, describes the growing range of nanopore bioinformatics tools, and highlights some of the most influential publications that have emerged over the last 2 years. Finally, we look to the future and the potential the platform has to disrupt work in human, microbiome, and plant genomics. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

USGS Publications Warehouse

Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

2013-01-01

SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

PubMed

Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

2013-01-01

SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).
Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

PubMed

Caruccio, Nicholas

2011-01-01

DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.
Use of the melting curve assay as a means for high-throughput quantification of Illumina sequencing libraries.

PubMed

Shinozuka, Hiroshi; Forster, John W

2016-01-01

Background. Multiplexed sequencing is commonly performed on massively parallel short-read sequencing platforms such as Illumina, and the efficiency of library normalisation can affect the quality of the output dataset. Although several library normalisation approaches have been established, none are ideal for highly multiplexed sequencing due to issues of cost and/or processing time. Methods. An inexpensive and high-throughput library quantification method has been developed, based on an adaptation of the melting curve assay. Sequencing libraries were subjected to the assay using the Bio-Rad Laboratories CFX Connect(TM) Real-Time PCR Detection System. The library quantity was calculated through summation of reduction of relative fluorescence units between 86 and 95 °C. Results.PCR-enriched sequencing libraries are suitable for this quantification without pre-purification of DNA. Short DNA molecules, which ideally should be eliminated from the library for subsequent processing, were differentiated from the target DNA in a mixture on the basis of differences in melting temperature. Quantification results for long sequences targeted using the melting curve assay were correlated with those from existing methods (R (2) > 0.77), and that observed from MiSeq sequencing (R (2) = 0.82). Discussion.The results of multiplexed sequencing suggested that the normalisation performance of the described method is equivalent to that of another recently reported high-throughput bead-based method, BeNUS. However, costs for the melting curve assay are considerably lower and processing times shorter than those of other existing methods, suggesting greater suitability for highly multiplexed sequencing applications.
Improved Selection of Internal Transcribed Spacer-Specific Primers Enables Quantitative, Ultra-High-Throughput Profiling of Fungal Communities

PubMed Central

Bokulich, Nicholas A.

2013-01-01

Ultra-high-throughput sequencing (HTS) of fungal communities has been restricted by short read lengths and primer amplification bias, slowing the adoption of newer sequencing technologies to fungal community profiling. To address these issues, we evaluated the performance of several common internal transcribed spacer (ITS) primers and designed a novel primer set and work flow for simultaneous quantification and species-level interrogation of fungal consortia. Primer comparison and validation were predicted in silico and by sequencing a “mock community” of mixed yeast species to explore the challenges of amplicon length and amplification bias for reconstructing defined yeast community structures. The amplicon size and distribution of this primer set are smaller than for all preexisting ITS primer sets, maximizing sequencing coverage of hypervariable ITS domains by very-short-amplicon, high-throughput sequencing platforms. This feature also enables the optional integration of quantitative PCR (qPCR) directly into the HTS preparatory work flow by substituting qPCR with these primers for standard PCR, yielding quantification of individual community members. The complete work flow described here, utilizing any of the qualified primer sets evaluated, can rapidly profile mixed fungal communities and capably reconstructed well-characterized beer and wine fermentation fungal communities. PMID:23377949
A novel ultra high-throughput 16S rRNA gene amplicon sequencing library preparation method for the Illumina HiSeq platform.

PubMed

de Muinck, Eric J; Trosvik, Pål; Gilfillan, Gregor D; Hov, Johannes R; Sundaram, Arvind Y M

2017-07-06

Advances in sequencing technologies and bioinformatics have made the analysis of microbial communities almost routine. Nonetheless, the need remains to improve on the techniques used for gathering such data, including increasing throughput while lowering cost and benchmarking the techniques so that potential sources of bias can be better characterized. We present a triple-index amplicon sequencing strategy to sequence large numbers of samples at significantly lower c ost and in a shorter timeframe compared to existing methods. The design employs a two-stage PCR protocol, incorpo rating three barcodes to each sample, with the possibility to add a fourth-index. It also includes heterogeneity spacers to overcome low complexity issues faced when sequencing amplicons on Illumina platforms. The library preparation method was extensively benchmarked through analysis of a mock community in order to assess biases introduced by sample indexing, number of PCR cycles, and template concentration. We further evaluated the method through re-sequencing of a standardized environmental sample. Finally, we evaluated our protocol on a set of fecal samples from a small cohort of healthy adults, demonstrating good performance in a realistic experimental setting. Between-sample variation was mainly related to batch effects, such as DNA extraction, while sample indexing was also a significant source of bias. PCR cycle number strongly influenced chimera formation and affected relative abundance estimates of species with high GC content. Libraries were sequenced using the Illumina HiSeq and MiSeq platforms to demonstrate that this protocol is highly scalable to sequence thousands of samples at a very low cost. Here, we provide the most comprehensive study of performance and bias inherent to a 16S rRNA gene amplicon sequencing method to date. Triple-indexing greatly reduces the number of long custom DNA oligos required for library preparation, while the inclusion of variable length heterogeneity spacers minimizes the need for PhiX spike-in. This design results in a significant cost reduction of highly multiplexed amplicon sequencing. The biases we characterize highlight the need for highly standardized protocols. Reassuringly, we find that the biological signal is a far stronger structuring factor than the various sources of bias.
WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data.

PubMed

Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

2010-07-01

High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users.
WebPrInSeS: automated full-length clone sequence identification and verification using high-throughput sequencing data

PubMed Central

Massouras, Andreas; Decouttere, Frederik; Hens, Korneel; Deplancke, Bart

2010-01-01

High-throughput sequencing (HTS) is revolutionizing our ability to obtain cheap, fast and reliable sequence information. Many experimental approaches are expected to benefit from the incorporation of such sequencing features in their pipeline. Consequently, software tools that facilitate such an incorporation should be of great interest. In this context, we developed WebPrInSeS, a web server tool allowing automated full-length clone sequence identification and verification using HTS data. WebPrInSeS encompasses two separate software applications. The first is WebPrInSeS-C which performs automated sequence verification of user-defined open-reading frame (ORF) clone libraries. The second is WebPrInSeS-E, which identifies positive hits in cDNA or ORF-based library screening experiments such as yeast one- or two-hybrid assays. Both tools perform de novo assembly using HTS data from any of the three major sequencing platforms. Thus, WebPrInSeS provides a highly integrated, cost-effective and efficient way to sequence-verify or identify clones of interest. WebPrInSeS is available at http://webprinses.epfl.ch/ and is open to all users. PMID:20501601
High-Throughput rRNA Gene Sequencing Reveals High and Complex Bacterial Diversity Associated with Brazilian Coffee Bean Fermentation

PubMed Central

Vinícius de Melo, Gilberto

2018-01-01

Summary Coffee bean fermentation is a spontaneous, on-farm process involving the action of different microbial groups, including bacteria and fungi. In this study, high-throughput sequencing approach was employed to study the diversity and dynamics of bacteria associated with Brazilian coffee bean fermentation. The total DNA from fermenting coffee samples was extracted at different time points, and the 16S rRNA gene with segments around the V4 variable region was sequenced by Illumina high-throughput platform. Using this approach, the presence of over eighty bacterial genera was determined, many of which have been detected for the first time during coffee bean fermentation, including Fructobacillus, Pseudonocardia, Pedobacter, Sphingomonas and Hymenobacter. The presence of Fructobacillus suggests an influence of these bacteria on fructose metabolism during coffee fermentation. Temporal analysis showed a strong dominance of lactic acid bacteria with over 97% of read sequences at the end of fermentation, mainly represented by the Leuconostoc and Lactococcus. Metabolism of lactic acid bacteria was associated with the high formation of lactic acid during fermentation, as determined by HPLC analysis. The results reported in this study confirm the underestimation of bacterial diversity associated with coffee fermentation. New microbial groups reported in this study may be explored as functional starter cultures for on-farm coffee processing.
A Multicenter Study To Evaluate the Performance of High-Throughput Sequencing for Virus Detection

PubMed Central

Ng, Siemon H. S.; Vandeputte, Olivier; Aljanahi, Aisha; Deyati, Avisek; Cassart, Jean-Pol; Charlebois, Robert L.; Taliaferro, Lanyn P.

2017-01-01

ABSTRACT The capability of high-throughput sequencing (HTS) for detection of known and unknown viruses makes it a powerful tool for broad microbial investigations, such as evaluation of novel cell substrates that may be used for the development of new biological products. However, like any new assay, regulatory applications of HTS need method standardization. Therefore, our three laboratories initiated a study to evaluate performance of HTS for potential detection of viral adventitious agents by spiking model viruses in different cellular matrices to mimic putative materials for manufacturing of biologics. Four model viruses were selected based upon different physical and biochemical properties and commercial availability: human respiratory syncytial virus (RSV), Epstein-Barr virus (EBV), feline leukemia virus (FeLV), and human reovirus (REO). Additionally, porcine circovirus (PCV) was tested by one laboratory. Independent samples were prepared for HTS by spiking intact viruses or extracted viral nucleic acids, singly or mixed, into different HeLa cell matrices (resuspended whole cells, cell lysate, or total cellular RNA). Data were obtained using different sequencing platforms (Roche 454, Illumina HiSeq1500 or HiSeq2500). Bioinformatic analyses were performed independently by each laboratory using available tools, pipelines, and databases. The results showed that comparable virus detection was obtained in the three laboratories regardless of sample processing, library preparation, sequencing platform, and bioinformatic analysis: between 0.1 and 3 viral genome copies per cell were detected for all of the model viruses used. This study highlights the potential for using HTS for sensitive detection of adventitious viruses in complex biological samples containing cellular background. IMPORTANCE Recent high-throughput sequencing (HTS) investigations have resulted in unexpected discoveries of known and novel viruses in a variety of sample types, including research materials, clinical materials, and biological products. Therefore, HTS can be a powerful tool for supplementing current methods for demonstrating the absence of adventitious or unwanted viruses in biological products, particularly when using a new cell line. However, HTS is a complex technology with different platforms, which needs standardization for evaluation of biologics. This collaborative study was undertaken to investigate detection of different virus types using two different HTS platforms. The results of the independently performed studies demonstrated a similar sensitivity of virus detection, regardless of the different sample preparation and processing procedures and bioinformatic analyses done in the three laboratories. Comparable HTS detection of different virus types supports future development of reference virus materials for standardization and validation of different HTS platforms. PMID:28932815
High-throughput sequencing of complete human mtDNA genomes from the Caucasus and West Asia: high diversity and demographic inferences.

PubMed

Schönberg, Anna; Theunert, Christoph; Li, Mingkun; Stoneking, Mark; Nasidze, Ivan

2011-09-01

To investigate the demographic history of human populations from the Caucasus and surrounding regions, we used high-throughput sequencing to generate 147 complete mtDNA genome sequences from random samples of individuals from three groups from the Caucasus (Armenians, Azeri and Georgians), and one group each from Iran and Turkey. Overall diversity is very high, with 144 different sequences that fall into 97 different haplogroups found among the 147 individuals. Bayesian skyline plots (BSPs) of population size change through time show a population expansion around 40-50 kya, followed by a constant population size, and then another expansion around 15-18 kya for the groups from the Caucasus and Iran. The BSP for Turkey differs the most from the others, with an increase from 35 to 50 kya followed by a prolonged period of constant population size, and no indication of a second period of growth. An approximate Bayesian computation approach was used to estimate divergence times between each pair of populations; the oldest divergence times were between Turkey and the other four groups from the South Caucasus and Iran (~400-600 generations), while the divergence time of the three Caucasus groups from each other was comparable to their divergence time from Iran (average of ~360 generations). These results illustrate the value of random sampling of complete mtDNA genome sequences that can be obtained with high-throughput sequencing platforms.
The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

PubMed

Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C

2012-01-01

The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).
GlycoExtractor: a web-based interface for high throughput processing of HPLC-glycan data.

PubMed

Artemenko, Natalia V; Campbell, Matthew P; Rudd, Pauline M

2010-04-05

Recently, an automated high-throughput HPLC platform has been developed that can be used to fully sequence and quantify low concentrations of N-linked sugars released from glycoproteins, supported by an experimental database (GlycoBase) and analytical tools (autoGU). However, commercial packages that support the operation of HPLC instruments and data storage lack platforms for the extraction of large volumes of data. The lack of resources and agreed formats in glycomics is now a major limiting factor that restricts the development of bioinformatic tools and automated workflows for high-throughput HPLC data analysis. GlycoExtractor is a web-based tool that interfaces with a commercial HPLC database/software solution to facilitate the extraction of large volumes of processed glycan profile data (peak number, peak areas, and glucose unit values). The tool allows the user to export a series of sample sets to a set of file formats (XML, JSON, and CSV) rather than a collection of disconnected files. This approach not only reduces the amount of manual refinement required to export data into a suitable format for data analysis but also opens the field to new approaches for high-throughput data interpretation and storage, including biomarker discovery and validation and monitoring of online bioprocessing conditions for next generation biotherapeutics.
Comparative Genomics of Bacillus species and its Relevance in Industrial Microbiology.

PubMed

Sharma, Archana; Satyanarayana, T

2013-01-01

With the advent of high throughput sequencing platforms and relevant analytical tools, the rate of microbial genome sequencing has accelerated which has in turn led to better understanding of microbial molecular biology and genetics. The complete genome sequences of important industrial organisms provide opportunities for human health, industry, and the environment. Bacillus species are the dominant workhorses in industrial fermentations. Today, genome sequences of several Bacillus species are available, and comparative genomics of this genus helps in understanding their physiology, biochemistry, and genetics. The genomes of these bacterial species are the sources of many industrially important enzymes and antibiotics and, therefore, provide an opportunity to tailor enzymes with desired properties to suit a wide range of applications. A comparative account of strengths and weaknesses of the different sequencing platforms are also highlighted in the review.
High-throughput screening based on label-free detection of small molecule microarrays

NASA Astrophysics Data System (ADS)

Zhu, Chenggang; Fei, Yiyan; Zhu, Xiangdong

2017-02-01

Based on small-molecule microarrays (SMMs) and oblique-incidence reflectivity difference (OI-RD) scanner, we have developed a novel high-throughput drug preliminary screening platform based on label-free monitoring of direct interactions between target proteins and immobilized small molecules. The screening platform is especially attractive for screening compounds against targets of unknown function and/or structure that are not compatible with functional assay development. In this screening platform, OI-RD scanner serves as a label-free detection instrument which is able to monitor about 15,000 biomolecular interactions in a single experiment without the need to label any biomolecule. Besides, SMMs serves as a novel format for high-throughput screening by immobilization of tens of thousands of different compounds on a single phenyl-isocyanate functionalized glass slide. Based on the high-throughput screening platform, we sequentially screened five target proteins (purified target proteins or cell lysate containing target protein) in high-throughput and label-free mode. We found hits for respective target protein and the inhibition effects for some hits were confirmed by following functional assays. Compared to traditional high-throughput screening assay, the novel high-throughput screening platform has many advantages, including minimal sample consumption, minimal distortion of interactions through label-free detection, multi-target screening analysis, which has a great potential to be a complementary screening platform in the field of drug discovery.

Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

PubMed Central

Lasitschka, Bärbel; Jones, David; Northcott, Paul; Hutter, Barbara; Jäger, Natalie; Kool, Marcel; Taylor, Michael; Lichter, Peter; Pfister, Stefan; Wolf, Stephan; Brors, Benedikt; Eils, Roland

2013-01-01

The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies’ platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other platforms. This helps to identify the proper sequencing platform for whole genome studies with different application scopes. PMID:23776689
XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets.

PubMed

Yu, Yao; Hu, Hao; Bohlender, Ryan J; Hu, Fulan; Chen, Jiun-Sheng; Holt, Carson; Fowler, Jerry; Guthery, Stephen L; Scheet, Paul; Hildebrandt, Michelle A T; Yandell, Mark; Huff, Chad D

2018-04-06

High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease-gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits.
Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

PubMed Central

Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

2015-01-01

Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016
Progress in ion torrent semiconductor chip based sequencing.

PubMed

Merriman, Barry; Rothberg, Jonathan M

2012-12-01

In order for next-generation sequencing to become widely used as a diagnostic in the healthcare industry, sequencing instrumentation will need to be mass produced with a high degree of quality and economy. One way to achieve this is to recast DNA sequencing in a format that fully leverages the manufacturing base created for computer chips, complementary metal-oxide semiconductor chip fabrication, which is the current pinnacle of large scale, high quality, low-cost manufacturing of high technology. To achieve this, ideally the entire sensory apparatus of the sequencer would be embodied in a standard semiconductor chip, manufactured in the same fab facilities used for logic and memory chips. Recently, such a sequencing chip, and the associated sequencing platform, has been developed and commercialized by Ion Torrent, a division of Life Technologies, Inc. Here we provide an overview of this semiconductor chip based sequencing technology, and summarize the progress made since its commercial introduction. We described in detail the progress in chip scaling, sequencing throughput, read length, and accuracy. We also summarize the enhancements in the associated platform, including sample preparation, data processing, and engagement of the broader development community through open source and crowdsourcing initiatives. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
High throughput 16SrRNA gene sequencing reveals the correlation between Propionibacterium acnes and sarcoidosis.

PubMed

Zhao, Meng-Meng; Du, Shan-Shan; Li, Qiu-Hong; Chen, Tao; Qiu, Hui; Wu, Qin; Chen, Shan-Shan; Zhou, Ying; Zhang, Yuan; Hu, Yang; Su, Yi-Liang; Shen, Li; Zhang, Fen; Weng, Dong; Li, Hui-Ping

2017-02-01

This study aims to use high throughput 16SrRNA gene sequencing to examine the bacterial profile of lymph node biopsy samples of patients with sarcoidosis and to further verify the association between Propionibacterium acnes (P. acnes) and sarcoidosis. A total of 36 mediastinal lymph node biopsy specimens were collected from 17 cases of sarcoidosis, 8 tuberculosis (TB group), and 11 non-infectious lung diseases (control group). The V4 region of the bacterial 16SrRNA gene in the specimens was amplified and sequenced using the high throughput sequencing platform MiSeq, and bacterial profile was established. The data analysis software QIIME and Metastats were used to compare bacterial relative abundance in the three patient groups. Overall, 545 genera were identified; 38 showed significantly lower and 29 had significantly higher relative abundance in the sarcoidosis group than in the TB and control groups (P < 0.01). P. acnes 16SrRNA was exclusively found in all the 17 samples of the sarcoidosis group, whereas was not detected in the TB and control groups. The relative abundance of P. acnes in the sarcoidosis group (0.16% ± 0. 11%) was significantly higher than that in the TB (Metastats analysis: P = 0.0010, q = 0.0044) and control groups (Metastats analysis: P = 0.0010, q = 0.0038). The relative abundance of P. granulosum was only 0.0022% ± 0. 0044% in the sarcoidosis group. P. granulosum 16SrRNA was not detected in the other two groups. High throughput 16SrRNA gene sequencing appears to be a useful tool to investigate the bacterial profile of sarcoidosis specimens. The results suggest that P. acnes may be involved in sarcoidosis development.
Toward a mtDNA locus-specific mutation database using the LOVD platform.

PubMed

Elson, Joanna L; Sweeney, Mary G; Procaccio, Vincent; Yarham, John W; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H; Pitceathly, Robert D S; Thorburn, David R; Lott, Marie T; Wallace, Douglas C; Taylor, Robert W; McFarland, Robert

2012-09-01

The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. © 2012 Wiley Periodicals, Inc.
Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform

PubMed Central

Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert

2015-01-01

The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690
The use of coded PCR primers enables high-throughput sequencing of multiple homolog amplification products by 454 parallel sequencing.

PubMed

Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske

2007-02-14

The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Evaluation of the Bacterial Diversity in the Human Tongue Coating Based on Genus-Specific Primers for 16S rRNA Sequencing.

PubMed

Sun, Beili; Zhou, Dongrui; Tu, Jing; Lu, Zuhong

2017-01-01

The characteristics of tongue coating are very important symbols for disease diagnosis in traditional Chinese medicine (TCM) theory. As a habitat of oral microbiota, bacteria on the tongue dorsum have been proved to be the cause of many oral diseases. The high-throughput next-generation sequencing (NGS) platforms have been widely applied in the analysis of bacterial 16S rRNA gene. We developed a methodology based on genus-specific multiprimer amplification and ligation-based sequencing for microbiota analysis. In order to validate the efficiency of the approach, we thoroughly analyzed six tongue coating samples from lung cancer patients with different TCM types, and more than 600 genera of bacteria were detected by this platform. The results showed that ligation-based parallel sequencing combined with enzyme digestion and multiamplification could expand the effective length of sequencing reads and could be applied in the microbiota analysis.
Multiplex amplification of large sets of human exons.

PubMed

Porreca, Gregory J; Zhang, Kun; Li, Jin Billy; Xie, Bin; Austin, Derek; Vassallo, Sara L; LeProust, Emily M; Peck, Bill J; Emig, Christopher J; Dahl, Fredrik; Gao, Yuan; Church, George M; Shendure, Jay

2007-11-01

A new generation of technologies is poised to reduce DNA sequencing costs by several orders of magnitude. But our ability to fully leverage the power of these technologies is crippled by the absence of suitable 'front-end' methods for isolating complex subsets of a mammalian genome at a scale that matches the throughput at which these platforms will routinely operate. We show that targeting oligonucleotides released from programmable microarrays can be used to capture and amplify approximately 10,000 human exons in a single multiplex reaction. Additionally, we show integration of this protocol with ultra-high-throughput sequencing for targeted variation discovery. Although the multiplex capture reaction is highly specific, we found that nonuniform capture is a key issue that will need to be resolved by additional optimization. We anticipate that highly multiplexed methods for targeted amplification will enable the comprehensive resequencing of human exons at a fraction of the cost of whole-genome resequencing.
Inertial-ordering-assisted droplet microfluidics for high-throughput single-cell RNA-sequencing.

PubMed

Moon, Hui-Sung; Je, Kwanghwi; Min, Jae-Woong; Park, Donghyun; Han, Kyung-Yeon; Shin, Seung-Ho; Park, Woong-Yang; Yoo, Chang Eun; Kim, Shin-Hyun

2018-02-27

Single-cell RNA-seq reveals the cellular heterogeneity inherent in the population of cells, which is very important in many clinical and research applications. Recent advances in droplet microfluidics have achieved the automatic isolation, lysis, and labeling of single cells in droplet compartments without complex instrumentation. However, barcoding errors occurring in the cell encapsulation process because of the multiple-beads-in-droplet and insufficient throughput because of the low concentration of beads for avoiding multiple-beads-in-a-droplet remain important challenges for precise and efficient expression profiling of single cells. In this study, we developed a new droplet-based microfluidic platform that significantly improved the throughput while reducing barcoding errors through deterministic encapsulation of inertially ordered beads. Highly concentrated beads containing oligonucleotide barcodes were spontaneously ordered in a spiral channel by an inertial effect, which were in turn encapsulated in droplets one-by-one, while cells were simultaneously encapsulated in the droplets. The deterministic encapsulation of beads resulted in a high fraction of single-bead-in-a-droplet and rare multiple-beads-in-a-droplet although the bead concentration increased to 1000 μl -1 , which diminished barcoding errors and enabled accurate high-throughput barcoding. We successfully validated our device with single-cell RNA-seq. In addition, we found that multiple-beads-in-a-droplet, generated using a normal Drop-Seq device with a high concentration of beads, underestimated transcript numbers and overestimated cell numbers. This accurate high-throughput platform can expand the capability and practicality of Drop-Seq in single-cell analysis.
Automatic and integrated micro-enzyme assay (AIμEA) platform for highly sensitive thrombin analysis via an engineered fluorescence protein-functionalized monolithic capillary column.

PubMed

Lin, Lihua; Liu, Shengquan; Nie, Zhou; Chen, Yingzhuang; Lei, Chunyang; Wang, Zhen; Yin, Chao; Hu, Huiping; Huang, Yan; Yao, Shouzhuo

2015-04-21

Nowadays, large-scale screening for enzyme discovery, engineering, and drug discovery processes require simple, fast, and sensitive enzyme activity assay platforms with high integration and potential for high-throughput detection. Herein, a novel automatic and integrated micro-enzyme assay (AIμEA) platform was proposed based on a unique microreaction system fabricated by a engineered green fluorescence protein (GFP)-functionalized monolithic capillary column, with thrombin as an example. The recombinant GFP probe was rationally engineered to possess a His-tag and a substrate sequence of thrombin, which enable it to be immobilized on the monolith via metal affinity binding, and to be released after thrombin digestion. Combined with capillary electrophoresis-laser-induced fluorescence (CE-LIF), all the procedures, including thrombin injection, online enzymatic digestion in the microreaction system, and label-free detection of the released GFP, were integrated in a single electrophoretic process. By taking advantage of the ultrahigh loading capacity of the AIμEA platform and the CE automatic programming setup, one microreaction column was sufficient for many times digestion without replacement. The novel microreaction system showed significantly enhanced catalytic efficiency, about 30 fold higher than that of the equivalent bulk reaction. Accordingly, the AIμEA platform was highly sensitive with a limit of detection down to 1 pM of thrombin. Moreover, the AIμEA platform was robust and reliable to detect thrombin in human serum samples and its inhibition by hirudin. Hence, this AIμEA platform exhibits great potential for high-throughput analysis in future biological application, disease diagnostics, and drug screening.
An automated Genomes-to-Natural Products platform (GNP) for the discovery of modular natural products.

PubMed

Johnston, Chad W; Skinnider, Michael A; Wyatt, Morgan A; Li, Xiang; Ranieri, Michael R M; Yang, Lian; Zechel, David L; Ma, Bin; Magarvey, Nathan A

2015-09-28

Bacterial natural products are a diverse and valuable group of small molecules, and genome sequencing indicates that the vast majority remain undiscovered. The prediction of natural product structures from biosynthetic assembly lines can facilitate their discovery, but highly automated, accurate, and integrated systems are required to mine the broad spectrum of sequenced bacterial genomes. Here we present a genome-guided natural products discovery tool to automatically predict, combinatorialize and identify polyketides and nonribosomal peptides from biosynthetic assembly lines using LC-MS/MS data of crude extracts in a high-throughput manner. We detail the directed identification and isolation of six genetically predicted polyketides and nonribosomal peptides using our Genome-to-Natural Products platform. This highly automated, user-friendly programme provides a means of realizing the potential of genetically encoded natural products.
High-throughput automated microfluidic sample preparation for accurate microbial genomics

PubMed Central

Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B.; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P.; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C.

2017-01-01

Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications. PMID:28128213
A comparative analysis of high-throughput platforms for validation of a circulating microRNA signature in diabetic retinopathy.

PubMed

Farr, Ryan J; Januszewski, Andrzej S; Joglekar, Mugdha V; Liang, Helena; McAulley, Annie K; Hewitt, Alex W; Thomas, Helen E; Loudovaris, Tom; Kay, Thomas W H; Jenkins, Alicia; Hardikar, Anandwardhan A

2015-06-02

MicroRNAs are now increasingly recognized as biomarkers of disease progression. Several quantitative real-time PCR (qPCR) platforms have been developed to determine the relative levels of microRNAs in biological fluids. We systematically compared the detection of cellular and circulating microRNA using a standard 96-well platform, a high-content microfluidics platform and two ultra-high content platforms. We used extensive analytical tools to compute inter- and intra-run variability and concordance measured using fidelity scoring, coefficient of variation and cluster analysis. We carried out unprejudiced next generation sequencing to identify a microRNA signature for Diabetic Retinopathy (DR) and systematically assessed the validation of this signature on clinical samples using each of the above four qPCR platforms. The results indicate that sensitivity to measure low copy number microRNAs is inversely related to qPCR reaction volume and that the choice of platform for microRNA biomarker validation should be made based on the abundance of miRNAs of interest.
Diversity Arrays Technology (DArT) Marker Platforms for Diversity Analysis and Linkage Mapping in a Complex Crop, the Octoploid Cultivated Strawberry (Fragaria × ananassa)

PubMed Central

Sánchez-Sevilla, José F.; Horvath, Aniko; Botella, Miguel A.; Gaston, Amèlia; Folta, Kevin; Kilian, Andrzej; Denoyes, Beatrice; Amaya, Iraida

2015-01-01

Cultivated strawberry (Fragaria × ananassa) is a genetically complex allo-octoploid crop with 28 pairs of chromosomes (2n = 8x = 56) for which a genome sequence is not yet available. The diploid Fragaria vesca is considered the donor species of one of the octoploid sub-genomes and its available genome sequence can be used as a reference for genomic studies. A wide number of strawberry cultivars are stored in ex situ germplasm collections world-wide but a number of previous studies have addressed the genetic diversity present within a limited number of these collections. Here, we report the development and application of two platforms based on the implementation of Diversity Array Technology (DArT) markers for high-throughput genotyping in strawberry. The first DArT microarray was used to evaluate the genetic diversity of 62 strawberry cultivars that represent a wide range of variation based on phenotype, geographical and temporal origin and pedigrees. A total of 603 DArT markers were used to evaluate the diversity and structure of the population and their cluster analyses revealed that these markers were highly efficient in classifying the accessions in groups based on historical, geographical and pedigree-based cues. The second DArTseq platform took benefit of the complexity reduction method optimized for strawberry and the development of next generation sequencing technologies. The strawberry DArTseq was used to generate a total of 9,386 SNP markers in the previously developed ‘232’ × ‘1392’ mapping population, of which, 4,242 high quality markers were further selected to saturate this map after several filtering steps. The high-throughput platforms here developed for genotyping strawberry will facilitate genome-wide characterizations of large accessions sets and complement other available options. PMID:26675207
Vidjil: A Web Platform for Analysis of High-Throughput Repertoire Sequencing.

PubMed

Duez, Marc; Giraud, Mathieu; Herbert, Ryan; Rocher, Tatiana; Salson, Mikaël; Thonier, Florian

2016-01-01

The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications.
Vidjil: A Web Platform for Analysis of High-Throughput Repertoire Sequencing

PubMed Central

Duez, Marc; Herbert, Ryan; Rocher, Tatiana; Salson, Mikaël; Thonier, Florian

2016-01-01

Background The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. Methods and Results Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications. PMID:27835690
The ChIP-exo Method: Identifying Protein-DNA Interactions with Near Base Pair Precision.

PubMed

Perreault, Andrea A; Venters, Bryan J

2016-12-23

Chromatin immunoprecipitation (ChIP) is an indispensable tool in the fields of epigenetics and gene regulation that isolates specific protein-DNA interactions. ChIP coupled to high throughput sequencing (ChIP-seq) is commonly used to determine the genomic location of proteins that interact with chromatin. However, ChIP-seq is hampered by relatively low mapping resolution of several hundred base pairs and high background signal. The ChIP-exo method is a refined version of ChIP-seq that substantially improves upon both resolution and noise. The key distinction of the ChIP-exo methodology is the incorporation of lambda exonuclease digestion in the library preparation workflow to effectively footprint the left and right 5' DNA borders of the protein-DNA crosslink site. The ChIP-exo libraries are then subjected to high throughput sequencing. The resulting data can be leveraged to provide unique and ultra-high resolution insights into the functional organization of the genome. Here, we describe the ChIP-exo method that we have optimized and streamlined for mammalian systems and next-generation sequencing-by-synthesis platform.
Single nucleotide polymorphism (SNP) discovery in rainbow trout using restriction site associated DNA (RAD) sequencing of doubled haploids and assessment of polymorphism in a population survey

USDA-ARS?s Scientific Manuscript database

Background: Our goal is to produce a high-throughput SNP genotyping platform for genomic analyses in rainbow trout that will enable fine mapping of QTL, whole genome association studies, genomic selection for improved aquaculture production traits, and genetic analyses of wild populations that aid ...

Development and preliminary evaluation of a 90K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria ×ananassa

USDA-ARS?s Scientific Manuscript database

A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria ×ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca ‘Hawaii 4’ reference genome to identify sing...
[Study on Microbial Diversity of Peri-implantitis Subgingival by High-throughput Sequencing].

PubMed

Li, Zhi-jie; Wang, Shao-guo; Li, Yue-hong; Tu, Dong-xiang; Liu, Shi-yun; Nie, Hong-bing; Li, Zhi-qiang; Zhang, Ju-mei

2015-07-01

To study microbial diversity of peri-implantitis subgingival with high-throughput sequencing, and investigate microbiological etiology of peri-implantitis. Subgingival plaques were sampled from the patients with peri-implantitis (D group) and non-peri-implantitis subjects (N group). The microbiological diversity of the subgingival plaques was detected by sequencing V4 region of 16S rRNA with Illumina Miseq platform. The diversity of the community structure was analyzed using Mothur software. A total of 156 507 gene sequences were detected in nine samples and 4 402 operational taxonomic units (OTUs) were found. Selenomonas, Pseudomonas, and Fusobacterium were dominant bacteria in D group, while Fusobacterium, Veillonella and Streptococcus were dominant bacteria in N group. Differences between peri-implantitis and non-peri-implantitis bacterial communities were observed at all phylogenetic levels by LEfSe, which was also found in PcoA test. The occurrence of peri-implantitis is not only related to periodontitis pathogenic microbe, but also related with the changes of oral microbial community structure. Treponema, Herbaspirillum, Butyricimonas and Phaeobacte may be closely related to the occurrence and development of peri-implantitis.
Massively parallel whole genome amplification for single-cell sequencing using droplet microfluidics.

PubMed

Hosokawa, Masahito; Nishikawa, Yohei; Kogawa, Masato; Takeyama, Haruko

2017-07-12

Massively parallel single-cell genome sequencing is required to further understand genetic diversities in complex biological systems. Whole genome amplification (WGA) is the first step for single-cell sequencing, but its throughput and accuracy are insufficient in conventional reaction platforms. Here, we introduce single droplet multiple displacement amplification (sd-MDA), a method that enables massively parallel amplification of single cell genomes while maintaining sequence accuracy and specificity. Tens of thousands of single cells are compartmentalized in millions of picoliter droplets and then subjected to lysis and WGA by passive droplet fusion in microfluidic channels. Because single cells are isolated in compartments, their genomes are amplified to saturation without contamination. This enables the high-throughput acquisition of contamination-free and cell specific sequence reads from single cells (21,000 single-cells/h), resulting in enhancement of the sequence data quality compared to conventional methods. This method allowed WGA of both single bacterial cells and human cancer cells. The obtained sequencing coverage rivals those of conventional techniques with superior sequence quality. In addition, we also demonstrate de novo assembly of uncultured soil bacteria and obtain draft genomes from single cell sequencing. This sd-MDA is promising for flexible and scalable use in single-cell sequencing.
Large-scale microfluidics providing high-resolution and high-throughput screening of Caenorhabditis elegans poly-glutamine aggregation model

NASA Astrophysics Data System (ADS)

Mondal, Sudip; Hegarty, Evan; Martin, Chris; Gökçe, Sertan Kutal; Ghorashian, Navid; Ben-Yakar, Adela

2016-10-01

Next generation drug screening could benefit greatly from in vivo studies, using small animal models such as Caenorhabditis elegans for hit identification and lead optimization. Current in vivo assays can operate either at low throughput with high resolution or with low resolution at high throughput. To enable both high-throughput and high-resolution imaging of C. elegans, we developed an automated microfluidic platform. This platform can image 15 z-stacks of ~4,000 C. elegans from 96 different populations using a large-scale chip with a micron resolution in 16 min. Using this platform, we screened ~100,000 animals of the poly-glutamine aggregation model on 25 chips. We tested the efficacy of ~1,000 FDA-approved drugs in improving the aggregation phenotype of the model and identified four confirmed hits. This robust platform now enables high-content screening of various C. elegans disease models at the speed and cost of in vitro cell-based assays.
Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays

PubMed Central

Aryee, Martin J.; Jaffe, Andrew E.; Corrada-Bravo, Hector; Ladd-Acosta, Christine; Feinberg, Andrew P.; Hansen, Kasper D.; Irizarry, Rafael A.

2014-01-01

Motivation: The recently released Infinium HumanMethylation450 array (the ‘450k’ array) provides a high-throughput assay to quantify DNA methylation (DNAm) at ∼450 000 loci across a range of genomic features. Although less comprehensive than high-throughput sequencing-based techniques, this product is more cost-effective and promises to be the most widely used DNAm high-throughput measurement technology over the next several years. Results: Here we describe a suite of computational tools that incorporate state-of-the-art statistical techniques for the analysis of DNAm data. The software is structured to easily adapt to future versions of the technology. We include methods for preprocessing, quality assessment and detection of differentially methylated regions from the kilobase to the megabase scale. We show how our software provides a powerful and flexible development platform for future methods. We also illustrate how our methods empower the technology to make discoveries previously thought to be possible only with sequencing-based methods. Availability and implementation: http://bioconductor.org/packages/release/bioc/html/minfi.html. Contact: khansen@jhsph.edu; rafa@jimmy.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24478339
Alignment of high-throughput sequencing data inside in-memory databases.

PubMed

Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias

2014-01-01

In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
Pathogenic bacteria in sewage treatment plants as revealed by 454 pyrosequencing.

PubMed

Ye, Lin; Zhang, Tong

2011-09-01

This study applied 454 high-throughput pyrosequencing to analyze potentially pathogenic bacteria in activated sludge from 14 municipal wastewater treatment plants (WWTPs) across four countries (China, U.S., Canada, and Singapore), plus the influent and effluent of one of the 14 WWTPs. A total of 370,870 16S rRNA gene sequences with average length of 207 bps were obtained and all of them were assigned to corresponding taxonomic ranks by using RDP classifier and MEGAN. It was found that the most abundant potentially pathogenic bacteria in the WWTPs were affiliated with the genera of Aeromonas and Clostridium. Aeromonas veronii, Aeromonas hydrophila, and Clostridium perfringens were species most similar to the potentially pathogenic bacteria found in this study. Some sequences highly similar (>99%) to Corynebacterium diphtheriae were found in the influent and activated sludge samples from a saline WWTP. Overall, the percentage of the sequences closely related (>99%) to known pathogenic bacteria sequences was about 0.16% of the total sequences. Additionally, a platform-independent Java application (BAND) was developed for graphical visualization of the data of microbial abundance generated by high-throughput pyrosequencing. The approach demonstrated in this study could examine most of the potentially pathogenic bacteria simultaneously instead of one-by-one detection by other methods.
Compartmental genomics in living cells revealed by single-cell nanobiopsy.

PubMed

Actis, Paolo; Maalouf, Michelle M; Kim, Hyunsung John; Lohith, Akshar; Vilozny, Boaz; Seger, R Adam; Pourmand, Nader

2014-01-28

The ability to study the molecular biology of living single cells in heterogeneous cell populations is essential for next generation analysis of cellular circuitry and function. Here, we developed a single-cell nanobiopsy platform based on scanning ion conductance microscopy (SICM) for continuous sampling of intracellular content from individual cells. The nanobiopsy platform uses electrowetting within a nanopipette to extract cellular material from living cells with minimal disruption of the cellular milieu. We demonstrate the subcellular resolution of the nanobiopsy platform by isolating small subpopulations of mitochondria from single living cells, and quantify mutant mitochondrial genomes in those single cells with high throughput sequencing technology. These findings may provide the foundation for dynamic subcellular genomic analysis.
Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis.

PubMed

Li, Ben; Li, Yunxiao; Qin, Zhaohui S

2017-06-01

Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical 'large p , small n ' problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package "adaptiveHM", which is freely available from https://github.com/benliemory/adaptiveHM.
Improving Hierarchical Models Using Historical Data with Applications in High-Throughput Genomics Data Analysis

PubMed Central

Li, Ben; Li, Yunxiao; Qin, Zhaohui S.

2016-01-01

Modern high-throughput biotechnologies such as microarray and next generation sequencing produce a massive amount of information for each sample assayed. However, in a typical high-throughput experiment, only limited amount of data are observed for each individual feature, thus the classical ‘large p, small n’ problem. Bayesian hierarchical model, capable of borrowing strength across features within the same dataset, has been recognized as an effective tool in analyzing such data. However, the shrinkage effect, the most prominent feature of hierarchical features, can lead to undesirable over-correction for some features. In this work, we discuss possible causes of the over-correction problem and propose several alternative solutions. Our strategy is rooted in the fact that in the Big Data era, large amount of historical data are available which should be taken advantage of. Our strategy presents a new framework to enhance the Bayesian hierarchical model. Through simulation and real data analysis, we demonstrated superior performance of the proposed strategy. Our new strategy also enables borrowing information across different platforms which could be extremely useful with emergence of new technologies and accumulation of data from different platforms in the Big Data era. Our method has been implemented in R package “adaptiveHM”, which is freely available from https://github.com/benliemory/adaptiveHM. PMID:28919931
Robustness of Massively Parallel Sequencing Platforms

PubMed Central

Kavak, Pınar; Yüksel, Bayram; Aksu, Soner; Kulekci, M. Oguzhan; Güngör, Tunga; Hach, Faraz; Şahinalp, S. Cenk; Alkan, Can; Sağıroğlu, Mahmut Şamil

2015-01-01

The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications. PMID:26382624
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis.

PubMed

Simonyan, Vahan; Mazumder, Raja

2014-09-30

The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis.
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis

PubMed Central

Simonyan, Vahan; Mazumder, Raja

2014-01-01

The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis. PMID:25271953
Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications.

PubMed

Gowrisankar, Sivakumar; Lerner-Ellis, Jordan P; Cox, Stephanie; White, Emily T; Manion, Megan; LeVan, Kevin; Liu, Jonathan; Farwell, Lisa M; Iartchouk, Oleg; Rehm, Heidi L; Funke, Birgit H

2010-11-01

Medical sequencing for diseases with locus and allelic heterogeneities has been limited by the high cost and low throughput of traditional sequencing technologies. "Second-generation" sequencing (SGS) technologies allow the parallel processing of a large number of genes and, therefore, offer great promise for medical sequencing; however, their use in clinical laboratories is still in its infancy. Our laboratory offers clinical resequencing for dilated cardiomyopathy (DCM) using an array-based platform that interrogates 19 of more than 30 genes known to cause DCM. We explored both the feasibility and cost effectiveness of using PCR amplification followed by SGS technology for sequencing these 19 genes in a set of five samples enriched for known sequence alterations (109 unique substitutions and 27 insertions and deletions). While the analytical sensitivity for substitutions was comparable to that of the DCM array (98%), SGS technology performed better than the DCM array for insertions and deletions (90.6% versus 58%). Overall, SGS performed substantially better than did the current array-based testing platform; however, the operational cost and projected turnaround time do not meet our current standards. Therefore, efficient capture methods and/or sample pooling strategies that shorten the turnaround time and decrease reagent and labor costs are needed before implementing this platform into routine clinical applications.
Accuracy of the high-throughput amplicon sequencing to identify species within the genus Aspergillus.

PubMed

Lee, Seungeun; Yamamoto, Naomichi

2015-12-01

This study characterized the accuracy of high-throughput amplicon sequencing to identify species within the genus Aspergillus. To this end, we sequenced the internal transcribed spacer 1 (ITS1), β-tubulin (BenA), and calmodulin (CaM) gene encoding sequences as DNA markers from eight reference Aspergillus strains with known identities using 300-bp sequencing on the Illumina MiSeq platform, and compared them with the BLASTn outputs. The identifications with the sequences longer than 250 bp were accurate at the section rank, with some ambiguities observed at the species rank due to mostly cross detection of sibling species. Additionally, in silico analysis was performed to predict the identification accuracy for all species in the genus Aspergillus, where 107, 210, and 187 species were predicted to be identifiable down to the species rank based on ITS1, BenA, and CaM, respectively. Finally, air filter samples were analysed to quantify the relative abundances of Aspergillus species in outdoor air. The results were reproducible across biological duplicates both at the species and section ranks, but not strongly correlated between ITS1 and BenA, suggesting the Aspergillus detection can be taxonomically biased depending on the selection of the DNA markers and/or primers. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Application of Genomic Technologies to the Breeding of Trees

PubMed Central

Badenes, Maria L.; Fernández i Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J.

2016-01-01

The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species. PMID:27895664
Application of Genomic Technologies to the Breeding of Trees.

PubMed

Badenes, Maria L; Fernández I Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J

2016-01-01

The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species.
Filling reference gaps via assembling DNA barcodes using high-throughput sequencing-moving toward barcoding the world.

PubMed

Liu, Shanlin; Yang, Chentao; Zhou, Chengran; Zhou, Xin

2017-12-01

Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)-based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn't show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. © The Authors 2017. Published by Oxford University Press.
Mining high-throughput experimental data to link gene and function

PubMed Central

Blaby-Haas, Crysten E.; de Crécy-Lagard, Valérie

2011-01-01

Nearly 2200 genomes encoding some 6 million proteins have now been sequenced. Around 40% of these proteins are of unknown function even when function is loosely and minimally defined as “belonging to a superfamily”. In addition to in silico methods, the swelling stream of high-throughput experimental data can give valuable clues for linking these “unknowns” with precise biological roles. The goal is to develop integrative data-mining platforms that allow the scientific community at large to access and utilize this rich source of experimental knowledge. To this end, we review recent advances in generating whole-genome experimental datasets, where this data can be accessed, and how it can be used to drive prediction of gene function. PMID:21310501
Using High-Throughput Sequencing to Leverage Surveillance of Genetic Diversity and Oseltamivir Resistance: A Pilot Study during the 2009 Influenza A(H1N1) Pandemic

PubMed Central

Téllez-Sosa, Juan; Rodríguez, Mario Henry; Gómez-Barreto, Rosa E.; Valdovinos-Torres, Humberto; Hidalgo, Ana Cecilia; Cruz-Hervert, Pablo; Luna, René Santos; Carrillo-Valenzo, Erik; Ramos, Celso; García-García, Lourdes; Martínez-Barnetche, Jesús

2013-01-01

Background Influenza viruses display a high mutation rate and complex evolutionary patterns. Next-generation sequencing (NGS) has been widely used for qualitative and semi-quantitative assessment of genetic diversity in complex biological samples. The “deep sequencing” approach, enabled by the enormous throughput of current NGS platforms, allows the identification of rare genetic viral variants in targeted genetic regions, but is usually limited to a small number of samples. Methodology and Principal Findings We designed a proof-of-principle study to test whether redistributing sequencing throughput from a high depth-small sample number towards a low depth-large sample number approach is feasible and contributes to influenza epidemiological surveillance. Using 454-Roche sequencing, we sequenced at a rather low depth, a 307 bp amplicon of the neuraminidase gene of the Influenza A(H1N1) pandemic (A(H1N1)pdm) virus from cDNA amplicons pooled in 48 barcoded libraries obtained from nasal swab samples of infected patients (n = 299) taken from May to November, 2009 pandemic period in Mexico. This approach revealed that during the transition from the first (May-July) to second wave (September-November) of the pandemic, the initial genetic variants were replaced by the N248D mutation in the NA gene, and enabled the establishment of temporal and geographic associations with genetic diversity and the identification of mutations associated with oseltamivir resistance. Conclusions NGS sequencing of a short amplicon from the NA gene at low sequencing depth allowed genetic screening of a large number of samples, providing insights to viral genetic diversity dynamics and the identification of genetic variants associated with oseltamivir resistance. Further research is needed to explain the observed replacement of the genetic variants seen during the second wave. As sequencing throughput rises and library multiplexing and automation improves, we foresee that the approach presented here can be scaled up for global genetic surveillance of influenza and other infectious diseases. PMID:23843978

High-resolution phylogenetic microbial community profiling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Singer, Esther; Bushnell, Brian; Coleman-Derr, Devin

Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structuresmore » at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake's water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.« less
High-resolution phylogenetic microbial community profiling

DOE PAGES

Singer, Esther; Bushnell, Brian; Coleman-Derr, Devin; ...

2016-02-09

Over the past decade, high-throughput short-read 16S rRNA gene amplicon sequencing has eclipsed clone-dependent long-read Sanger sequencing for microbial community profiling. The transition to new technologies has provided more quantitative information at the expense of taxonomic resolution with implications for inferring metabolic traits in various ecosystems. We applied single-molecule real-time sequencing for microbial community profiling, generating full-length 16S rRNA gene sequences at high throughput, which we propose to name PhyloTags. We benchmarked and validated this approach using a defined microbial community. When further applied to samples from the water column of meromictic Sakinaw Lake, we show that while community structuresmore » at the phylum level are comparable between PhyloTags and Illumina V4 16S rRNA gene sequences (iTags), variance increases with community complexity at greater water depths. PhyloTags moreover allowed less ambiguous classification. Last, a platform-independent comparison of PhyloTags and in silico generated partial 16S rRNA gene sequences demonstrated significant differences in community structure and phylogenetic resolution across multiple taxonomic levels, including a severe underestimation in the abundance of specific microbial genera involved in nitrogen and methane cycling across the Lake's water column. Thus, PhyloTags provide a reliable adjunct or alternative to cost-effective iTags, enabling more accurate phylogenetic resolution of microbial communities and predictions on their metabolic potential.« less
High-Throughput Screening Enhances Kidney Organoid Differentiation from Human Pluripotent Stem Cells and Enables Automated Multidimensional Phenotyping.

PubMed

Czerniecki, Stefan M; Cruz, Nelly M; Harder, Jennifer L; Menon, Rajasree; Annis, James; Otto, Edgar A; Gulieva, Ramila E; Islas, Laura V; Kim, Yong Kyun; Tran, Linh M; Martins, Timothy J; Pippin, Jeffrey W; Fu, Hongxia; Kretzler, Matthias; Shankland, Stuart J; Himmelfarb, Jonathan; Moon, Randall T; Paragas, Neal; Freedman, Benjamin S

2018-05-15

Organoids derived from human pluripotent stem cells are a potentially powerful tool for high-throughput screening (HTS), but the complexity of organoid cultures poses a significant challenge for miniaturization and automation. Here, we present a fully automated, HTS-compatible platform for enhanced differentiation and phenotyping of human kidney organoids. The entire 21-day protocol, from plating to differentiation to analysis, can be performed automatically by liquid-handling robots, or alternatively by manual pipetting. High-content imaging analysis reveals both dose-dependent and threshold effects during organoid differentiation. Immunofluorescence and single-cell RNA sequencing identify previously undetected parietal, interstitial, and partially differentiated compartments within organoids and define conditions that greatly expand the vascular endothelium. Chemical modulation of toxicity and disease phenotypes can be quantified for safety and efficacy prediction. Screening in gene-edited organoids in this system reveals an unexpected role for myosin in polycystic kidney disease. Organoids in HTS formats thus establish an attractive platform for multidimensional phenotypic screening. Copyright © 2018 Elsevier Inc. All rights reserved.
The European Classical Swine Fever Virus Database: Blueprint for a Pathogen-Specific Sequence Database with Integrated Sequence Analysis Tools

PubMed Central

Postel, Alexander; Schmeiser, Stefanie; Zimmermann, Bernd; Becher, Paul

2016-01-01

Molecular epidemiology has become an indispensable tool in the diagnosis of diseases and in tracing the infection routes of pathogens. Due to advances in conventional sequencing and the development of high throughput technologies, the field of sequence determination is in the process of being revolutionized. Platforms for sharing sequence information and providing standardized tools for phylogenetic analyses are becoming increasingly important. The database (DB) of the European Union (EU) and World Organisation for Animal Health (OIE) Reference Laboratory for classical swine fever offers one of the world’s largest semi-public virus-specific sequence collections combined with a module for phylogenetic analysis. The classical swine fever (CSF) DB (CSF-DB) became a valuable tool for supporting diagnosis and epidemiological investigations of this highly contagious disease in pigs with high socio-economic impacts worldwide. The DB has been re-designed and now allows for the storage and analysis of traditionally used, well established genomic regions and of larger genomic regions including complete viral genomes. We present an application example for the analysis of highly similar viral sequences obtained in an endemic disease situation and introduce the new geographic “CSF Maps” tool. The concept of this standardized and easy-to-use DB with an integrated genetic typing module is suited to serve as a blueprint for similar platforms for other human or animal viruses. PMID:27827988
High-Throughput, Motility-Based Sorter for Microswimmers and Gene Discovery Platform

NASA Astrophysics Data System (ADS)

Yuan, Jinzhou; Raizen, David; Bau, Haim

2015-11-01

Animal motility varies with genotype, disease progression, aging, and environmental conditions. In many studies, it is desirable to carry out high throughput motility-based sorting to isolate rare animals for, among other things, forward genetic screens to identify genetic pathways that regulate phenotypes of interest. Many commonly used screening processes are labor-intensive, lack sensitivity, and require extensive investigator training. Here, we describe a sensitive, high throughput, automated, motility-based method for sorting nematodes. Our method was implemented in a simple microfluidic device capable of sorting many thousands of animals per hour per module, and is amenable to parallelism. The device successfully enriched for known C. elegans motility mutants. Furthermore, using this device, we isolated low-abundance mutants capable of suppressing the somnogenic effects of the flp-13 gene, which regulates sleep-like quiescence in C. elegans. Subsequent genomic sequencing led to the identification of a flp-13-suppressor gene. This research was supported, in part, by NIH NIA Grant 5R03AG042690-02.
Evaluation of high throughput gene expression platforms using a genomic biomarker signature for prediction of skin sensitization.

PubMed

Forreryd, Andy; Johansson, Henrik; Albrekt, Ann-Sofie; Lindstedt, Malin

2014-05-16

Allergic contact dermatitis (ACD) develops upon exposure to certain chemical compounds termed skin sensitizers. To reduce the occurrence of skin sensitizers, chemicals are regularly screened for their capacity to induce sensitization. The recently developed Genomic Allergen Rapid Detection (GARD) assay is an in vitro alternative to animal testing for identification of skin sensitizers, classifying chemicals by evaluating transcriptional levels of a genomic biomarker signature. During assay development and biomarker identification, genome-wide expression analysis was applied using microarrays covering approximately 30,000 transcripts. However, the microarray platform suffers from drawbacks in terms of low sample throughput, high cost per sample and time consuming protocols and is a limiting factor for adaption of GARD into a routine assay for screening of potential sensitizers. With the purpose to simplify assay procedures, improve technical parameters and increase sample throughput, we assessed the performance of three high throughput gene expression platforms--nCounter®, BioMark HD™ and OpenArray®--and correlated their performance metrics against our previously generated microarray data. We measured the levels of 30 transcripts from the GARD biomarker signature across 48 samples. Detection sensitivity, reproducibility, correlations and overall structure of gene expression measurements were compared across platforms. Gene expression data from all of the evaluated platforms could be used to classify most of the sensitizers from non-sensitizers in the GARD assay. Results also showed high data quality and acceptable reproducibility for all platforms but only medium to poor correlations of expression measurements across platforms. In addition, evaluated platforms were superior to the microarray platform in terms of cost efficiency, simplicity of protocols and sample throughput. We evaluated the performance of three non-array based platforms using a limited set of transcripts from the GARD biomarker signature. We demonstrated that it was possible to achieve acceptable discriminatory power in terms of separation between sensitizers and non-sensitizers in the GARD assay while reducing assay costs, simplify assay procedures and increase sample throughput by using an alternative platform, providing a first step towards the goal to prepare GARD for formal validation and adaption of the assay for industrial screening of potential sensitizers.
Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

PubMed Central

Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

2014-01-01

Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207
Tempo and mode of genomic mutations unveil human evolutionary history.

PubMed

Hara, Yuichiro

2015-01-01

Mutations that have occurred in human genomes provide insight into various aspects of evolutionary history such as speciation events and degrees of natural selection. Comparing genome sequences between human and great apes or among humans is a feasible approach for inferring human evolutionary history. Recent advances in high-throughput or so-called 'next-generation' DNA sequencing technologies have enabled the sequencing of thousands of individual human genomes, as well as a variety of reference genomes of hominids, many of which are publicly available. These sequence data can help to unveil the detailed demographic history of the lineage leading to humans as well as the explosion of modern human population size in the last several thousand years. In addition, high-throughput sequencing illustrates the tempo and mode of de novo mutations, which are producing human genetic variation at this moment. Pedigree-based human genome sequencing has shown that mutation rates vary significantly across the human genome. These studies have also provided an improved timescale of human evolution, because the mutation rate estimated from pedigree analysis is half that estimated from traditional analyses based on molecular phylogeny. Because of the dramatic reduction in sequencing cost, sequencing on-demand samples designed for specific studies is now also becoming popular. To produce data of sufficient quality to meet the requirements of the study, it is necessary to set an explicit sequencing plan that includes the choice of sample collection methods, sequencing platforms, and number of sequence reads.
Technological advancements and their importance for nematode identification

NASA Astrophysics Data System (ADS)

Ahmed, Mohammed; Sapp, Melanie; Prior, Thomas; Karssen, Gerrit; Back, Matthew Alan

2016-06-01

Nematodes represent a species-rich and morphologically diverse group of metazoans known to inhabit both aquatic and terrestrial environments. Their role as biological indicators and as key players in nutrient cycling has been well documented. Some plant-parasitic species are also known to cause significant losses to crop production. In spite of this, there still exists a huge gap in our knowledge of their diversity due to the enormity of time and expertise often involved in characterising species using phenotypic features. Molecular methodology provides useful means of complementing the limited number of reliable diagnostic characters available for morphology-based identification. We discuss herein some of the limitations of traditional taxonomy and how molecular methodologies, especially the use of high-throughput sequencing, have assisted in carrying out large-scale nematode community studies and characterisation of phytonematodes through rapid identification of multiple taxa. We also provide brief descriptions of some the current and almost-outdated high-throughput sequencing platforms and their applications in both plant nematology and soil ecology.
QSRA: a quality-value guided de novo short read assembler.

PubMed

Bryant, Douglas W; Wong, Weng-Keen; Mockler, Todd C

2009-02-24

New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data. We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality. QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.
VARiD: a variation detection framework for color-space and letter-space platforms.

PubMed

Dalca, Adrian V; Rumble, Stephen M; Levy, Samuel; Brudno, Michael

2010-06-15

High-throughput sequencing (HTS) technologies are transforming the study of genomic variation. The various HTS technologies have different sequencing biases and error rates, and while most HTS technologies sequence the residues of the genome directly, generating base calls for each position, the Applied Biosystem's SOLiD platform generates dibase-coded (color space) sequences. While combining data from the various platforms should increase the accuracy of variation detection, to date there are only a few tools that can identify variants from color space data, and none that can analyze color space and regular (letter space) data together. We present VARiD--a probabilistic method for variation detection from both letter- and color-space reads simultaneously. VARiD is based on a hidden Markov model and uses the forward-backward algorithm to accurately identify heterozygous, homozygous and tri-allelic SNPs, as well as micro-indels. Our analysis shows that VARiD performs better than the AB SOLiD toolset at detecting variants from color-space data alone, and improves the calls dramatically when letter- and color-space reads are combined. The toolset is freely available at http://compbio.cs.utoronto.ca/varid.
Microbial forensics: fiber optic microarray subtyping of Bacillus anthracis

NASA Astrophysics Data System (ADS)

Shepard, Jason R. E.

2009-05-01

The past decade has seen increased development and subsequent adoption of rapid molecular techniques involving DNA analysis for detection of pathogenic microorganisms, also termed microbial forensics. The continued accumulation of microbial sequence information in genomic databases now better positions the field of high-throughput DNA analysis to proceed in a more manageable fashion. The potential to build off of these databases exists as technology continues to develop, which will enable more rapid, cost effective analyses. This wealth of genetic information, along with new technologies, has the potential to better address some of the current problems and solve the key issues involved in DNA analysis of pathogenic microorganisms. To this end, a high density fiber optic microarray has been employed, housing numerous DNA sequences simultaneously for detection of various pathogenic microorganisms, including Bacillus anthracis, among others. Each organism is analyzed with multiple sequences and can be sub-typed against other closely related organisms. For public health labs, real-time PCR methods have been developed as an initial preliminary screen, but culture and growth are still considered the gold standard. Technologies employing higher throughput than these standard methods are better suited to capitalize on the limitless potential garnered from the sequence information. Microarray analyses are one such format positioned to exploit this potential, and our array platform is reusable, allowing repetitive tests on a single array, providing an increase in throughput and decrease in cost, along with a certainty of detection, down to the individual strain level.
Compartmental Genomics in Living Cells Revealed by Single-Cell Nanobiopsy

PubMed Central

Actis, Paolo; Maalouf, Michelle; Kim, Hyunsung John; Lohith, Akshar; Vilozny, Boaz; Seger, R. Adam; Pourmand, Nader

2014-01-01

The ability to study the molecular biology of living single cells in heterogeneous cell populations is essential for next generation analysis of cellular circuitry and function. Here, we developed a single-cell nanobiopsy platform based on scanning ion conductance microscopy (SICM) for continuous sampling of intracellular content from individual cells. The nanobiopsy platform uses electrowetting within a nanopipette to extract cellular material from living cells with minimal disruption of the cellular milieu. We demonstrate the subcellular resolution of the nanobiopsy platform by isolating small subpopulations of mitochondria from single living cells, and quantify mutant mitochondrial genomes in those single cells with high throughput sequencing technology. These findings may provide the foundation for dynamic subcellular genomic analysis. PMID:24279711
A field ornithologist’s guide to genomics: Practical considerations for ecology and conservation

USGS Publications Warehouse

Oyler-McCance, Sara J.; Oh, Kevin; Langin, Kathryn; Aldridge, Cameron L.

2016-01-01

Vast improvements in sequencing technology have made it practical to simultaneously sequence millions of nucleotides distributed across the genome, opening the door for genomic studies in virtually any species. Ornithological research stands to benefit in three substantial ways. First, genomic methods enhance our ability to parse and simultaneously analyze both neutral and non-neutral genomic regions, thus providing insight into adaptive evolution and divergence. Second, the sheer quantity of sequence data generated by current sequencing platforms allows increased precision and resolution in analyses. Third, high-throughput sequencing can benefit applications that focus on a small number of loci that are otherwise prohibitively expensive, time-consuming, and technically difficult using traditional sequencing methods. These advances have improved our ability to understand evolutionary processes like speciation and local adaptation, but they also offer many practical applications in the fields of population ecology, migration tracking, conservation planning, diet analyses, and disease ecology. This review provides a guide for field ornithologists interested in incorporating genomic approaches into their research program, with an emphasis on techniques related to ecology and conservation. We present a general overview of contemporary genomic approaches and methods, as well as important considerations when selecting a genomic technique. We also discuss research questions that are likely to benefit from utilizing high-throughput sequencing instruments, highlighting select examples from recent avian studies.
Filling reference gaps via assembling DNA barcodes using high-throughput sequencing—moving toward barcoding the world

PubMed Central

Zhou, Chengran

2017-01-01

Abstract Over the past decade, biodiversity researchers have dedicated tremendous efforts to constructing DNA reference barcodes for rapid species registration and identification. Although analytical cost for standard DNA barcoding has been significantly reduced since early 2000, further dramatic reduction in barcoding costs is unlikely because Sanger sequencing is approaching its limits in throughput and chemistry cost. Constraints in barcoding cost not only led to unbalanced barcoding efforts around the globe, but also prevented high-throughput sequencing (HTS)–based taxonomic identification from applying binomial species names, which provide crucial linkages to biological knowledge. We developed an Illumina-based pipeline, HIFI-Barcode, to produce full-length Cytochrome c oxidase subunit I (COI) barcodes from pooled polymerase chain reaction amplicons generated by individual specimens. The new pipeline generated accurate barcode sequences that were comparable to Sanger standards, even for different haplotypes of the same species that were only a few nucleotides different from each other. Additionally, the new pipeline was much more sensitive in recovering amplicons at low quantity. The HIFI-Barcode pipeline successfully recovered barcodes from more than 78% of the polymerase chain reactions that didn’t show clear bands on the electrophoresis gel. Moreover, sequencing results based on the single molecular sequencing platform Pacbio confirmed the accuracy of the HIFI-Barcode results. Altogether, the new pipeline can provide an improved solution to produce full-length reference barcodes at about one-tenth of the current cost, enabling construction of comprehensive barcode libraries for local fauna, leading to a feasible direction for DNA barcoding global biomes. PMID:29077841
Development of a High-Throughput Resequencing Array for the Detection of Pathogenic Mutations in Osteogenesis Imperfecta

PubMed Central

Wang, Yao; Cui, Yazhou; Zhou, Xiaoyan; Han, Jinxiang

2015-01-01

Objective Osteogenesis imperfecta (OI) is a rare inherited skeletal disease, characterized by bone fragility and low bone density. The mutations in this disorder have been widely reported to be on various exonal hotspots of the candidate genes, including COL1A1, COL1A2, CRTAP, LEPRE1, and FKBP10, thus creating a great demand for precise genetic tests. However, large genome sizes make the process daunting and the analyses, inefficient and expensive. Therefore, we aimed at developing a fast, accurate, efficient, and cheaper sequencing platform for OI diagnosis; and to this end, use of an advanced array-based technique was proposed. Method A CustomSeq Affymetrix Resequencing Array was established for high-throughput sequencing of five genes simultaneously. Genomic DNA extraction from 13 OI patients and 85 normal controls and amplification using long-range PCR (LR-PCR) were followed by DNA fragmentation and chip hybridization, according to standard Affymetrix protocols. Hybridization signals were determined using GeneChip Sequence Analysis Software (GSEQ). To examine the feasibility, the outcome from new resequencing approach was validated by conventional capillary sequencing method. Result Overall call rates using resequencing array was 96–98% and the agreement between microarray and capillary sequencing was 99.99%. 11 out of 13 OI patients with pathogenic mutations were successfully detected by the chip analysis without adjustment, and one mutation could also be identified using manual visual inspection. Conclusion A high-throughput resequencing array was developed that detects the disease-associated mutations in OI, providing a potential tool to facilitate large-scale genetic screening for OI patients. Through this method, a novel mutation was also found. PMID:25742658
Overview of Next-generation Sequencing Platforms Used in Published Draft Plant Genomes in Light of Genotypization of Immortelle Plant (Helichrysium Arenarium)

PubMed Central

Hodzic, Jasin; Gurbeta, Lejla; Omanovic-Miklicanin, Enisa; Badnjevic, Almir

2017-01-01

Introduction: Major advancements in DNA sequencing methods introduced in the first decade of the new millennium initiated a rapid expansion of sequencing studies, which yielded a tremendous amount of DNA sequence data, including whole sequenced genomes of various species, including plants. A set of novel sequencing platforms, often collectively named as “next-generation sequencing” (NGS) completely transformed the life sciences, by allowing extensive throughput, while greatly reducing the necessary time, labor and cost of any sequencing endeavor. Purpose: of this paper is to present an overview NGS platforms used to produce the current compendium of published draft genomes of various plants, namely the Roche/454, ABI/SOLiD, and Solexa/Illumina, and to determine the most frequently used platform for the whole genome sequencing of plants in light of genotypization of immortelle plant. Materials and methods: 45 papers were selected (with 47 presented plant genome draft sequences), and utilized sequencing techniques and NGS platforms (Roche/454, ABI/SOLiD and Illumina/Solexa) in selected papers were determined. Subsequently, frequency of usage of each platform or combination of platforms was calculated. Results: Illumina/Solexa platforms are by used either as sole sequencing tool in 40.42% of published genomes, or in combination with other platforms - additional 48.94% of published genomes, followed by Roche/454 platforms, used in combination with traditional Sanger sequencing method (10.64%), and never as a sole tool. ABI/SOLiD was only used in combination with Illumina/Solexa and Roche/454 in 4.25% of publications. Conclusions: Illumina/Solexa platforms are by far most preferred by researchers, most probably due to most affordable sequencing costs. Taking into consideration the current economic situation in the Balkans region, Illumina Solexa is the best (if not the only) platform choice if the sequencing of immortelle plant (Helichrysium arenarium) is to be performed by the researchers in this region. PMID:28974852
Development and Validation of an Automated High-Throughput System for Zebrafish In Vivo Screenings

PubMed Central

Virto, Juan M.; Holgado, Olaia; Diez, Maria; Izpisua Belmonte, Juan Carlos; Callol-Massot, Carles

2012-01-01

The zebrafish is a vertebrate model compatible with the paradigms of drug discovery. The small size and transparency of zebrafish embryos make them amenable for the automation necessary in high-throughput screenings. We have developed an automated high-throughput platform for in vivo chemical screenings on zebrafish embryos that includes automated methods for embryo dispensation, compound delivery, incubation, imaging and analysis of the results. At present, two different assays to detect cardiotoxic compounds and angiogenesis inhibitors can be automatically run in the platform, showing the versatility of the system. A validation of these two assays with known positive and negative compounds, as well as a screening for the detection of unknown anti-angiogenic compounds, have been successfully carried out in the system developed. We present a totally automated platform that allows for high-throughput screenings in a vertebrate organism. PMID:22615792
Screening of HIV-1 Protease Using a Combination of an Ultra-High-Throughput Fluorescent-Based Assay and RapidFire Mass Spectrometry.

PubMed

Meng, Juncai; Lai, Ming-Tain; Munshi, Vandna; Grobler, Jay; McCauley, John; Zuck, Paul; Johnson, Eric N; Uebele, Victor N; Hermes, Jeffrey D; Adam, Gregory C

2015-06-01

HIV-1 protease (PR) represents one of the primary targets for developing antiviral agents for the treatment of HIV-infected patients. To identify novel PR inhibitors, a label-free, high-throughput mass spectrometry (HTMS) assay was developed using the RapidFire platform and applied as an orthogonal assay to confirm hits identified in a fluorescence resonance energy transfer (FRET)-based primary screen of > 1 million compounds. For substrate selection, a panel of peptide substrates derived from natural processing sites for PR was evaluated on the RapidFire platform. As a result, KVSLNFPIL, a new substrate measured to have a ~ 20- and 60-fold improvement in k cat/K m over the frequently used sequences SQNYPIVQ and SQNYPIV, respectively, was identified for the HTMS screen. About 17% of hits from the FRET-based primary screen were confirmed in the HTMS confirmatory assay including all 304 known PR inhibitors in the set, demonstrating that the HTMS assay is effective at triaging false-positives while capturing true hits. Hence, with a sampling rate of ~7 s per well, the RapidFire HTMS assay enables the high-throughput evaluation of peptide substrates and functions as an efficient tool for hits triage in the discovery of novel PR inhibitors. © 2015 Society for Laboratory Automation and Screening.
Mining high-throughput experimental data to link gene and function.

PubMed

Blaby-Haas, Crysten E; de Crécy-Lagard, Valérie

2011-04-01

Nearly 2200 genomes that encode around 6 million proteins have now been sequenced. Around 40% of these proteins are of unknown function, even when function is loosely and minimally defined as 'belonging to a superfamily'. In addition to in silico methods, the swelling stream of high-throughput experimental data can give valuable clues for linking these unknowns with precise biological roles. The goal is to develop integrative data-mining platforms that allow the scientific community at large to access and utilize this rich source of experimental knowledge. To this end, we review recent advances in generating whole-genome experimental datasets, where this data can be accessed, and how it can be used to drive prediction of gene function. Copyright © 2011 Elsevier Ltd. All rights reserved.

Microfluidic single-cell whole-transcriptome sequencing.

PubMed

Streets, Aaron M; Zhang, Xiannian; Cao, Chen; Pang, Yuhong; Wu, Xinglong; Xiong, Liang; Yang, Lu; Fu, Yusi; Zhao, Liang; Tang, Fuchou; Huang, Yanyi

2014-05-13

Single-cell whole-transcriptome analysis is a powerful tool for quantifying gene expression heterogeneity in populations of cells. Many techniques have, thus, been recently developed to perform transcriptome sequencing (RNA-Seq) on individual cells. To probe subtle biological variation between samples with limiting amounts of RNA, more precise and sensitive methods are still required. We adapted a previously developed strategy for single-cell RNA-Seq that has shown promise for superior sensitivity and implemented the chemistry in a microfluidic platform for single-cell whole-transcriptome analysis. In this approach, single cells are captured and lysed in a microfluidic device, where mRNAs with poly(A) tails are reverse-transcribed into cDNA. Double-stranded cDNA is then collected and sequenced using a next generation sequencing platform. We prepared 94 libraries consisting of single mouse embryonic cells and technical replicates of extracted RNA and thoroughly characterized the performance of this technology. Microfluidic implementation increased mRNA detection sensitivity as well as improved measurement precision compared with tube-based protocols. With 0.2 M reads per cell, we were able to reconstruct a majority of the bulk transcriptome with 10 single cells. We also quantified variation between and within different types of mouse embryonic cells and found that enhanced measurement precision, detection sensitivity, and experimental throughput aided the distinction between biological variability and technical noise. With this work, we validated the advantages of an early approach to single-cell RNA-Seq and showed that the benefits of combining microfluidic technology with high-throughput sequencing will be valuable for large-scale efforts in single-cell transcriptome analysis.
Multiplex Reverse Transcription-PCR for Simultaneous Surveillance of Influenza A and B Viruses

PubMed Central

Zhou, Bin; Barnes, John R.; Sessions, October M.; Chou, Tsui-Wen; Wilson, Malania; Stark, Thomas J.; Volk, Michelle; Spirason, Natalie; Halpin, Rebecca A.; Kamaraj, Uma Sangumathi; Ding, Tao; Stockwell, Timothy B.; Ghedin, Elodie; Barr, Ian G.

2017-01-01

ABSTRACT Influenza A and B viruses are the causative agents of annual influenza epidemics that can be severe, and influenza A viruses intermittently cause pandemics. Sequence information from influenza virus genomes is instrumental in determining mechanisms underpinning antigenic evolution and antiviral resistance. However, due to sequence diversity and the dynamics of influenza virus evolution, rapid and high-throughput sequencing of influenza viruses remains a challenge. We developed a single-reaction influenza A/B virus (FluA/B) multiplex reverse transcription-PCR (RT-PCR) method that amplifies the most critical genomic segments (hemagglutinin [HA], neuraminidase [NA], and matrix [M]) of seasonal influenza A and B viruses for next-generation sequencing, regardless of viral type, subtype, or lineage. Herein, we demonstrate that the strategy is highly sensitive and robust. The strategy was validated on thousands of seasonal influenza A and B virus-positive specimens using multiple next-generation sequencing platforms. PMID:28978683
Modeling read counts for CNV detection in exome sequencing data.

PubMed

Love, Michael I; Myšičková, Alena; Sun, Ruping; Kalscheuer, Vera; Vingron, Martin; Haas, Stefan A

2011-11-08

Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.
Personalized In Vitro and In Vivo Cancer Models to Guide Precision Medicine

PubMed Central

Pauli, Chantal; Hopkins, Benjamin D.; Prandi, Davide; Shaw, Reid; Fedrizzi, Tarcisio; Sboner, Andrea; Sailer, Verena; Augello, Michael; Puca, Loredana; Rosati, Rachele; McNary, Terra J.; Churakova, Yelena; Cheung, Cynthia; Triscott, Joanna; Pisapia, David; Rao, Rema; Mosquera, Juan Miguel; Robinson, Brian; Faltas, Bishoy M.; Emerling, Brooke E.; Gadi, Vijayakrishna K.; Bernard, Brady; Elemento, Olivier; Beltran, Himisha; Dimichelis, Francesca; Kemp, Christopher J.; Grandori, Carla; Cantley, Lewis C.; Rubin, Mark A.

2017-01-01

Precision Medicine is an approach that takes into account the influence of individuals' genes, environment and lifestyle exposures to tailor interventions. Here, we describe the development of a robust precision cancer care platform, which integrates whole exome sequencing (WES) with a living biobank that enables high throughput drug screens on patient-derived tumor organoids. To date, 56 tumor-derived organoid cultures, and 19 patient-derived xenograft (PDX) models have been established from the 769 patients enrolled in an IRB approved clinical trial. Because genomics alone was insufficient to identify therapeutic options for the majority of patients with advanced disease, we used high throughput drug screening effective strategies. Analysis of tumor derived cells from four cases, two uterine malignancies and two colon cancers, identified effective drugs and drug combinations that were subsequently validated using 3D cultures and PDX models. This platform thereby promotes the discovery of novel therapeutic approaches that can be assessed in clinical trials and provides personalized therapeutic options for individual patients where standard clinical options have been exhausted. PMID:28331002
Crop 3D-a LiDAR based platform for 3D high-throughput crop phenotyping.

PubMed

Guo, Qinghua; Wu, Fangfang; Pang, Shuxin; Zhao, Xiaoqian; Chen, Linhai; Liu, Jin; Xue, Baolin; Xu, Guangcai; Li, Le; Jing, Haichun; Chu, Chengcai

2018-03-01

With the growing population and the reducing arable land, breeding has been considered as an effective way to solve the food crisis. As an important part in breeding, high-throughput phenotyping can accelerate the breeding process effectively. Light detection and ranging (LiDAR) is an active remote sensing technology that is capable of acquiring three-dimensional (3D) data accurately, and has a great potential in crop phenotyping. Given that crop phenotyping based on LiDAR technology is not common in China, we developed a high-throughput crop phenotyping platform, named Crop 3D, which integrated LiDAR sensor, high-resolution camera, thermal camera and hyperspectral imager. Compared with traditional crop phenotyping techniques, Crop 3D can acquire multi-source phenotypic data in the whole crop growing period and extract plant height, plant width, leaf length, leaf width, leaf area, leaf inclination angle and other parameters for plant biology and genomics analysis. In this paper, we described the designs, functions and testing results of the Crop 3D platform, and briefly discussed the potential applications and future development of the platform in phenotyping. We concluded that platforms integrating LiDAR and traditional remote sensing techniques might be the future trend of crop high-throughput phenotyping.
Using optical mapping data for the improvement of vertebrate genome assemblies.

PubMed

Howe, Kerstin; Wood, Jonathan M D

2015-01-01

Optical mapping is a technology that gathers long-range information on genome sequences similar to ordered restriction digest maps. Because it is not subject to cloning, amplification, hybridisation or sequencing bias, it is ideally suited to the improvement of fragmented genome assemblies that can no longer be improved by classical methods. In addition, its low cost and rapid turnaround make it equally useful during the scaffolding process of de novo assembly from high throughput sequencing reads. We describe how optical mapping has been used in practice to produce high quality vertebrate genome assemblies. In particular, we detail the efforts undertaken by the Genome Reference Consortium (GRC), which maintains the reference genomes for human, mouse, zebrafish and chicken, and uses different optical mapping platforms for genome curation.
Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform.

PubMed

Schirmer, Melanie; Ijaz, Umer Z; D'Amore, Rosalinda; Hall, Neil; Sloan, William T; Quince, Christopher

2015-03-31

With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
From cheek swabs to consensus sequences: an A to Z protocol for high-throughput DNA sequencing of complete human mitochondrial genomes

PubMed Central

2014-01-01

Background Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources. PMID:24460871
SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

USGS Publications Warehouse

Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

2013-01-01

SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.
Identification and correction of systematic error in high-throughput sequence data

PubMed Central

2011-01-01

Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. PMID:22099972
Novel Phenotype-Genotype Correlations of Restrictive Cardiomyopathy With Myosin-Binding Protein C (MYBPC3) Gene Mutations Tested by Next-Generation Sequencing.

PubMed

Wu, Wei; Lu, Chao-Xia; Wang, Yi-Ning; Liu, Fang; Chen, Wei; Liu, Yong-Tai; Han, Ye-Chen; Cao, Jian; Zhang, Shu-Yang; Zhang, Xue

2015-07-10

MYBPC3 dysfunctions have been proven to induce dilated cardiomyopathy, hypertrophic cardiomyopathy, and/or left ventricular noncompaction; however, the genotype-phenotype correlation between MYBPC3 and restrictive cardiomyopathy (RCM) has not been established. The newly developed next-generation sequencing method is capable of broad genomic DNA sequencing with high throughput and can help explore novel correlations between genetic variants and cardiomyopathies. A proband from a multigenerational family with 3 live patients and 1 unrelated patient with clinical diagnoses of RCM underwent a next-generation sequencing workflow based on a custom AmpliSeq panel, including 64 candidate pathogenic genes for cardiomyopathies, on the Ion Personal Genome Machine high-throughput sequencing benchtop instrument. The selected panel contained a total of 64 genes that were reportedly associated with inherited cardiomyopathies. All patients fulfilled strict criteria for RCM with clinical characteristics, echocardiography, and/or cardiac magnetic resonance findings. The multigenerational family with 3 adult RCM patients carried an identical nonsense MYBPC3 mutation, and the unrelated patient carried a missense mutation in the MYBPC3 gene. All of these results were confirmed by the Sanger sequencing method. This study demonstrated that MYBPC3 gene mutations, revealed by next-generation sequencing, were associated with familial and sporadic RCM patients. It is suggested that the next-generation sequencing platform with a selected panel provides a highly efficient approach for molecular diagnosis of hereditary and idiopathic RCM and helps build new genotype-phenotype correlations. © 2015 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley Blackwell.
Comparative aerial- and ground-based high-throughput phenotyping for the genetic dissection of NDVI as a proxy for drought-adaptive traits in durum wheat

USDA-ARS?s Scientific Manuscript database

High-throughput phenotyping platforms (HTPPs) provide novel opportunities to more effectively dissect the genetic basis of drought-adaptive traits. This genome-wide association study (GWAS) compares the results obtained with two Unmanned Aerial Vehicles (UAVs) and a ground-based platform used to mea...
A Distributed Amplifier System for Bilayer Lipid Membrane (BLM) Arrays With Noise and Individual Offset Cancellation.

PubMed

Crescentini, Marco; Thei, Frederico; Bennati, Marco; Saha, Shimul; de Planque, Maurits R R; Morgan, Hywel; Tartagni, Marco

2015-06-01

Lipid bilayer membrane (BLM) arrays are required for high throughput analysis, for example drug screening or advanced DNA sequencing. Complex microfluidic devices are being developed but these are restricted in terms of array size and structure or have integrated electronic sensing with limited noise performance. We present a compact and scalable multichannel electrophysiology platform based on a hybrid approach that combines integrated state-of-the-art microelectronics with low-cost disposable fluidics providing a platform for high-quality parallel single ion channel recording. Specifically, we have developed a new integrated circuit amplifier based on a novel noise cancellation scheme that eliminates flicker noise derived from devices under test and amplifiers. The system is demonstrated through the simultaneous recording of ion channel activity from eight bilayer membranes. The platform is scalable and could be extended to much larger array sizes, limited only by electronic data decimation and communication capabilities.
Digital imaging of root traits (DIRT): a high-throughput computing and collaboration platform for field-based root phenomics.

PubMed

Das, Abhiram; Schneider, Hannah; Burridge, James; Ascanio, Ana Karine Martinez; Wojciechowski, Tobias; Topp, Christopher N; Lynch, Jonathan P; Weitz, Joshua S; Bucksch, Alexander

2015-01-01

Plant root systems are key drivers of plant function and yield. They are also under-explored targets to meet global food and energy demands. Many new technologies have been developed to characterize crop root system architecture (CRSA). These technologies have the potential to accelerate the progress in understanding the genetic control and environmental response of CRSA. Putting this potential into practice requires new methods and algorithms to analyze CRSA in digital images. Most prior approaches have solely focused on the estimation of root traits from images, yet no integrated platform exists that allows easy and intuitive access to trait extraction and analysis methods from images combined with storage solutions linked to metadata. Automated high-throughput phenotyping methods are increasingly used in laboratory-based efforts to link plant genotype with phenotype, whereas similar field-based studies remain predominantly manual low-throughput. Here, we present an open-source phenomics platform "DIRT", as a means to integrate scalable supercomputing architectures into field experiments and analysis pipelines. DIRT is an online platform that enables researchers to store images of plant roots, measure dicot and monocot root traits under field conditions, and share data and results within collaborative teams and the broader community. The DIRT platform seamlessly connects end-users with large-scale compute "commons" enabling the estimation and analysis of root phenotypes from field experiments of unprecedented size. DIRT is an automated high-throughput computing and collaboration platform for field based crop root phenomics. The platform is accessible at http://www.dirt.iplantcollaborative.org/ and hosted on the iPlant cyber-infrastructure using high-throughput grid computing resources of the Texas Advanced Computing Center (TACC). DIRT is a high volume central depository and high-throughput RSA trait computation platform for plant scientists working on crop roots. It enables scientists to store, manage and share crop root images with metadata and compute RSA traits from thousands of images in parallel. It makes high-throughput RSA trait computation available to the community with just a few button clicks. As such it enables plant scientists to spend more time on science rather than on technology. All stored and computed data is easily accessible to the public and broader scientific community. We hope that easy data accessibility will attract new tool developers and spur creative data usage that may even be applied to other fields of science.
High-throughput cell-based screening reveals a role for ZNF131 as a repressor of ERalpha signaling

PubMed Central

Han, Xiao; Guo, Jinhai; Deng, Weiwei; Zhang, Chenying; Du, Peige; Shi, Taiping; Ma, Dalong

2008-01-01

Background Estrogen receptor α (ERα) is a transcription factor whose activity is affected by multiple regulatory cofactors. In an effort to identify the human genes involved in the regulation of ERα, we constructed a high-throughput, cell-based, functional screening platform by linking a response element (ERE) with a reporter gene. This allowed the cellular activity of ERα, in cells cotransfected with the candidate gene, to be quantified in the presence or absence of its cognate ligand E2. Results From a library of 570 human cDNA clones, we identified zinc finger protein 131 (ZNF131) as a repressor of ERα mediated transactivation. ZNF131 is a typical member of the BTB/POZ family of transcription factors, and shows both ubiquitous expression and a high degree of sequence conservation. The luciferase reporter gene assay revealed that ZNF131 inhibits ligand-dependent transactivation by ERα in a dose-dependent manner. Electrophoretic mobility shift assay clearly demonstrated that the interaction between ZNF131 and ERα interrupts or prevents ERα binding to the estrogen response element (ERE). In addition, ZNF131 was able to suppress the expression of pS2, an ERα target gene. Conclusion We suggest that the functional screening platform we constructed can be applied for high-throughput genomic screening candidate ERα-related genes. This in turn may provide new insights into the underlying molecular mechanisms of ERα regulation in mammalian cells. PMID:18847501
Design and characterization of a nanopore-coupled polymerase for single-molecule DNA sequencing by synthesis on an electrode array

PubMed Central

Stranges, P. Benjamin; Palla, Mirkó; Kalachikov, Sergey; Nivala, Jeff; Dorwart, Michael; Trans, Andrew; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Tao, Chuanjuan; Morozova, Irina; Li, Zengmin; Shi, Shundi; Aberra, Aman; Arnold, Cleoma; Yang, Alexander; Aguirre, Anne; Harada, Eric T.; Korenblum, Daniel; Pollard, James; Bhat, Ashwini; Gremyachinskiy, Dmitriy; Bibillo, Arek; Chen, Roger; Davis, Randy; Russo, James J.; Fuller, Carl W.; Roever, Stefan; Ju, Jingyue; Church, George M.

2016-01-01

Scalable, high-throughput DNA sequencing is a prerequisite for precision medicine and biomedical research. Recently, we presented a nanopore-based sequencing-by-synthesis (Nanopore-SBS) approach, which used a set of nucleotides with polymer tags that allow discrimination of the nucleotides in a biological nanopore. Here, we designed and covalently coupled a DNA polymerase to an α-hemolysin (αHL) heptamer using the SpyCatcher/SpyTag conjugation approach. These porin–polymerase conjugates were inserted into lipid bilayers on a complementary metal oxide semiconductor (CMOS)-based electrode array for high-throughput electrical recording of DNA synthesis. The designed nanopore construct successfully detected the capture of tagged nucleotides complementary to a DNA base on a provided template. We measured over 200 tagged-nucleotide signals for each of the four bases and developed a classification method to uniquely distinguish them from each other and background signals. The probability of falsely identifying a background event as a true capture event was less than 1.2%. In the presence of all four tagged nucleotides, we observed sequential additions in real time during polymerase-catalyzed DNA synthesis. Single-polymerase coupling to a nanopore, in combination with the Nanopore-SBS approach, can provide the foundation for a low-cost, single-molecule, electronic DNA-sequencing platform. PMID:27729524
SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses.

PubMed

Vetrovský, Tomáš; Baldrian, Petr; Morais, Daniel; Berger, Bonnie

2018-02-14

Modern molecular methods have increased our ability to describe microbial communities. Along with the advances brought by new sequencing technologies, we now require intensive computational resources to make sense of the large numbers of sequences continuously produced. The software developed by the scientific community to address this demand, although very useful, require experience of the command-line environment, extensive training and have steep learning curves, limiting their use. We created SEED 2, a graphical user interface for handling high-throughput amplicon-sequencing data under Windows operating systems. SEED 2 is the only sequence visualizer that empowers users with tools to handle amplicon-sequencing data of microbial community markers. It is suitable for any marker genes sequences obtained through Illumina, IonTorrent or Sanger sequencing. SEED 2 allows the user to process raw sequencing data, identify specific taxa, produce of OTU-tables, create sequence alignments and construct phylogenetic trees. Standard dual core laptops with 8 GB of RAM can handle ca. 8 million of Illumina PE 300 bp sequences, ca. 4GB of data. SEED 2 was implemented in Object Pascal and uses internal functions and external software for amplicon data processing. SEED 2 is a freeware software, available at http://www.biomed.cas.cz/mbu/lbwrf/seed/ as a self-contained file, including all the dependencies, and does not require installation. Supplementary data contain a comprehensive list of supported functions. daniel.morais@biomed.cas.cz. Supplementary data are available at Bioinformatics online. © The Author(s) 2018. Published by Oxford University Press.
PCR cycles above routine numbers do not compromise high-throughput DNA barcoding results.

PubMed

Vierna, J; Doña, J; Vizcaíno, A; Serrano, D; Jovani, R

2017-10-01

High-throughput DNA barcoding has become essential in ecology and evolution, but some technical questions still remain. Increasing the number of PCR cycles above the routine 20-30 cycles is a common practice when working with old-type specimens, which provide little amounts of DNA, or when facing annealing issues with the primers. However, increasing the number of cycles can raise the number of artificial mutations due to polymerase errors. In this work, we sequenced 20 COI libraries in the Illumina MiSeq platform. Libraries were prepared with 40, 45, 50, 55, and 60 PCR cycles from four individuals belonging to four species of four genera of cephalopods. We found no relationship between the number of PCR cycles and the number of mutations despite using a nonproofreading polymerase. Moreover, even when using a high number of PCR cycles, the resulting number of mutations was low enough not to be an issue in the context of high-throughput DNA barcoding (but may still remain an issue in DNA metabarcoding due to chimera formation). We conclude that the common practice of increasing the number of PCR cycles should not negatively impact the outcome of a high-throughput DNA barcoding study in terms of the occurrence of point mutations.
Current status and future perspectives on molecular and serological methods in diagnostic mycology.

PubMed

Lau, Anna; Chen, Sharon; Sleiman, Sue; Sorrell, Tania

2009-11-01

Invasive fungal infections are an important cause of infectious morbidity. Nonculture-based methods are increasingly used for rapid, accurate diagnosis to improve patient outcomes. New and existing DNA amplification platforms have high sensitivity and specificity for direct detection and identification of fungi in clinical specimens. Since laboratories are increasingly reliant on DNA sequencing for fungal identification, measures to improve sequence interpretation should support validation of reference isolates and quality control in public gene repositories. Novel technologies (e.g., isothermal and PNA FISH methods), platforms enabling high-throughput analyses (e.g., DNA microarrays and Luminex xMAP) and/or commercial PCR assays warrant further evaluation for routine diagnostic use. Notwithstanding the advantages of molecular tests, serological assays remain clinically useful for patient management. The serum Aspergillus galactomannan test has been incorporated into diagnostic algorithms of invasive aspergillosis. Both the galactomannan and the serum beta-D-glucan test have value for diagnosing infection and monitoring therapeutic response.
CisSERS: Customizable in silico sequence evaluation for restriction sites

DOE PAGES

Sharpe, Richard M.; Koepke, Tyson; Harper, Artemus; ...

2016-04-12

High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Here, data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated tomore » enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERSenable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERSand results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.« less

CisSERS: Customizable in silico sequence evaluation for restriction sites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharpe, Richard M.; Koepke, Tyson; Harper, Artemus

High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Here, data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated tomore » enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERSenable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERSand results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.« less
Deep Sequencing Analysis of RNAs from Citrus Plants Grown in a Citrus Sudden Death-Affected Area Reveals Diverse Known and Putative Novel Viruses.

PubMed

Matsumura, Emilyn E; Coletta-Filho, Helvecio D; Nouri, Shahideh; Falk, Bryce W; Nerva, Luca; Oliveira, Tiago S; Dorta, Silvia O; Machado, Marcos A

2017-04-24

Citrus sudden death (CSD) has caused the death of approximately four million orange trees in a very important citrus region in Brazil. Although its etiology is still not completely clear, symptoms and distribution of affected plants indicate a viral disease. In a search for viruses associated with CSD, we have performed a comparative high-throughput sequencing analysis of the transcriptome and small RNAs from CSD-symptomatic and -asymptomatic plants using the Illumina platform. The data revealed mixed infections that included Citrus tristeza virus (CTV) as the most predominant virus, followed by the Citrus sudden death-associated virus (CSDaV), Citrus endogenous pararetrovirus (CitPRV) and two putative novel viruses tentatively named Citrus jingmen-like virus (CJLV), and Citrus virga-like virus (CVLV). The deep sequencing analyses were sensitive enough to differentiate two genotypes of both viruses previously associated with CSD-affected plants: CTV and CSDaV. Our data also showed a putative association of the CSD-symptomatic plants with a specific CSDaV genotype and a likely association with CitPRV as well, whereas the two putative novel viruses showed to be more associated with CSD-asymptomatic plants. This is the first high-throughput sequencing-based study of the viral sequences present in CSD-affected citrus plants, and generated valuable information for further CSD studies.
Genome-wide mapping of DNase I hypersensitive sites in rare cell populations using single-cell DNase sequencing.

PubMed

Cooper, James; Ding, Yi; Song, Jiuzhou; Zhao, Keji

2017-11-01

Increased chromatin accessibility is a feature of cell-type-specific cis-regulatory elements; therefore, mapping of DNase I hypersensitive sites (DHSs) enables the detection of active regulatory elements of transcription, including promoters, enhancers, insulators and locus-control regions. Single-cell DNase sequencing (scDNase-seq) is a method of detecting genome-wide DHSs when starting with either single cells or <1,000 cells from primary cell sources. This technique enables genome-wide mapping of hypersensitive sites in a wide range of cell populations that cannot be analyzed using conventional DNase I sequencing because of the requirement for millions of starting cells. Fresh cells, formaldehyde-cross-linked cells or cells recovered from formalin-fixed paraffin-embedded (FFPE) tissue slides are suitable for scDNase-seq assays. To generate scDNase-seq libraries, cells are lysed and then digested with DNase I. Circular carrier plasmid DNA is included during subsequent DNA purification and library preparation steps to prevent loss of the small quantity of DHS DNA. Libraries are generated for high-throughput sequencing on the Illumina platform using standard methods. Preparation of scDNase-seq libraries requires only 2 d. The materials and molecular biology techniques described in this protocol should be accessible to any general molecular biology laboratory. Processing of high-throughput sequencing data requires basic bioinformatics skills and uses publicly available bioinformatics software.
BiQ Analyzer HT: locus-specific analysis of DNA methylation by high-throughput bisulfite sequencing

PubMed Central

Lutsik, Pavlo; Feuerbach, Lars; Arand, Julia; Lengauer, Thomas; Walter, Jörn; Bock, Christoph

2011-01-01

Bisulfite sequencing is a widely used method for measuring DNA methylation in eukaryotic genomes. The assay provides single-base pair resolution and, given sufficient sequencing depth, its quantitative accuracy is excellent. High-throughput sequencing of bisulfite-converted DNA can be applied either genome wide or targeted to a defined set of genomic loci (e.g. using locus-specific PCR primers or DNA capture probes). Here, we describe BiQ Analyzer HT (http://biq-analyzer-ht.bioinf.mpi-inf.mpg.de/), a user-friendly software tool that supports locus-specific analysis and visualization of high-throughput bisulfite sequencing data. The software facilitates the shift from time-consuming clonal bisulfite sequencing to the more quantitative and cost-efficient use of high-throughput sequencing for studying locus-specific DNA methylation patterns. In addition, it is useful for locus-specific visualization of genome-wide bisulfite sequencing data. PMID:21565797
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

PubMed Central

Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

2016-01-01

Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
Identification of QTLs for 14 Agronomically Important Traits in Setaria italica Based on SNPs Generated from High-Throughput Sequencing

PubMed Central

Zhang, Kai; Fan, Guangyu; Zhang, Xinxin; Zhao, Fang; Wei, Wei; Du, Guohua; Feng, Xiaolei; Wang, Xiaoming; Wang, Feng; Song, Guoliang; Zou, Hongfeng; Zhang, Xiaolei; Li, Shuangdong; Ni, Xuemei; Zhang, Gengyun; Zhao, Zhihai

2017-01-01

Foxtail millet (Setaria italica) is an important crop possessing C4 photosynthesis capability. The S. italica genome was de novo sequenced in 2012, but the sequence lacked high-density genetic maps with agronomic and yield trait linkages. In the present study, we resequenced a foxtail millet population of 439 recombinant inbred lines (RILs) and developed high-resolution bin map and high-density SNP markers, which could provide an effective approach for gene identification. A total of 59 QTL for 14 agronomic traits in plants grown under long- and short-day photoperiods were identified. The phenotypic variation explained ranged from 4.9 to 43.94%. In addition, we suggested that there may be segregation distortion on chromosome 6 that is significantly distorted toward Zhang gu. The newly identified QTL will provide a platform for sequence-based research on the S. italica genome, and for molecular marker-assisted breeding. PMID:28364039
Identification of QTLs for 14 Agronomically Important Traits in Setaria italica Based on SNPs Generated from High-Throughput Sequencing.

PubMed

Zhang, Kai; Fan, Guangyu; Zhang, Xinxin; Zhao, Fang; Wei, Wei; Du, Guohua; Feng, Xiaolei; Wang, Xiaoming; Wang, Feng; Song, Guoliang; Zou, Hongfeng; Zhang, Xiaolei; Li, Shuangdong; Ni, Xuemei; Zhang, Gengyun; Zhao, Zhihai

2017-05-05

Foxtail millet ( Setaria italica ) is an important crop possessing C4 photosynthesis capability. The S. italica genome was de novo sequenced in 2012, but the sequence lacked high-density genetic maps with agronomic and yield trait linkages. In the present study, we resequenced a foxtail millet population of 439 recombinant inbred lines (RILs) and developed high-resolution bin map and high-density SNP markers, which could provide an effective approach for gene identification. A total of 59 QTL for 14 agronomic traits in plants grown under long- and short-day photoperiods were identified. The phenotypic variation explained ranged from 4.9 to 43.94%. In addition, we suggested that there may be segregation distortion on chromosome 6 that is significantly distorted toward Zhang gu. The newly identified QTL will provide a platform for sequence-based research on the S. italica genome, and for molecular marker-assisted breeding. Copyright © 2017 Zhang et al.
Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput.

PubMed

Gierahn, Todd M; Wadsworth, Marc H; Hughes, Travis K; Bryson, Bryan D; Butler, Andrew; Satija, Rahul; Fortune, Sarah; Love, J Christopher; Shalek, Alex K

2017-04-01

Single-cell RNA-seq can precisely resolve cellular states, but applying this method to low-input samples is challenging. Here, we present Seq-Well, a portable, low-cost platform for massively parallel single-cell RNA-seq. Barcoded mRNA capture beads and single cells are sealed in an array of subnanoliter wells using a semipermeable membrane, enabling efficient cell lysis and transcript capture. We use Seq-Well to profile thousands of primary human macrophages exposed to Mycobacterium tuberculosis.
NG6: Integrated next generation sequencing storage and processing environment.

PubMed

Mariette, Jérôme; Escudié, Frédéric; Allias, Nicolas; Salin, Gérald; Noirot, Céline; Thomas, Sylvain; Klopp, Christophe

2012-09-09

Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data.
The challenges of sequencing by synthesis.

PubMed

Fuller, Carl W; Middendorf, Lyle R; Benner, Steven A; Church, George M; Harris, Timothy; Huang, Xiaohua; Jovanovich, Stevan B; Nelson, John R; Schloss, Jeffery A; Schwartz, David C; Vezenov, Dmitri V

2009-11-01

DNA sequencing-by-synthesis (SBS) technology, using a polymerase or ligase enzyme as its core biochemistry, has already been incorporated in several second-generation DNA sequencing systems with significant performance. Notwithstanding the substantial success of these SBS platforms, challenges continue to limit the ability to reduce the cost of sequencing a human genome to $100,000 or less. Achieving dramatically reduced cost with enhanced throughput and quality will require the seamless integration of scientific and technological effort across disciplines within biochemistry, chemistry, physics and engineering. The challenges include sample preparation, surface chemistry, fluorescent labels, optimizing the enzyme-substrate system, optics, instrumentation, understanding tradeoffs of throughput versus accuracy, and read-length/phasing limitations. By framing these challenges in a manner accessible to a broad community of scientists and engineers, we hope to solicit input from the broader research community on means of accelerating the advancement of genome sequencing technology.
Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform.

PubMed

Della Mina, Erika; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

2015-03-01

We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8-10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1-2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on 'benchtop' sequencers combining rapid turnaround times with higher manageability.
Species classifier choice is a key consideration when analysing low-complexity food microbiome data.

PubMed

Walsh, Aaron M; Crispie, Fiona; O'Sullivan, Orla; Finnegan, Laura; Claesson, Marcus J; Cotter, Paul D

2018-03-20

The use of shotgun metagenomics to analyse low-complexity microbial communities in foods has the potential to be of considerable fundamental and applied value. However, there is currently no consensus with respect to choice of species classification tool, platform, or sequencing depth. Here, we benchmarked the performances of three high-throughput short-read sequencing platforms, the Illumina MiSeq, NextSeq 500, and Ion Proton, for shotgun metagenomics of food microbiota. Briefly, we sequenced six kefir DNA samples and a mock community DNA sample, the latter constructed by evenly mixing genomic DNA from 13 food-related bacterial species. A variety of bioinformatic tools were used to analyse the data generated, and the effects of sequencing depth on these analyses were tested by randomly subsampling reads. Compositional analysis results were consistent between the platforms at divergent sequencing depths. However, we observed pronounced differences in the predictions from species classification tools. Indeed, PERMANOVA indicated that there was no significant differences between the compositional results generated by the different sequencers (p = 0.693, R 2 = 0.011), but there was a significant difference between the results predicted by the species classifiers (p = 0.01, R 2 = 0.127). The relative abundances predicted by the classifiers, apart from MetaPhlAn2, were apparently biased by reference genome sizes. Additionally, we observed varying false-positive rates among the classifiers. MetaPhlAn2 had the lowest false-positive rate, whereas SLIMM had the greatest false-positive rate. Strain-level analysis results were also similar across platforms. Each platform correctly identified the strains present in the mock community, but accuracy was improved slightly with greater sequencing depth. Notably, PanPhlAn detected the dominant strains in each kefir sample above 500,000 reads per sample. Again, the outputs from functional profiling analysis using SUPER-FOCUS were generally accordant between the platforms at different sequencing depths. Finally, and expectedly, metagenome assembly completeness was significantly lower on the MiSeq than either on the NextSeq (p = 0.03) or the Proton (p = 0.011), and it improved with increased sequencing depth. Our results demonstrate a remarkable similarity in the results generated by the three sequencing platforms at different sequencing depths, and, in fact, the choice of bioinformatics methodology had a more evident impact on results than the choice of sequencer did.
Robo-Lector - a novel platform for automated high-throughput cultivations in microtiter plates with high information content.

PubMed

Huber, Robert; Ritter, Daniel; Hering, Till; Hillmer, Anne-Kathrin; Kensy, Frank; Müller, Carsten; Wang, Le; Büchs, Jochen

2009-08-01

In industry and academic research, there is an increasing demand for flexible automated microfermentation platforms with advanced sensing technology. However, up to now, conventional platforms cannot generate continuous data in high-throughput cultivations, in particular for monitoring biomass and fluorescent proteins. Furthermore, microfermentation platforms are needed that can easily combine cost-effective, disposable microbioreactors with downstream processing and analytical assays. To meet this demand, a novel automated microfermentation platform consisting of a BioLector and a liquid-handling robot (Robo-Lector) was sucessfully built and tested. The BioLector provides a cultivation system that is able to permanently monitor microbial growth and the fluorescence of reporter proteins under defined conditions in microtiter plates. Three examplary methods were programed on the Robo-Lector platform to study in detail high-throughput cultivation processes and especially recombinant protein expression. The host/vector system E. coli BL21(DE3) pRhotHi-2-EcFbFP, expressing the fluorescence protein EcFbFP, was hereby investigated. With the method 'induction profiling' it was possible to conduct 96 different induction experiments (varying inducer concentrations from 0 to 1.5 mM IPTG at 8 different induction times) simultaneously in an automated way. The method 'biomass-specific induction' allowed to automatically induce cultures with different growth kinetics in a microtiter plate at the same biomass concentration, which resulted in a relative standard deviation of the EcFbFP production of only +/- 7%. The third method 'biomass-specific replication' enabled to generate equal initial biomass concentrations in main cultures from precultures with different growth kinetics. This was realized by automatically transferring an appropiate inoculum volume from the different preculture microtiter wells to respective wells of the main culture plate, where subsequently similar growth kinetics could be obtained. The Robo-Lector generates extensive kinetic data in high-throughput cultivations, particularly for biomass and fluorescence protein formation. Based on the non-invasive on-line-monitoring signals, actions of the liquid-handling robot can easily be triggered. This interaction between the robot and the BioLector (Robo-Lector) combines high-content data generation with systematic high-throughput experimentation in an automated fashion, offering new possibilities to study biological production systems. The presented platform uses a standard liquid-handling workstation with widespread automation possibilities. Thus, high-throughput cultivations can now be combined with small-scale downstream processing techniques and analytical assays. Ultimately, this novel versatile platform can accelerate and intensify research and development in the field of systems biology as well as modelling and bioprocess optimization.
High-Throughput Sequencing of Germline and Tumor From Men with Early-Onset Metastatic Prostate Cancer

DTIC Science & Technology

2016-12-01

AWARD NUMBER: W81XWH-13-1-0371 TITLE: High-Throughput Sequencing of Germline and Tumor From Men with Early- Onset Metastatic Prostate Cancer...DATES COVERED 30 Sep 2013 - 29 Sep 2016 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER High-Throughput Sequencing of Germline and Tumor From Men with...presenting with metastatic prostate cancer at a young age (before age 60 years). Whole exome sequencing identified a panel of germline variants that have
High-throughput sequencing of the chloroplast and mitochondrion of Chlamydomonas reinhardtii to generate improved de novo assemblies, analyze expression patterns and transcript speciation, and evaluate diversity among laboratory strains and wild isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gallaher, Sean D.; Fitz-Gibbon, Sorel T.; Strenkert, Daniela

Chlamydomonas reinhardtii is a unicellular chlorophyte alga that is widely studied as a reference organism for understanding photosynthesis, sensory and motile cilia, and for development of an algal-based platform for producing biofuels and bio-products. Its highly repetitive, ~205-kbp circular chloroplast genome and ~15.8-kbp linear mitochondrial genome were sequenced prior to the advent of high-throughput sequencing technologies. Here, high coverage shotgun sequencing was used to assemble both organellar genomes de novo. These new genomes correct dozens of errors in the prior genome sequences and annotations. Gen-ome sequencing coverage indicates that each cell contains on average 83 copies of the chloroplast genomemore » and 130 copies of the mitochondrial genome. Using protocols and analyses optimized for organellar tran-scripts, RNA-Seq was used to quantify their relative abundances across 12 different growth conditions. Forty-six percent of total cellular mRNA is attributable to high expression from a few dozen chloroplast genes. RNA-Seq data were used to guide gene annotation, to demonstrate polycistronic gene expression, and to quantify splicing of psaA and psbA introns. In contrast to a conclusion from a recent study, we found that chloroplast transcripts are not edited. Unexpectedly, cytosine-rich polynucleotide tails were observed at the 3’-end of all mitochondrial transcripts. A comparative genomics analysis of eight laboratory strains and 11 wild isolates of C. reinhardtii identified 2658 variants in the organellargenomes, which is 1/10th as much genetic diversity as is found in the nucleus.« less
Novel throughput phenotyping platforms in plant genetic studies.

PubMed

Montes, Juan M; Melchinger, Albrecht E; Reif, Jochen C

2007-10-01

Unraveling the genetic basis of complex traits in plants is limited by the lack of appropriate phenotyping platforms that enable high-throughput screening of many genotypes in multilocation field trials. Near-infrared spectroscopy on agricultural harvesters and spectral reflectance of plant canopies have recently been reported as promising components of novel phenotyping platforms. Understanding the genetic basis of complex traits is now within reach with the use of these new techniques.
Advances in high throughput DNA sequence data compression.

PubMed

Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

2016-06-01

Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.
Illumina MiSeq Sequencing for Preliminary Analysis of Microbiome Causing Primary Endodontic Infections in Egypt

PubMed Central

Azab, Marwa Mohamed; Fayyad, Dalia Mukhtar

2018-01-01

The use of high throughput next generation technologies has allowed more comprehensive analysis than traditional Sanger sequencing. The specific aim of this study was to investigate the microbial diversity of primary endodontic infections using Illumina MiSeq sequencing platform in Egyptian patients. Samples were collected from 19 patients in Suez Canal University Hospital (Endodontic Department) using sterile # 15K file and paper points. DNA was extracted using Mo Bio power soil DNA isolation extraction kit followed by PCR amplification and agarose gel electrophoresis. The microbiome was characterized on the basis of the V3 and V4 hypervariable region of the 16S rRNA gene by using paired-end sequencing on Illumina MiSeq device. MOTHUR software was used in sequence filtration and analysis of sequenced data. A total of 1858 operational taxonomic units at 97% similarity were assigned to 26 phyla, 245 families, and 705 genera. Four main phyla Firmicutes, Bacteroidetes, Proteobacteria, and Synergistetes were predominant in all samples. At genus level, Prevotella, Bacillus, Porphyromonas, Streptococcus, and Bacteroides were the most abundant. Illumina MiSeq platform sequencing can be used to investigate oral microbiome composition of endodontic infections. Elucidating the ecology of endodontic infections is a necessary step in developing effective intracanal antimicrobials. PMID:29849646
BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.

PubMed

Chae, Heejoon; Rhee, Sungmin; Nephew, Kenneth P; Kim, Sun

2015-01-15

It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA-mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. The objective of this study was to modify our widely recognized Web server for the integrated mRNA-miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Development and evaluation of the first high-throughput SNP array for common carp (Cyprinus carpio)

PubMed Central

2014-01-01

Background A large number of single nucleotide polymorphisms (SNPs) have been identified in common carp (Cyprinus carpio) but, as yet, no high-throughput genotyping platform is available for this species. C. carpio is an important aquaculture species that accounts for nearly 14% of freshwater aquaculture production worldwide. We have developed an array for C. carpio with 250,000 SNPs and evaluated its performance using samples from various strains of C. carpio. Results The SNPs used on the array were selected from two resources: the transcribed sequences from RNA-seq data of four strains of C. carpio, and the genome re-sequencing data of five strains of C. carpio. The 250,000 SNPs on the resulting array are distributed evenly across the reference C.carpio genome with an average spacing of 6.6 kb. To evaluate the SNP array, 1,072 C. carpio samples were collected and tested. Of the 250,000 SNPs on the array, 185,150 (74.06%) were found to be polymorphic sites. Genotyping accuracy was checked using genotyping data from a group of full-siblings and their parents, and over 99.8% of the qualified SNPs were found to be reliable. Analysis of the linkage disequilibrium on all samples and on three domestic C.carpio strains revealed that the latter had the longer haplotype blocks. We also evaluated our SNP array on 80 samples from eight species related to C. carpio, with from 53,526 to 71,984 polymorphic SNPs. An identity by state analysis divided all the samples into three clusters; most of the C. carpio strains formed the largest cluster. Conclusions The Carp SNP array described here is the first high-throughput genotyping platform for C. carpio. Our evaluation of this array indicates that it will be valuable for farmed carp and for genetic and population biology studies in C. carpio and related species. PMID:24762296

Development and evaluation of the first high-throughput SNP array for common carp (Cyprinus carpio).

PubMed

Xu, Jian; Zhao, Zixia; Zhang, Xiaofeng; Zheng, Xianhu; Li, Jiongtang; Jiang, Yanliang; Kuang, Youyi; Zhang, Yan; Feng, Jianxin; Li, Chuangju; Yu, Juhua; Li, Qiang; Zhu, Yuanyuan; Liu, Yuanyuan; Xu, Peng; Sun, Xiaowen

2014-04-24

A large number of single nucleotide polymorphisms (SNPs) have been identified in common carp (Cyprinus carpio) but, as yet, no high-throughput genotyping platform is available for this species. C. carpio is an important aquaculture species that accounts for nearly 14% of freshwater aquaculture production worldwide. We have developed an array for C. carpio with 250,000 SNPs and evaluated its performance using samples from various strains of C. carpio. The SNPs used on the array were selected from two resources: the transcribed sequences from RNA-seq data of four strains of C. carpio, and the genome re-sequencing data of five strains of C. carpio. The 250,000 SNPs on the resulting array are distributed evenly across the reference C.carpio genome with an average spacing of 6.6 kb. To evaluate the SNP array, 1,072 C. carpio samples were collected and tested. Of the 250,000 SNPs on the array, 185,150 (74.06%) were found to be polymorphic sites. Genotyping accuracy was checked using genotyping data from a group of full-siblings and their parents, and over 99.8% of the qualified SNPs were found to be reliable. Analysis of the linkage disequilibrium on all samples and on three domestic C.carpio strains revealed that the latter had the longer haplotype blocks. We also evaluated our SNP array on 80 samples from eight species related to C. carpio, with from 53,526 to 71,984 polymorphic SNPs. An identity by state analysis divided all the samples into three clusters; most of the C. carpio strains formed the largest cluster. The Carp SNP array described here is the first high-throughput genotyping platform for C. carpio. Our evaluation of this array indicates that it will be valuable for farmed carp and for genetic and population biology studies in C. carpio and related species.
Reverse Genetics and High Throughput Sequencing Methodologies for Plant Functional Genomics

PubMed Central

Ben-Amar, Anis; Daldoul, Samia; Reustle, Götz M.; Krczal, Gabriele; Mliki, Ahmed

2016-01-01

In the post-genomic era, increasingly sophisticated genetic tools are being developed with the long-term goal of understanding how the coordinated activity of genes gives rise to a complex organism. With the advent of the next generation sequencing associated with effective computational approaches, wide variety of plant species have been fully sequenced giving a wealth of data sequence information on structure and organization of plant genomes. Since thousands of gene sequences are already known, recently developed functional genomics approaches provide powerful tools to analyze plant gene functions through various gene manipulation technologies. Integration of different omics platforms along with gene annotation and computational analysis may elucidate a complete view in a system biology level. Extensive investigations on reverse genetics methodologies were deployed for assigning biological function to a specific gene or gene product. We provide here an updated overview of these high throughout strategies highlighting recent advances in the knowledge of functional genomics in plants. PMID:28217003
A Fully Automated Microfluidic Femtosecond Laser Axotomy Platform for Nerve Regeneration Studies in C. elegans

PubMed Central

Gokce, Sertan Kutal; Guo, Samuel X.; Ghorashian, Navid; Everett, W. Neil; Jarrell, Travis; Kottek, Aubri; Bovik, Alan C.; Ben-Yakar, Adela

2014-01-01

Femtosecond laser nanosurgery has been widely accepted as an axonal injury model, enabling nerve regeneration studies in the small model organism, Caenorhabditis elegans. To overcome the time limitations of manual worm handling techniques, automation and new immobilization technologies must be adopted to improve throughput in these studies. While new microfluidic immobilization techniques have been developed that promise to reduce the time required for axotomies, there is a need for automated procedures to minimize the required amount of human intervention and accelerate the axotomy processes crucial for high-throughput. Here, we report a fully automated microfluidic platform for performing laser axotomies of fluorescently tagged neurons in living Caenorhabditis elegans. The presented automation process reduces the time required to perform axotomies within individual worms to ∼17 s/worm, at least one order of magnitude faster than manual approaches. The full automation is achieved with a unique chip design and an operation sequence that is fully computer controlled and synchronized with efficient and accurate image processing algorithms. The microfluidic device includes a T-shaped architecture and three-dimensional microfluidic interconnects to serially transport, position, and immobilize worms. The image processing algorithms can identify and precisely position axons targeted for ablation. There were no statistically significant differences observed in reconnection probabilities between axotomies carried out with the automated system and those performed manually with anesthetics. The overall success rate of automated axotomies was 67.4±3.2% of the cases (236/350) at an average processing rate of 17.0±2.4 s. This fully automated platform establishes a promising methodology for prospective genome-wide screening of nerve regeneration in C. elegans in a truly high-throughput manner. PMID:25470130
High-throughput sequence alignment using Graphics Processing Units

PubMed Central

Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L; Varshney, Amitabh

2007-01-01

Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU. PMID:18070356
Molecular characterization of a novel Nucleorhabdovirus from black currant identified by high-throughput sequencing

USDA-ARS?s Scientific Manuscript database

Contigs with sequence similarities to several nucleorhabdoviruses were identified by high-throughput sequencing analysis from a black currant (Ribes nigrum L.) cultivar. The complete genomic sequence of this new nucleorhabdovirus is 14,432 nucleotides. Its genomic organization is typical of nucleorh...
High Throughput Plasmid Sequencing with Illumina and CLC Bio (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

ScienceCinema

Athavale, Ajay

2018-01-04

Ajay Athavale (Monsanto) presents "High Throughput Plasmid Sequencing with Illumina and CLC Bio" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.
Microarray platform affords improved product analysis in mammalian cell growth studies

PubMed Central

Li, Lingyun; Migliore, Nicole; Schaefer, Eugene; Sharfstein, Susan T.; Dordick, Jonathan S.; Linhardt, Robert J.

2014-01-01

High throughput (HT) platforms serve as cost-efficient and rapid screening method for evaluating the effect of cell culture conditions and screening of chemicals. The aim of the current study was to develop a high-throughput cell-based microarray platform to assess the effect of culture conditions on Chinese hamster ovary (CHO) cells. Specifically, growth, transgene expression and metabolism of a GS/MSX CHO cell line, which produces a therapeutic monoclonal antibody, was examined using microarray system in conjunction with conventional shake flask platform in a non-proprietary medium. The microarray system consists of 60 nl spots of cells encapsulated in alginate and separated in groups via an 8-well chamber system attached to the chip. Results show the non-proprietary medium developed allows cell growth, production and normal glycosylation of recombinant antibody and metabolism of the recombinant CHO cells in both the microarray and shake flask platforms. In addition, 10.3 mM glutamate addition to the defined base media results in lactate metabolism shift in the recombinant GS/MSX CHO cells in the shake flask platform. Ultimately, the results demonstrate that the high-throughput microarray platform has the potential to be utilized for evaluating the impact of media additives on cellular processes, such as, cell growth, metabolism and productivity. PMID:24227746
High-Throughput Mechanobiology Screening Platform Using Micro- and Nanotopography.

PubMed

Hu, Junqiang; Gondarenko, Alexander A; Dang, Alex P; Bashour, Keenan T; O'Connor, Roddy S; Lee, Sunwoo; Liapis, Anastasia; Ghassemi, Saba; Milone, Michael C; Sheetz, Michael P; Dustin, Michael L; Kam, Lance C; Hone, James C

2016-04-13

We herein demonstrate the first 96-well plate platform to screen effects of micro- and nanotopographies on cell growth and proliferation. Existing high-throughput platforms test a limited number of factors and are not fully compatible with multiple types of testing and assays. This platform is compatible with high-throughput liquid handling, high-resolution imaging, and all multiwell plate-based instrumentation. We use the platform to screen for topographies and drug-topography combinations that have short- and long-term effects on T cell activation and proliferation. We coated nanofabricated "trench-grid" surfaces with anti-CD3 and anti-CD28 antibodies to activate T cells and assayed for interleukin 2 (IL-2) cytokine production. IL-2 secretion was enhanced at 200 nm trench width and >2.3 μm grating pitch; however, the secretion was suppressed at 100 nm width and <0.5 μm pitch. The enhancement on 200 nm grid trench was further amplified with the addition of blebbistatin to reduce contractility. The 200 nm grid pattern was found to triple the number of T cells in long-term expansion, a result with direct clinical applicability in adoptive immunotherapy.
Proteome Studies of Filamentous Fungi

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Scott E.; Panisko, Ellen A.

2011-04-20

The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide breadth of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, non-gel basedmore » proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of different variations on the general method and technologies for identifying peptides in a given sample. We present a method that can serve as a “baseline” for proteomic studies of fungi.« less
Proteome studies of filamentous fungi.

PubMed

Baker, Scott E; Panisko, Ellen A

2011-01-01

The continued fast pace of fungal genome sequence generation has enabled proteomic analysis of a wide variety of organisms that span the breadth of the Kingdom Fungi. There is some phylogenetic bias to the current catalog of fungi with reasonable DNA sequence databases (genomic or EST) that could be analyzed at a global proteomic level. However, the rapid development of next generation sequencing platforms has lowered the cost of genome sequencing such that in the near future, having a genome sequence will no longer be a time or cost bottleneck for downstream proteomic (and transcriptomic) analyses. High throughput, nongel-based proteomics offers a snapshot of proteins present in a given sample at a single point in time. There are a number of variations on the general methods and technologies for identifying peptides in a given sample. We present a method that can serve as a "baseline" for proteomic studies of fungi.
Development and preliminary evaluation of a multiplexed amplification and next generation sequencing method for viral hemorrhagic fever diagnostics

PubMed Central

Radonić, Aleksandar; Kocak Tufan, Zeliha; Domingo, Cristina

2017-01-01

Background We describe the development and evaluation of a novel method for targeted amplification and Next Generation Sequencing (NGS)-based identification of viral hemorrhagic fever (VHF) agents and assess the feasibility of this approach in diagnostics. Methodology An ultrahigh-multiplex panel was designed with primers to amplify all known variants of VHF-associated viruses and relevant controls. The performance of the panel was evaluated via serially quantified nucleic acids from Yellow fever virus, Rift Valley fever virus, Crimean-Congo hemorrhagic fever (CCHF) virus, Ebola virus, Junin virus and Chikungunya virus in a semiconductor-based sequencing platform. A comparison of direct NGS and targeted amplification-NGS was performed. The panel was further tested via a real-time nanopore sequencing-based platform, using clinical specimens from CCHF patients. Principal findings The multiplex primer panel comprises two pools of 285 and 256 primer pairs for the identification of 46 virus species causing hemorrhagic fevers, encompassing 6,130 genetic variants of the strains involved. In silico validation revealed that the panel detected over 97% of all known genetic variants of the targeted virus species. High levels of specificity and sensitivity were observed for the tested virus strains. Targeted amplification ensured viral read detection in specimens with the lowest virus concentration (1–10 genome equivalents) and enabled significant increases in specific reads over background for all viruses investigated. In clinical specimens, the panel enabled detection of the causative agent and its characterization within 10 minutes of sequencing, with sample-to-result time of less than 3.5 hours. Conclusions Virus enrichment via targeted amplification followed by NGS is an applicable strategy for the diagnosis of VHFs which can be adapted for high-throughput or nanopore sequencing platforms and employed for surveillance or outbreak monitoring. PMID:29155823
Development and preliminary evaluation of a multiplexed amplification and next generation sequencing method for viral hemorrhagic fever diagnostics.

PubMed

Brinkmann, Annika; Ergünay, Koray; Radonić, Aleksandar; Kocak Tufan, Zeliha; Domingo, Cristina; Nitsche, Andreas

2017-11-01

We describe the development and evaluation of a novel method for targeted amplification and Next Generation Sequencing (NGS)-based identification of viral hemorrhagic fever (VHF) agents and assess the feasibility of this approach in diagnostics. An ultrahigh-multiplex panel was designed with primers to amplify all known variants of VHF-associated viruses and relevant controls. The performance of the panel was evaluated via serially quantified nucleic acids from Yellow fever virus, Rift Valley fever virus, Crimean-Congo hemorrhagic fever (CCHF) virus, Ebola virus, Junin virus and Chikungunya virus in a semiconductor-based sequencing platform. A comparison of direct NGS and targeted amplification-NGS was performed. The panel was further tested via a real-time nanopore sequencing-based platform, using clinical specimens from CCHF patients. The multiplex primer panel comprises two pools of 285 and 256 primer pairs for the identification of 46 virus species causing hemorrhagic fevers, encompassing 6,130 genetic variants of the strains involved. In silico validation revealed that the panel detected over 97% of all known genetic variants of the targeted virus species. High levels of specificity and sensitivity were observed for the tested virus strains. Targeted amplification ensured viral read detection in specimens with the lowest virus concentration (1-10 genome equivalents) and enabled significant increases in specific reads over background for all viruses investigated. In clinical specimens, the panel enabled detection of the causative agent and its characterization within 10 minutes of sequencing, with sample-to-result time of less than 3.5 hours. Virus enrichment via targeted amplification followed by NGS is an applicable strategy for the diagnosis of VHFs which can be adapted for high-throughput or nanopore sequencing platforms and employed for surveillance or outbreak monitoring.
The application of the high throughput sequencing technology in the transposable elements.

PubMed

Liu, Zhen; Xu, Jian-hong

2015-09-01

High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.
Molecular characterization of a novel Luteovirus from peach identified by high-throughput sequencing

USDA-ARS?s Scientific Manuscript database

Contigs with sequence homologies to Cherry-associated luteovirus were identified by high-throughput sequencing analysis of two peach accessions undergoing quarantine testing. The complete genomic sequences of the two isolates of this virus are 5,819 and 5,814 nucleotides. Their genome organization i...
Gene expression profiling of flax (Linum usitatissimum L.) under edaphic stress.

PubMed

Dmitriev, Alexey A; Kudryavtseva, Anna V; Krasnov, George S; Koroban, Nadezhda V; Speranskaya, Anna S; Krinitsina, Anastasia A; Belenikin, Maxim S; Snezhkina, Anastasiya V; Sadritdinova, Asiya F; Kishlyan, Natalya V; Rozhmina, Tatiana A; Yurkevich, Olga Yu; Muravenko, Olga V; Bolsheva, Nadezhda L; Melnikova, Nataliya V

2016-11-16

Cultivated flax (Linum usitatissimum L.) is widely used for production of textile, food, chemical and pharmaceutical products. However, various stresses decrease flax production. Search for genes, which are involved in stress response, is necessary for breeding of adaptive cultivars. Imbalanced concentration of nutrient elements in soil decrease flax yields and also results in heritable changes in some flax lines. The appearance of Linum Insertion Sequence 1 (LIS-1) is the most studied modification. However, LIS-1 function is still unclear. High-throughput sequencing of transcriptome of flax plants grown under normal (N), phosphate deficient (P), and nutrient excess (NPK) conditions was carried out using Illumina platform. The assembly of transcriptome was performed, and a total of 34924, 33797, and 33698 unique transcripts for N, P, and NPK sequencing libraries were identified, respectively. We have not revealed any LIS-1 derived mRNA in our sequencing data. The analysis of high-throughput sequencing data allowed us to identify genes with potentially differential expression under imbalanced nutrition. For further investigation with qPCR, 15 genes were chosen and their expression levels were evaluated in the extended sampling of 31 flax plants. Significant expression alterations were revealed for genes encoding WRKY and JAZ protein families under P and NPK conditions. Moreover, the alterations of WRKY family genes differed depending on LIS-1 presence in flax plant genome. Besides, we revealed slight and LIS-1 independent mRNA level changes of KRP2 and ING1 genes, which are adjacent to LIS-1, under nutrition stress. Differentially expressed genes were identified in flax plants, which were grown under phosphate deficiency and excess nutrition, on the basis of high-throughput sequencing and qPCR data. We showed that WRKY and JAS gene families participate in flax response to imbalanced nutrient content in soil. Besides, we have not identified any mRNA, which could be derived from LIS-1, in our transcriptome sequencing data. Expression of LIS-1 flanking genes, ING1 and KRP2, was suggested not to be nutrient stress-induced. Obtained results provide new insights into edaphic stress response in flax and the role of LIS-1 in these process.
Identification of Sequence Specificity of 5-Methylcytosine Oxidation by Tet1 Protein with High-Throughput Sequencing.

PubMed

Kizaki, Seiichiro; Chandran, Anandhakumar; Sugiyama, Hiroshi

2016-03-02

Tet (ten-eleven translocation) family proteins have the ability to oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC). However, the oxidation reaction of Tet is not understood completely. Evaluation of genomic-level epigenetic changes by Tet protein requires unbiased identification of the highly selective oxidation sites. In this study, we used high-throughput sequencing to investigate the sequence specificity of mC oxidation by Tet1. A 6.6×10(4) -member mC-containing random DNA-sequence library was constructed. The library was subjected to Tet-reactive pulldown followed by high-throughput sequencing. Analysis of the obtained sequence data identified the Tet1-reactive sequences. We identified mCpG as a highly reactive sequence of Tet1 protein. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Label-Free Discovery Array Platform for the Characterization of Glycan Binding Proteins and Glycoproteins.

PubMed

Gray, Christopher J; Sánchez-Ruíz, Antonio; Šardzíková, Ivana; Ahmed, Yassir A; Miller, Rebecca L; Reyes Martinez, Juana E; Pallister, Edward; Huang, Kun; Both, Peter; Hartmann, Mirja; Roberts, Hannah N; Šardzík, Robert; Mandal, Santanu; Turnbull, Jerry E; Eyers, Claire E; Flitsch, Sabine L

2017-04-18

The identification of carbohydrate-protein interactions is central to our understanding of the roles of cell-surface carbohydrates (the glycocalyx), fundamental for cell-recognition events. Therefore, there is a need for fast high-throughput biochemical tools to capture the complexity of these biological interactions. Here, we describe a rapid method for qualitative label-free detection of carbohydrate-protein interactions on arrays of simple synthetic glycans, more complex natural glycosaminoglycans (GAG), and lectins/carbohydrate binding proteins using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. The platform can unequivocally identify proteins that are captured from either purified or complex sample mixtures, including biofluids. Identification of proteins bound to the functionalized array is achieved by analyzing either the intact protein mass or, after on-chip proteolytic digestion, the peptide mass fingerprint and/or tandem mass spectrometry of selected peptides, which can yield highly diagnostic sequence information. The platform described here should be a valuable addition to the limited analytical toolbox that is currently available for glycomics.
Design and Development of a Technology Platform for DNA-Encoded Library Production and Affinity Selection.

PubMed

Castañón, Jesús; Román, José Pablo; Jessop, Theodore C; de Blas, Jesús; Haro, Rubén

2018-06-01

DNA-encoded libraries (DELs) have emerged as an efficient and cost-effective drug discovery tool for the exploration and screening of very large chemical space using small-molecule collections of unprecedented size. Herein, we report an integrated automation and informatics system designed to enhance the quality, efficiency, and throughput of the production and affinity selection of these libraries. The platform is governed by software developed according to a database-centric architecture to ensure data consistency, integrity, and availability. Through its versatile protocol management functionalities, this application captures the wide diversity of experimental processes involved with DEL technology, keeps track of working protocols in the database, and uses them to command robotic liquid handlers for the synthesis of libraries. This approach provides full traceability of building-blocks and DNA tags in each split-and-pool cycle. Affinity selection experiments and high-throughput sequencing reads are also captured in the database, and the results are automatically deconvoluted and visualized in customizable representations. Researchers can compare results of different experiments and use machine learning methods to discover patterns in data. As of this writing, the platform has been validated through the generation and affinity selection of various libraries, and it has become the cornerstone of the DEL production effort at Lilly.
Chipster: user-friendly analysis software for microarray and other high-throughput data.

PubMed

Kallio, M Aleksi; Tuimala, Jarno T; Hupponen, Taavi; Klemelä, Petri; Gentile, Massimiliano; Scheinin, Ilari; Koski, Mikko; Käki, Janne; Korpelainen, Eija I

2011-10-14

The growth of high-throughput technologies such as microarrays and next generation sequencing has been accompanied by active research in data analysis methodology, producing new analysis methods at a rapid pace. While most of the newly developed methods are freely available, their use requires substantial computational skills. In order to enable non-programming biologists to benefit from the method development in a timely manner, we have created the Chipster software. Chipster (http://chipster.csc.fi/) brings a powerful collection of data analysis methods within the reach of bioscientists via its intuitive graphical user interface. Users can analyze and integrate different data types such as gene expression, miRNA and aCGH. The analysis functionality is complemented with rich interactive visualizations, allowing users to select datapoints and create new gene lists based on these selections. Importantly, users can save the performed analysis steps as reusable, automatic workflows, which can also be shared with other users. Being a versatile and easily extendable platform, Chipster can be used for microarray, proteomics and sequencing data. In this article we describe its comprehensive collection of analysis and visualization tools for microarray data using three case studies. Chipster is a user-friendly analysis software for high-throughput data. Its intuitive graphical user interface enables biologists to access a powerful collection of data analysis and integration tools, and to visualize data interactively. Users can collaborate by sharing analysis sessions and workflows. Chipster is open source, and the server installation package is freely available.
Chipster: user-friendly analysis software for microarray and other high-throughput data

PubMed Central

2011-01-01

Background The growth of high-throughput technologies such as microarrays and next generation sequencing has been accompanied by active research in data analysis methodology, producing new analysis methods at a rapid pace. While most of the newly developed methods are freely available, their use requires substantial computational skills. In order to enable non-programming biologists to benefit from the method development in a timely manner, we have created the Chipster software. Results Chipster (http://chipster.csc.fi/) brings a powerful collection of data analysis methods within the reach of bioscientists via its intuitive graphical user interface. Users can analyze and integrate different data types such as gene expression, miRNA and aCGH. The analysis functionality is complemented with rich interactive visualizations, allowing users to select datapoints and create new gene lists based on these selections. Importantly, users can save the performed analysis steps as reusable, automatic workflows, which can also be shared with other users. Being a versatile and easily extendable platform, Chipster can be used for microarray, proteomics and sequencing data. In this article we describe its comprehensive collection of analysis and visualization tools for microarray data using three case studies. Conclusions Chipster is a user-friendly analysis software for high-throughput data. Its intuitive graphical user interface enables biologists to access a powerful collection of data analysis and integration tools, and to visualize data interactively. Users can collaborate by sharing analysis sessions and workflows. Chipster is open source, and the server installation package is freely available. PMID:21999641

High-throughput physical mapping of chromosomes using automated in situ hybridization.

PubMed

George, Phillip; Sharakhova, Maria V; Sharakhov, Igor V

2012-06-28

Projects to obtain whole-genome sequences for 10,000 vertebrate species and for 5,000 insect and related arthropod species are expected to take place over the next 5 years. For example, the sequencing of the genomes for 15 malaria mosquitospecies is currently being done using an Illumina platform. This Anopheles species cluster includes both vectors and non-vectors of malaria. When the genome assemblies become available, researchers will have the unique opportunity to perform comparative analysis for inferring evolutionary changes relevant to vector ability. However, it has proven difficult to use next-generation sequencing reads to generate high-quality de novo genome assemblies. Moreover, the existing genome assemblies for Anopheles gambiae, although obtained using the Sanger method, are gapped or fragmented. Success of comparative genomic analyses will be limited if researchers deal with numerous sequencing contigs, rather than with chromosome-based genome assemblies. Fragmented, unmapped sequences create problems for genomic analyses because: (i) unidentified gaps cause incorrect or incomplete annotation of genomic sequences; (ii) unmapped sequences lead to confusion between paralogous genes and genes from different haplotypes; and (iii) the lack of chromosome assignment and orientation of the sequencing contigs does not allow for reconstructing rearrangement phylogeny and studying chromosome evolution. Developing high-resolution physical maps for species with newly sequenced genomes is a timely and cost-effective investment that will facilitate genome annotation, evolutionary analysis, and re-sequencing of individual genomes from natural populations. Here, we present innovative approaches to chromosome preparation, fluorescent in situ hybridization (FISH), and imaging that facilitate rapid development of physical maps. Using An. gambiae as an example, we demonstrate that the development of physical chromosome maps can potentially improve genome assemblies and, thus, the quality of genomic analyses. First, we use a high-pressure method to prepare polytene chromosome spreads. This method, originally developed for Drosophila, allows the user to visualize more details on chromosomes than the regular squashing technique. Second, a fully automated, front-end system for FISH is used for high-throughput physical genome mapping. The automated slide staining system runs multiple assays simultaneously and dramatically reduces hands-on time. Third, an automatic fluorescent imaging system, which includes a motorized slide stage, automatically scans and photographs labeled chromosomes after FISH. This system is especially useful for identifying and visualizing multiple chromosomal plates on the same slide. In addition, the scanning process captures a more uniform FISH result. Overall, the automated high-throughput physical mapping protocol is more efficient than a standard manual protocol.
Microfluidics for cell-based high throughput screening platforms - A review.

PubMed

Du, Guansheng; Fang, Qun; den Toonder, Jaap M J

2016-01-15

In the last decades, the basic techniques of microfluidics for the study of cells such as cell culture, cell separation, and cell lysis, have been well developed. Based on cell handling techniques, microfluidics has been widely applied in the field of PCR (Polymerase Chain Reaction), immunoassays, organ-on-chip, stem cell research, and analysis and identification of circulating tumor cells. As a major step in drug discovery, high-throughput screening allows rapid analysis of thousands of chemical, biochemical, genetic or pharmacological tests in parallel. In this review, we summarize the application of microfluidics in cell-based high throughput screening. The screening methods mentioned in this paper include approaches using the perfusion flow mode, the droplet mode, and the microarray mode. We also discuss the future development of microfluidic based high throughput screening platform for drug discovery. Copyright © 2015 Elsevier B.V. All rights reserved.
Personalized In Vitro and In Vivo Cancer Models to Guide Precision Medicine | Office of Cancer Genomics

Cancer.gov

Precision medicine is an approach that takes into account the influence of individuals' genes, environment, and lifestyle exposures to tailor interventions. Here, we describe the development of a robust precision cancer care platform that integrates whole-exome sequencing with a living biobank that enables high-throughput drug screens on patient-derived tumor organoids. To date, 56 tumor-derived organoid cultures and 19 patient-derived xenograft (PDX) models have been established from the 769 patients enrolled in an Institutional Review Board-approved clinical trial.
High-throughput assay of oxygen radical absorbance capacity (ORAC) using a multichannel liquid handling system coupled with a microplate fluorescence reader in 96-well format.

PubMed

Huang, Dejian; Ou, Boxin; Hampsch-Woodill, Maureen; Flanagan, Judith A; Prior, Ronald L

2002-07-31

The oxygen radical absorbance capacity (ORAC) assay has been widely accepted as a standard tool to measure the antioxidant activity in the nutraceutical, pharmaceutical, and food industries. However, the ORAC assay has been criticized for a lack of accessibility due to the unavailability of the COBAS FARA II analyzer, an instrument discontinued by the manufacturer. In addition, the manual sample preparation is time-consuming and labor-intensive. The objective of this study was to develop a high-throughput instrument platform that can fully automate the ORAC assay procedure. The new instrument platform consists of a robotic eight-channel liquid handling system and a microplate fluorescence reader. By using the high-throughput platform, the efficiency of the assay is improved with at least a 10-fold increase in sample throughput over the current procedure. The mean of intra- and interday CVs was
Nano-LC FTICR tandem mass spectrometry for top-down proteomics: routine baseline unit mass resolution of whole cell lysate proteins up to 72 kDa.

PubMed

Tipton, Jeremiah D; Tran, John C; Catherman, Adam D; Ahlf, Dorothy R; Durbin, Kenneth R; Lee, Ji Eun; Kellie, John F; Kelleher, Neil L; Hendrickson, Christopher L; Marshall, Alan G

2012-03-06

Current high-throughput top-down proteomic platforms provide routine identification of proteins less than 25 kDa with 4-D separations. This short communication reports the application of technological developments over the past few years that improve protein identification and characterization for masses greater than 25 kDa. Advances in separation science have allowed increased numbers of proteins to be identified, especially by nanoliquid chromatography (nLC) prior to mass spectrometry (MS) analysis. Further, a goal of high-throughput top-down proteomics is to extend the mass range for routine nLC MS analysis up to 80 kDa because gene sequence analysis predicts that ~70% of the human proteome is transcribed to be less than 80 kDa. Normally, large proteins greater than 50 kDa are identified and characterized by top-down proteomics through fraction collection and direct infusion at relatively low throughput. Further, other MS-based techniques provide top-down protein characterization, however at low resolution for intact mass measurement. Here, we present analysis of standard (up to 78 kDa) and whole cell lysate proteins by Fourier transform ion cyclotron resonance mass spectrometry (nLC electrospray ionization (ESI) FTICR MS). The separation platform reduced the complexity of the protein matrix so that, at 14.5 T, proteins from whole cell lysate up to 72 kDa are baseline mass resolved on a nano-LC chromatographic time scale. Further, the results document routine identification of proteins at improved throughput based on accurate mass measurement (less than 10 ppm mass error) of precursor and fragment ions for proteins up to 50 kDa.
Spatial tuning of acoustofluidic pressure nodes by altering net sonic velocity enables high-throughput, efficient cell sorting

DOE PAGES

Jung, Seung-Yong; Notton, Timothy; Fong, Erika; ...

2015-01-07

Particle sorting using acoustofluidics has enormous potential but widespread adoption has been limited by complex device designs and low throughput. Here, we report high-throughput separation of particles and T lymphocytes (600 μL min -1) by altering the net sonic velocity to reposition acoustic pressure nodes in a simple two-channel device. Finally, the approach is generalizable to other microfluidic platforms for rapid, high-throughput analysis.
Robo-Lector – a novel platform for automated high-throughput cultivations in microtiter plates with high information content

PubMed Central

Huber, Robert; Ritter, Daniel; Hering, Till; Hillmer, Anne-Kathrin; Kensy, Frank; Müller, Carsten; Wang, Le; Büchs, Jochen

2009-01-01

Background In industry and academic research, there is an increasing demand for flexible automated microfermentation platforms with advanced sensing technology. However, up to now, conventional platforms cannot generate continuous data in high-throughput cultivations, in particular for monitoring biomass and fluorescent proteins. Furthermore, microfermentation platforms are needed that can easily combine cost-effective, disposable microbioreactors with downstream processing and analytical assays. Results To meet this demand, a novel automated microfermentation platform consisting of a BioLector and a liquid-handling robot (Robo-Lector) was sucessfully built and tested. The BioLector provides a cultivation system that is able to permanently monitor microbial growth and the fluorescence of reporter proteins under defined conditions in microtiter plates. Three examplary methods were programed on the Robo-Lector platform to study in detail high-throughput cultivation processes and especially recombinant protein expression. The host/vector system E. coli BL21(DE3) pRhotHi-2-EcFbFP, expressing the fluorescence protein EcFbFP, was hereby investigated. With the method 'induction profiling' it was possible to conduct 96 different induction experiments (varying inducer concentrations from 0 to 1.5 mM IPTG at 8 different induction times) simultaneously in an automated way. The method 'biomass-specific induction' allowed to automatically induce cultures with different growth kinetics in a microtiter plate at the same biomass concentration, which resulted in a relative standard deviation of the EcFbFP production of only ± 7%. The third method 'biomass-specific replication' enabled to generate equal initial biomass concentrations in main cultures from precultures with different growth kinetics. This was realized by automatically transferring an appropiate inoculum volume from the different preculture microtiter wells to respective wells of the main culture plate, where subsequently similar growth kinetics could be obtained. Conclusion The Robo-Lector generates extensive kinetic data in high-throughput cultivations, particularly for biomass and fluorescence protein formation. Based on the non-invasive on-line-monitoring signals, actions of the liquid-handling robot can easily be triggered. This interaction between the robot and the BioLector (Robo-Lector) combines high-content data generation with systematic high-throughput experimentation in an automated fashion, offering new possibilities to study biological production systems. The presented platform uses a standard liquid-handling workstation with widespread automation possibilities. Thus, high-throughput cultivations can now be combined with small-scale downstream processing techniques and analytical assays. Ultimately, this novel versatile platform can accelerate and intensify research and development in the field of systems biology as well as modelling and bioprocess optimization. PMID:19646274
Toward a new paradigm of DNA writing using a massively parallel sequencing platform and degenerate oligonucleotide

PubMed Central

Hwang, Byungjin; Bang, Duhee

2016-01-01

All synthetic DNA materials require prior programming of the building blocks of the oligonucleotide sequences. The development of a programmable microarray platform provides cost-effective and time-efficient solutions in the field of data storage using DNA. However, the scalability of the synthesis is not on par with the accelerating sequencing capacity. Here, we report on a new paradigm of generating genetic material (writing) using a degenerate oligonucleotide and optomechanical retrieval method that leverages sequencing (reading) throughput to generate the desired number of oligonucleotides. As a proof of concept, we demonstrate the feasibility of our concept in digital information storage in DNA. In simulation, the ability to store data is expected to exponentially increase with increase in degenerate space. The present study highlights the major framework change in conventional DNA writing paradigm as a sequencer itself can become a potential source of making genetic materials. PMID:27876825
Toward a new paradigm of DNA writing using a massively parallel sequencing platform and degenerate oligonucleotide.

PubMed

Hwang, Byungjin; Bang, Duhee

2016-11-23

All synthetic DNA materials require prior programming of the building blocks of the oligonucleotide sequences. The development of a programmable microarray platform provides cost-effective and time-efficient solutions in the field of data storage using DNA. However, the scalability of the synthesis is not on par with the accelerating sequencing capacity. Here, we report on a new paradigm of generating genetic material (writing) using a degenerate oligonucleotide and optomechanical retrieval method that leverages sequencing (reading) throughput to generate the desired number of oligonucleotides. As a proof of concept, we demonstrate the feasibility of our concept in digital information storage in DNA. In simulation, the ability to store data is expected to exponentially increase with increase in degenerate space. The present study highlights the major framework change in conventional DNA writing paradigm as a sequencer itself can become a potential source of making genetic materials.
Integrated genome browser: visual analytics platform for genomics.

PubMed

Freese, Nowlan H; Norris, David C; Loraine, Ann E

2016-07-15

Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experiments. To demonstrate, we describe example visualizations and analyses of datasets from RNA-Seq, ChIP-Seq and bisulfite sequencing experiments. Understanding results from genome-scale experiments requires viewing the data in the context of reference genome annotations and other related datasets. To facilitate this, we enhanced IGB's ability to consume data from diverse sources, including Galaxy, Distributed Annotation and IGB-specific Quickload servers. To support future visualization needs as new genome-scale assays enter wide use, we transformed the IGB codebase into a modular, extensible platform for developers to create and deploy all-new visualizations of genomic data. IGB is open source and is freely available from http://bioviz.org/igb aloraine@uncc.edu. © The Author 2016. Published by Oxford University Press.
Diversity arrays technology (DArT) markers in apple for genetic linkage maps.

PubMed

Schouten, Henk J; van de Weg, W Eric; Carling, Jason; Khan, Sabaz Ali; McKay, Steven J; van Kaauwen, Martijn P W; Wittenberg, Alexander H J; Koehorst-van Putten, Herma J J; Noordijk, Yolanda; Gao, Zhongshan; Rees, D Jasper G; Van Dyk, Maria M; Jaccoud, Damian; Considine, Michael J; Kilian, Andrzej

2012-03-01

Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for apple. This is the first paper on DArT in horticultural trees. Genetic mapping of DArT markers in two mapping populations and their integration with other marker types showed that DArT is a powerful high-throughput method for obtaining accurate and reproducible marker data, despite the low cost per data point. This method appears to be suitable for aligning the genetic maps of different segregating populations. The standard complexity reduction method, based on the methylation-sensitive PstI restriction enzyme, resulted in a high frequency of markers, although there was 52-54% redundancy due to the repeated sampling of highly similar sequences. Sequencing of the marker clones showed that they are significantly enriched for low-copy, genic regions. The genome coverage using the standard method was 55-76%. For improved genome coverage, an alternative complexity reduction method was examined, which resulted in less redundancy and additional segregating markers. The DArT markers proved to be of high quality and were very suitable for genetic mapping at low cost for the apple, providing moderate genome coverage. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s11032-011-9579-5) contains supplementary material, which is available to authorized users.
Direct assembling methodologies for high-throughput bioscreening

PubMed Central

Rodríguez-Dévora, Jorge I.; Shi, Zhi-dong; Xu, Tao

2012-01-01

Over the last few decades, high-throughput (HT) bioscreening, a technique that allows rapid screening of biochemical compound libraries against biological targets, has been widely used in drug discovery, stem cell research, development of new biomaterials, and genomics research. To achieve these ambitions, scaffold-free (or direct) assembly of biological entities of interest has become critical. Appropriate assembling methodologies are required to build an efficient HT bioscreening platform. The development of contact and non-contact assembling systems as a practical solution has been driven by a variety of essential attributes of the bioscreening system, such as miniaturization, high throughput, and high precision. The present article reviews recent progress on these assembling technologies utilized for the construction of HT bioscreening platforms. PMID:22021162
GAP Final Technical Report 12-14-04

DOE Office of Scientific and Technical Information (OSTI.GOV)

Andrew J. Bordner, PhD, Senior Research Scientist

2004-12-14

The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less
Characterization and complete genome sequence of a previously uncharacterized panicovirus from Bermuda grass detected by high throughput sequencing

USDA-ARS?s Scientific Manuscript database

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high throughput sequencing (HTS). The nearly full genome sequence of a previously uncharacterized Panicovirus was identified from...
Leishmania genome analysis and high-throughput immunological screening identifies tuzin as a novel vaccine candidate against visceral leishmaniasis.

PubMed

Lakshmi, Bhavana Sethu; Wang, Ruobing; Madhubala, Rentala

2014-06-24

Leishmaniasis is a neglected tropical disease caused by Leishmania species. It is a major health concern affecting 88 countries and threatening 350 million people globally. Unfortunately, there are no vaccines and there are limitations associated with the current therapeutic regimens for leishmaniasis. The emerging cases of drug-resistance further aggravate the situation, demanding rapid drug and vaccine development. The genome sequence of Leishmania, provides access to novel genes that hold potential as chemotherapeutic targets or vaccine candidates. In this study, we selected 19 antigenic genes from about 8000 common Leishmania genes based on the Leishmania major and Leishmania infantum genome information available in the pathogen databases. Potential vaccine candidates thus identified were screened using an in vitro high throughput immunological platform developed in the laboratory. Four candidate genes coding for tuzin, flagellar glycoprotein-like protein (FGP), phospholipase A1-like protein (PLA1) and potassium voltage-gated channel protein (K VOLT) showed a predominant protective Th1 response over disease exacerbating Th2. We report the immunogenic properties and protective efficacy of one of the four antigens, tuzin, as a DNA vaccine against Leishmania donovani challenge. Our results show that administration of tuzin DNA protected BALB/c mice against L. donovani challenge and that protective immunity was associated with higher levels of IFN-γ and IL-12 production in comparison to IL-4 and IL-10. Our study presents a simple approach to rapidly identify potential vaccine candidates using the exhaustive information stored in the genome and an in vitro high-throughput immunological platform. Copyright © 2014. Published by Elsevier Ltd.
Development of a high-throughput microscale cell disruption platform for Pichia pastoris in rapid bioprocess design.

PubMed

Bláha, Benjamin A F; Morris, Stephen A; Ogonah, Olotu W; Maucourant, Sophie; Crescente, Vincenzo; Rosenberg, William; Mukhopadhyay, Tarit K

2018-01-01

The time and cost benefits of miniaturized fermentation platforms can only be gained by employing complementary techniques facilitating high-throughput at small sample volumes. Microbial cell disruption is a major bottleneck in experimental throughput and is often restricted to large processing volumes. Moreover, for rigid yeast species, such as Pichia pastoris, no effective high-throughput disruption methods exist. The development of an automated, miniaturized, high-throughput, noncontact, scalable platform based on adaptive focused acoustics (AFA) to disrupt P. pastoris and recover intracellular heterologous protein is described. Augmented modes of AFA were established by investigating vessel designs and a novel enzymatic pretreatment step. Three different modes of AFA were studied and compared to the performance high-pressure homogenization. For each of these modes of cell disruption, response models were developed to account for five different performance criteria. Using multiple responses not only demonstrated that different operating parameters are required for different response optima, with highest product purity requiring suboptimal values for other criteria, but also allowed for AFA-based methods to mimic large-scale homogenization processes. These results demonstrate that AFA-mediated cell disruption can be used for a wide range of applications including buffer development, strain selection, fermentation process development, and whole bioprocess integration. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 34:130-140, 2018. © 2017 American Institute of Chemical Engineers.
Targeted gene enrichment and high-throughput sequencing for environmental biomonitoring: a case study using freshwater macroinvertebrates.

PubMed

Dowle, Eddy J; Pochon, Xavier; C Banks, Jonathan; Shearer, Karen; Wood, Susanna A

2016-09-01

Recent studies have advocated biomonitoring using DNA techniques. In this study, two high-throughput sequencing (HTS)-based methods were evaluated: amplicon metabarcoding of the cytochrome C oxidase subunit I (COI) mitochondrial gene and gene enrichment using MYbaits (targeting nine different genes including COI). The gene-enrichment method does not require PCR amplification and thus avoids biases associated with universal primers. Macroinvertebrate samples were collected from 12 New Zealand rivers. Macroinvertebrates were morphologically identified and enumerated, and their biomass determined. DNA was extracted from all macroinvertebrate samples and HTS undertaken using the illumina miseq platform. Macroinvertebrate communities were characterized from sequence data using either six genes (three of the original nine were not used) or just the COI gene in isolation. The gene-enrichment method (all genes) detected the highest number of taxa and obtained the strongest Spearman rank correlations between the number of sequence reads, abundance and biomass in 67% of the samples. Median detection rates across rare (<1% of the total abundance or biomass), moderately abundant (1-5%) and highly abundant (>5%) taxa were highest using the gene-enrichment method (all genes). Our data indicated primer biases occurred during amplicon metabarcoding with greater than 80% of sequence reads originating from one taxon in several samples. The accuracy and sensitivity of both HTS methods would be improved with more comprehensive reference sequence databases. The data from this study illustrate the challenges of using PCR amplification-based methods for biomonitoring and highlight the potential benefits of using approaches, such as gene enrichment, which circumvent the need for an initial PCR step. © 2015 John Wiley & Sons Ltd.
Industrializing electrophysiology: HT automated patch clamp on SyncroPatch® 96 using instant frozen cells.

PubMed

Polonchuk, Liudmila

2014-01-01

Patch-clamping is a powerful technique for investigating the ion channel function and regulation. However, its low throughput hampered profiling of large compound series in early drug development. Fortunately, automation has revolutionized the area of experimental electrophysiology over the past decade. Whereas the first automated patch-clamp instruments using the planar patch-clamp technology demonstrated rather a moderate throughput, few second-generation automated platforms recently launched by various companies have significantly increased ability to form a high number of high-resistance seals. Among them is SyncroPatch(®) 96 (Nanion Technologies GmbH, Munich, Germany), a fully automated giga-seal patch-clamp system with the highest throughput on the market. By recording from up to 96 cells simultaneously, the SyncroPatch(®) 96 allows to substantially increase throughput without compromising data quality. This chapter describes features of the innovative automated electrophysiology system and protocols used for a successful transfer of the established hERG assay to this high-throughput automated platform.
Continuous inertial microparticle and blood cell separation in straight channels with local microstructures.

PubMed

Wu, Zhenlong; Chen, Yu; Wang, Moran; Chung, Aram J

2016-02-07

Fluid inertia which has conventionally been neglected in microfluidics has been gaining much attention for particle and cell manipulation because inertia-based methods inherently provide simple, passive, precise and high-throughput characteristics. Particularly, the inertial approach has been applied to blood separation for various biomedical research studies mainly using spiral microchannels. For higher throughput, parallelization is essential; however, it is difficult to realize using spiral channels because of their large two dimensional layouts. In this work, we present a novel inertial platform for continuous sheathless particle and blood cell separation in straight microchannels containing microstructures. Microstructures within straight channels exert secondary flows to manipulate particle positions similar to Dean flow in curved channels but with higher controllability. Through a balance between inertial lift force and microstructure-induced secondary flow, we deterministically position microspheres and cells based on their sizes to be separated downstream. Using our inertial platform, we successfully sorted microparticles and fractionized blood cells with high separation efficiencies, high purities and high throughputs. The inertial separation platform developed here can be operated to process diluted blood with a throughput of 10.8 mL min(-1)via radially arrayed single channels with one inlet and two rings of outlets.
Lens-free shadow image based high-throughput continuous cell monitoring technique.

PubMed

Jin, Geonsoo; Yoo, In-Hwa; Pack, Seung Pil; Yang, Ji-Woon; Ha, Un-Hwan; Paek, Se-Hwan; Seo, Sungkyu

2012-01-01

A high-throughput continuous cell monitoring technique which does not require any labeling reagents or destruction of the specimen is demonstrated. More than 6000 human alveolar epithelial A549 cells are monitored for up to 72 h simultaneously and continuously with a single digital image within a cost and space effective lens-free shadow imaging platform. In an experiment performed within a custom built incubator integrated with the lens-free shadow imaging platform, the cell nucleus division process could be successfully characterized by calculating the signal-to-noise ratios (SNRs) and the shadow diameters (SDs) of the cell shadow patterns. The versatile nature of this platform also enabled a single cell viability test followed by live cell counting. This study firstly shows that the lens-free shadow imaging technique can provide a continuous cell monitoring without any staining/labeling reagent and destruction of the specimen. This high-throughput continuous cell monitoring technique based on lens-free shadow imaging may be widely utilized as a compact, low-cost, and high-throughput cell monitoring tool in the fields of drug and food screening or cell proliferation and viability testing. Copyright © 2012 Elsevier B.V. All rights reserved.

Multi-step high-throughput conjugation platform for the development of antibody-drug conjugates.

PubMed

Andris, Sebastian; Wendeler, Michaela; Wang, Xiangyang; Hubbuch, Jürgen

2018-07-20

Antibody-drug conjugates (ADCs) form a rapidly growing class of biopharmaceuticals which attracts a lot of attention throughout the industry due to its high potential for cancer therapy. They combine the specificity of a monoclonal antibody (mAb) and the cell-killing capacity of highly cytotoxic small molecule drugs. Site-specific conjugation approaches involve a multi-step process for covalent linkage of antibody and drug via a linker. Despite the range of parameters that have to be investigated, high-throughput methods are scarcely used so far in ADC development. In this work an automated high-throughput platform for a site-specific multi-step conjugation process on a liquid-handling station is presented by use of a model conjugation system. A high-throughput solid-phase buffer exchange was successfully incorporated for reagent removal by utilization of a batch cation exchange step. To ensure accurate screening of conjugation parameters, an intermediate UV/Vis-based concentration determination was established including feedback to the process. For conjugate characterization, a high-throughput compatible reversed-phase chromatography method with a runtime of 7 min and no sample preparation was developed. Two case studies illustrate the efficient use for mapping the operating space of a conjugation process. Due to the degree of automation and parallelization, the platform is capable of significantly reducing process development efforts and material demands and shorten development timelines for antibody-drug conjugates. Copyright © 2018 Elsevier B.V. All rights reserved.
DynaMIT: the dynamic motif integration toolkit

PubMed Central

Dassi, Erik; Quattrone, Alessandro

2016-01-01

De-novo motif search is a frequently applied bioinformatics procedure to identify and prioritize recurrent elements in sequences sets for biological investigation, such as the ones derived from high-throughput differential expression experiments. Several algorithms have been developed to perform motif search, employing widely different approaches and often giving divergent results. In order to maximize the power of these investigations and ultimately be able to draft solid biological hypotheses, there is the need for applying multiple tools on the same sequences and merge the obtained results. However, motif reporting formats and statistical evaluation methods currently make such an integration task difficult to perform and mostly restricted to specific scenarios. We thus introduce here the Dynamic Motif Integration Toolkit (DynaMIT), an extremely flexible platform allowing to identify motifs employing multiple algorithms, integrate them by means of a user-selected strategy and visualize results in several ways; furthermore, the platform is user-extendible in all its aspects. DynaMIT is freely available at http://cibioltg.bitbucket.org. PMID:26253738
A scalable double-barcode sequencing platform for characterization of dynamic protein-protein interactions.

PubMed

Schlecht, Ulrich; Liu, Zhimin; Blundell, Jamie R; St Onge, Robert P; Levy, Sasha F

2017-05-25

Several large-scale efforts have systematically catalogued protein-protein interactions (PPIs) of a cell in a single environment. However, little is known about how the protein interactome changes across environmental perturbations. Current technologies, which assay one PPI at a time, are too low throughput to make it practical to study protein interactome dynamics. Here, we develop a highly parallel protein-protein interaction sequencing (PPiSeq) platform that uses a novel double barcoding system in conjunction with the dihydrofolate reductase protein-fragment complementation assay in Saccharomyces cerevisiae. PPiSeq detects PPIs at a rate that is on par with current assays and, in contrast with current methods, quantitatively scores PPIs with enough accuracy and sensitivity to detect changes across environments. Both PPI scoring and the bulk of strain construction can be performed with cell pools, making the assay scalable and easily reproduced across environments. PPiSeq is therefore a powerful new tool for large-scale investigations of dynamic PPIs.
GiNA, an efficient and high-throughput software for horticultural phenotyping

USDA-ARS?s Scientific Manuscript database

Traditional methods for trait phenotyping have been a bottleneck for research in many crop species due to their intensive labor, high cost, complex implementation, lack of reproducibility and propensity to subjective bias. Recently, multiple high-throughput phenotyping platforms have been developed,...
The first FDA marketing authorizations of next-generation sequencing technology and tests: challenges, solutions and impact for future assays.

PubMed

Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana

2015-01-01

The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays.
Accounting for uncertainty in DNA sequencing data.

PubMed

O'Rawe, Jason A; Ferson, Scott; Lyon, Gholson J

2015-02-01

Science is defined in part by an honest exposition of the uncertainties that arise in measurements and propagate through calculations and inferences, so that the reliabilities of its conclusions are made apparent. The recent rapid development of high-throughput DNA sequencing technologies has dramatically increased the number of measurements made at the biochemical and molecular level. These data come from many different DNA-sequencing technologies, each with their own platform-specific errors and biases, which vary widely. Several statistical studies have tried to measure error rates for basic determinations, but there are no general schemes to project these uncertainties so as to assess the surety of the conclusions drawn about genetic, epigenetic, and more general biological questions. We review here the state of uncertainty quantification in DNA sequencing applications, describe sources of error, and propose methods that can be used for accounting and propagating these errors and their uncertainties through subsequent calculations. Copyright © 2014 Elsevier Ltd. All rights reserved.
Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

NASA Astrophysics Data System (ADS)

Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

2016-03-01

Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.
Evaluation of Methods for de novo Genome assembly from High-throughput Sequencing Reads Reveals Dependencies that Affect the Quality of the Results

USDA-ARS?s Scientific Manuscript database

Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole...
Next generation platforms for high-throughput biodosimetry

PubMed Central

Repin, Mikhail; Turner, Helen C.; Garty, Guy; Brenner, David J.

2014-01-01

Here the general concept of the combined use of plates and tubes in racks compatible with the American National Standards Institute/the Society for Laboratory Automation and Screening microplate formats as the next generation platforms for increasing the throughput of biodosimetry assays was described. These platforms can be used at different stages of biodosimetry assays starting from blood collection into microtubes organised in standardised racks and ending with the cytogenetic analysis of samples in standardised multiwell and multichannel plates. Robotically friendly platforms can be used for different biodosimetry assays in minimally equipped laboratories and on cost-effective automated universal biotech systems. PMID:24837249
Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

PubMed Central

Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

2012-01-01

Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365
Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

PubMed

Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

2015-08-19

Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.
A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens

PubMed Central

Besaratinia, Ahmad; Li, Haiqing; Yoon, Jae-In; Zheng, Albert; Gao, Hanlin; Tommasi, Stella

2012-01-01

Many carcinogens leave a unique mutational fingerprint in the human genome. These mutational fingerprints manifest as specific types of mutations often clustering at certain genomic loci in tumor genomes from carcinogen-exposed individuals. To develop a high-throughput method for detecting the mutational fingerprint of carcinogens, we have devised a cost-, time- and labor-effective strategy, in which the widely used transgenic Big Blue® mouse mutation detection assay is made compatible with the Roche/454 Genome Sequencer FLX Titanium next-generation sequencing technology. As proof of principle, we have used this novel method to establish the mutational fingerprints of three prominent carcinogens with varying mutagenic potencies, including sunlight ultraviolet radiation, 4-aminobiphenyl and secondhand smoke that are known to be strong, moderate and weak mutagens, respectively. For verification purposes, we have compared the mutational fingerprints of these carcinogens obtained by our newly developed method with those obtained by parallel analyses using the conventional low-throughput approach, that is, standard mutation detection assay followed by direct DNA sequencing using a capillary DNA sequencer. We demonstrate that this high-throughput next-generation sequencing-based method is highly specific and sensitive to detect the mutational fingerprints of the tested carcinogens. The method is reproducible, and its accuracy is comparable with that of the currently available low-throughput method. In conclusion, this novel method has the potential to move the field of carcinogenesis forward by allowing high-throughput analysis of mutations induced by endogenous and/or exogenous genotoxic agents. PMID:22735701
A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens.

PubMed

Besaratinia, Ahmad; Li, Haiqing; Yoon, Jae-In; Zheng, Albert; Gao, Hanlin; Tommasi, Stella

2012-08-01

Many carcinogens leave a unique mutational fingerprint in the human genome. These mutational fingerprints manifest as specific types of mutations often clustering at certain genomic loci in tumor genomes from carcinogen-exposed individuals. To develop a high-throughput method for detecting the mutational fingerprint of carcinogens, we have devised a cost-, time- and labor-effective strategy, in which the widely used transgenic Big Blue mouse mutation detection assay is made compatible with the Roche/454 Genome Sequencer FLX Titanium next-generation sequencing technology. As proof of principle, we have used this novel method to establish the mutational fingerprints of three prominent carcinogens with varying mutagenic potencies, including sunlight ultraviolet radiation, 4-aminobiphenyl and secondhand smoke that are known to be strong, moderate and weak mutagens, respectively. For verification purposes, we have compared the mutational fingerprints of these carcinogens obtained by our newly developed method with those obtained by parallel analyses using the conventional low-throughput approach, that is, standard mutation detection assay followed by direct DNA sequencing using a capillary DNA sequencer. We demonstrate that this high-throughput next-generation sequencing-based method is highly specific and sensitive to detect the mutational fingerprints of the tested carcinogens. The method is reproducible, and its accuracy is comparable with that of the currently available low-throughput method. In conclusion, this novel method has the potential to move the field of carcinogenesis forward by allowing high-throughput analysis of mutations induced by endogenous and/or exogenous genotoxic agents.
The growing impact of lyophilized cell-free protein expression systems

PubMed Central

Hunt, J. Porter; Yang, Seung Ook; Wilding, Kristen M.; Bundy, Bradley C.

2017-01-01

ABSTRACT Recently reported shelf-stable, on-demand protein synthesis platforms are enabling new possibilities in biotherapeutics, biosensing, biocatalysis, and high throughput protein expression. Lyophilized cell-free protein expression systems not only overcome cold-storage limitations, but also enable stockpiling for on-demand synthesis and completely sterilize the protein synthesis platform. Recently reported high-yield synthesis of cytotoxic protein Onconase from lyophilized E. coli extract preparations demonstrates the utility of lyophilized cell-free protein expression and its potential for creating on-demand biotherapeutics, vaccines, biosensors, biocatalysts, and high throughput protein synthesis. PMID:27791452
A flexible and economical barcoding approach for highly multiplexed amplicon sequencing of diverse target genes

PubMed Central

Herbold, Craig W.; Pelikan, Claus; Kuzyk, Orest; Hausmann, Bela; Angel, Roey; Berry, David; Loy, Alexander

2015-01-01

High throughput sequencing of phylogenetic and functional gene amplicons provides tremendous insight into the structure and functional potential of complex microbial communities. Here, we introduce a highly adaptable and economical PCR approach to barcoding and pooling libraries of numerous target genes. In this approach, we replace gene- and sequencing platform-specific fusion primers with general, interchangeable barcoding primers, enabling nearly limitless customized barcode-primer combinations. Compared to barcoding with long fusion primers, our multiple-target gene approach is more economical because it overall requires lower number of primers and is based on short primers with generally lower synthesis and purification costs. To highlight our approach, we pooled over 900 different small-subunit rRNA and functional gene amplicon libraries obtained from various environmental or host-associated microbial community samples into a single, paired-end Illumina MiSeq run. Although the amplicon regions ranged in size from approximately 290 to 720 bp, we found no significant systematic sequencing bias related to amplicon length or gene target. Our results indicate that this flexible multiplexing approach produces large, diverse, and high quality sets of amplicon sequence data for modern studies in microbial ecology. PMID:26236305
Parallel processing of genomics data

NASA Astrophysics Data System (ADS)

Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

2016-10-01

The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
A simple method for semi-random DNA amplicon fragmentation using the methylation-dependent restriction enzyme MspJI.

PubMed

Shinozuka, Hiroshi; Cogan, Noel O I; Shinozuka, Maiko; Marshall, Alexis; Kay, Pippa; Lin, Yi-Han; Spangenberg, German C; Forster, John W

2015-04-11

Fragmentation at random nucleotide locations is an essential process for preparation of DNA libraries to be used on massively parallel short-read DNA sequencing platforms. Although instruments for physical shearing, such as the Covaris S2 focused-ultrasonicator system, and products for enzymatic shearing, such as the Nextera technology and NEBNext dsDNA Fragmentase kit, are commercially available, a simple and inexpensive method is desirable for high-throughput sequencing library preparation. MspJI is a recently characterised restriction enzyme which recognises the sequence motif CNNR (where R = G or A) when the first base is modified to 5-methylcytosine or 5-hydroxymethylcytosine. A semi-random enzymatic DNA amplicon fragmentation method was developed based on the unique cleavage properties of MspJI. In this method, random incorporation of 5-methyl-2'-deoxycytidine-5'-triphosphate is achieved through DNA amplification with DNA polymerase, followed by DNA digestion with MspJI. Due to the recognition sequence of the enzyme, DNA amplicons are fragmented in a relatively sequence-independent manner. The size range of the resulting fragments was capable of control through optimisation of 5-methyl-2'-deoxycytidine-5'-triphosphate concentration in the reaction mixture. A library suitable for sequencing using the Illumina MiSeq platform was prepared and processed using the proposed method. Alignment of generated short reads to a reference sequence demonstrated a relatively high level of random fragmentation. The proposed method may be performed with standard laboratory equipment. Although the uniformity of coverage was slightly inferior to the Covaris physical shearing procedure, due to efficiencies of cost and labour, the method may be more suitable than existing approaches for implementation in large-scale sequencing activities, such as bacterial artificial chromosome (BAC)-based genome sequence assembly, pan-genomic studies and locus-targeted genotyping-by-sequencing.
Advanced Virus Detection Technologies Interest Group (AVDTIG): Efforts on High Throughput Sequencing (HTS) for Virus Detection.

PubMed

Khan, Arifa S; Vacante, Dominick A; Cassart, Jean-Pol; Ng, Siemon H S; Lambert, Christophe; Charlebois, Robert L; King, Kathryn E

Several nucleic-acid based technologies have recently emerged with capabilities for broad virus detection. One of these, high throughput sequencing, has the potential for novel virus detection because this method does not depend upon prior viral sequence knowledge. However, the use of high throughput sequencing for testing biologicals poses greater challenges as compared to other newly introduced tests due to its technical complexities and big data bioinformatics. Thus, the Advanced Virus Detection Technologies Users Group was formed as a joint effort by regulatory and industry scientists to facilitate discussions and provide a forum for sharing data and experiences using advanced new virus detection technologies, with a focus on high throughput sequencing technologies. The group was initiated as a task force that was coordinated by the Parenteral Drug Association and subsequently became the Advanced Virus Detection Technologies Interest Group to continue efforts for using new technologies for detection of adventitious viruses with broader participation, including international government agencies, academia, and technology service providers. © PDA, Inc. 2016.
Microfluidic droplet platform for ultrahigh-throughput single-cell screening of biodiversity.

PubMed

Terekhov, Stanislav S; Smirnov, Ivan V; Stepanova, Anastasiya V; Bobik, Tatyana V; Mokrushina, Yuliana A; Ponomarenko, Natalia A; Belogurov, Alexey A; Rubtsova, Maria P; Kartseva, Olga V; Gomzikova, Marina O; Moskovtsev, Alexey A; Bukatin, Anton S; Dubina, Michael V; Kostryukova, Elena S; Babenko, Vladislav V; Vakhitova, Maria T; Manolov, Alexander I; Malakhova, Maja V; Kornienko, Maria A; Tyakht, Alexander V; Vanyushkina, Anna A; Ilina, Elena N; Masson, Patrick; Gabibov, Alexander G; Altman, Sidney

2017-03-07

Ultrahigh-throughput screening (uHTS) techniques can identify unique functionality from millions of variants. To mimic the natural selection mechanisms that occur by compartmentalization in vivo, we developed a technique based on single-cell encapsulation in droplets of a monodisperse microfluidic double water-in-oil-in-water emulsion (MDE). Biocompatible MDE enables in-droplet cultivation of different living species. The combination of droplet-generating machinery with FACS followed by next-generation sequencing and liquid chromatography-mass spectrometry analysis of the secretomes of encapsulated organisms yielded detailed genotype/phenotype descriptions. This platform was probed with uHTS for biocatalysts anchored to yeast with enrichment close to the theoretically calculated limit and cell-to-cell interactions. MDE-FACS allowed the identification of human butyrylcholinesterase mutants that undergo self-reactivation after inhibition by the organophosphorus agent paraoxon. The versatility of the platform allowed the identification of bacteria, including slow-growing oral microbiota species that suppress the growth of a common pathogen, Staphylococcus aureus , and predicted which genera were associated with inhibitory activity.
A Mixture Modeling Framework for Differential Analysis of High-Throughput Data

PubMed Central

Taslim, Cenny; Lin, Shili

2014-01-01

The inventions of microarray and next generation sequencing technologies have revolutionized research in genomics; platforms have led to massive amount of data in gene expression, methylation, and protein-DNA interactions. A common theme among a number of biological problems using high-throughput technologies is differential analysis. Despite the common theme, different data types have their own unique features, creating a “moving target” scenario. As such, methods specifically designed for one data type may not lead to satisfactory results when applied to another data type. To meet this challenge so that not only currently existing data types but also data from future problems, platforms, or experiments can be analyzed, we propose a mixture modeling framework that is flexible enough to automatically adapt to any moving target. More specifically, the approach considers several classes of mixture models and essentially provides a model-based procedure whose model is adaptive to the particular data being analyzed. We demonstrate the utility of the methodology by applying it to three types of real data: gene expression, methylation, and ChIP-seq. We also carried out simulations to gauge the performance and showed that the approach can be more efficient than any individual model without inflating type I error. PMID:25057284

Identification of species by multiplex analysis of variable-length sequences

PubMed Central

Pereira, Filipe; Carneiro, João; Matthiesen, Rune; van Asch, Barbara; Pinto, Nádia; Gusmão, Leonor; Amorim, António

2010-01-01

The quest for a universal and efficient method of identifying species has been a longstanding challenge in biology. Here, we show that accurate identification of species in all domains of life can be accomplished by multiplex analysis of variable-length sequences containing multiple insertion/deletion variants. The new method, called SPInDel, is able to discriminate 93.3% of eukaryotic species from 18 taxonomic groups. We also demonstrate that the identification of prokaryotic and viral species with numeric profiles of fragment lengths is generally straightforward. A computational platform is presented to facilitate the planning of projects and includes a large data set with nearly 1800 numeric profiles for species in all domains of life (1556 for eukaryotes, 105 for prokaryotes and 130 for viruses). Finally, a SPInDel profiling kit for discrimination of 10 mammalian species was successfully validated on highly processed food products with species mixtures and proved to be easily adaptable to multiple screening procedures routinely used in molecular biology laboratories. These results suggest that SPInDel is a reliable and cost-effective method for broad-spectrum species identification that is appropriate for use in suboptimal samples and is amenable to different high-throughput genotyping platforms without the need for DNA sequencing. PMID:20923781
Identifying drought adaptive traits in upland cotton using a proximal sensing cart for high-throughput phenotyping

USDA-ARS?s Scientific Manuscript database

Field-based high-throughput phenotyping is an emerging approach to characterize difficult, time-sensitive plant traits in relevant growing conditions. Proximal sensing carts have been developed as an alternative platform to more costly high-clearance tractors for phenotyping dynamic traits in the fi...
Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

PubMed Central

Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

2016-01-01

DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962
Fundamental Bounds for Sequence Reconstruction from Nanopore Sequencers.

PubMed

Magner, Abram; Duda, Jarosław; Szpankowski, Wojciech; Grama, Ananth

2016-06-01

Nanopore sequencers are emerging as promising new platforms for high-throughput sequencing. As with other technologies, sequencer errors pose a major challenge for their effective use. In this paper, we present a novel information theoretic analysis of the impact of insertion-deletion (indel) errors in nanopore sequencers. In particular, we consider the following problems: (i) for given indel error characteristics and rate, what is the probability of accurate reconstruction as a function of sequence length; (ii) using replicated extrusion (the process of passing a DNA strand through the nanopore), what is the number of replicas needed to accurately reconstruct the true sequence with high probability? Our results provide a number of important insights: (i) the probability of accurate reconstruction of a sequence from a single sample in the presence of indel errors tends quickly (i.e., exponentially) to zero as the length of the sequence increases; and (ii) replicated extrusion is an effective technique for accurate reconstruction. We show that for typical distributions of indel errors, the required number of replicas is a slow function (polylogarithmic) of sequence length - implying that through replicated extrusion, we can sequence large reads using nanopore sequencers. Moreover, we show that in certain cases, the required number of replicas can be related to information-theoretic parameters of the indel error distributions.
De novo assembly of the Japanese lawngrass (Zoysia japonica Steud.) root transcriptome and identification of candidate unigenes related to early responses under salt stress

PubMed Central

Xie, Qi; Niu, Jun; Xu, Xilin; Xu, Lixin; Zhang, Yinbing; Fan, Bo; Liang, Xiaohong; Zhang, Lijuan; Yin, Shuxia; Han, Liebao

2015-01-01

Japanese lawngrass (Zoysia japonica Steud.) is an important warm-season turfgrass that is able to survive in a range of soils, from infertile sands to clays, and to grow well under saline conditions. However, little is known about the molecular mechanisms involved in its resistance to salt stress. Here, we used high-throughput RNA sequencing (RNA-seq) to investigate the changes in gene expression of Zoysia grass at high NaCl concentrations. We first constructed two sequencing libraries, including control and NaCl-treated samples, and sequenced them using the Illumina HiSeq™ 2000 platform. Approximately 157.20 million paired-end reads with a total length of 68.68 Mb were obtained. Subsequently, 32,849 unigenes with an N50 length of 1781 bp were assembled using Trinity. Furthermore, three public databases, the Kyoto Encyclopedia of Genes and Genomes (KEGG), Swiss-prot, and Clusters of Orthologous Groups (COGs), were used for gene function analysis and enrichment. The annotated genes included 57 Gene Ontology (GO) terms, 120 KEGG pathways, and 24 COGs. Compared with the control, 1455 genes were significantly different (false discovery rate ≤0.01, |log2Ratio |≥1) in the NaCl-treated samples. These genes were enriched in 10 KEGG pathways and 73 GO terms, and subjected to 25 COG categories. Using high-throughput next-generation sequencing, we built a database as a global transcript resource for Z. japonica Steud. roots. The results of this study will advance our understanding of the early salt response in Japanese lawngrass roots. PMID:26347751
High Throughput Sequence Analysis for Disease Resistance in Maize

USDA-ARS?s Scientific Manuscript database

Preliminary results of a computational analysis of high throughput sequencing data from Zea mays and the fungus Aspergillus are reported. The Illumina Genome Analyzer was used to sequence RNA samples from two strains of Z. mays (Va35 and Mp313) collected over a time course as well as several specie...
[Research on soil bacteria under the impact of sealed CO2 leakage by high-throughput sequencing technology].

PubMed

Tian, Di; Ma, Xin; Li, Yu-E; Zha, Liang-Song; Wu, Yang; Zou, Xiao-Xia; Liu, Shuang

2013-10-01

Carbon dioxide Capture and Storage has provided a new option for mitigating global anthropogenic CO2 emission with its unique advantages. However, there is a risk of the sealed CO2 leakage, bringing a serious threat to the ecology system. It is widely known that soil microorganisms are closely related to soil health, while the study on the impact of sequestered CO2 leakage on soil microorganisms is quite deficient. In this study, the leakage scenarios of sealed CO2 were constructed and the 16S rRNA genes of soil bacteria were sequenced by Illumina high-throughput sequencing technology on Miseq platform, and related biological analysis was conducted to explore the changes of soil bacterial abundance, diversity and structure. There were 486,645 reads for 43,017 OTUs of 15 soil samples and the results of biological analysis showed that there were differences in the abundance, diversity and community structure of soil bacterial community under different CO, leakage scenarios while the abundance and diversity of the bacterial community declined with the amplification of CO2 leakage quantity and leakage time, and some bacteria species became the dominant bacteria species in the bacteria community, therefore the increase of Acidobacteria species would be a biological indicator for the impact of sealed CO2 leakage on soil ecology system.
Micropillar arrays as a high-throughput screening platform for therapeutics in multiple sclerosis.

PubMed

Mei, Feng; Fancy, Stephen P J; Shen, Yun-An A; Niu, Jianqin; Zhao, Chao; Presley, Bryan; Miao, Edna; Lee, Seonok; Mayoral, Sonia R; Redmond, Stephanie A; Etxeberria, Ainhoa; Xiao, Lan; Franklin, Robin J M; Green, Ari; Hauser, Stephen L; Chan, Jonah R

2014-08-01

Functional screening for compounds that promote remyelination represents a major hurdle in the development of rational therapeutics for multiple sclerosis. Screening for remyelination is problematic, as myelination requires the presence of axons. Standard methods do not resolve cell-autonomous effects and are not suited for high-throughput formats. Here we describe a binary indicant for myelination using micropillar arrays (BIMA). Engineered with conical dimensions, micropillars permit resolution of the extent and length of membrane wrapping from a single two-dimensional image. Confocal imaging acquired from the base to the tip of the pillars allows for detection of concentric wrapping observed as 'rings' of myelin. The platform is formatted in 96-well plates, amenable to semiautomated random acquisition and automated detection and quantification. Upon screening 1,000 bioactive molecules, we identified a cluster of antimuscarinic compounds that enhance oligodendrocyte differentiation and remyelination. Our findings demonstrate a new high-throughput screening platform for potential regenerative therapeutics in multiple sclerosis.
Competitive Genomic Screens of Barcoded Yeast Libraries

PubMed Central

Urbanus, Malene; Proctor, Michael; Heisler, Lawrence E.; Giaever, Guri; Nislow, Corey

2011-01-01

By virtue of advances in next generation sequencing technologies, we have access to new genome sequences almost daily. The tempo of these advances is accelerating, promising greater depth and breadth. In light of these extraordinary advances, the need for fast, parallel methods to define gene function becomes ever more important. Collections of genome-wide deletion mutants in yeasts and E. coli have served as workhorses for functional characterization of gene function, but this approach is not scalable, current gene-deletion approaches require each of the thousands of genes that comprise a genome to be deleted and verified. Only after this work is complete can we pursue high-throughput phenotyping. Over the past decade, our laboratory has refined a portfolio of competitive, miniaturized, high-throughput genome-wide assays that can be performed in parallel. This parallelization is possible because of the inclusion of DNA 'tags', or 'barcodes,' into each mutant, with the barcode serving as a proxy for the mutation and one can measure the barcode abundance to assess mutant fitness. In this study, we seek to fill the gap between DNA sequence and barcoded mutant collections. To accomplish this we introduce a combined transposon disruption-barcoding approach that opens up parallel barcode assays to newly sequenced, but poorly characterized microbes. To illustrate this approach we present a new Candida albicans barcoded disruption collection and describe how both microarray-based and next generation sequencing-based platforms can be used to collect 10,000 - 1,000,000 gene-gene and drug-gene interactions in a single experiment. PMID:21860376
High-throughput electrophysiological assays for voltage gated ion channels using SyncroPatch 768PE.

PubMed

Li, Tianbo; Lu, Gang; Chiang, Eugene Y; Chernov-Rogan, Tania; Grogan, Jane L; Chen, Jun

2017-01-01

Ion channels regulate a variety of physiological processes and represent an important class of drug target. Among the many methods of studying ion channel function, patch clamp electrophysiology is considered the gold standard by providing the ultimate precision and flexibility. However, its utility in ion channel drug discovery is impeded by low throughput. Additionally, characterization of endogenous ion channels in primary cells remains technical challenging. In recent years, many automated patch clamp (APC) platforms have been developed to overcome these challenges, albeit with varying throughput, data quality and success rate. In this study, we utilized SyncroPatch 768PE, one of the latest generation APC platforms which conducts parallel recording from two-384 modules with giga-seal data quality, to push these 2 boundaries. By optimizing various cell patching parameters and a two-step voltage protocol, we developed a high throughput APC assay for the voltage-gated sodium channel Nav1.7. By testing a group of Nav1.7 reference compounds' IC50, this assay was proved to be highly consistent with manual patch clamp (R > 0.9). In a pilot screening of 10,000 compounds, the success rate, defined by > 500 MΩ seal resistance and >500 pA peak current, was 79%. The assay was robust with daily throughput ~ 6,000 data points and Z' factor 0.72. Using the same platform, we also successfully recorded endogenous voltage-gated potassium channel Kv1.3 in primary T cells. Together, our data suggest that SyncroPatch 768PE provides a powerful platform for ion channel research and drug discovery.
Loeffler 4.0: Diagnostic Metagenomics.

PubMed

Höper, Dirk; Wylezich, Claudia; Beer, Martin

2017-01-01

A new world of possibilities for "virus discovery" was opened up with high-throughput sequencing becoming available in the last decade. While scientifically metagenomic analysis was established before the start of the era of high-throughput sequencing, the availability of the first second-generation sequencers was the kick-off for diagnosticians to use sequencing for the detection of novel pathogens. Today, diagnostic metagenomics is becoming the standard procedure for the detection and genetic characterization of new viruses or novel virus variants. Here, we provide an overview about technical considerations of high-throughput sequencing-based diagnostic metagenomics together with selected examples of "virus discovery" for animal diseases or zoonoses and metagenomics for food safety or basic veterinary research. © 2017 Elsevier Inc. All rights reserved.
tcpl: The ToxCast Pipeline for High-Throughput Screening Data

EPA Science Inventory

Motivation: The large and diverse high-throughput chemical screening efforts carried out by the US EPAToxCast program requires an efficient, transparent, and reproducible data pipeline.Summary: The tcpl R package and its associated MySQL database provide a generalized platform fo...
High-throughput screening of filamentous fungi using nanoliter-range droplet-based microfluidics

NASA Astrophysics Data System (ADS)

Beneyton, Thomas; Wijaya, I. Putu Mahendra; Postros, Prexilia; Najah, Majdi; Leblond, Pascal; Couvent, Angélique; Mayot, Estelle; Griffiths, Andrew D.; Drevelle, Antoine

2016-06-01

Filamentous fungi are an extremely important source of industrial enzymes because of their capacity to secrete large quantities of proteins. Currently, functional screening of fungi is associated with low throughput and high costs, which severely limits the discovery of novel enzymatic activities and better production strains. Here, we describe a nanoliter-range droplet-based microfluidic system specially adapted for the high-throughput sceening (HTS) of large filamentous fungi libraries for secreted enzyme activities. The platform allowed (i) compartmentalization of single spores in ~10 nl droplets, (ii) germination and mycelium growth and (iii) high-throughput sorting of fungi based on enzymatic activity. A 104 clone UV-mutated library of Aspergillus niger was screened based on α-amylase activity in just 90 minutes. Active clones were enriched 196-fold after a single round of microfluidic HTS. The platform is a powerful tool for the development of new production strains with low cost, space and time footprint and should bring enormous benefit for improving the viability of biotechnological processes.
Comprehensive analysis of the T-cell receptor beta chain gene in rhesus monkey by high throughput sequencing

PubMed Central

Li, Zhoufang; Liu, Guangjie; Tong, Yin; Zhang, Meng; Xu, Ying; Qin, Li; Wang, Zhanhui; Chen, Xiaoping; He, Jiankui

2015-01-01

Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys. We identified 1.26 million TCRβ sequences corresponding to 643,570 unique TCRβ sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys. PMID:25961410
Three-Dimensional Root Phenotyping with a Novel Imaging and Software Platform1[C][W][OA

PubMed Central

Clark, Randy T.; MacCurdy, Robert B.; Jung, Janelle K.; Shaff, Jon E.; McCouch, Susan R.; Aneshansley, Daniel J.; Kochian, Leon V.

2011-01-01

A novel imaging and software platform was developed for the high-throughput phenotyping of three-dimensional root traits during seedling development. To demonstrate the platform’s capacity, plants of two rice (Oryza sativa) genotypes, Azucena and IR64, were grown in a transparent gellan gum system and imaged daily for 10 d. Rotational image sequences consisting of 40 two-dimensional images were captured using an optically corrected digital imaging system. Three-dimensional root reconstructions were generated and analyzed using a custom-designed software, RootReader3D. Using the automated and interactive capabilities of RootReader3D, five rice root types were classified and 27 phenotypic root traits were measured to characterize these two genotypes. Where possible, measurements from the three-dimensional platform were validated and were highly correlated with conventional two-dimensional measurements. When comparing gellan gum-grown plants with those grown under hydroponic and sand culture, significant differences were detected in morphological root traits (P < 0.05). This highly flexible platform provides the capacity to measure root traits with a high degree of spatial and temporal resolution and will facilitate novel investigations into the development of entire root systems or selected components of root systems. In combination with the extensive genetic resources that are now available, this platform will be a powerful resource to further explore the molecular and genetic determinants of root system architecture. PMID:21454799
[Complete genome sequencing of polymalic acid-producing strain Aureobasidium pullulans CCTCC M2012223].

PubMed

Wang, Yongkang; Song, Xiaodan; Li, Xiaorong; Yang, Sang-tian; Zou, Xiang

2017-01-04

To explore the genome sequence of Aureobasidium pullulans CCTCC M2012223, analyze the key genes related to the biosynthesis of important metabolites, and provide genetic background for metabolic engineering. Complete genome of A. pullulans CCTCC M2012223 was sequenced by Illumina HiSeq high throughput sequencing platform. Then, fragment assembly, gene prediction, functional annotation, and GO/COG cluster were analyzed in comparison with those of other five A. pullulans varieties. The complete genome sequence of A. pullulans CCTCC M2012223 was 30756831 bp with an average GC content of 47.49%, and 9452 genes were successfully predicted. Genome-wide analysis showed that A. pullulans CCTCC M2012223 had the biggest genome assembly size. Protein sequences involved in the pullulan and polymalic acid pathway were highly conservative in all of six A. pullulans varieties. Although both A. pullulans CCTCC M2012223 and A. pullulans var. melanogenum have a close affinity, some point mutation and inserts were occurred in protein sequences involved in melanin biosynthesis. Genome information of A. pullulans CCTCC M2012223 was annotated and genes involved in melanin, pullulan and polymalic acid pathway were compared, which would provide a theoretical basis for genetic modification of metabolic pathway in A. pullulans.
Sequencing the Connectome

PubMed Central

Zador, Anthony M.; Dubnau, Joshua; Oyibo, Hassana K.; Zhan, Huiqing; Cao, Gang; Peikon, Ian D.

2012-01-01

Connectivity determines the function of neural circuits. Historically, circuit mapping has usually been viewed as a problem of microscopy, but no current method can achieve high-throughput mapping of entire circuits with single neuron precision. Here we describe a novel approach to determining connectivity. We propose BOINC (“barcoding of individual neuronal connections”), a method for converting the problem of connectivity into a form that can be read out by high-throughput DNA sequencing. The appeal of using sequencing is that its scale—sequencing billions of nucleotides per day is now routine—is a natural match to the complexity of neural circuits. An inexpensive high-throughput technique for establishing circuit connectivity at single neuron resolution could transform neuroscience research. PMID:23109909
FastaValidator: an open-source Java library to parse and validate FASTA formatted sequences.

PubMed

Waldmann, Jost; Gerken, Jan; Hankeln, Wolfgang; Schweer, Timmy; Glöckner, Frank Oliver

2014-06-14

Advances in sequencing technologies challenge the efficient importing and validation of FASTA formatted sequence data which is still a prerequisite for most bioinformatic tools and pipelines. Comparative analysis of commonly used Bio*-frameworks (BioPerl, BioJava and Biopython) shows that their scalability and accuracy is hampered. FastaValidator represents a platform-independent, standardized, light-weight software library written in the Java programming language. It targets computer scientists and bioinformaticians writing software which needs to parse quickly and accurately large amounts of sequence data. For end-users FastaValidator includes an interactive out-of-the-box validation of FASTA formatted files, as well as a non-interactive mode designed for high-throughput validation in software pipelines. The accuracy and performance of the FastaValidator library qualifies it for large data sets such as those commonly produced by massive parallel (NGS) technologies. It offers scientists a fast, accurate and standardized method for parsing and validating FASTA formatted sequence data.
Haplotag: Software for Haplotype-Based Genotyping-by-Sequencing Analysis

PubMed Central

Tinker, Nicholas A.; Bekele, Wubishet A.; Hattori, Jiro

2016-01-01

Genotyping-by-sequencing (GBS), and related methods, are based on high-throughput short-read sequencing of genomic complexity reductions followed by discovery of single nucleotide polymorphisms (SNPs) within sequence tags. This provides a powerful and economical approach to whole-genome genotyping, facilitating applications in genomics, diversity analysis, and molecular breeding. However, due to the complexity of analyzing large data sets, applications of GBS may require substantial time, expertise, and computational resources. Haplotag, the novel GBS software described here, is freely available, and operates with minimal user-investment on widely available computer platforms. Haplotag is unique in fulfilling the following set of criteria: (1) operates without a reference genome; (2) can be used in a polyploid species; (3) provides a discovery mode, and a production mode; (4) discovers polymorphisms based on a model of tag-level haplotypes within sequenced tags; (5) reports SNPs as well as haplotype-based genotypes; and (6) provides an intuitive visual “passport” for each inferred locus. Haplotag is optimized for use in a self-pollinating plant species. PMID:26818073
Annotating Protein Functional Residues by Coupling High-Throughput Fitness Profile and Homologous-Structure Analysis.

PubMed

Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren

2016-11-01

Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.

Intersection of toxicogenomics and high throughput screening in the Tox21 program: an NIEHS perspective.

PubMed

Merrick, B Alex; Paules, Richard S; Tice, Raymond R

Humans are exposed to thousands of chemicals with inadequate toxicological data. Advances in computational toxicology, robotic high throughput screening (HTS), and genome-wide expression have been integrated into the Tox21 program to better predict the toxicological effects of chemicals. Tox21 is a collaboration among US government agencies initiated in 2008 that aims to shift chemical hazard assessment from traditional animal toxicology to target-specific, mechanism-based, biological observations using in vitro assays and lower organism models. HTS uses biocomputational methods for probing thousands of chemicals in in vitro assays for gene-pathway response patterns predictive of adverse human health outcomes. In 1999, NIEHS began exploring the application of toxicogenomics to toxicology and recent advances in NextGen sequencing should greatly enhance the biological content obtained from HTS platforms. We foresee an intersection of new technologies in toxicogenomics and HTS as an innovative development in Tox21. Tox21 goals, priorities, progress, and challenges will be reviewed.
Advances in the Study of Heart Development and Disease Using Zebrafish

PubMed Central

Brown, Daniel R.; Samsa, Leigh Ann; Qian, Li; Liu, Jiandong

2016-01-01

Animal models of cardiovascular disease are key players in the translational medicine pipeline used to define the conserved genetic and molecular basis of disease. Congenital heart diseases (CHDs) are the most common type of human birth defect and feature structural abnormalities that arise during cardiac development and maturation. The zebrafish, Danio rerio, is a valuable vertebrate model organism, offering advantages over traditional mammalian models. These advantages include the rapid, stereotyped and external development of transparent embryos produced in large numbers from inexpensively housed adults, vast capacity for genetic manipulation, and amenability to high-throughput screening. With the help of modern genetics and a sequenced genome, zebrafish have led to insights in cardiovascular diseases ranging from CHDs to arrhythmia and cardiomyopathy. Here, we discuss the utility of zebrafish as a model system and summarize zebrafish cardiac morphogenesis with emphasis on parallels to human heart diseases. Additionally, we discuss the specific tools and experimental platforms utilized in the zebrafish model including forward screens, functional characterization of candidate genes, and high throughput applications. PMID:27335817
High-throughput sequencing analysis of the bacteria in the dust storm which passed over Canberra, Australia on 22-23 September 2009

NASA Astrophysics Data System (ADS)

Munday, Chris; De Deckker, Patrick; Tapper, Nigel; Allison, Gwen

2014-05-01

Following a prolonged drought in Australia in the first decade of the 21st century, several dust storms affected the heavily populated East coast of Australia. The largest such storm occurred on 22-23 September 2009 and had a front of an estimated 3000km. A 24hr average PM10 concentration of over 2,000μg/m3 was recorded in several locations and an hourly peak of over 15,000μg/m3 was recorded (Leys et al. 2011). Over two time periods duplicate aerosol samples were collected on 47mm diameter cellulose nitrate membranes at a location removed from anthropogenic influences. One set of samples was collected in the afternoon the dust event started and another was collected overnight. Additionally, overnight rainfall was collected in a sterile bottle.DNA was directly extracted one membrane from each time point for molecular cloning and high throughput sequencing, while the other was cultivated on Tryptic Soy Agar (TSA). High throughput sequencing was performed using the 454 Titanium platform. From the three samples, 19,945 curated sequences were obtained representing 942 OTUS, with the three samples approximately equal in number. Unclassified Rhizobiales and Stenotrophomonas were the most abundant groups which could be attributed names. A total of 942 OTUs were identified (cutoff = 0.03), and despite the temporal relation of the samples, only eleven were found in all three samples, indicating that the dust storm evolved in composition as it passed over the region. Approximately 800 and 500 CFU/m3 were found in the two cultivated samples, tenfold more than was collected from previous dust events (Lim et al, 2011). Identification of cultivars revealed a dominance of the gram positive Firmicutes phylum, while the clone library showed a more even distribution of taxa, with Actinobacteria the most common and Firmicutes comprising less than 10% of sequences. Collectively, the analyses indicate that the concentration of cultivable organisms during the dust storm dramatically relative to calm conditions. A diverse and variable population of microorganisms were present reflecting the vast source and dynamic nature of the storm.
High-throughput two-dimensional root system phenotyping platform facilitates genetic analysis of root growth and development.

PubMed

Clark, Randy T; Famoso, Adam N; Zhao, Keyan; Shaff, Jon E; Craft, Eric J; Bustamante, Carlos D; McCouch, Susan R; Aneshansley, Daniel J; Kochian, Leon V

2013-02-01

High-throughput phenotyping of root systems requires a combination of specialized techniques and adaptable plant growth, root imaging and software tools. A custom phenotyping platform was designed to capture images of whole root systems, and novel software tools were developed to process and analyse these images. The platform and its components are adaptable to a wide range root phenotyping studies using diverse growth systems (hydroponics, paper pouches, gel and soil) involving several plant species, including, but not limited to, rice, maize, sorghum, tomato and Arabidopsis. The RootReader2D software tool is free and publicly available and was designed with both user-guided and automated features that increase flexibility and enhance efficiency when measuring root growth traits from specific roots or entire root systems during large-scale phenotyping studies. To demonstrate the unique capabilities and high-throughput capacity of this phenotyping platform for studying root systems, genome-wide association studies on rice (Oryza sativa) and maize (Zea mays) root growth were performed and root traits related to aluminium (Al) tolerance were analysed on the parents of the maize nested association mapping (NAM) population. © 2012 Blackwell Publishing Ltd.
High throughput MLVA-16 typing for Brucella based on the microfluidics technology

PubMed Central

2011-01-01

Background Brucellosis, a zoonosis caused by the genus Brucella, has been eradicated in Northern Europe, Australia, the USA and Canada, but remains endemic in most areas of the world. The strain and biovar typing of Brucella field samples isolated in outbreaks is useful for tracing back source of infection and may be crucial for discriminating naturally occurring outbreaks versus bioterrorist events, being Brucella a potential biological warfare agent. In the last years MLVA-16 has been described for Brucella spp. genotyping. The MLVA band profiles may be resolved by different techniques i.e. the manual agarose gels, the capillary electrophoresis sequencing systems or the microfluidic Lab-on-Chip electrophoresis. In this paper we described a high throughput system of MLVA-16 typing for Brucella spp. by using of the microfluidics technology. Results The Caliper LabChip 90 equipment was evaluated for MLVA-16 typing of sixty-three Brucella samples. Furthermore, in order to validate the system, DNA samples previously resolved by sequencing system and Agilent technology, were de novo genotyped. The comparison of the MLVA typing data obtained by the Caliper equipment and those previously obtained by the other analysis methods showed a good correlation. However the outputs were not accurate as the Caliper DNA fragment sizes showed discrepancies compared with real data and a conversion table from observed to expected data was created. Conclusion In this paper we described the MLVA-16 using a rapid, sophisticated microfluidics technology for detection of amplification product sizes. The comparison of the MLVA typing data produced by Caliper LabChip 90 system with the data obtained by different techniques showed a general concordance of the results. Furthermore this platform represents a significant improvement in terms of handling, data acquiring, computational efficiency and rapidity, allowing to perform the strain genotyping in a time equal to one sixth respect to other microfluidics systems as e.g. the Agilent 2100 bioanalyzer. Finally, this platform can be considered a valid alternative to standard genotyping techniques, particularly useful dealing with a large number of samples in short time. These data confirmed that this technology represents a significative advancement in high-throughput accurate Brucella genotyping. PMID:21435217
A high-throughput and quantitative method to assess the mutagenic potential of translesion DNA synthesis

PubMed Central

Taggart, David J.; Camerlengo, Terry L.; Harrison, Jason K.; Sherrer, Shanen M.; Kshetry, Ajay K.; Taylor, John-Stephen; Huang, Kun; Suo, Zucai

2013-01-01

Cellular genomes are constantly damaged by endogenous and exogenous agents that covalently and structurally modify DNA to produce DNA lesions. Although most lesions are mended by various DNA repair pathways in vivo, a significant number of damage sites persist during genomic replication. Our understanding of the mutagenic outcomes derived from these unrepaired DNA lesions has been hindered by the low throughput of existing sequencing methods. Therefore, we have developed a cost-effective high-throughput short oligonucleotide sequencing assay that uses next-generation DNA sequencing technology for the assessment of the mutagenic profiles of translesion DNA synthesis catalyzed by any error-prone DNA polymerase. The vast amount of sequencing data produced were aligned and quantified by using our novel software. As an example, the high-throughput short oligonucleotide sequencing assay was used to analyze the types and frequencies of mutations upstream, downstream and at a site-specifically placed cis–syn thymidine–thymidine dimer generated individually by three lesion-bypass human Y-family DNA polymerases. PMID:23470999
AmpliVar: mutation detection in high-throughput sequence from amplicon-based libraries.

PubMed

Hsu, Arthur L; Kondrashova, Olga; Lunke, Sebastian; Love, Clare J; Meldrum, Cliff; Marquis-Nicholson, Renate; Corboy, Greg; Pham, Kym; Wakefield, Matthew; Waring, Paul M; Taylor, Graham R

2015-04-01

Conventional means of identifying variants in high-throughput sequencing align each read against a reference sequence, and then call variants at each position. Here, we demonstrate an orthogonal means of identifying sequence variation by grouping the reads as amplicons prior to any alignment. We used AmpliVar to make key-value hashes of sequence reads and group reads as individual amplicons using a table of flanking sequences. Low-abundance reads were removed according to a selectable threshold, and reads above this threshold were aligned as groups, rather than as individual reads, permitting the use of sensitive alignment tools. We show that this approach is more sensitive, more specific, and more computationally efficient than comparable methods for the analysis of amplicon-based high-throughput sequencing data. The method can be extended to enable alignment-free confirmation of variants seen in hybridization capture target-enrichment data. © 2015 WILEY PERIODICALS, INC.
A BSL-4 high-throughput screen identifies sulfonamide inhibitors of Nipah virus.

PubMed

Tigabu, Bersabeh; Rasmussen, Lynn; White, E Lucile; Tower, Nichole; Saeed, Mohammad; Bukreyev, Alexander; Rockx, Barry; LeDuc, James W; Noah, James W

2014-04-01

Nipah virus is a biosafety level 4 (BSL-4) pathogen that causes severe respiratory illness and encephalitis in humans. To identify novel small molecules that target Nipah virus replication as potential therapeutics, Southern Research Institute and Galveston National Laboratory jointly developed an automated high-throughput screening platform that is capable of testing 10,000 compounds per day within BSL-4 biocontainment. Using this platform, we screened a 10,080-compound library using a cell-based, high-throughput screen for compounds that inhibited the virus-induced cytopathic effect. From this pilot effort, 23 compounds were identified with EC50 values ranging from 3.9 to 20.0 μM and selectivities >10. Three sulfonamide compounds with EC50 values <12 μM were further characterized for their point of intervention in the viral replication cycle and for broad antiviral efficacy. Development of HTS capability under BSL-4 containment changes the paradigm for drug discovery for highly pathogenic agents because this platform can be readily modified to identify prophylactic and postexposure therapeutic candidates against other BSL-4 pathogens, particularly Ebola, Marburg, and Lassa viruses.
A BSL-4 High-Throughput Screen Identifies Sulfonamide Inhibitors of Nipah Virus

PubMed Central

Tigabu, Bersabeh; Rasmussen, Lynn; White, E. Lucile; Tower, Nichole; Saeed, Mohammad; Bukreyev, Alexander; Rockx, Barry; LeDuc, James W.

2014-01-01

Abstract Nipah virus is a biosafety level 4 (BSL-4) pathogen that causes severe respiratory illness and encephalitis in humans. To identify novel small molecules that target Nipah virus replication as potential therapeutics, Southern Research Institute and Galveston National Laboratory jointly developed an automated high-throughput screening platform that is capable of testing 10,000 compounds per day within BSL-4 biocontainment. Using this platform, we screened a 10,080-compound library using a cell-based, high-throughput screen for compounds that inhibited the virus-induced cytopathic effect. From this pilot effort, 23 compounds were identified with EC50 values ranging from 3.9 to 20.0 μM and selectivities >10. Three sulfonamide compounds with EC50 values <12 μM were further characterized for their point of intervention in the viral replication cycle and for broad antiviral efficacy. Development of HTS capability under BSL-4 containment changes the paradigm for drug discovery for highly pathogenic agents because this platform can be readily modified to identify prophylactic and postexposure therapeutic candidates against other BSL-4 pathogens, particularly Ebola, Marburg, and Lassa viruses. PMID:24735442
Raspberry Pi-powered imaging for plant phenotyping.

PubMed

Tovar, Jose C; Hoyer, J Steen; Lin, Andy; Tielking, Allison; Callen, Steven T; Elizabeth Castillo, S; Miller, Michael; Tessman, Monica; Fahlgren, Noah; Carrington, James C; Nusinow, Dmitri A; Gehan, Malia A

2018-03-01

Image-based phenomics is a powerful approach to capture and quantify plant diversity. However, commercial platforms that make consistent image acquisition easy are often cost-prohibitive. To make high-throughput phenotyping methods more accessible, low-cost microcomputers and cameras can be used to acquire plant image data. We used low-cost Raspberry Pi computers and cameras to manage and capture plant image data. Detailed here are three different applications of Raspberry Pi-controlled imaging platforms for seed and shoot imaging. Images obtained from each platform were suitable for extracting quantifiable plant traits (e.g., shape, area, height, color) en masse using open-source image processing software such as PlantCV. This protocol describes three low-cost platforms for image acquisition that are useful for quantifying plant diversity. When coupled with open-source image processing tools, these imaging platforms provide viable low-cost solutions for incorporating high-throughput phenomics into a wide range of research programs.
PRADA: pipeline for RNA sequencing data analysis.

PubMed

Torres-García, Wandaliz; Zheng, Siyuan; Sivachenko, Andrey; Vegesna, Rahulsimham; Wang, Qianghu; Yao, Rong; Berger, Michael F; Weinstein, John N; Getz, Gad; Verhaak, Roel G W

2014-08-01

Technological advances in high-throughput sequencing necessitate improved computational tools for processing and analyzing large-scale datasets in a systematic automated manner. For that purpose, we have developed PRADA (Pipeline for RNA-Sequencing Data Analysis), a flexible, modular and highly scalable software platform that provides many different types of information available by multifaceted analysis starting from raw paired-end RNA-seq data: gene expression levels, quality metrics, detection of unsupervised and supervised fusion transcripts, detection of intragenic fusion variants, homology scores and fusion frame classification. PRADA uses a dual-mapping strategy that increases sensitivity and refines the analytical endpoints. PRADA has been used extensively and successfully in the glioblastoma and renal clear cell projects of The Cancer Genome Atlas program. http://sourceforge.net/projects/prada/ gadgetz@broadinstitute.org or rverhaak@mdanderson.org Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
High-Throughput Block Optical DNA Sequence Identification.

PubMed

Sagar, Dodderi Manjunatha; Korshoj, Lee Erik; Hanson, Katrina Bethany; Chowdhury, Partha Pratim; Otoupal, Peter Britton; Chatterjee, Anushree; Nagpal, Prashant

2018-01-01

Optical techniques for molecular diagnostics or DNA sequencing generally rely on small molecule fluorescent labels, which utilize light with a wavelength of several hundred nanometers for detection. Developing a label-free optical DNA sequencing technique will require nanoscale focusing of light, a high-throughput and multiplexed identification method, and a data compression technique to rapidly identify sequences and analyze genomic heterogeneity for big datasets. Such a method should identify characteristic molecular vibrations using optical spectroscopy, especially in the "fingerprinting region" from ≈400-1400 cm -1 . Here, surface-enhanced Raman spectroscopy is used to demonstrate label-free identification of DNA nucleobases with multiplexed 3D plasmonic nanofocusing. While nanometer-scale mode volumes prevent identification of single nucleobases within a DNA sequence, the block optical technique can identify A, T, G, and C content in DNA k-mers. The content of each nucleotide in a DNA block can be a unique and high-throughput method for identifying sequences, genes, and other biomarkers as an alternative to single-letter sequencing. Additionally, coupling two complementary vibrational spectroscopy techniques (infrared and Raman) can improve block characterization. These results pave the way for developing a novel, high-throughput block optical sequencing method with lossy genomic data compression using k-mer identification from multiplexed optical data acquisition. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Large-scale prediction of ADAR-mediated effective human A-to-I RNA editing.

PubMed

Yao, Li; Wang, Heming; Song, Yuanyuan; Dai, Zhen; Yu, Hao; Yin, Ming; Wang, Dongxu; Yang, Xin; Wang, Jinlin; Wang, Tiedong; Cao, Nan; Zhu, Jimin; Shen, Xizhong; Song, Guangqi; Zhao, Yicheng

2017-08-10

Adenosine-to-inosine (A-to-I) editing by adenosine deaminase acting on the RNA (ADAR) proteins is one of the most frequent modifications during post- and co-transcription. To facilitate the assignment of biological functions to specific editing sites, we designed an automatic online platform to annotate A-to-I RNA editing sites in pre-mRNA splicing signals, microRNAs (miRNAs) and miRNA target untranslated regions (3' UTRs) from human (Homo sapiens) high-throughput sequencing data and predict their effects based on large-scale bioinformatic analysis. After analysing plenty of previously reported RNA editing events and human normal tissues RNA high-seq data, >60 000 potentially effective RNA editing events on functional genes were found. The RNA Editing Plus platform is available for free at https://www.rnaeditplus.org/, and we believe our platform governing multiple optimized methods will improve further studies of A-to-I-induced editing post-transcriptional regulation. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Transcriptome sequencing and annotation of the halophytic microalga Dunaliella salina * #

PubMed Central

Hong, Ling; Liu, Jun-li; Midoun, Samira Z.; Miller, Philip C.

2017-01-01

The unicellular green alga Dunaliella salina is well adapted to salt stress and contains compounds (including β-carotene and vitamins) with potential commercial value. A large transcriptome database of D. salina during the adjustment, exponential and stationary growth phases was generated using a high throughput sequencing platform. We characterized the metabolic processes in D. salina with a focus on valuable metabolites, with the aim of manipulating D. salina to achieve greater economic value in large-scale production through a bioengineering strategy. Gene expression profiles under salt stress verified using quantitative polymerase chain reaction (qPCR) implied that salt can regulate the expression of key genes. This study generated a substantial fraction of D. salina transcriptional sequences for the entire growth cycle, providing a basis for the discovery of novel genes. This first full-scale transcriptome study of D. salina establishes a foundation for further comparative genomic studies. PMID:28990374
MetaDP: a comprehensive web server for disease prediction of 16S rRNA metagenomic datasets.

PubMed

Xu, Xilin; Wu, Aiping; Zhang, Xinlei; Su, Mingming; Jiang, Taijiao; Yuan, Zhe-Ming

2016-01-01

High-throughput sequencing-based metagenomics has garnered considerable interest in recent years. Numerous methods and tools have been developed for the analysis of metagenomic data. However, it is still a daunting task to install a large number of tools and complete a complicated analysis, especially for researchers with minimal bioinformatics backgrounds. To address this problem, we constructed an automated software named MetaDP for 16S rRNA sequencing data analysis, including data quality control, operational taxonomic unit clustering, diversity analysis, and disease risk prediction modeling. Furthermore, a support vector machine-based prediction model for intestinal bowel syndrome (IBS) was built by applying MetaDP to microbial 16S sequencing data from 108 children. The success of the IBS prediction model suggests that the platform may also be applied to other diseases related to gut microbes, such as obesity, metabolic syndrome, or intestinal cancer, among others (http://metadp.cn:7001/).
High-throughput sequencing methods to study neuronal RNA-protein interactions.

PubMed

Ule, Jernej

2009-12-01

UV-cross-linking and RNase protection, combined with high-throughput sequencing, have provided global maps of RNA sites bound by individual proteins or ribosomes. Using a stringent purification protocol, UV-CLIP (UV-cross-linking and immunoprecipitation) was able to identify intronic and exonic sites bound by splicing regulators in mouse brain tissue. Ribosome profiling has been used to quantify ribosome density on budding yeast mRNAs under different environmental conditions. Post-transcriptional regulation in neurons requires high spatial and temporal precision, as is evident from the role of localized translational control in synaptic plasticity. It remains to be seen if the high-throughput methods can be applied quantitatively to study the dynamics of RNP (ribonucleoprotein) remodelling in specific neuronal populations during the neurodegenerative process. It is certain, however, that applications of new biochemical techniques followed by high-throughput sequencing will continue to provide important insights into the mechanisms of neuronal post-transcriptional regulation.
Optimization of High-Throughput Sequencing Kinetics for determining enzymatic rate constants of thousands of RNA substrates

PubMed Central

Niland, Courtney N.; Jankowsky, Eckhard; Harris, Michael E.

2016-01-01

Quantification of the specificity of RNA binding proteins and RNA processing enzymes is essential to understanding their fundamental roles in biological processes. High Throughput Sequencing Kinetics (HTS-Kin) uses high throughput sequencing and internal competition kinetics to simultaneously monitor the processing rate constants of thousands of substrates by RNA processing enzymes. This technique has provided unprecedented insight into the substrate specificity of the tRNA processing endonuclease ribonuclease P. Here, we investigate the accuracy and robustness of measurements associated with each step of the HTS-Kin procedure. We examine the effect of substrate concentration on the observed rate constant, determine the optimal kinetic parameters, and provide guidelines for reducing error in amplification of the substrate population. Importantly, we find that high-throughput sequencing, and experimental reproducibility contribute their own sources of error, and these are the main sources of imprecision in the quantified results when otherwise optimized guidelines are followed. PMID:27296633
Next-generation sequencing coupled with a cell-free display technology for high-throughput production of reliable interactome data

PubMed Central

Fujimori, Shigeo; Hirai, Naoya; Ohashi, Hiroyuki; Masuoka, Kazuyo; Nishikimi, Akihiko; Fukui, Yoshinori; Washio, Takanori; Oshikubo, Tomohiro; Yamashita, Tatsuhiro; Miyamoto-Sato, Etsuko

2012-01-01

Next-generation sequencing (NGS) has been applied to various kinds of omics studies, resulting in many biological and medical discoveries. However, high-throughput protein-protein interactome datasets derived from detection by sequencing are scarce, because protein-protein interaction analysis requires many cell manipulations to examine the interactions. The low reliability of the high-throughput data is also a problem. Here, we describe a cell-free display technology combined with NGS that can improve both the coverage and reliability of interactome datasets. The completely cell-free method gives a high-throughput and a large detection space, testing the interactions without using clones. The quantitative information provided by NGS reduces the number of false positives. The method is suitable for the in vitro detection of proteins that interact not only with the bait protein, but also with DNA, RNA and chemical compounds. Thus, it could become a universal approach for exploring the large space of protein sequences and interactome networks. PMID:23056904
Droplet microfluidic technology for single-cell high-throughput screening.

PubMed

Brouzes, Eric; Medkova, Martina; Savenelli, Neal; Marran, Dave; Twardowski, Mariusz; Hutchison, J Brian; Rothberg, Jonathan M; Link, Darren R; Perrimon, Norbert; Samuels, Michael L

2009-08-25

We present a droplet-based microfluidic technology that enables high-throughput screening of single mammalian cells. This integrated platform allows for the encapsulation of single cells and reagents in independent aqueous microdroplets (1 pL to 10 nL volumes) dispersed in an immiscible carrier oil and enables the digital manipulation of these reactors at a very high-throughput. Here, we validate a full droplet screening workflow by conducting a droplet-based cytotoxicity screen. To perform this screen, we first developed a droplet viability assay that permits the quantitative scoring of cell viability and growth within intact droplets. Next, we demonstrated the high viability of encapsulated human monocytic U937 cells over a period of 4 days. Finally, we developed an optically-coded droplet library enabling the identification of the droplets composition during the assay read-out. Using the integrated droplet technology, we screened a drug library for its cytotoxic effect against U937 cells. Taken together our droplet microfluidic platform is modular, robust, uses no moving parts, and has a wide range of potential applications including high-throughput single-cell analyses, combinatorial screening, and facilitating small sample analyses.
Real-time imaging of microparticles and living cells with CMOS nanocapacitor arrays

NASA Astrophysics Data System (ADS)

Laborde, C.; Pittino, F.; Verhoeven, H. A.; Lemay, S. G.; Selmi, L.; Jongsma, M. A.; Widdershoven, F. P.

2015-09-01

Platforms that offer massively parallel, label-free biosensing can, in principle, be created by combining all-electrical detection with low-cost integrated circuits. Examples include field-effect transistor arrays, which are used for mapping neuronal signals and sequencing DNA. Despite these successes, however, bioelectronics has so far failed to deliver a broadly applicable biosensing platform. This is due, in part, to the fact that d.c. or low-frequency signals cannot be used to probe beyond the electrical double layer formed by screening salt ions, which means that under physiological conditions the sensing of a target analyte located even a short distance from the sensor (∼1 nm) is severely hampered. Here, we show that high-frequency impedance spectroscopy can be used to detect and image microparticles and living cells under physiological salt conditions. Our assay employs a large-scale, high-density array of nanoelectrodes integrated with CMOS electronics on a single chip and the sensor response depends on the electrical properties of the analyte, allowing impedance-based fingerprinting. With our platform, we image the dynamic attachment and micromotion of BEAS, THP1 and MCF7 cancer cell lines in real time at submicrometre resolution in growth medium, demonstrating the potential of the platform for label/tracer-free high-throughput screening of anti-tumour drug candidates.

Initial steps towards a production platform for DNA sequence analysis on the grid.

PubMed

Luyf, Angela C M; van Schaik, Barbera D C; de Vries, Michel; Baas, Frank; van Kampen, Antoine H C; Olabarriaga, Silvia D

2010-12-14

Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/
Development of a phenotyping platform for high throughput screening of nodal root angle in sorghum.

PubMed

Joshi, Dinesh C; Singh, Vijaya; Hunt, Colleen; Mace, Emma; van Oosterom, Erik; Sulman, Richard; Jordan, David; Hammer, Graeme

2017-01-01

In sorghum, the growth angle of nodal roots is a major component of root system architecture. It strongly influences the spatial distribution of roots of mature plants in the soil profile, which can impact drought adaptation. However, selection for nodal root angle in sorghum breeding programs has been restricted by the absence of a suitable high throughput phenotyping platform. The aim of this study was to develop a phenotyping platform for the rapid, non-destructive and digital measurement of nodal root angle of sorghum at the seedling stage. The phenotyping platform comprises of 500 soil filled root chambers (50 × 45 × 0.3 cm in size), made of transparent perspex sheets that were placed in metal tubs and covered with polycarbonate sheets. Around 3 weeks after sowing, once the first flush of nodal roots was visible, roots were imaged in situ using an imaging box that included two digital cameras that were remotely controlled by two android tablets. Free software ( openGelPhoto.tcl ) allowed precise measurement of nodal root angle from the digital images. The reliability and efficiency of the platform was evaluated by screening a large nested association mapping population of sorghum and a set of hybrids in six independent experimental runs that included up to 500 plants each. The platform revealed extensive genetic variation and high heritability (repeatability) for nodal root angle. High genetic correlations and consistent ranking of genotypes across experimental runs confirmed the reproducibility of the platform. This low cost, high throughput root phenotyping platform requires no sophisticated equipment, is adaptable to most glasshouse environments and is well suited to dissect the genetic control of nodal root angle of sorghum. The platform is suitable for use in sorghum breeding programs aiming to improve drought adaptation through root system architecture manipulation.
Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

PubMed

Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

2015-11-20

The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.
An evaluation of the accuracy and speed of metagenome analysis tools

PubMed Central

Lindgreen, Stinus; Adair, Karen L.; Gardner, Paul P.

2016-01-01

Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html PMID:26778510
High-throughput sequencing: a failure mode analysis.

PubMed

Yang, George S; Stott, Jeffery M; Smailus, Duane; Barber, Sarah A; Balasundaram, Miruna; Marra, Marco A; Holt, Robert A

2005-01-04

Basic manufacturing principles are becoming increasingly important in high-throughput sequencing facilities where there is a constant drive to increase quality, increase efficiency, and decrease operating costs. While high-throughput centres report failure rates typically on the order of 10%, the causes of sporadic sequencing failures are seldom analyzed in detail and have not, in the past, been formally reported. Here we report the results of a failure mode analysis of our production sequencing facility based on detailed evaluation of 9,216 ESTs generated from two cDNA libraries. Two categories of failures are described; process-related failures (failures due to equipment or sample handling) and template-related failures (failures that are revealed by close inspection of electropherograms and are likely due to properties of the template DNA sequence itself). Preventative action based on a detailed understanding of failure modes is likely to improve the performance of other production sequencing pipelines.
New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era (2010 JGI/ANL HPC Workshop)

ScienceCinema

Notredame, Cedric

2018-05-02

Cedric Notredame from the Centre for Genomic Regulation gives a presentation on New Challenges of the Computation of Multiple Sequence Alignments in the High-Throughput Era at the JGI/Argonne HPC Workshop on January 26, 2010.
Evaluation of Sequencing Approaches for High-Throughput Transcriptomics - (BOSC)

EPA Science Inventory

Whole-genome in vitro transcriptomics has shown the capability to identify mechanisms of action and estimates of potency for chemical-mediated effects in a toxicological framework, but with limited throughput and high cost. The generation of high-throughput global gene expression...
High-Throughput Platform for Synthesis of Melamine-Formaldehyde Microcapsules.

PubMed

Çakir, Seda; Bauters, Erwin; Rivero, Guadalupe; Parasote, Tom; Paul, Johan; Du Prez, Filip E

2017-07-10

The synthesis of microcapsules via in situ polymerization is a labor-intensive and time-consuming process, where many composition and process factors affect the microcapsule formation and its morphology. Herein, we report a novel combinatorial technique for the preparation of melamine-formaldehyde microcapsules, using a custom-made and automated high-throughput platform (HTP). After performing validation experiments for ensuring the accuracy and reproducibility of the novel platform, a design of experiment study was performed. The influence of different encapsulation parameters was investigated, such as the effect of the surfactant, surfactant type, surfactant concentration and core/shell ratio. As a result, this HTP-platform is suitable to be used for the synthesis of different types of microcapsules in an automated and controlled way, allowing the screening of different reaction parameters in a shorter time compared to the manual synthetic techniques.
The promise and challenge of high-throughput sequencing of the antibody repertoire

PubMed Central

Georgiou, George; Ippolito, Gregory C; Beausang, John; Busse, Christian E; Wardemann, Hedda; Quake, Stephen R

2014-01-01

Efforts to determine the antibody repertoire encoded by B cells in the blood or lymphoid organs using high-throughput DNA sequencing technologies have been advancing at an extremely rapid pace and are transforming our understanding of humoral immune responses. Information gained from high-throughput DNA sequencing of immunoglobulin genes (Ig-seq) can be applied to detect B-cell malignancies with high sensitivity, to discover antibodies specific for antigens of interest, to guide vaccine development and to understand autoimmunity. Rapid progress in the development of experimental protocols and informatics analysis tools is helping to reduce sequencing artifacts, to achieve more precise quantification of clonal diversity and to extract the most pertinent biological information. That said, broader application of Ig-seq, especially in clinical settings, will require the development of a standardized experimental design framework that will enable the sharing and meta-analysis of sequencing data generated by different laboratories. PMID:24441474
High-throughput GPU-based LDPC decoding

NASA Astrophysics Data System (ADS)

Chang, Yang-Lang; Chang, Cheng-Chun; Huang, Min-Yu; Huang, Bormin

2010-08-01

Low-density parity-check (LDPC) code is a linear block code known to approach the Shannon limit via the iterative sum-product algorithm. LDPC codes have been adopted in most current communication systems such as DVB-S2, WiMAX, WI-FI and 10GBASE-T. LDPC for the needs of reliable and flexible communication links for a wide variety of communication standards and configurations have inspired the demand for high-performance and flexibility computing. Accordingly, finding a fast and reconfigurable developing platform for designing the high-throughput LDPC decoder has become important especially for rapidly changing communication standards and configurations. In this paper, a new graphic-processing-unit (GPU) LDPC decoding platform with the asynchronous data transfer is proposed to realize this practical implementation. Experimental results showed that the proposed GPU-based decoder achieved 271x speedup compared to its CPU-based counterpart. It can serve as a high-throughput LDPC decoder.
The future scalability of pH-based genome sequencers: A theoretical perspective

NASA Astrophysics Data System (ADS)

Go, Jonghyun; Alam, Muhammad A.

2013-10-01

Sequencing of human genome is an essential prerequisite for personalized medicine and early prognosis of various genetic diseases. The state-of-art, high-throughput genome sequencing technologies provide improved sequencing; however, their reliance on relatively expensive optical detection schemes has prevented wide-spread adoption of the technology in routine care. In contrast, the recently announced pH-based electronic genome sequencers achieve fast sequencing at low cost because of the compatibility with the current microelectronics technology. While the progress in technology development has been rapid, the physics of the sequencing chips and the potential for future scaling (and therefore, cost reduction) remain unexplored. In this article, we develop a theoretical framework and a scaling theory to explain the principle of operation of the pH-based sequencing chips and use the framework to explore various perceived scaling limits of the technology related to signal to noise ratio, well-to-well crosstalk, and sequencing accuracy. We also address several limitations inherent to the key steps of pH-based genome sequencers, which are widely shared by many other sequencing platforms in the market but remained unexplained properly so far.
Pseudouridines have context-dependent mutation and stop rates in high-throughput sequencing.

PubMed

Zhou, Katherine I; Clark, Wesley C; Pan, David W; Eckwahl, Matthew J; Dai, Qing; Pan, Tao

2018-05-11

The abundant RNA modification pseudouridine (Ψ) has been mapped transcriptome-wide by chemically modifying pseudouridines with carbodiimide and detecting the resulting reverse transcription stops in high-throughput sequencing. However, these methods have limited sensitivity and specificity, in part due to the use of reverse transcription stops. We sought to use mutations rather than just stops in sequencing data to identify pseudouridine sites. Here, we identify reverse transcription conditions that allow read-through of carbodiimide-modified pseudouridine (CMC-Ψ), and we show that pseudouridines in carbodiimide-treated human ribosomal RNA have context-dependent mutation and stop rates in high-throughput sequencing libraries prepared under these conditions. Furthermore, accounting for the context-dependence of mutation and stop rates can enhance the detection of pseudouridine sites. Similar approaches could contribute to the sequencing-based detection of many RNA modifications.
Genetic high throughput screening in Retinitis Pigmentosa based on high resolution melting (HRM) analysis.

PubMed

Anasagasti, Ander; Barandika, Olatz; Irigoyen, Cristina; Benitez, Bruno A; Cooper, Breanna; Cruchaga, Carlos; López de Munain, Adolfo; Ruiz-Ederra, Javier

2013-11-01

Retinitis Pigmentosa (RP) involves a group of genetically determined retinal diseases caused by a large number of mutations that result in rod photoreceptor cell death followed by gradual death of cone cells. Most cases of RP are monogenic, with more than 80 associated genes identified so far. The high number of genes and variants involved in RP, among other factors, is making the molecular characterization of RP a real challenge for many patients. Although HRM has been used for the analysis of isolated variants or single RP genes, as far as we are concerned, this is the first study that uses HRM analysis for a high-throughput screening of several RP genes. Our main goal was to test the suitability of HRM analysis as a genetic screening technique in RP, and to compare its performance with two of the most widely used NGS platforms, Illumina and PGM-Ion Torrent technologies. RP patients (n = 96) were clinically diagnosed at the Ophthalmology Department of Donostia University Hospital, Spain. We analyzed a total of 16 RP genes that meet the following inclusion criteria: 1) size: genes with transcripts of less than 4 kb; 2) number of exons: genes with up to 22 exons; and 3) prevalence: genes reported to account for, at least, 0.4% of total RP cases worldwide. For comparison purposes, RHO gene was also sequenced with Illumina (GAII; Illumina), Ion semiconductor technologies (PGM; Life Technologies) and Sanger sequencing (ABI 3130xl platform; Applied Biosystems). Detected variants were confirmed in all cases by Sanger sequencing and tested for co-segregation in the family of affected probands. We identified a total of 65 genetic variants, 15 of which (23%) were novel, in 49 out of 96 patients. Among them, 14 (4 novel) are probable disease-causing genetic variants in 7 RP genes, affecting 15 patients. Our HRM analysis-based study, proved to be a cost-effective and rapid method that provides an accurate identification of genetic RP variants. This approach is effective for medium sized (<4 kb transcript) RP genes, which constitute over 80% of the total of known RP genes.
Single-cell barcoding and sequencing using droplet microfluidics.

PubMed

Zilionis, Rapolas; Nainys, Juozas; Veres, Adrian; Savova, Virginia; Zemmour, David; Klein, Allon M; Mazutis, Linas

2017-01-01

Single-cell RNA sequencing has recently emerged as a powerful tool for mapping cellular heterogeneity in diseased and healthy tissues, yet high-throughput methods are needed for capturing the unbiased diversity of cells. Droplet microfluidics is among the most promising candidates for capturing and processing thousands of individual cells for whole-transcriptome or genomic analysis in a massively parallel manner with minimal reagent use. We recently established a method called inDrops, which has the capability to index >15,000 cells in an hour. A suspension of cells is first encapsulated into nanoliter droplets with hydrogel beads (HBs) bearing barcoding DNA primers. Cells are then lysed and mRNA is barcoded (indexed) by a reverse transcription (RT) reaction. Here we provide details for (i) establishing an inDrops platform (1 d); (ii) performing hydrogel bead synthesis (4 d); (iii) encapsulating and barcoding cells (1 d); and (iv) RNA-seq library preparation (2 d). inDrops is a robust and scalable platform, and it is unique in its ability to capture and profile >75% of cells in even very small samples, on a scale of thousands or tens of thousands of cells.
Combining clinical and genomics queries using i2b2 – Three methods

PubMed Central

Murphy, Shawn N.; Avillach, Paul; Bellazzi, Riccardo; Phillips, Lori; Gabetta, Matteo; Eran, Alal; McDuffie, Michael T.; Kohane, Isaac S.

2017-01-01

We are fortunate to be living in an era of twin biomedical data surges: a burgeoning representation of human phenotypes in the medical records of our healthcare systems, and high-throughput sequencing making rapid technological advances. The difficulty representing genomic data and its annotations has almost by itself led to the recognition of a biomedical “Big Data” challenge, and the complexity of healthcare data only compounds the problem to the point that coherent representation of both systems on the same platform seems insuperably difficult. We investigated the capability for complex, integrative genomic and clinical queries to be supported in the Informatics for Integrating Biology and the Bedside (i2b2) translational software package. Three different data integration approaches were developed: The first is based on Sequence Ontology, the second is based on the tranSMART engine, and the third on CouchDB. These novel methods for representing and querying complex genomic and clinical data on the i2b2 platform are available today for advancing precision medicine. PMID:28388645
The Mutation-Associated Neoantigen Functional Expansion of Specific T cells (MANAFEST) assay: a sensitive platform for monitoring antitumor immunity.

PubMed

Danilova, Ludmila; Anagnostou, Valsamo; Caushi, Justina X; Sidhom, John-William; Guo, Haidan; Chan, Hok Yee; Suri, Prerna; Tam, Ada J; Zhang, Jiajia; El Asmar, Margueritta; Marrone, Kristen A; Naidoo, Jarushka; Brahmer, Julie R; Forde, Patrick M; Baras, Alexander S; Cope, Leslie; Velculescu, Victor E; Pardoll, Drew; Housseau, Franck; Smith, Kellie N

2018-06-12

Mutation-associated neoantigens (MANAs) are a target of antitumor T-cell immunity. Sensitive, simple, and standardized assays are needed to assess the repertoire of functional MANA-specific T cells in oncology. Assays analyzing in vitro cytokine production such as ELISpot and intracellular cytokine staining (ICS) have been useful but have limited sensitivity in assessing tumor-specific T-cell responses and do not analyze antigen-specific T-cell repertoires. The FEST (Functional Expansion of Specific T cells) assay described herein integrates TCR sequencing of short-term, peptide-stimulated cultures with a bioinformatic platform to identify antigen-specific clonotypic amplifications. This assay can be adapted for all types of antigens, including mutation associated neoantigens (MANAs) via tumor exome-guided prediction of MANAs. Following in vitro identification by the MANAFEST assay, the MANA-specific CDR3 sequence can be used as a molecular barcode to detect and monitor the dynamics of these clonotypes in blood, tumor, and normal tissue of patients receiving immunotherapy. MANAFEST is compatible with high-throughput routine clinical and lab practices. Copyright ©2018, American Association for Cancer Research.
High-throughput sequencing of fecal DNA to identify insects consumed by wild Weddell's saddleback tamarins (Saguinus weddelli, Cebidae, Primates) in Bolivia.

PubMed

Mallott, E K; Malhi, R S; Garber, P A

2015-03-01

The genus Saguinus represents a successful radiation of over 20 species of small-bodied New World monkeys. Studies of the tamarin diet indicate that insects and small vertebrates account for ∼16-45% of total feeding and foraging time, and represent an important source of lipids, protein, and metabolizable energy. Although tamarins are reported to commonly consume large-bodied insects such as grasshoppers and walking sticks (Orthoptera), little is known concerning the degree to which smaller or less easily identifiable arthropod prey comprises an important component of their diet. To better understand tamarin arthropod feeding behavior, fecal samples from 20 wild Bolivian saddleback tamarins (members of five groups) were collected over a 3 week period in June 2012, and analyzed for the presence of arthropod DNA. DNA was extracted using a Qiagen stool extraction kit, and universal insect primers were created and used to amplify a ∼280 bp section of the COI mitochondrial gene. Amplicons were sequenced on the Roche 454 sequencing platform using high-throughput sequencing techniques. An analysis of these samples indicated the presence of 43 taxa of arthropods including 10 orders, 15 families, and 12 identified genera. Many of these taxa had not been previously identified in the tamarin diet. These results highlight molecular analysis of fecal DNA as an important research tool for identifying anthropod feeding patterns in primates, and reveal broad diversity in the taxa, foraging microhabitats, and size of arthropods consumed by tamarin monkeys. © 2014 Wiley Periodicals, Inc.
Discovery of viruses and virus-like pathogens in pistachio using high-throughput sequencing

USDA-ARS?s Scientific Manuscript database

Pistachio (Pistacia vera L.) trees from the National Clonal Germplasm Repository (NCGR) and orchards in California were surveyed for viruses and virus-like agents by high-throughput sequencing (HTS). Analyses of 60 trees including clonal UCB-1 hybrid rootstock (P. atlantica × P. integerrima) identif...
Integrated automation for continuous high-throughput synthetic chromosome assembly and transformation to identify improved yeast strains for industrial production of peptide sweetener brazzein

USDA-ARS?s Scientific Manuscript database

Production and recycling of recombinant sweetener peptides in industrial biorefineries involves the evaluation of large numbers of genes and proteins. High-throughput integrated robotic molecular biology platforms that have the capacity to rapidly synthesize, clone, and express heterologous gene ope...
Multitrait, random regression, or simple repeatability model in high-throughput phenotyping data improve genomic prediction for wheat grain yield

USDA-ARS?s Scientific Manuscript database

High-throughput phenotyping (HTP) platforms can be used to measure traits that are genetically correlated with wheat (Triticum aestivum L.) grain yield across time. Incorporating such secondary traits in the multivariate pedigree and genomic prediction models would be desirable to improve indirect s...

MicroRNA from Moringa oleifera: Identification by High Throughput Sequencing and Their Potential Contribution to Plant Medicinal Value.

PubMed

Pirrò, Stefano; Zanella, Letizia; Kenzo, Maurice; Montesano, Carla; Minutolo, Antonella; Potestà, Marina; Sobze, Martin Sanou; Canini, Antonella; Cirilli, Marco; Muleo, Rosario; Colizzi, Vittorio; Galgani, Andrea

2016-01-01

Moringa oleifera is a widespread plant with substantial nutritional and medicinal value. We postulated that microRNAs (miRNAs), which are endogenous, noncoding small RNAs regulating gene expression at the post-transcriptional level, might contribute to the medicinal properties of plants of this species after ingestion into human body, regulating human gene expression. However, the knowledge is scarce about miRNA in Moringa. Furthermore, in order to test the hypothesis on the pharmacological potential properties of miRNA, we conducted a high-throughput sequencing analysis using the Illumina platform. A total of 31,290,964 raw reads were produced from a library of small RNA isolated from M. oleifera seeds. We identified 94 conserved and two novel miRNAs that were validated by qRT-PCR assays. Results from qRT-PCR trials conducted on the expression of 20 Moringa miRNA showed that are conserved across multiple plant species as determined by their detection in tissue of other common crop plants. In silico analyses predicted target genes for the conserved miRNA that in turn allowed to relate the miRNAs to the regulation of physiological processes. Some of the predicted plant miRNAs have functional homology to their mammalian counterparts and regulated human genes when they were transfected into cell lines. To our knowledge, this is the first report of discovering M. oleifera miRNAs based on high-throughput sequencing and bioinformatics analysis and we provided new insight into a potential cross-species control of human gene expression. The widespread cultivation and consumption of M. oleifera, for nutritional and medicinal purposes, brings humans into close contact with products and extracts of this plant species. The potential for miRNA transfer should be evaluated as one possible mechanism of action to account for beneficial properties of this valuable species.
Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species

PubMed Central

Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Yu, Yeisoo; Yang, Kiwoung; Choi, Beom-Soon; Koh, Hee-Jong; Waminal, Nomar Espinosa; Choi, Hong-Il; Kim, Nam-Hoon; Jang, Woojong; Park, Hyun-Seung; Lee, Jonghoon; Lee, Hyun Oh; Joh, Ho Jun; Lee, Hyeon Ju; Park, Jee Young; Perumal, Sampath; Jayakodi, Murukarthick; Lee, Yun Sun; Kim, Backki; Copetti, Dario; Kim, Soonok; Kim, Sunggil; Lim, Ki-Byung; Kim, Young-Dong; Lee, Jungho; Cho, Kwang-Su; Park, Beom-Seok; Wing, Rod A.; Yang, Tae-Jin

2015-01-01

Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis. PMID:26506948
CNV-seq, a new method to detect copy number variation using high-throughput sequencing.

PubMed

Xie, Chao; Tammi, Martti T

2009-03-06

DNA copy number variation (CNV) has been recognized as an important source of genetic variation. Array comparative genomic hybridization (aCGH) is commonly used for CNV detection, but the microarray platform has a number of inherent limitations. Here, we describe a method to detect copy number variation using shotgun sequencing, CNV-seq. The method is based on a robust statistical model that describes the complete analysis procedure and allows the computation of essential confidence values for detection of CNV. Our results show that the number of reads, not the length of the reads is the key factor determining the resolution of detection. This favors the next-generation sequencing methods that rapidly produce large amount of short reads. Simulation of various sequencing methods with coverage between 0.1x to 8x show overall specificity between 91.7 - 99.9%, and sensitivity between 72.2 - 96.5%. We also show the results for assessment of CNV between two individual human genomes.
Investigation of Human Cancers for Retrovirus by Low-Stringency Target Enrichment and High-Throughput Sequencing.

PubMed

Vinner, Lasse; Mourier, Tobias; Friis-Nielsen, Jens; Gniadecki, Robert; Dybkaer, Karen; Rosenberg, Jacob; Langhoff, Jill Levin; Cruz, David Flores Santa; Fonager, Jannik; Izarzugaza, Jose M G; Gupta, Ramneek; Sicheritz-Ponten, Thomas; Brunak, Søren; Willerslev, Eske; Nielsen, Lars Peter; Hansen, Anders Johannes

2015-08-19

Although nearly one fifth of all human cancers have an infectious aetiology, the causes for the majority of cancers remain unexplained. Despite the enormous data output from high-throughput shotgun sequencing, viral DNA in a clinical sample typically constitutes a proportion of host DNA that is too small to be detected. Sequence variation among virus genomes complicates application of sequence-specific, and highly sensitive, PCR methods. Therefore, we aimed to develop and characterize a method that permits sensitive detection of sequences despite considerable variation. We demonstrate that our low-stringency in-solution hybridization method enables detection of <100 viral copies. Furthermore, distantly related proviral sequences may be enriched by orders of magnitude, enabling discovery of hitherto unknown viral sequences by high-throughput sequencing. The sensitivity was sufficient to detect retroviral sequences in clinical samples. We used this method to conduct an investigation for novel retrovirus in samples from three cancer types. In accordance with recent studies our investigation revealed no retroviral infections in human B-cell lymphoma cells, cutaneous T-cell lymphoma or colorectal cancer biopsies. Nonetheless, our generally applicable method makes sensitive detection possible and permits sequencing of distantly related sequences from complex material.
High-throughput microsatellite genotyping in ecology: improved accuracy, efficiency, standardization and success with low-quantity and degraded DNA.

PubMed

De Barba, M; Miquel, C; Lobréaux, S; Quenette, P Y; Swenson, J E; Taberlet, P

2017-05-01

Microsatellite markers have played a major role in ecological, evolutionary and conservation research during the past 20 years. However, technical constrains related to the use of capillary electrophoresis and a recent technological revolution that has impacted other marker types have brought to question the continued use of microsatellites for certain applications. We present a study for improving microsatellite genotyping in ecology using high-throughput sequencing (HTS). This approach entails selection of short markers suitable for HTS, sequencing PCR-amplified microsatellites on an Illumina platform and bioinformatic treatment of the sequence data to obtain multilocus genotypes. It takes advantage of the fact that HTS gives direct access to microsatellite sequences, allowing unambiguous allele identification and enabling automation of the genotyping process through bioinformatics. In addition, the massive parallel sequencing abilities expand the information content of single experimental runs far beyond capillary electrophoresis. We illustrated the method by genotyping brown bear samples amplified with a multiplex PCR of 13 new microsatellite markers and a sex marker. HTS of microsatellites provided accurate individual identification and parentage assignment and resulted in a significant improvement of genotyping success (84%) of faecal degraded DNA and costs reduction compared to capillary electrophoresis. The HTS approach holds vast potential for improving success, accuracy, efficiency and standardization of microsatellite genotyping in ecological and conservation applications, especially those that rely on profiling of low-quantity/quality DNA and on the construction of genetic databases. We discuss and give perspectives for the implementation of the method in the light of the challenges encountered in wildlife studies. © 2016 John Wiley & Sons Ltd.
[Review of Second Generation Sequencing and Its Application in Forensic Genetics].

PubMed

Zhang, S H; Bian, Y N; Zhao, Q; Li, C T

2016-08-01

The rapid development of second generation sequencing （SGS） within the past few years has led to the increasement of data throughput and read length while at the same time brought down substantially the sequencing cost. This made new breakthrough in the area of biology and ushered the forensic genetics into a new era. Based on the history of sequencing application in forensic genetics, this paper reviews the importance of sequencing technologies for genetic marker detection. The application status and potential of SGS in forensic genetics are discussed based on the already explored SGS platforms of Roche, Illumina and Life Technologies. With these platforms, DNA markers （SNP, STR）, RNA markers （mRNA, microRNA） and whole mtDNA can be sequenced. However, development and validation of application kits, maturation of analysis software, connection to the existing databases and the possible ethical issues occurred with big data will be the key factors that determine whether this technology can substitute or supplement PCR-CE, the mature technology, and be widely used for cases detection. Copyright© by the Editorial Department of Journal of Forensic Medicine.
Analyzing Immunoglobulin Repertoires

PubMed Central

Chaudhary, Neha; Wesemann, Duane R.

2018-01-01

Somatic assembly of T cell receptor and B cell receptor (BCR) genes produces a vast diversity of lymphocyte antigen recognition capacity. The advent of efficient high-throughput sequencing of lymphocyte antigen receptor genes has recently generated unprecedented opportunities for exploration of adaptive immune responses. With these opportunities have come significant challenges in understanding the analysis techniques that most accurately reflect underlying biological phenomena. In this regard, sample preparation and sequence analysis techniques, which have largely been borrowed and adapted from other fields, continue to evolve. Here, we review current methods and challenges of library preparation, sequencing and statistical analysis of lymphocyte receptor repertoire studies. We discuss the general steps in the process of immune repertoire generation including sample preparation, platforms available for sequencing, processing of sequencing data, measurable features of the immune repertoire, and the statistical tools that can be used for analysis and interpretation of the data. Because BCR analysis harbors additional complexities, such as immunoglobulin (Ig) (i.e., antibody) gene somatic hypermutation and class switch recombination, the emphasis of this review is on Ig/BCR sequence analysis. PMID:29593723
Evaluation of a PCR/ESI-MS platform to identify respiratory viruses from nasopharyngeal aspirates.

PubMed

Lin, Yong; Fu, Yongfeng; Xu, Menghua; Su, Liyun; Cao, Lingfeng; Xu, Jin; Cheng, Xunjia

2015-11-01

Acute respiratory tract infection is a major cause of morbidity and mortality worldwide, particularly in infants and young children. High-throughput, accurate, broad-range tools for etiologic diagnosis are critical for effective epidemic control. In this study, the diagnostic capacities of an Ibis platform based on the PCR/ESI-MS assay were evaluated using clinical samples. Nasopharyngeal aspirates (NPAs) were collected from 120 children (<5 years old) who were hospitalized with lower respiratory tract infections between November 2010 and October 2011. The respiratory virus detection assay was performed using the PCR/ESI-MS assay and the DFA. The discordant PCR/ESI-MS and DFA results were resolved with RT-PCR plus sequencing. The overall agreement for PCR/ESI-MS and DFA was 98.3% (118/120). Compared with the results from DFA, the sensitivity and specificity of the PCR/ESI-MS assay were 100% and 97.5%, respectively. The PCR/ESI-MS assay also detected more multiple virus infections and revealed more detailed subtype information than DFA. Among the 12 original specimens with discordant results between PCR/ESI-MS and DFA, 11 had confirmed PCR/ESI-MS results. Thus, the PCR/ESI-MS assay is a high-throughput, sensitive, specific and promising method to detect and subtype conventional viruses in respiratory tract infections and allows rapid identification of mixed pathogens. © 2015 Wiley Periodicals, Inc.
Modeling congenital disease and inborn errors of development in Drosophila melanogaster

PubMed Central

Moulton, Matthew J.; Letsou, Anthea

2016-01-01

ABSTRACT Fly models that faithfully recapitulate various aspects of human disease and human health-related biology are being used for research into disease diagnosis and prevention. Established and new genetic strategies in Drosophila have yielded numerous substantial successes in modeling congenital disorders or inborn errors of human development, as well as neurodegenerative disease and cancer. Moreover, although our ability to generate sequence datasets continues to outpace our ability to analyze these datasets, the development of high-throughput analysis platforms in Drosophila has provided access through the bottleneck in the identification of disease gene candidates. In this Review, we describe both the traditional and newer methods that are facilitating the incorporation of Drosophila into the human disease discovery process, with a focus on the models that have enhanced our understanding of human developmental disorders and congenital disease. Enviable features of the Drosophila experimental system, which make it particularly useful in facilitating the much anticipated move from genotype to phenotype (understanding and predicting phenotypes directly from the primary DNA sequence), include its genetic tractability, the low cost for high-throughput discovery, and a genome and underlying biology that are highly evolutionarily conserved. In embracing the fly in the human disease-gene discovery process, we can expect to speed up and reduce the cost of this process, allowing experimental scales that are not feasible and/or would be too costly in higher eukaryotes. PMID:26935104
Research Associate | Center for Cancer Research

Cancer.gov

The Basic Science Program (BSP) at the Frederick National Laboratory for Cancer Research (FNLCR) pursues independent, multidisciplinary research programs in basic or applied molecular biology, immunology, retrovirology, cancer biology or human genetics. As part of the BSP, the Microbiome and Genetics Core (the Core) characterizes microbiomes by next-generation sequencing to determine their composition and variation, as influenced by immune, genetic, and host health factors. The Core provides support across a spectrum of processes, from nucleic acid isolation through bioinformatics and statistical analysis. KEY ROLES/RESPONSIBILITIES The Research Associate II will provide support in the areas of automated isolation, preparation, PCR and sequencing of DNA on next generation platforms (Illumina MiSeq and NextSeq). An opportunity exists to join the Core’s team of highly trained experimentalists and bioinformaticians working to characterize microbiome samples. The following represent requirements of the position: A minimum of five (5) years related of biomedical experience. Experience with high-throughput nucleic acid (DNA/RNA) extraction. Experience in performing PCR amplification (including quantitative real-time PCR). Experience or familiarity with robotic liquid handling protocols (especially on the Eppendorf epMotion 5073 or 5075 platforms). Experience in operating and maintaining benchtop Illumina sequencers (MiSeq and NextSeq). Ability to evaluate experimental quality and to troubleshoot molecular biology protocols. Experience with sample tracking, inventory management and biobanking. Ability to operate and communicate effectively in a team-oriented work environment.
An Efficient Method for Electroporation of Small Interfering RNAs into ENCODE Project Tier 1 GM12878 and K562 Cell Lines.

PubMed

Muller, Ryan Y; Hammond, Ming C; Rio, Donald C; Lee, Yeon J

2015-12-01

The Encyclopedia of DNA Elements (ENCODE) Project aims to identify all functional sequence elements in the human genome sequence by use of high-throughput DNA/cDNA sequencing approaches. To aid the standardization, comparison, and integration of data sets produced from different technologies and platforms, the ENCODE Consortium selected several standard human cell lines to be used by the ENCODE Projects. The Tier 1 ENCODE cell lines include GM12878, K562, and H1 human embryonic stem cell lines. GM12878 is a lymphoblastoid cell line, transformed with the Epstein-Barr virus, that was selected by the International HapMap Project for whole genome and transcriptome sequencing by use of the Illumina platform. K562 is an immortalized myelogenous leukemia cell line. The GM12878 cell line is attractive for the ENCODE Projects, as it offers potential synergy with the International HapMap Project. Despite the vast amount of sequencing data available on the GM12878 cell line through the ENCODE Project, including transcriptome, chromatin immunoprecipitation-sequencing for histone marks, and transcription factors, no small interfering siRNA-mediated knockdown studies have been performed in the GM12878 cell line, as cationic lipid-mediated transfection methods are inefficient for lymphoid cell lines. Here, we present an efficient and reproducible method for transfection of a variety of siRNAs into the GM12878 and K562 cell lines, which subsequently results in targeted protein depletion.
Microscale screening systems for 3D cellular microenvironments: platforms, advances, and challenges

PubMed Central

Montanez-Sauri, Sara I.; Beebe, David J.; Sung, Kyung Eun

2015-01-01

The increasing interest in studying cells using more in vivo-like three-dimensional (3D) microenvironments has created a need for advanced 3D screening platforms with enhanced functionalities and increased throughput. 3D screening platforms that better mimic in vivo microenvironments with enhanced throughput would provide more in-depth understanding of the complexity and heterogeneity of microenvironments. The platforms would also better predict the toxicity and efficacy of potential drugs in physiologically relevant conditions. Traditional 3D culture models (e.g. spinner flasks, gyratory rotation devices, non-adhesive surfaces, polymers) were developed to create 3D multicellular structures. However, these traditional systems require large volumes of reagents and cells, and are not compatible with high throughput screening (HTS) systems. Microscale technology offers the miniaturization of 3D cultures and allows efficient screening of various conditions. This review will discuss the development, most influential works, and current advantages and challenges of microscale culture systems for screening cells in 3D microenvironments. PMID:25274061
Comprehensive analysis of RNA-protein interactions by high-throughput sequencing-RNA affinity profiling.

PubMed

Tome, Jacob M; Ozer, Abdullah; Pagano, John M; Gheba, Dan; Schroth, Gary P; Lis, John T

2014-06-01

RNA-protein interactions play critical roles in gene regulation, but methods to quantitatively analyze these interactions at a large scale are lacking. We have developed a high-throughput sequencing-RNA affinity profiling (HiTS-RAP) assay by adapting a high-throughput DNA sequencer to quantify the binding of fluorescently labeled protein to millions of RNAs anchored to sequenced cDNA templates. Using HiTS-RAP, we measured the affinity of mutagenized libraries of GFP-binding and NELF-E-binding aptamers to their respective targets and identified critical regions of interaction. Mutations additively affected the affinity of the NELF-E-binding aptamer, whose interaction depended mainly on a single-stranded RNA motif, but not that of the GFP aptamer, whose interaction depended primarily on secondary structure.
Comparison of the gut microbiota composition between wild and captive sika deer (Cervus nippon hortulorum) from feces by high-throughput sequencing.

PubMed

Guan, Yu; Yang, Haitao; Han, Siyu; Feng, Limin; Wang, Tianming; Ge, Jianping

2017-11-23

The gut microbiota is characterized as a complex ecosystem that has effects on health and diseases of host with the interactions of many other factors together. Sika deer is the national level for the protection of wild animals in China. The available sequencing data of gut microbiota from feces of wild sika deer, especially for Cervus nippon hortulorum in Northeast China, are limited. Here, we characterized the gastrointestinal bacterial communities of wild (7 samples) and captive (12 samples) sika deer from feces, and compared their gut microbiota by analyzing the V3-V4 region of 16S rRNA gene using high-throughput sequencing technology on the Illumina Hiseq platform. Firmicutes (77.624%), Bacteroidetes (18.288%) and Tenericutes (1.342%) were the most predominant phyla in wild sika deer. While in captive sika deer, Firmicutes (50.710%) was the dominant phylum, followed by Bacteroidetes (31.996%) and Proteobacteria (4.806%). A total of 9 major phyla, 22 families and 30 genera among gastrointestinal bacterial communities showed significant differences between wild and captive sika deer. The specific function and mechanism of Tenericutes in wild sika deer need further study. Our results indicated that captive sika deer in farm had higher fecal bacterial diversity than the wild. Abundance and quantity of diet source for sika deer played crucial role in shaping the composition and structure of gut microbiota.
High-Throughput Genotyping of Single Nucleotide Polymorphisms in the Plasmodium falciparum dhfr Gene by Asymmetric PCR and Melt-Curve Analysis▿

PubMed Central

Cruz, Rochelle E.; Shokoples, Sandra E.; Manage, Dammika P.; Yanow, Stephanie K.

2010-01-01

Mutations within the Plasmodium falciparum dihydrofolate reductase gene (Pfdhfr) contribute to resistance to antimalarials such as sulfadoxine-pyrimethamine (SP). Of particular importance are the single nucleotide polymorphisms (SNPs) within codons 51, 59, 108, and 164 in the Pfdhfr gene that are associated with SP treatment failure. Given that traditional genotyping methods are time-consuming and laborious, we developed an assay that provides the rapid, high-throughput analysis of parasite DNA isolated from clinical samples. This assay is based on asymmetric real-time PCR and melt-curve analysis (MCA) performed on the LightCycler platform. Unlabeled probes specific to each SNP are included in the reaction mixture and hybridize differentially to the mutant and wild-type sequences within the amplicon, generating distinct melting curves. Since the probe is present throughout PCR and MCA, the assay proceeds seamlessly with no further addition of reagents. This assay was validated for analytical sensitivity and specificity using plasmids, purified genomic DNA from reference strains, and parasite cultures. For all four SNPs, correct genotypes were identified with 100 copies of the template. The performance of the assay was evaluated with a blind panel of clinical isolates from travelers with low-level parasitemia. The concordance between our assay and DNA sequencing ranged from 84 to 100% depending on the SNP. We also directly compared our MCA assay to a published TaqMan real-time PCR assay and identified major issues with the specificity of the TaqMan probes. Our assay provides a number of technical improvements that facilitate the high-throughput screening of patient samples to identify SP-resistant malaria. PMID:20631115
Genome sequencing in microfabricated high-density picolitre reactors.

PubMed

Margulies, Marcel; Egholm, Michael; Altman, William E; Attiya, Said; Bader, Joel S; Bemben, Lisa A; Berka, Jan; Braverman, Michael S; Chen, Yi-Ju; Chen, Zhoutao; Dewell, Scott B; Du, Lei; Fierro, Joseph M; Gomes, Xavier V; Godwin, Brian C; He, Wen; Helgesen, Scott; Ho, Chun Heen; Ho, Chun He; Irzyk, Gerard P; Jando, Szilveszter C; Alenquer, Maria L I; Jarvie, Thomas P; Jirage, Kshama B; Kim, Jong-Bum; Knight, James R; Lanza, Janna R; Leamon, John H; Lefkowitz, Steven M; Lei, Ming; Li, Jing; Lohman, Kenton L; Lu, Hong; Makhijani, Vinod B; McDade, Keith E; McKenna, Michael P; Myers, Eugene W; Nickerson, Elizabeth; Nobile, John R; Plant, Ramona; Puc, Bernard P; Ronan, Michael T; Roth, George T; Sarkis, Gary J; Simons, Jan Fredrik; Simpson, John W; Srinivasan, Maithreyan; Tartaro, Karrie R; Tomasz, Alexander; Vogt, Kari A; Volkmer, Greg A; Wang, Shally H; Wang, Yong; Weiner, Michael P; Yu, Pengguang; Begley, Richard F; Rothberg, Jonathan M

2005-09-15

The proliferation of large-scale DNA-sequencing projects in recent years has driven a search for alternative methods to reduce time and cost. Here we describe a scalable, highly parallel sequencing system with raw throughput significantly greater than that of state-of-the-art capillary electrophoresis instruments. The apparatus uses a novel fibre-optic slide of individual wells and is able to sequence 25 million bases, at 99% or better accuracy, in one four-hour run. To achieve an approximately 100-fold increase in throughput over current Sanger sequencing technology, we have developed an emulsion method for DNA amplification and an instrument for sequencing by synthesis using a pyrosequencing protocol optimized for solid support and picolitre-scale volumes. Here we show the utility, throughput, accuracy and robustness of this system by shotgun sequencing and de novo assembly of the Mycoplasma genitalium genome with 96% coverage at 99.96% accuracy in one run of the machine.
Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

PubMed

Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

2018-03-09

High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.
E2FM: an encrypted and compressed full-text index for collections of genomic sequences.

PubMed

Montecuollo, Ferdinando; Schmid, Giovannni; Tagliaferri, Roberto

2017-09-15

Next Generation Sequencing (NGS) platforms and, more generally, high-throughput technologies are giving rise to an exponential growth in the size of nucleotide sequence databases. Moreover, many emerging applications of nucleotide datasets-as those related to personalized medicine-require the compliance with regulations about the storage and processing of sensitive data. We have designed and carefully engineered E 2 FM -index, a new full-text index in minute space which was optimized for compressing and encrypting nucleotide sequence collections in FASTA format and for performing fast pattern-search queries. E 2 FM -index allows to build self-indexes which occupy till to 1/20 of the storage required by the input FASTA file, thus permitting to save about 95% of storage when indexing collections of highly similar sequences; moreover, it can exactly search the built indexes for patterns in times ranging from few milliseconds to a few hundreds milliseconds, depending on pattern length. Source code is available at https://github.com/montecuollo/E2FM . ferdinando.montecuollo@unicampania.it. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Generation of genetically modified mice using CRISPR/Cas9 and haploid embryonic stem cell systems

PubMed Central

JIN, Li-Fang; LI, Jin-Song

2016-01-01

With the development of high-throughput sequencing technology in the post-genomic era, researchers have concentrated their efforts on elucidating the relationships between genes and their corresponding functions. Recently, important progress has been achieved in the generation of genetically modified mice based on CRISPR/Cas9 and haploid embryonic stem cell (haESC) approaches, which provide new platforms for gene function analysis, human disease modeling, and gene therapy. Here, we review the CRISPR/Cas9 and haESC technology for the generation of genetically modified mice and discuss the key challenges in the application of these approaches. PMID:27469251
Pediatric Glioblastoma Therapies Based on Patient-Derived Stem Cell Resources

DTIC Science & Technology

2014-11-01

genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate gene...and genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate...PRISM 7900 Sequence Detection System ( Genomics Resource, FHCRC). Relative transcript abundance was analyzed using the 2−ΔΔCt method. TRIzol (Invitrogen

A novel hanging spherical drop system for the generation of cellular spheroids and high throughput combinatorial drug screening.

PubMed

Neto, A I; Correia, C R; Oliveira, M B; Rial-Hermida, M I; Alvarez-Lorenzo, C; Reis, R L; Mano, J F

2015-04-01

We propose a novel hanging spherical drop system for anchoring arrays of droplets of cell suspension based on the use of biomimetic superhydrophobic flat substrates, with controlled positional adhesion and minimum contact with a solid substrate. By facing down the platform, it was possible to generate independent spheroid bodies in a high throughput manner, in order to mimic in vivo tumour models on the lab-on-chip scale. To validate this system for drug screening purposes, the toxicity of the anti-cancer drug doxorubicin in cell spheroids was tested and compared to cells in 2D culture. The advantages presented by this platform, such as feasibility of the system and the ability to control the size uniformity of the spheroid, emphasize its potential to be used as a new low cost toolbox for high-throughput drug screening and in cell or tissue engineering.
Single-Molecule Flow Platform for the Quantification of Biomolecules Attached to Single Nanoparticles.

PubMed

Jung, Seung-Ryoung; Han, Rui; Sun, Wei; Jiang, Yifei; Fujimoto, Bryant S; Yu, Jiangbo; Kuo, Chun-Ting; Rong, Yu; Zhou, Xing-Hua; Chiu, Daniel T

2018-05-15

We describe here a flow platform for quantifying the number of biomolecules on individual fluorescent nanoparticles. The platform combines line-confocal fluorescence detection with near nanoscale channels (1-2 μm in width and height) to achieve high single-molecule detection sensitivity and throughput. The number of biomolecules present on each nanoparticle was determined by deconvolving the fluorescence intensity distribution of single-nanoparticle-biomolecule complexes with the intensity distribution of single biomolecules. We demonstrate this approach by quantifying the number of streptavidins on individual semiconducting polymer dots (Pdots); streptavidin was rendered fluorescent using biotin-Alexa647. This flow platform has high-throughput (hundreds to thousands of nanoparticles detected per second) and requires minute amounts of sample (∼5 μL at a dilute concentration of 10 pM). This measurement method is an additional tool for characterizing synthetic or biological nanoparticles.
Next generation platforms for high-throughput biodosimetry.

PubMed

Repin, Mikhail; Turner, Helen C; Garty, Guy; Brenner, David J

2014-06-01

Here the general concept of the combined use of plates and tubes in racks compatible with the American National Standards Institute/the Society for Laboratory Automation and Screening microplate formats as the next generation platforms for increasing the throughput of biodosimetry assays was described. These platforms can be used at different stages of biodosimetry assays starting from blood collection into microtubes organised in standardised racks and ending with the cytogenetic analysis of samples in standardised multiwell and multichannel plates. Robotically friendly platforms can be used for different biodosimetry assays in minimally equipped laboratories and on cost-effective automated universal biotech systems. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Xi-cam: a versatile interface for data visualization and analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pandolfi, Ronald J.; Allan, Daniel B.; Arenholz, Elke

Xi-cam is an extensible platform for data management, analysis and visualization.Xi-camaims to provide a flexible and extensible approach to synchrotron data treatment as a solution to rising demands for high-volume/high-throughput processing pipelines. The core ofXi-camis an extensible plugin-based graphical user interface platform which provides users with an interactive interface to processing algorithms. Plugins are available for SAXS/WAXS/GISAXS/GIWAXS, tomography and NEXAFS data. WithXi-cam's `advanced' mode, data processing steps are designed as a graph-based workflow, which can be executed live, locally or remotely. Remote execution utilizes high-performance computing or de-localized resources, allowing for the effective reduction of high-throughput data.Xi-cam's plugin-based architecture targetsmore » cross-facility and cross-technique collaborative development, in support of multi-modal analysis.Xi-camis open-source and cross-platform, and available for download on GitHub.« less
Xi-cam: a versatile interface for data visualization and analysis

DOE PAGES

Pandolfi, Ronald J.; Allan, Daniel B.; Arenholz, Elke; ...

2018-05-31

Xi-cam is an extensible platform for data management, analysis and visualization.Xi-camaims to provide a flexible and extensible approach to synchrotron data treatment as a solution to rising demands for high-volume/high-throughput processing pipelines. The core ofXi-camis an extensible plugin-based graphical user interface platform which provides users with an interactive interface to processing algorithms. Plugins are available for SAXS/WAXS/GISAXS/GIWAXS, tomography and NEXAFS data. WithXi-cam's `advanced' mode, data processing steps are designed as a graph-based workflow, which can be executed live, locally or remotely. Remote execution utilizes high-performance computing or de-localized resources, allowing for the effective reduction of high-throughput data.Xi-cam's plugin-based architecture targetsmore » cross-facility and cross-technique collaborative development, in support of multi-modal analysis.Xi-camis open-source and cross-platform, and available for download on GitHub.« less
Abnormal plasma DNA profiles in early ovarian cancer using a non-invasive prenatal testing platform: implications for cancer screening.

PubMed

Cohen, Paul A; Flowers, Nicola; Tong, Stephen; Hannan, Natalie; Pertile, Mark D; Hui, Lisa

2016-08-24

Non-invasive prenatal testing (NIPT) identifies fetal aneuploidy by sequencing cell-free DNA in the maternal plasma. Pre-symptomatic maternal malignancies have been incidentally detected during NIPT based on abnormal genomic profiles. This low coverage sequencing approach could have potential for ovarian cancer screening in the non-pregnant population. Our objective was to investigate whether plasma DNA sequencing with a clinical whole genome NIPT platform can detect early- and late-stage high-grade serous ovarian carcinomas (HGSOC). This is a case control study of prospectively-collected biobank samples comprising preoperative plasma from 32 women with HGSOC (16 'early cancer' (FIGO I-II) and 16 'advanced cancer' (FIGO III-IV)) and 32 benign controls. Plasma DNA from cases and controls were sequenced using a commercial NIPT platform and chromosome dosage measured. Sequencing data were blindly analyzed with two methods: (1) Subchromosomal changes were called using an open source algorithm WISECONDOR (WIthin-SamplE COpy Number aberration DetectOR). Genomic gains or losses ≥ 15 Mb were prespecified as "screen positive" calls, and mapped to recurrent copy number variations reported in an ovarian cancer genome atlas. (2) Selected whole chromosome gains or losses were reported using the routine NIPT pipeline for fetal aneuploidy. We detected 13/32 cancer cases using the subchromosomal analysis (sensitivity 40.6 %, 95 % CI, 23.7-59.4 %), including 6/16 early and 7/16 advanced HGSOC cases. Two of 32 benign controls had subchromosomal gains ≥ 15 Mb (specificity 93.8 %, 95 % CI, 79.2-99.2 %). Twelve of the 13 true positive cancer cases exhibited specific recurrent changes reported in HGSOC tumors. The NIPT pipeline resulted in one "monosomy 18" call from the cancer group, and two "monosomy X" calls in the controls. Low coverage plasma DNA sequencing used for prenatal testing detected 40.6 % of all HGSOC, including 38 % of early stage cases. Our findings demonstrate the potential of a high throughput sequencing platform to screen for early HGSOC in plasma based on characteristic multiple segmental chromosome gains and losses. The performance of this approach may be further improved by refining bioinformatics algorithms and targeting selected cancer copy number variations.
iCopyDAV: Integrated platform for copy number variations—Detection, annotation and visualization

PubMed Central

Vogeti, Sriharsha

2018-01-01

Discovery of copy number variations (CNVs), a major category of structural variations, have dramatically changed our understanding of differences between individuals and provide an alternate paradigm for the genetic basis of human diseases. CNVs include both copy gain and copy loss events and their detection genome-wide is now possible using high-throughput, low-cost next generation sequencing (NGS) methods. However, accurate detection of CNVs from NGS data is not straightforward due to non-uniform coverage of reads resulting from various systemic biases. We have developed an integrated platform, iCopyDAV, to handle some of these issues in CNV detection in whole genome NGS data. It has a modular framework comprising five major modules: data pre-treatment, segmentation, variant calling, annotation and visualization. An important feature of iCopyDAV is the functional annotation module that enables the user to identify and prioritize CNVs encompassing various functional elements, genomic features and disease-associations. Parallelization of the segmentation algorithms makes the iCopyDAV platform even accessible on a desktop. Here we show the effect of sequencing coverage, read length, bin size, data pre-treatment and segmentation approaches on accurate detection of the complete spectrum of CNVs. Performance of iCopyDAV is evaluated on both simulated data and real data for different sequencing depths. It is an open-source integrated pipeline available at https://github.com/vogetihrsh/icopydav and as Docker’s image at http://bioinf.iiit.ac.in/icopydav/. PMID:29621297
Characterization of the indigenous microflora in raw and pasteurized buffalo milk during storage at refrigeration temperature by high-throughput sequencing

USDA-ARS?s Scientific Manuscript database

The effect of refrigeration on bacterial communities within raw and pasteurized buffalo milk was studied using high-throughput sequencing. High quality samples of raw buffalo milk were obtained from five dairy farms in the Guangxi province of China. A sample of each milk was pasteurized, and both r...
ChIP-chip versus ChIP-seq: Lessons for experimental design and data analysis

PubMed Central

2011-01-01

Background Chromatin immunoprecipitation (ChIP) followed by microarray hybridization (ChIP-chip) or high-throughput sequencing (ChIP-seq) allows genome-wide discovery of protein-DNA interactions such as transcription factor bindings and histone modifications. Previous reports only compared a small number of profiles, and little has been done to compare histone modification profiles generated by the two technologies or to assess the impact of input DNA libraries in ChIP-seq analysis. Here, we performed a systematic analysis of a modENCODE dataset consisting of 31 pairs of ChIP-chip/ChIP-seq profiles of the coactivator CBP, RNA polymerase II (RNA PolII), and six histone modifications across four developmental stages of Drosophila melanogaster. Results Both technologies produce highly reproducible profiles within each platform, ChIP-seq generally produces profiles with a better signal-to-noise ratio, and allows detection of more peaks and narrower peaks. The set of peaks identified by the two technologies can be significantly different, but the extent to which they differ varies depending on the factor and the analysis algorithm. Importantly, we found that there is a significant variation among multiple sequencing profiles of input DNA libraries and that this variation most likely arises from both differences in experimental condition and sequencing depth. We further show that using an inappropriate input DNA profile can impact the average signal profiles around genomic features and peak calling results, highlighting the importance of having high quality input DNA data for normalization in ChIP-seq analysis. Conclusions Our findings highlight the biases present in each of the platforms, show the variability that can arise from both technology and analysis methods, and emphasize the importance of obtaining high quality and deeply sequenced input DNA libraries for ChIP-seq analysis. PMID:21356108
Bioinformatics and Microarray Data Analysis on the Cloud.

PubMed

Calabrese, Barbara; Cannataro, Mario

2016-01-01

High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.
High-Throughput Quantitative Lipidomics Analysis of Nonesterified Fatty Acids in Human Plasma.

PubMed

Christinat, Nicolas; Morin-Rivron, Delphine; Masoodi, Mojgan

2016-07-01

We present a high-throughput, nontargeted lipidomics approach using liquid chromatography coupled to high-resolution mass spectrometry for quantitative analysis of nonesterified fatty acids. We applied this method to screen a wide range of fatty acids from medium-chain to very long-chain (8 to 24 carbon atoms) in human plasma samples. The method enables us to chromatographically separate branched-chain species from their straight-chain isomers as well as separate biologically important ω-3 and ω-6 polyunsaturated fatty acids. We used 51 fatty acid species to demonstrate the quantitative capability of this method with quantification limits in the nanomolar range; however, this method is not limited only to these fatty acid species. High-throughput sample preparation was developed and carried out on a robotic platform that allows extraction of 96 samples simultaneously within 3 h. This high-throughput platform was used to assess the influence of different types of human plasma collection and preparation on the nonesterified fatty acid profile of healthy donors. Use of the anticoagulants EDTA and heparin has been compared with simple clotting, and only limited changes have been detected in most nonesterified fatty acid concentrations.
An Automated Method for High-Throughput Screening of Arabidopsis Rosette Growth in Multi-Well Plates and Its Validation in Stress Conditions.

PubMed

De Diego, Nuria; Fürst, Tomáš; Humplík, Jan F; Ugena, Lydia; Podlešáková, Kateřina; Spíchal, Lukáš

2017-01-01

High-throughput plant phenotyping platforms provide new possibilities for automated, fast scoring of several plant growth and development traits, followed over time using non-invasive sensors. Using Arabidops is as a model offers important advantages for high-throughput screening with the opportunity to extrapolate the results obtained to other crops of commercial interest. In this study we describe the development of a highly reproducible high-throughput Arabidopsis in vitro bioassay established using our OloPhen platform, suitable for analysis of rosette growth in multi-well plates. This method was successfully validated on example of multivariate analysis of Arabidopsis rosette growth in different salt concentrations and the interaction with varying nutritional composition of the growth medium. Several traits such as changes in the rosette area, relative growth rate, survival rate and homogeneity of the population are scored using fully automated RGB imaging and subsequent image analysis. The assay can be used for fast screening of the biological activity of chemical libraries, phenotypes of transgenic or recombinant inbred lines, or to search for potential quantitative trait loci. It is especially valuable for selecting genotypes or growth conditions that improve plant stress tolerance.
TotalReCaller: improved accuracy and performance via integrated alignment and base-calling.

PubMed

Menges, Fabian; Narzisi, Giuseppe; Mishra, Bud

2011-09-01

Currently, re-sequencing approaches use multiple modules serially to interpret raw sequencing data from next-generation sequencing platforms, while remaining oblivious to the genomic information until the final alignment step. Such approaches fail to exploit the full information from both raw sequencing data and the reference genome that can yield better quality sequence reads, SNP-calls, variant detection, as well as an alignment at the best possible location in the reference genome. Thus, there is a need for novel reference-guided bioinformatics algorithms for interpreting analog signals representing sequences of the bases ({A, C, G, T}), while simultaneously aligning possible sequence reads to a source reference genome whenever available. Here, we propose a new base-calling algorithm, TotalReCaller, to achieve improved performance. A linear error model for the raw intensity data and Burrows-Wheeler transform (BWT) based alignment are combined utilizing a Bayesian score function, which is then globally optimized over all possible genomic locations using an efficient branch-and-bound approach. The algorithm has been implemented in soft- and hardware [field-programmable gate array (FPGA)] to achieve real-time performance. Empirical results on real high-throughput Illumina data were used to evaluate TotalReCaller's performance relative to its peers-Bustard, BayesCall, Ibis and Rolexa-based on several criteria, particularly those important in clinical and scientific applications. Namely, it was evaluated for (i) its base-calling speed and throughput, (ii) its read accuracy and (iii) its specificity and sensitivity in variant calling. A software implementation of TotalReCaller as well as additional information, is available at: http://bioinformatics.nyu.edu/wordpress/projects/totalrecaller/ fabian.menges@nyu.edu.
High-throughput cultivation and screening platform for unicellular phototrophs.

PubMed

Tillich, Ulrich M; Wolter, Nick; Schulze, Katja; Kramer, Dan; Brödel, Oliver; Frohme, Marcus

2014-09-16

High-throughput cultivation and screening methods allow a parallel, miniaturized and cost efficient processing of many samples. These methods however, have not been generally established for phototrophic organisms such as microalgae or cyanobacteria. In this work we describe and test high-throughput methods with the model organism Synechocystis sp. PCC6803. The required technical automation for these processes was achieved with a Tecan Freedom Evo 200 pipetting robot. The cultivation was performed in 2.2 ml deepwell microtiter plates within a cultivation chamber outfitted with programmable shaking conditions, variable illumination, variable temperature, and an adjustable CO2 atmosphere. Each microtiter-well within the chamber functions as a separate cultivation vessel with reproducible conditions. The automated measurement of various parameters such as growth, full absorption spectrum, chlorophyll concentration, MALDI-TOF-MS, as well as a novel vitality measurement protocol, have already been established and can be monitored during cultivation. Measurement of growth parameters can be used as inputs for the system to allow for periodic automatic dilutions and therefore a semi-continuous cultivation of hundreds of cultures in parallel. The system also allows the automatic generation of mid and long term backups of cultures to repeat experiments or to retrieve strains of interest. The presented platform allows for high-throughput cultivation and screening of Synechocystis sp. PCC6803. The platform should be usable for many phototrophic microorganisms as is, and be adaptable for even more. A variety of analyses are already established and the platform is easily expandable both in quality, i.e. with further parameters to screen for additional targets and in quantity, i.e. size or number of processed samples.
Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

PubMed

Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

2010-11-01

Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.
Microfluidics for genome-wide studies involving next generation sequencing

PubMed Central

Murphy, Travis W.; Lu, Chang

2017-01-01

Next-generation sequencing (NGS) has revolutionized how molecular biology studies are conducted. Its decreasing cost and increasing throughput permit profiling of genomic, transcriptomic, and epigenomic features for a wide range of applications. Microfluidics has been proven to be highly complementary to NGS technology with its unique capabilities for handling small volumes of samples and providing platforms for automation, integration, and multiplexing. In this article, we review recent progress on applying microfluidics to facilitate genome-wide studies. We emphasize on several technical aspects of NGS and how they benefit from coupling with microfluidic technology. We also summarize recent efforts on developing microfluidic technology for genomic, transcriptomic, and epigenomic studies, with emphasis on single cell analysis. We envision rapid growth in these directions, driven by the needs for testing scarce primary cell samples from patients in the context of precision medicine. PMID:28396707
Illumina Synthetic Long Read Sequencing Allows Recovery of Missing Sequences even in the “Finished” C. elegans Genome

PubMed Central

Li, Runsheng; Hsieh, Chia-Ling; Young, Amanda; Zhang, Zhihong; Ren, Xiaoliang; Zhao, Zhongying

2015-01-01

Most next-generation sequencing platforms permit acquisition of high-throughput DNA sequences, but the relatively short read length limits their use in genome assembly or finishing. Illumina has recently released a technology called Synthetic Long-Read Sequencing that can produce reads of unusual length, i.e., predominately around 10 Kb. However, a systematic assessment of their use in genome finishing and assembly is still lacking. We evaluate the promise and deficiency of the long reads in these aspects using isogenic C. elegans genome with no gap. First, the reads are highly accurate and capable of recovering most types of repetitive sequences. However, the presence of tandem repetitive sequences prevents pre-assembly of long reads in the relevant genomic region. Second, the reads are able to reliably detect missing but not extra sequences in the C. elegans genome. Third, the reads of smaller size are more capable of recovering repetitive sequences than those of bigger size. Fourth, at least 40 Kbp missing genomic sequences are recovered in the C. elegans genome using the long reads. Finally, an N50 contig size of at least 86 Kbp can be achieved with 24×reads but with substantial mis-assembly errors, highlighting a need for novel assembly algorithm for the long reads. PMID:26039588
A Microfluidic Platform for High-Throughput Multiplexed Protein Quantitation

PubMed Central

Volpetti, Francesca; Garcia-Cordero, Jose; Maerkl, Sebastian J.

2015-01-01

We present a high-throughput microfluidic platform capable of quantitating up to 384 biomarkers in 4 distinct samples by immunoassay. The microfluidic device contains 384 unit cells, which can be individually programmed with pairs of capture and detection antibody. Samples are quantitated in each unit cell by four independent MITOMI detection areas, allowing four samples to be analyzed in parallel for a total of 1,536 assays per device. We show that the device can be pre-assembled and stored for weeks at elevated temperature and we performed proof-of-concept experiments simultaneously quantitating IL-6, IL-1β, TNF-α, PSA, and GFP. Finally, we show that the platform can be used to identify functional antibody combinations by screening 64 antibody combinations requiring up to 384 unique assays per device. PMID:25680117
Portable and Error-Free DNA-Based Data Storage.

PubMed

Yazdi, S M Hossein Tabatabaei; Gabrys, Ryan; Milenkovic, Olgica

2017-07-10

DNA-based data storage is an emerging nonvolatile memory technology of potentially unprecedented density, durability, and replication efficiency. The basic system implementation steps include synthesizing DNA strings that contain user information and subsequently retrieving them via high-throughput sequencing technologies. Existing architectures enable reading and writing but do not offer random-access and error-free data recovery from low-cost, portable devices, which is crucial for making the storage technology competitive with classical recorders. Here we show for the first time that a portable, random-access platform may be implemented in practice using nanopore sequencers. The novelty of our approach is to design an integrated processing pipeline that encodes data to avoid costly synthesis and sequencing errors, enables random access through addressing, and leverages efficient portable sequencing via new iterative alignment and deletion error-correcting codes. Our work represents the only known random access DNA-based data storage system that uses error-prone nanopore sequencers, while still producing error-free readouts with the highest reported information rate/density. As such, it represents a crucial step towards practical employment of DNA molecules as storage media.
Real-time detection of BRAF V600E mutation from archival hairy cell leukemia FFPE tissue by nanopore sequencing.

PubMed

Vacca, Davide; Cancila, Valeria; Gulino, Alessandro; Lo Bosco, Giosuè; Belmonte, Beatrice; Di Napoli, Arianna; Florena, Ada Maria; Tripodo, Claudio; Arancio, Walter

2018-02-01

The MinION is a miniaturized high-throughput next generation sequencing platform of novel conception. The use of nucleic acids derived from formalin-fixed paraffin-embedded samples is highly desirable, but their adoption for molecular assays is hurdled by the high degree of fragmentation and by the chemical-induced mutations stemming from the fixation protocols. In order to investigate the suitability of MinION sequencing on formalin-fixed paraffin-embedded samples, the presence and frequency of BRAF c.1799T > A mutation was investigated in two archival tissue specimens of Hairy cell leukemia and Hairy cell leukemia Variant. Despite the poor quality of the starting DNA, BRAF mutation was successfully detected in the Hairy cell leukemia sample with around 50% of the reads obtained within 2 h of the sequencing start. Notably, the mutational burden of the Hairy cell leukemia sample as derived from nanopore sequencing proved to be comparable to a sensitive method for the detection of point mutations, namely the Digital PCR, using a validated assay. Nanopore sequencing can be adopted for targeted sequencing of genetic lesions on critical DNA samples such as those extracted from archival routine formalin-fixed paraffin-embedded samples. This result let speculating about the possibility that the nanopore sequencing could be trustably adopted for the real-time targeted sequencing of genetic lesions. Our report opens the window for the adoption of nanopore sequencing in molecular pathology for research and diagnostics.

An integrated pipeline for next generation sequencing and annotation of the complete mitochondrial genome of the giant intestinal fluke, Fasciolopsis buski (Lankester, 1857) Looss, 1899

PubMed Central

Biswal, Devendra Kumar; Ghatani, Sudeep; Shylla, Jollin A.; Sahu, Ranjana; Mullapudi, Nandita

2013-01-01

Helminths include both parasitic nematodes (roundworms) and platyhelminths (trematode and cestode flatworms) that are abundant, and are of clinical importance. The genetic characterization of parasitic flatworms using advanced molecular tools is central to the diagnosis and control of infections. Although the nuclear genome houses suitable genetic markers (e.g., in ribosomal (r) DNA) for species identification and molecular characterization, the mitochondrial (mt) genome consistently provides a rich source of novel markers for informative systematics and epidemiological studies. In the last decade, there have been some important advances in mtDNA genomics of helminths, especially lung flukes, liver flukes and intestinal flukes. Fasciolopsis buski, often called the giant intestinal fluke, is one of the largest digenean trematodes infecting humans and found primarily in Asia, in particular the Indian subcontinent. Next-generation sequencing (NGS) technologies now provide opportunities for high throughput sequencing, assembly and annotation within a short span of time. Herein, we describe a high-throughput sequencing and bioinformatics pipeline for mt genomics for F. buski that emphasizes the utility of short read NGS platforms such as Ion Torrent and Illumina in successfully sequencing and assembling the mt genome using innovative approaches for PCR primer design as well as assembly. We took advantage of our NGS whole genome sequence data (unpublished so far) for F. buski and its comparison with available data for the Fasciola hepatica mtDNA as the reference genome for design of precise and specific primers for amplification of mt genome sequences from F. buski. A long-range PCR was carried out to create an NGS library enriched in mt DNA sequences. Two different NGS platforms were employed for complete sequencing, assembly and annotation of the F. buski mt genome. The complete mt genome sequences of the intestinal fluke comprise 14,118 bp and is thus the shortest trematode mitochondrial genome sequenced to date. The noncoding control regions are separated into two parts by the tRNA-Gly gene and don’t contain either tandem repeats or secondary structures, which are typical for trematode control regions. The gene content and arrangement are identical to that of F. hepatica. The F. buski mtDNA genome has a close resemblance with F. hepatica and has a similar gene order tallying with that of other trematodes. The mtDNA for the intestinal fluke is reported herein for the first time by our group that would help investigate Fasciolidae taxonomy and systematics with the aid of mtDNA NGS data. More so, it would serve as a resource for comparative mitochondrial genomics and systematic studies of trematode parasites. PMID:24255820
Target enrichment and high-throughput sequencing of 80 ribosomal protein genes to identify mutations associated with Diamond-Blackfan anaemia.

PubMed

Gerrard, Gareth; Valgañón, Mikel; Foong, Hui En; Kasperaviciute, Dalia; Iskander, Deena; Game, Laurence; Müller, Michael; Aitman, Timothy J; Roberts, Irene; de la Fuente, Josu; Foroni, Letizia; Karadimitris, Anastasios

2013-08-01

Diamond-Blackfan anaemia (DBA) is caused by inactivating mutations in ribosomal protein (RP) genes, with mutations in 13 of the 80 RP genes accounting for 50-60% of cases. The remaining 40-50% cases may harbour mutations in one of the remaining RP genes, but the very low frequencies render conventional genetic screening as challenging. We, therefore, applied custom enrichment technology combined with high-throughput sequencing to screen all 80 RP genes. Using this approach, we identified and validated inactivating mutations in 15/17 (88%) DBA patients. Target enrichment combined with high-throughput sequencing is a robust and improved methodology for the genetic diagnosis of DBA. © 2013 John Wiley & Sons Ltd.
Overcoming bias and systematic errors in next generation sequencing data.

PubMed

Taub, Margaret A; Corrada Bravo, Hector; Irizarry, Rafael A

2010-12-10

Considerable time and effort has been spent in developing analysis and quality assessment methods to allow the use of microarrays in a clinical setting. As is the case for microarrays and other high-throughput technologies, data from new high-throughput sequencing technologies are subject to technological and biological biases and systematic errors that can impact downstream analyses. Only when these issues can be readily identified and reliably adjusted for will clinical applications of these new technologies be feasible. Although much work remains to be done in this area, we describe consistently observed biases that should be taken into account when analyzing high-throughput sequencing data. In this article, we review current knowledge about these biases, discuss their impact on analysis results, and propose solutions.
Validation of Biomarker Proteins Using Reverse Capture Protein Microarrays.

PubMed

Jozwik, Catherine; Eidelman, Ofer; Starr, Joshua; Pollard, Harvey B; Srivastava, Meera

2017-01-01

Genomics has revolutionized large-scale and high-throughput sequencing and has led to the discovery of thousands of new proteins. Protein chip technology is emerging as a miniaturized and highly parallel platform that is suited to rapid, simultaneous screening of large numbers of proteins and the analysis of various protein-binding activities, enzyme substrate relationships, and posttranslational modifications. Specifically, reverse capture protein microarrays provide the most appropriate platform for identifying low-abundance, disease-specific biomarker proteins in a sea of high-abundance proteins from biological fluids such as blood, serum, plasma, saliva, urine, and cerebrospinal fluid as well as tissues and cells obtained by biopsy. Samples from hundreds of patients can be spotted in serial dilutions on many replicate glass slides. Each slide can then be probed with one specific antibody to the biomarker of interest. That antibody's titer can then be determined quantitatively for each patient, allowing for the statistical assessment and validation of the diagnostic or prognostic utility of that particular antigen. As the technology matures and the availability of validated, platform-compatible antibodies increases, the platform will move further into the desirable realm of discovery science for detecting and quantitating low-abundance signaling proteins. In this chapter, we describe methods for the successful application of the reverse capture protein microarray platform for which we have made substantial contributions to the development and application of this method, particularly in the use of body fluids other than serum/plasma.
Molecular beacons for DNA binding proteins: an emerging technology for detection of DNA binding proteins and their ligands.

PubMed

Dummitt, Benjamin; Chang, Yie-Hwa

2006-06-01

Quantitation of the level or activity of specific proteins is one of the most commonly performed experiments in biomedical research. Protein detection has historically been difficult to adapt to high throughput platforms because of heavy reliance upon antibodies for protein detection. Molecular beacons for DNA binding proteins is a recently developed technology that attempts to overcome such limitations. Protein detection is accomplished using inexpensive, easy-to-synthesize oligonucleotides, accompanied by a fluorescence readout. Importantly, detection of the protein and reporting of the signal occur simultaneously, allowing for one-step protocols and increased potential for use in high throughput analysis. While the initial iteration of the technology allowed only for the detection of sequence-specific DNA binding proteins, more recent adaptations allow for the possibility of development of beacons for any protein, independent of native DNA binding activity. Here, we discuss the development of the technology, the mechanism of the reaction, and recent improvements and modifications made to improve the assay in terms of sensitivity, potential for multiplexing, and broad applicability.
TERRA REF: Advancing phenomics with high resolution, open access sensor and genomics data

NASA Astrophysics Data System (ADS)

LeBauer, D.; Kooper, R.; Burnette, M.; Willis, C.

2017-12-01

Automated plant measurement has the potential to improve understanding of genetic and environmental controls on plant traits (phenotypes). The application of sensors and software in the automation of high throughput phenotyping reflects a fundamental shift from labor intensive hand measurements to drone, tractor, and robot mounted sensing platforms. These tools are expected to speed the rate of crop improvement by enabling plant breeders to more accurately select plants with improved yields, resource use efficiency, and stress tolerance. However, there are many challenges facing high throughput phenomics: sensors and platforms are expensive, currently there are few standard methods of data collection and storage, and the analysis of large data sets requires high performance computers and automated, reproducible computing pipelines. To overcome these obstacles and advance the science of high throughput phenomics, the TERRA Phenotyping Reference Platform (TERRA-REF) team is developing an open-access database of high resolution sensor data. TERRA REF is an integrated field and greenhouse phenotyping system that includes: a reference field scanner with fifteen sensors that can generate terrabytes of data each day at mm resolution; UAV, tractor, and fixed field sensing platforms; and an automated controlled-environment scanner. These platforms will enable investigation of diverse sensing modalities, and the investigation of traits under controlled and field environments. It is the goal of TERRA REF to lower the barrier to entry for academic and industry researchers by providing high-resolution data, open source software, and online computing resources. Our project is unique in that all data will be made fully public in November 2018, and is already available to early adopters through the beta-user program. We will describe the datasets and how to use them as well as the databases and computing pipeline and how these can be reused and remixed in other phenomics pipelines. Finally, we will describe the National Data Service workbench, a cloud computing platform that can access the petabyte scale data while supporting reproducible research.
A web platform for the network analysis of high-throughput data in melanoma and its use to investigate mechanisms of resistance to anti-PD1 immunotherapy.

PubMed

Dreyer, Florian S; Cantone, Martina; Eberhardt, Martin; Jaitly, Tanushree; Walter, Lisa; Wittmann, Jürgen; Gupta, Shailendra K; Khan, Faiz M; Wolkenhauer, Olaf; Pützer, Brigitte M; Jäck, Hans-Martin; Heinzerling, Lucie; Vera, Julio

2018-06-01

Cellular phenotypes are established and controlled by complex and precisely orchestrated molecular networks. In cancer, mutations and dysregulations of multiple molecular factors perturb the regulation of these networks and lead to malignant transformation. High-throughput technologies are a valuable source of information to establish the complex molecular relationships behind the emergence of malignancy, but full exploitation of this massive amount of data requires bioinformatics tools that rely on network-based analyses. In this report we present the Virtual Melanoma Cell, an online tool developed to facilitate the mining and interpretation of high-throughput data on melanoma by biomedical researches. The platform is based on a comprehensive, manually generated and expert-validated regulatory map composed of signaling pathways important in malignant melanoma. The Virtual Melanoma Cell is a tool designed to accept, visualize and analyze user-generated datasets. It is available at: https://www.vcells.net/melanoma. To illustrate the utilization of the web platform and the regulatory map, we have analyzed a large publicly available dataset accounting for anti-PD1 immunotherapy treatment of malignant melanoma patients. Copyright © 2018 Elsevier B.V. All rights reserved.
Diversity amongst trigeminal neurons revealed by high throughput single cell sequencing

PubMed Central

Nguyen, Minh Q.; Wu, Youmei; Bonilla, Lauren S.; von Buchholtz, Lars J.

2017-01-01

The trigeminal ganglion contains somatosensory neurons that detect a range of thermal, mechanical and chemical cues and innervate unique sensory compartments in the head and neck including the eyes, nose, mouth, meninges and vibrissae. We used single-cell sequencing and in situ hybridization to examine the cellular diversity of the trigeminal ganglion in mice, defining thirteen clusters of neurons. We show that clusters are well conserved in dorsal root ganglia suggesting they represent distinct functional classes of somatosensory neurons and not specialization associated with their sensory targets. Notably, functionally important genes (e.g. the mechanosensory channel Piezo2 and the capsaicin gated ion channel Trpv1) segregate into multiple clusters and often are expressed in subsets of cells within a cluster. Therefore, the 13 genetically-defined classes are likely to be physiologically heterogeneous rather than highly parallel (i.e., redundant) lines of sensory input. Our analysis harnesses the power of single-cell sequencing to provide a unique platform for in silico expression profiling that complements other approaches linking gene-expression with function and exposes unexpected diversity in the somatosensory system. PMID:28957441
Molecular characterization of a novel rhabdovirus infecting blackcurrant identified by high-throughput sequencing.

PubMed

Wu, L-P; Yang, T; Liu, H-W; Postman, J; Li, R

2018-05-01

A large contig with sequence similarities to several nucleorhabdoviruses was identified by high-throughput sequencing analysis from a black currant (Ribes nigrum L.) cultivar. The complete genome sequence of this new nucleorhabdovirus is 14,432 nucleotides long. Its genomic organization is very similar to those of unsegmented plant rhabdoviruses, containing six open reading frames in the order 3'-N-P-P3-M-G-L-5. The virus, which is provisionally named "black currant-associated rhabdovirus", is 41-52% identical in its genome nucleotide sequence to other nucleorhabdoviruses and may represent a new species in the genus Nucleorhabdovirus.
Quantitative Assessment of RNA-Protein Interactions with High Throughput Sequencing - RNA Affinity Profiling (HiTS-RAP)

PubMed Central

Ozer, Abdullah; Tome, Jacob M.; Friedman, Robin C.; Gheba, Dan; Schroth, Gary P.; Lis, John T.

2016-01-01

Because RNA-protein interactions play a central role in a wide-array of biological processes, methods that enable a quantitative assessment of these interactions in a high-throughput manner are in great demand. Recently, we developed the High Throughput Sequencing-RNA Affinity Profiling (HiTS-RAP) assay, which couples sequencing on an Illumina GAIIx with the quantitative assessment of one or several proteins’ interactions with millions of different RNAs in a single experiment. We have successfully used HiTS-RAP to analyze interactions of EGFP and NELF-E proteins with their corresponding canonical and mutant RNA aptamers. Here, we provide a detailed protocol for HiTS-RAP, which can be completed in about a month (8 days hands-on time) including the preparation and testing of recombinant proteins and DNA templates, clustering DNA templates on a flowcell, high-throughput sequencing and protein binding with GAIIx, and finally data analysis. We also highlight aspects of HiTS-RAP that can be further improved and points of comparison between HiTS-RAP and two other recently developed methods, RNA-MaP and RBNS. A successful HiTS-RAP experiment provides the sequence and binding curves for approximately 200 million RNAs in a single experiment. PMID:26182240
A high-throughput multiplex method adapted for GMO detection.

PubMed

Chaouachi, Maher; Chupeau, Gaëlle; Berard, Aurélie; McKhann, Heather; Romaniuk, Marcel; Giancola, Sandra; Laval, Valérie; Bertheau, Yves; Brunel, Dominique

2008-12-24

A high-throughput multiplex assay for the detection of genetically modified organisms (GMO) was developed on the basis of the existing SNPlex method designed for SNP genotyping. This SNPlex assay allows the simultaneous detection of up to 48 short DNA sequences (approximately 70 bp; "signature sequences") from taxa endogenous reference genes, from GMO constructions, screening targets, construct-specific, and event-specific targets, and finally from donor organisms. This assay avoids certain shortcomings of multiplex PCR-based methods already in widespread use for GMO detection. The assay demonstrated high specificity and sensitivity. The results suggest that this assay is reliable, flexible, and cost- and time-effective for high-throughput GMO detection.
SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

PubMed

Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

2018-05-25

Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.
Size Matters: Assessing Optimum Soil Sample Size for Fungal and Bacterial Community Structure Analyses Using High Throughput Sequencing of rRNA Gene Amplicons

PubMed Central

Penton, C. Ryan; Gupta, Vadakattu V. S. R.; Yu, Julian; Tiedje, James M.

2016-01-01

We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5, and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI) were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungal community structure, replicate dispersion and the number of operational taxonomic units (OTUs) retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation. PMID:27313569
A quantitative assessment of the Hadoop framework for analyzing massively parallel DNA sequencing data.

PubMed

Siretskiy, Alexey; Sundqvist, Tore; Voznesenskiy, Mikhail; Spjuth, Ola

2015-01-01

New high-throughput technologies, such as massively parallel sequencing, have transformed the life sciences into a data-intensive field. The most common e-infrastructure for analyzing this data consists of batch systems that are based on high-performance computing resources; however, the bioinformatics software that is built on this platform does not scale well in the general case. Recently, the Hadoop platform has emerged as an interesting option to address the challenges of increasingly large datasets with distributed storage, distributed processing, built-in data locality, fault tolerance, and an appealing programming methodology. In this work we introduce metrics and report on a quantitative comparison between Hadoop and a single node of conventional high-performance computing resources for the tasks of short read mapping and variant calling. We calculate efficiency as a function of data size and observe that the Hadoop platform is more efficient for biologically relevant data sizes in terms of computing hours for both split and un-split data files. We also quantify the advantages of the data locality provided by Hadoop for NGS problems, and show that a classical architecture with network-attached storage will not scale when computing resources increase in numbers. Measurements were performed using ten datasets of different sizes, up to 100 gigabases, using the pipeline implemented in Crossbow. To make a fair comparison, we implemented an improved preprocessor for Hadoop with better performance for splittable data files. For improved usability, we implemented a graphical user interface for Crossbow in a private cloud environment using the CloudGene platform. All of the code and data in this study are freely available as open source in public repositories. From our experiments we can conclude that the improved Hadoop pipeline scales better than the same pipeline on high-performance computing resources, we also conclude that Hadoop is an economically viable option for the common data sizes that are currently used in massively parallel sequencing. Given that datasets are expected to increase over time, Hadoop is a framework that we envision will have an increasingly important role in future biological data analysis.
RIKEN Integrated Sequence Analysis (RISA) System—384-Format Sequencing Pipeline with 384 Multicapillary Sequencer

PubMed Central

Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide

2000-01-01

The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be revealed by seven RISA systems within one month. PMID:11076861
Increasing ecological inference from high throughput sequencing of fungi in the environment through a tagging approach

Treesearch

D. Lee Taylor; Michael G. Booth; Jack W. McFarland; Ian C. Herriott; Niall J. Lennon; Chad Nusbaum; Thomas G. Marr

2008-01-01

High throughput sequencing methods are widely used in analyses of microbial diversity but are generally applied to small numbers of samples, which precludes charaterization of patterns of microbial diversity across space and time. We have designed a primer-tagging approach that allows pooling and subsequent sorting of numerous samples, which is directed to...
Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

PubMed Central

2011-01-01

Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336
Large-Scale Biomonitoring of Remote and Threatened Ecosystems via High-Throughput Sequencing

PubMed Central

Gibson, Joel F.; Shokralla, Shadi; Curry, Colin; Baird, Donald J.; Monk, Wendy A.; King, Ian; Hajibabaei, Mehrdad

2015-01-01

Biodiversity metrics are critical for assessment and monitoring of ecosystems threatened by anthropogenic stressors. Existing sorting and identification methods are too expensive and labour-intensive to be scaled up to meet management needs. Alternately, a high-throughput DNA sequencing approach could be used to determine biodiversity metrics from bulk environmental samples collected as part of a large-scale biomonitoring program. Here we show that both morphological and DNA sequence-based analyses are suitable for recovery of individual taxonomic richness, estimation of proportional abundance, and calculation of biodiversity metrics using a set of 24 benthic samples collected in the Peace-Athabasca Delta region of Canada. The high-throughput sequencing approach was able to recover all metrics with a higher degree of taxonomic resolution than morphological analysis. The reduced cost and increased capacity of DNA sequence-based approaches will finally allow environmental monitoring programs to operate at the geographical and temporal scale required by industrial and regulatory end-users. PMID:26488407
Characterization and complete genome sequence of a panicovirus from Bermuda grass by high-throughput sequencing.

PubMed

Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre

2017-04-01

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.
Extending the BEAGLE library to a multi-FPGA platform.

PubMed

Jin, Zheming; Bakos, Jason D

2013-01-19

Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1. The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt. The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.

Combining high-throughput sequencing with fruit body surveys reveals contrasting life-history strategies in fungi

PubMed Central

Ovaskainen, Otso; Schigel, Dmitry; Ali-Kovero, Heini; Auvinen, Petri; Paulin, Lars; Nordén, Björn; Nordén, Jenni

2013-01-01

Before the recent revolution in molecular biology, field studies on fungal communities were mostly confined to fruit bodies, whereas mycelial interactions were studied in the laboratory. Here we combine high-throughput sequencing with a fruit body inventory to study simultaneously mycelial and fruit body occurrences in a community of fungi inhabiting dead wood of Norway spruce. We studied mycelial occurrence by extracting DNA from wood samples followed by 454-sequencing of the ITS1 and ITS2 regions and an automated procedure for species identification. In total, we detected 198 species as mycelia and 137 species as fruit bodies. The correlation between mycelial and fruit body occurrences was high for the majority of the species, suggesting that high-throughput sequencing can successfully characterize the dominating fungal communities, despite possible biases related to sampling, PCR, sequencing and molecular identification. We used the fruit body and molecular data to test hypothesized links between life history and population dynamic parameters. We show that the species that have on average a high mycelial abundance also have a high fruiting rate and produce large fruit bodies, leading to a positive feedback loop in their population dynamics. Earlier studies have shown that species with specialized resource requirements are rarely seen fruiting, for which reason they are often classified as red-listed. We show with the help of high-throughput sequencing that some of these species are more abundant as mycelium in wood than what could be expected from their occurrence as fruit bodies. PMID:23575372
Towards sensitive, high-throughput, biomolecular assays based on fluorescence lifetime

NASA Astrophysics Data System (ADS)

Ioanna Skilitsi, Anastasia; Turko, Timothé; Cianfarani, Damien; Barre, Sophie; Uhring, Wilfried; Hassiepen, Ulrich; Léonard, Jérémie

2017-09-01

Time-resolved fluorescence detection for robust sensing of biomolecular interactions is developed by implementing time-correlated single photon counting in high-throughput conditions. Droplet microfluidics is used as a promising platform for the very fast handling of low-volume samples. We illustrate the potential of this very sensitive and cost-effective technology in the context of an enzymatic activity assay based on fluorescently-labeled biomolecules. Fluorescence lifetime detection by time-correlated single photon counting is shown to enable reliable discrimination between positive and negative control samples at a throughput as high as several hundred samples per second.
Sources of PCR-induced distortions in high-throughput sequencing data sets

PubMed Central

Kebschull, Justus M.; Zador, Anthony M.

2015-01-01

PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. PMID:26187991
Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa.

PubMed

Bassil, Nahla V; Davis, Thomas M; Zhang, Hailong; Ficklin, Stephen; Mittmann, Mike; Webster, Teresa; Mahoney, Lise; Wood, David; Alperin, Elisabeth S; Rosyara, Umesh R; Koehorst-Vanc Putten, Herma; Monfort, Amparo; Sargent, Daniel J; Amaya, Iraida; Denoyes, Beatrice; Bianco, Luca; van Dijk, Thijs; Pirani, Ali; Iezzoni, Amy; Main, Dorrie; Peace, Cameron; Yang, Yilong; Whitaker, Vance; Verma, Sujeet; Bellon, Laurent; Brew, Fiona; Herrera, Raul; van de Weg, Eric

2015-03-07

A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.
High-throughput sequencing of mGluR signaling pathway genes reveals enrichment of rare variants in autism.

PubMed

Kelleher, Raymond J; Geigenmüller, Ute; Hovhannisyan, Hayk; Trautman, Edwin; Pinard, Robert; Rathmell, Barbara; Carpenter, Randall; Margulies, David

2012-01-01

Identification of common molecular pathways affected by genetic variation in autism is important for understanding disease pathogenesis and devising effective therapies. Here, we test the hypothesis that rare genetic variation in the metabotropic glutamate-receptor (mGluR) signaling pathway contributes to autism susceptibility. Single-nucleotide variants in genes encoding components of the mGluR signaling pathway were identified by high-throughput multiplex sequencing of pooled samples from 290 non-syndromic autism cases and 300 ethnically matched controls on two independent next-generation platforms. This analysis revealed significant enrichment of rare functional variants in the mGluR pathway in autism cases. Higher burdens of rare, potentially deleterious variants were identified in autism cases for three pathway genes previously implicated in syndromic autism spectrum disorder, TSC1, TSC2, and SHANK3, suggesting that genetic variation in these genes also contributes to risk for non-syndromic autism. In addition, our analysis identified HOMER1, which encodes a postsynaptic density-localized scaffolding protein that interacts with Shank3 to regulate mGluR activity, as a novel autism-risk gene. Rare, potentially deleterious HOMER1 variants identified uniquely in the autism population affected functionally important protein regions or regulatory sequences and co-segregated closely with autism among children of affected families. We also identified rare ASD-associated coding variants predicted to have damaging effects on components of the Ras/MAPK cascade. Collectively, these findings suggest that altered signaling downstream of mGluRs contributes to the pathogenesis of non-syndromic autism.
High-Throughput Sequencing of mGluR Signaling Pathway Genes Reveals Enrichment of Rare Variants in Autism

PubMed Central

Hovhannisyan, Hayk; Trautman, Edwin; Pinard, Robert; Rathmell, Barbara; Carpenter, Randall; Margulies, David

2012-01-01

Identification of common molecular pathways affected by genetic variation in autism is important for understanding disease pathogenesis and devising effective therapies. Here, we test the hypothesis that rare genetic variation in the metabotropic glutamate-receptor (mGluR) signaling pathway contributes to autism susceptibility. Single-nucleotide variants in genes encoding components of the mGluR signaling pathway were identified by high-throughput multiplex sequencing of pooled samples from 290 non-syndromic autism cases and 300 ethnically matched controls on two independent next-generation platforms. This analysis revealed significant enrichment of rare functional variants in the mGluR pathway in autism cases. Higher burdens of rare, potentially deleterious variants were identified in autism cases for three pathway genes previously implicated in syndromic autism spectrum disorder, TSC1, TSC2, and SHANK3, suggesting that genetic variation in these genes also contributes to risk for non-syndromic autism. In addition, our analysis identified HOMER1, which encodes a postsynaptic density-localized scaffolding protein that interacts with Shank3 to regulate mGluR activity, as a novel autism-risk gene. Rare, potentially deleterious HOMER1 variants identified uniquely in the autism population affected functionally important protein regions or regulatory sequences and co-segregated closely with autism among children of affected families. We also identified rare ASD-associated coding variants predicted to have damaging effects on components of the Ras/MAPK cascade. Collectively, these findings suggest that altered signaling downstream of mGluRs contributes to the pathogenesis of non-syndromic autism. PMID:22558107
Diversity arrays technology: a generic genome profiling technology on open platforms.

PubMed

Kilian, Andrzej; Wenzl, Peter; Huttner, Eric; Carling, Jason; Xia, Ling; Blois, Hélène; Caig, Vanessa; Heller-Uszynska, Katarzyna; Jaccoud, Damian; Hopper, Colleen; Aschenbrenner-Kilian, Malgorzata; Evers, Margaret; Peng, Kaiman; Cayla, Cyril; Hok, Puthick; Uszynski, Grzegorz

2012-01-01

In the last 20 years, we have observed an exponential growth of the DNA sequence data and simular increase in the volume of DNA polymorphism data generated by numerous molecular marker technologies. Most of the investment, and therefore progress, concentrated on human genome and genomes of selected model species. Diversity Arrays Technology (DArT), developed over a decade ago, was among the first "democratizing" genotyping technologies, as its performance was primarily driven by the level of DNA sequence variation in the species rather than by the level of financial investment. DArT also proved more robust to genome size and ploidy-level differences among approximately 60 organisms for which DArT was developed to date compared to other high-throughput genotyping technologies. The success of DArT in a number of organisms, including a wide range of "orphan crops," can be attributed to the simplicity of underlying concepts: DArT combines genome complexity reduction methods enriching for genic regions with a highly parallel assay readout on a number of "open-access" microarray platforms. The quantitative nature of the assay enabled a number of applications in which allelic frequencies can be estimated from DArT arrays. A typical DArT assay tests for polymorphism tens of thousands of genomic loci with the final number of markers reported (hundreds to thousands) reflecting the level of DNA sequence variation in the tested loci. Detailed DArT methods, protocols, and a range of their application examples as well as DArT's evolution path are presented.
High-Throughput Cancer Cell Sphere Formation for 3D Cell Culture.

PubMed

Chen, Yu-Chih; Yoon, Euisik

2017-01-01

Three-dimensional (3D) cell culture is critical in studying cancer pathology and drug response. Though 3D cancer sphere culture can be performed in low-adherent dishes or well plates, the unregulated cell aggregation may skew the results. On contrary, microfluidic 3D culture can allow precise control of cell microenvironments, and provide higher throughput by orders of magnitude. In this chapter, we will look into engineering innovations in a microfluidic platform for high-throughput cancer cell sphere formation and review the implementation methods in detail.
A highly efficient, high-throughput lipidomics platform for the quantitative detection of eicosanoids in human whole blood.

PubMed

Song, Jiao; Liu, Xuejun; Wu, Jiejun; Meehan, Michael J; Blevitt, Jonathan M; Dorrestein, Pieter C; Milla, Marcos E

2013-02-15

We have developed an ultra-performance liquid chromatography-multiple reaction monitoring/mass spectrometry (UPLC-MRM/MS)-based, high-content, high-throughput platform that enables simultaneous profiling of multiple lipids produced ex vivo in human whole blood (HWB) on treatment with calcium ionophore and its modulation with pharmacological agents. HWB samples were processed in a 96-well plate format compatible with high-throughput sample processing instrumentation. We employed a scheduled MRM (sMRM) method, with a triple-quadrupole mass spectrometer coupled to a UPLC system, to measure absolute amounts of 122 distinct eicosanoids using deuterated internal standards. In a 6.5-min run, we resolved and detected with high sensitivity (lower limit of quantification in the range of 0.4-460 pg) all targeted analytes from a very small HWB sample (2.5 μl). Approximately 90% of the analytes exhibited a dynamic range exceeding 1000. We also developed a tailored software package that dramatically sped up the overall data quantification and analysis process with superior consistency and accuracy. Matrix effects from HWB and precision of the calibration curve were evaluated using this newly developed automation tool. This platform was successfully applied to the global quantification of changes on all 122 eicosanoids in HWB samples from healthy donors in response to calcium ionophore stimulation. Copyright © 2012 Elsevier Inc. All rights reserved.
First TILLING Platform in Cucurbita pepo: A New Mutant Resource for Gene Function and Crop Improvement

PubMed Central

Vicente-Dólera, Nelly; Troadec, Christelle; Moya, Manuel; del Río-Celestino, Mercedes; Pomares-Viciana, Teresa; Bendahmane, Abdelhafid; Picó, Belén; Román, Belén; Gómez, Pedro

2014-01-01

Although the availability of genetic and genomic resources for Cucurbita pepo has increased significantly, functional genomic resources are still limited for this crop. In this direction, we have developed a high throughput reverse genetic tool: the first TILLING (Targeting Induced Local Lesions IN Genomes) resource for this species. Additionally, we have used this resource to demonstrate that the previous EMS mutant population we developed has the highest mutation density compared with other cucurbits mutant populations. The overall mutation density in this first C. pepo TILLING platform was estimated to be 1/133 Kb by screening five additional genes. In total, 58 mutations confirmed by sequencing were identified in the five targeted genes, thirteen of which were predicted to have an impact on the function of the protein. The genotype/phenotype correlation was studied in a peroxidase gene, revealing that the phenotype of seedling homozygous for one of the isolated mutant alleles was albino. These results indicate that the TILLING approach in this species was successful at providing new mutations and can address the major challenge of linking sequence information to biological function and also the identification of novel variation for crop breeding. PMID:25386735
Toward a Genome-Wide Systems Biology Analysis of Host-Pathogen Interactions in Group A Streptococcus

PubMed Central

Musser, James M.; DeLeo, Frank R.

2005-01-01

Genome-wide analysis of microbial pathogens and molecular pathogenesis processes has become an area of considerable activity in the last 5 years. These studies have been made possible by several advances, including completion of the human genome sequence, publication of genome sequences for many human pathogens, development of microarray technology and high-throughput proteomics, and maturation of bioinformatics. Despite these advances, relatively little effort has been expended in the bacterial pathogenesis arena to develop and use integrated research platforms in a systems biology approach to enhance our understanding of disease processes. This review discusses progress made in exploiting an integrated genome-wide research platform to gain new knowledge about how the human bacterial pathogen group A Streptococcus causes disease. Results of these studies have provided many new avenues for basic pathogenesis research and translational research focused on development of an efficacious human vaccine and novel therapeutics. One goal in summarizing this line of study is to bring exciting new findings to the attention of the investigative pathology community. In addition, we hope the review will stimulate investigators to consider using analogous approaches for analysis of the molecular pathogenesis of other microbes. PMID:16314461
BioLayout(Java): versatile network visualisation of structural and functional relationships.

PubMed

Goldovsky, Leon; Cases, Ildefonso; Enright, Anton J; Ouzounis, Christos A

2005-01-01

Visualisation of biological networks is becoming a common task for the analysis of high-throughput data. These networks correspond to a wide variety of biological relationships, such as sequence similarity, metabolic pathways, gene regulatory cascades and protein interactions. We present a general approach for the representation and analysis of networks of variable type, size and complexity. The application is based on the original BioLayout program (C-language implementation of the Fruchterman-Rheingold layout algorithm), entirely re-written in Java to guarantee portability across platforms. BioLayout(Java) provides broader functionality, various analysis techniques, extensions for better visualisation and a new user interface. Examples of analysis of biological networks using BioLayout(Java) are presented.
Research progress of plant population genomics based on high-throughput sequencing.

PubMed

Wang, Yun-sheng

2016-08-01

Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.
Microfluidics-assisted in vitro drug screening and carrier production

PubMed Central

Tsui, Jonathan H.; Lee, Woohyuk; Pun, Suzie H.; Kim, Jungkyu; Kim, Deok-Ho

2013-01-01

Microfluidic platforms provide several unique advantages for drug development. In the production of drug carriers, physical properties such as size and shape, and chemical properties such as drug composition and pharmacokinetic parameters, can be modified simply and effectively by tuning the flow rate and geometries. Large numbers of carriers can then be fabricated with minimal effort and with little to no batch-to-batch variation. Additionally, cell or tissue culture models in microfluidic systems can be used as in vitro drug screening tools. Compared to in vivo animal models, microfluidic drug screening platforms allow for high-throughput and reproducible screening at a significantly lower cost, and when combined with current advances in tissue engineering, are also capable of mimicking native tissues. In this review, various microfluidic platforms for drug and gene carrier fabrication are reviewed to provide guidelines for designing appropriate carriers. In vitro microfluidic drug screening platforms designed for high-throughput analysis and replication of in vivo conditions are also reviewed to highlight future directions for drug research and development. PMID:23856409
Novel method for high-throughput colony PCR screening in nanoliter-reactors

PubMed Central

Walser, Marcel; Pellaux, Rene; Meyer, Andreas; Bechtold, Matthias; Vanderschuren, Herve; Reinhardt, Richard; Magyar, Joseph; Panke, Sven; Held, Martin

2009-01-01

We introduce a technology for the rapid identification and sequencing of conserved DNA elements employing a novel suspension array based on nanoliter (nl)-reactors made from alginate. The reactors have a volume of 35 nl and serve as reaction compartments during monoseptic growth of microbial library clones, colony lysis, thermocycling and screening for sequence motifs via semi-quantitative fluorescence analyses. nl-Reactors were kept in suspension during all high-throughput steps which allowed performing the protocol in a highly space-effective fashion and at negligible expenses of consumables and reagents. As a first application, 11 high-quality microsatellites for polymorphism studies in cassava were isolated and sequenced out of a library of 20 000 clones in 2 days. The technology is widely scalable and we envision that throughputs for nl-reactor based screenings can be increased up to 100 000 and more samples per day thereby efficiently complementing protocols based on established deep-sequencing technologies. PMID:19282448
Systematic Identification of Combinatorial Drivers and Targets in Cancer Cell Lines

PubMed Central

Tabchy, Adel; Eltonsy, Nevine; Housman, David E.; Mills, Gordon B.

2013-01-01

There is an urgent need to elicit and validate highly efficacious targets for combinatorial intervention from large scale ongoing molecular characterization efforts of tumors. We established an in silico bioinformatic platform in concert with a high throughput screening platform evaluating 37 novel targeted agents in 669 extensively characterized cancer cell lines reflecting the genomic and tissue-type diversity of human cancers, to systematically identify combinatorial biomarkers of response and co-actionable targets in cancer. Genomic biomarkers discovered in a 141 cell line training set were validated in an independent 359 cell line test set. We identified co-occurring and mutually exclusive genomic events that represent potential drivers and combinatorial targets in cancer. We demonstrate multiple cooperating genomic events that predict sensitivity to drug intervention independent of tumor lineage. The coupling of scalable in silico and biologic high throughput cancer cell line platforms for the identification of co-events in cancer delivers rational combinatorial targets for synthetic lethal approaches with a high potential to pre-empt the emergence of resistance. PMID:23577104
Systematic identification of combinatorial drivers and targets in cancer cell lines.

PubMed

Tabchy, Adel; Eltonsy, Nevine; Housman, David E; Mills, Gordon B

2013-01-01

There is an urgent need to elicit and validate highly efficacious targets for combinatorial intervention from large scale ongoing molecular characterization efforts of tumors. We established an in silico bioinformatic platform in concert with a high throughput screening platform evaluating 37 novel targeted agents in 669 extensively characterized cancer cell lines reflecting the genomic and tissue-type diversity of human cancers, to systematically identify combinatorial biomarkers of response and co-actionable targets in cancer. Genomic biomarkers discovered in a 141 cell line training set were validated in an independent 359 cell line test set. We identified co-occurring and mutually exclusive genomic events that represent potential drivers and combinatorial targets in cancer. We demonstrate multiple cooperating genomic events that predict sensitivity to drug intervention independent of tumor lineage. The coupling of scalable in silico and biologic high throughput cancer cell line platforms for the identification of co-events in cancer delivers rational combinatorial targets for synthetic lethal approaches with a high potential to pre-empt the emergence of resistance.
Mobile element biology – new possibilities with high-throughput sequencing

PubMed Central

Xing, Jinchuan; Witherspoon, David J.; Jorde, Lynn B.

2014-01-01

Mobile elements compose more than half of the human genome, but until recently their large-scale detection was time-consuming and challenging. With the development of new high-throughput sequencing technologies, the complete spectrum of mobile element variation in humans can now be identified and analyzed. Thousands of new mobile element insertions have been discovered, yielding new insights into mobile element biology, evolution, and genomic variation. We review several high-throughput methods, with an emphasis on techniques that specifically target mobile element insertions in humans, and we highlight recent applications of these methods in evolutionary studies and in the analysis of somatic alterations in human cancers. PMID:23312846
High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species

PubMed Central

2011-01-01

Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434
GST-PRIME: an algorithm for genome-wide primer design.

PubMed

Leister, Dario; Varotto, Claudio

2007-01-01

The profiling of mRNA expression based on DNA arrays has become a powerful tool to study genome-wide transcription of genes in a number of organisms. GST-PRIME is a software package created to facilitate large-scale primer design for the amplification of probes to be immobilized on arrays for transcriptome analyses, even though it can be also applied in low-throughput approaches. GST-PRIME allows highly efficient, direct amplification of gene-sequence tags (GSTs) from genomic DNA (gDNA), starting from annotated genome or transcript sequences. GST-PRIME provides a customer-friendly platform for automatic primer design, and despite the relative simplicity of the algorithm, experimental tests in the model plant species Arabidopsis thaliana confirmed the reliability of the software. This chapter describes the algorithm used for primer design, its input and output files, and the installation of the standalone package and its use.

Correcting for batch effects in case-control microbiome studies

PubMed Central

Gibbons, Sean M.; Duvallet, Claire

2018-01-01

High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses. PMID:29684016
A framework for human microbiome research

PubMed Central

Methé, Barbara A.; Nelson, Karen E.; Pop, Mihai; Creasy, Heather H.; Giglio, Michelle G.; Huttenhower, Curtis; Gevers, Dirk; Petrosino, Joseph F.; Abubucker, Sahar; Badger, Jonathan H.; Chinwalla, Asif T.; Earl, Ashlee M.; FitzGerald, Michael G.; Fulton, Robert S.; Hallsworth-Pepin, Kymberlie; Lobos, Elizabeth A.; Madupu, Ramana; Magrini, Vincent; Martin, John C.; Mitreva, Makedonka; Muzny, Donna M.; Sodergren, Erica J.; Versalovic, James; Wollam, Aye M.; Worley, Kim C.; Wortman, Jennifer R.; Young, Sarah K.; Zeng, Qiandong; Aagaard, Kjersti M.; Abolude, Olukemi O.; Allen-Vercoe, Emma; Alm, Eric J.; Alvarado, Lucia; Andersen, Gary L.; Anderson, Scott; Appelbaum, Elizabeth; Arachchi, Harindra M.; Armitage, Gary; Arze, Cesar A.; Ayvaz, Tulin; Baker, Carl C.; Begg, Lisa; Belachew, Tsegahiwot; Bhonagiri, Veena; Bihan, Monika; Blaser, Martin J.; Bloom, Toby; Vivien Bonazzi, J.; Brooks, Paul; Buck, Gregory A.; Buhay, Christian J.; Busam, Dana A.; Campbell, Joseph L.; Canon, Shane R.; Cantarel, Brandi L.; Chain, Patrick S.; Chen, I-Min A.; Chen, Lei; Chhibba, Shaila; Chu, Ken; Ciulla, Dawn M.; Clemente, Jose C.; Clifton, Sandra W.; Conlan, Sean; Crabtree, Jonathan; Cutting, Mary A.; Davidovics, Noam J.; Davis, Catherine C.; DeSantis, Todd Z.; Deal, Carolyn; Delehaunty, Kimberley D.; Dewhirst, Floyd E.; Deych, Elena; Ding, Yan; Dooling, David J.; Dugan, Shannon P.; Dunne, Wm. Michael; Durkin, A. Scott; Edgar, Robert C.; Erlich, Rachel L.; Farmer, Candace N.; Farrell, Ruth M.; Faust, Karoline; Feldgarden, Michael; Felix, Victor M.; Fisher, Sheila; Fodor, Anthony A.; Forney, Larry; Foster, Leslie; Di Francesco, Valentina; Friedman, Jonathan; Friedrich, Dennis C.; Fronick, Catrina C.; Fulton, Lucinda L.; Gao, Hongyu; Garcia, Nathalia; Giannoukos, Georgia; Giblin, Christina; Giovanni, Maria Y.; Goldberg, Jonathan M.; Goll, Johannes; Gonzalez, Antonio; Griggs, Allison; Gujja, Sharvari; Haas, Brian J.; Hamilton, Holli A.; Harris, Emily L.; Hepburn, Theresa A.; Herter, Brandi; Hoffmann, Diane E.; Holder, Michael E.; Howarth, Clinton; Huang, Katherine H.; Huse, Susan M.; Izard, Jacques; Jansson, Janet K.; Jiang, Huaiyang; Jordan, Catherine; Joshi, Vandita; Katancik, James A.; Keitel, Wendy A.; Kelley, Scott T.; Kells, Cristyn; Kinder-Haake, Susan; King, Nicholas B.; Knight, Rob; Knights, Dan; Kong, Heidi H.; Koren, Omry; Koren, Sergey; Kota, Karthik C.; Kovar, Christie L.; Kyrpides, Nikos C.; La Rosa, Patricio S.; Lee, Sandra L.; Lemon, Katherine P.; Lennon, Niall; Lewis, Cecil M.; Lewis, Lora; Ley, Ruth E.; Li, Kelvin; Liolios, Konstantinos; Liu, Bo; Liu, Yue; Lo, Chien-Chi; Lozupone, Catherine A.; Lunsford, R. Dwayne; Madden, Tessa; Mahurkar, Anup A.; Mannon, Peter J.; Mardis, Elaine R.; Markowitz, Victor M.; Mavrommatis, Konstantinos; McCorrison, Jamison M.; McDonald, Daniel; McEwen, Jean; McGuire, Amy L.; McInnes, Pamela; Mehta, Teena; Mihindukulasuriya, Kathie A.; Miller, Jason R.; Minx, Patrick J.; Newsham, Irene; Nusbaum, Chad; O’Laughlin, Michelle; Orvis, Joshua; Pagani, Ioanna; Palaniappan, Krishna; Patel, Shital M.; Pearson, Matthew; Peterson, Jane; Podar, Mircea; Pohl, Craig; Pollard, Katherine S.; Priest, Margaret E.; Proctor, Lita M.; Qin, Xiang; Raes, Jeroen; Ravel, Jacques; Reid, Jeffrey G.; Rho, Mina; Rhodes, Rosamond; Riehle, Kevin P.; Rivera, Maria C.; Rodriguez-Mueller, Beltran; Rogers, Yu-Hui; Ross, Matthew C.; Russ, Carsten; Sanka, Ravi K.; Pamela Sankar, J.; Sathirapongsasuti, Fah; Schloss, Jeffery A.; Schloss, Patrick D.; Schmidt, Thomas M.; Scholz, Matthew; Schriml, Lynn; Schubert, Alyxandria M.; Segata, Nicola; Segre, Julia A.; Shannon, William D.; Sharp, Richard R.; Sharpton, Thomas J.; Shenoy, Narmada; Sheth, Nihar U.; Simone, Gina A.; Singh, Indresh; Smillie, Chris S.; Sobel, Jack D.; Sommer, Daniel D.; Spicer, Paul; Sutton, Granger G.; Sykes, Sean M.; Tabbaa, Diana G.; Thiagarajan, Mathangi; Tomlinson, Chad M.; Torralba, Manolito; Treangen, Todd J.; Truty, Rebecca M.; Vishnivetskaya, Tatiana A.; Walker, Jason; Wang, Lu; Wang, Zhengyuan; Ward, Doyle V.; Warren, Wesley; Watson, Mark A.; Wellington, Christopher; Wetterstrand, Kris A.; White, James R.; Wilczek-Boney, Katarzyna; Wu, Yuan Qing; Wylie, Kristine M.; Wylie, Todd; Yandava, Chandri; Ye, Liang; Ye, Yuzhen; Yooseph, Shibu; Youmans, Bonnie P.; Zhang, Lan; Zhou, Yanjiao; Zhu, Yiming; Zoloth, Laurie; Zucker, Jeremy D.; Birren, Bruce W.; Gibbs, Richard A.; Highlander, Sarah K.; Weinstock, George M.; Wilson, Richard K.; White, Owen

2012-01-01

A variety of microbial communities and their genes (microbiome) exist throughout the human body, playing fundamental roles in human health and disease. The NIH funded Human Microbiome Project (HMP) Consortium has established a population-scale framework which catalyzed significant development of metagenomic protocols resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 to 18 body sites up to three times, which to date, have generated 5,177 microbial taxonomic profiles from 16S rRNA genes and over 3.5 Tb of metagenomic sequence. In parallel, approximately 800 human-associated reference genomes have been sequenced. Collectively, these data represent the largest resource to date describing the abundance and variety of the human microbiome, while providing a platform for current and future studies. PMID:22699610
Mapping of disease-associated variants in admixed populations

PubMed Central

2011-01-01

Recent developments in high-throughput genotyping and whole-genome sequencing will enhance the identification of disease loci in admixed populations. We discuss how a more refined estimation of ancestry benefits both admixture mapping and association mapping, making disease loci identification in admixed populations more powerful. High-throughput genotyping and sequencing will enable refined estimation of ancestry, thus enhancing disease loci identification in admixed populations PMID:21635713
Versatile de novo enzyme activity in capsid proteins from an engineered M13 bacteriophage library.

PubMed

Casey, John P; Barbero, Roberto J; Heldman, Nimrod; Belcher, Angela M

2014-11-26

Biocatalysis has grown rapidly in recent decades as a solution to the evolving demands of industrial chemical processes. Mounting environmental pressures and shifting supply chains underscore the need for novel chemical activities, while rapid biotechnological progress has greatly increased the utility of enzymatic methods. Enzymes, though capable of high catalytic efficiency and remarkable reaction selectivity, still suffer from relative instability, high costs of scaling, and functional inflexibility. Herein, we developed a biochemical platform for engineering de novo semisynthetic enzymes, functionally modular and widely stable, based on the M13 bacteriophage. The hydrolytic bacteriophage described in this paper catalyzes a range of carboxylic esters, is active from 25 to 80 °C, and demonstrates greater efficiency in DMSO than in water. The platform complements biocatalysts with characteristics of heterogeneous catalysis, yielding high-surface area, thermostable biochemical structures readily adaptable to reactions in myriad solvents. As the viral structure ensures semisynthetic enzymes remain linked to the genetic sequences responsible for catalysis, future work will tailor the biocatalysts to high-demand synthetic processes by evolving new activities, utilizing high-throughput screening technology and harnessing M13's multifunctionality.
HTP-OligoDesigner: An Online Primer Design Tool for High-Throughput Gene Cloning and Site-Directed Mutagenesis.

PubMed

Camilo, Cesar M; Lima, Gustavo M A; Maluf, Fernando V; Guido, Rafael V C; Polikarpov, Igor

2016-01-01

Following burgeoning genomic and transcriptomic sequencing data, biochemical and molecular biology groups worldwide are implementing high-throughput cloning and mutagenesis facilities in order to obtain a large number of soluble proteins for structural and functional characterization. Since manual primer design can be a time-consuming and error-generating step, particularly when working with hundreds of targets, the automation of primer design process becomes highly desirable. HTP-OligoDesigner was created to provide the scientific community with a simple and intuitive online primer design tool for both laboratory-scale and high-throughput projects of sequence-independent gene cloning and site-directed mutagenesis and a Tm calculator for quick queries.
Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.

PubMed

Bashir, Ali; Bansal, Vikas; Bafna, Vineet

2010-06-18

Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.
YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs

PubMed Central

Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore

2017-01-01

Abstract Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. PMID:28108659
Ultra-High Throughput Synthesis of Nanoparticles with Homogeneous Size Distribution Using a Coaxial Turbulent Jet Mixer

PubMed Central

2015-01-01

High-throughput production of nanoparticles (NPs) with controlled quality is critical for their clinical translation into effective nanomedicines for diagnostics and therapeutics. Here we report a simple and versatile coaxial turbulent jet mixer that can synthesize a variety of NPs at high throughput up to 3 kg/d, while maintaining the advantages of homogeneity, reproducibility, and tunability that are normally accessible only in specialized microscale mixing devices. The device fabrication does not require specialized machining and is easy to operate. As one example, we show reproducible, high-throughput formulation of siRNA-polyelectrolyte polyplex NPs that exhibit effective gene knockdown but exhibit significant dependence on batch size when formulated using conventional methods. The coaxial turbulent jet mixer can accelerate the development of nanomedicines by providing a robust and versatile platform for preparation of NPs at throughputs suitable for in vivo studies, clinical trials, and industrial-scale production. PMID:24824296
A Protocol for Functional Assessment of Whole-Protein Saturation Mutagenesis Libraries Utilizing High-Throughput Sequencing.

PubMed

Stiffler, Michael A; Subramanian, Subu K; Salinas, Victor H; Ranganathan, Rama

2016-07-03

Site-directed mutagenesis has long been used as a method to interrogate protein structure, function and evolution. Recent advances in massively-parallel sequencing technology have opened up the possibility of assessing the functional or fitness effects of large numbers of mutations simultaneously. Here, we present a protocol for experimentally determining the effects of all possible single amino acid mutations in a protein of interest utilizing high-throughput sequencing technology, using the 263 amino acid antibiotic resistance enzyme TEM-1 β-lactamase as an example. In this approach, a whole-protein saturation mutagenesis library is constructed by site-directed mutagenic PCR, randomizing each position individually to all possible amino acids. The library is then transformed into bacteria, and selected for the ability to confer resistance to β-lactam antibiotics. The fitness effect of each mutation is then determined by deep sequencing of the library before and after selection. Importantly, this protocol introduces methods which maximize sequencing read depth and permit the simultaneous selection of the entire mutation library, by mixing adjacent positions into groups of length accommodated by high-throughput sequencing read length and utilizing orthogonal primers to barcode each group. Representative results using this protocol are provided by assessing the fitness effects of all single amino acid mutations in TEM-1 at a clinically relevant dosage of ampicillin. The method should be easily extendable to other proteins for which a high-throughput selection assay is in place.
eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing

PubMed Central

2014-01-01

Background RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. Results We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification” includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module “mRNA identification” includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module “Target screening” provides expression profiling analyses and graphic visualization. The module “Self-testing” offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program’s functionality. Conclusions eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory. PMID:24593312
eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing.

PubMed

Yuan, Tiezheng; Huang, Xiaoyi; Dittmar, Rachel L; Du, Meijun; Kohli, Manish; Boardman, Lisa; Thibodeau, Stephen N; Wang, Liang

2014-03-05

RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification" includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module "mRNA identification" includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module "Target screening" provides expression profiling analyses and graphic visualization. The module "Self-testing" offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program's functionality. eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory.
Mapping of MPEG-4 decoding on a flexible architecture platform

NASA Astrophysics Data System (ADS)

van der Tol, Erik B.; Jaspers, Egbert G.

2001-12-01

In the field of consumer electronics, the advent of new features such as Internet, games, video conferencing, and mobile communication has triggered the convergence of television and computers technologies. This requires a generic media-processing platform that enables simultaneous execution of very diverse tasks such as high-throughput stream-oriented data processing and highly data-dependent irregular processing with complex control flows. As a representative application, this paper presents the mapping of a Main Visual profile MPEG-4 for High-Definition (HD) video onto a flexible architecture platform. A stepwise approach is taken, going from the decoder application toward an implementation proposal. First, the application is decomposed into separate tasks with self-contained functionality, clear interfaces, and distinct characteristics. Next, a hardware-software partitioning is derived by analyzing the characteristics of each task such as the amount of inherent parallelism, the throughput requirements, the complexity of control processing, and the reuse potential over different applications and different systems. Finally, a feasible implementation is proposed that includes amongst others a very-long-instruction-word (VLIW) media processor, one or more RISC processors, and some dedicated processors. The mapping study of the MPEG-4 decoder proves the flexibility and extensibility of the media-processing platform. This platform enables an effective HW/SW co-design yielding a high performance density.
A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome.

PubMed

Allali, Imane; Arnold, Jason W; Roach, Jeffrey; Cadenas, Maria Belen; Butz, Natasha; Hassan, Hosni M; Koci, Matthew; Ballou, Anne; Mendoza, Mary; Ali, Rizwana; Azcarate-Peril, M Andrea

2017-09-13

Advancements in Next Generation Sequencing (NGS) technologies regarding throughput, read length and accuracy had a major impact on microbiome research by significantly improving 16S rRNA amplicon sequencing. As rapid improvements in sequencing platforms and new data analysis pipelines are introduced, it is essential to evaluate their capabilities in specific applications. The aim of this study was to assess whether the same project-specific biological conclusions regarding microbiome composition could be reached using different sequencing platforms and bioinformatics pipelines. Chicken cecum microbiome was analyzed by 16S rRNA amplicon sequencing using Illumina MiSeq, Ion Torrent PGM, and Roche 454 GS FLX Titanium platforms, with standard and modified protocols for library preparation. We labeled the bioinformatics pipelines included in our analysis QIIME1 and QIIME2 (de novo OTU picking [not to be confused with QIIME version 2 commonly referred to as QIIME2]), QIIME3 and QIIME4 (open reference OTU picking), UPARSE1 and UPARSE2 (each pair differs only in the use of chimera depletion methods), and DADA2 (for Illumina data only). GS FLX+ yielded the longest reads and highest quality scores, while MiSeq generated the largest number of reads after quality filtering. Declines in quality scores were observed starting at bases 150-199 for GS FLX+ and bases 90-99 for MiSeq. Scores were stable for PGM-generated data. Overall microbiome compositional profiles were comparable between platforms; however, average relative abundance of specific taxa varied depending on sequencing platform, library preparation method, and bioinformatics analysis. Specifically, QIIME with de novo OTU picking yielded the highest number of unique species and alpha diversity was reduced with UPARSE and DADA2 compared to QIIME. The three platforms compared in this study were capable of discriminating samples by treatment, despite differences in diversity and abundance, leading to similar biological conclusions. Our results demonstrate that while there were differences in depth of coverage and phylogenetic diversity, all workflows revealed comparable treatment effects on microbial diversity. To increase reproducibility and reliability and to retain consistency between similar studies, it is important to consider the impact on data quality and relative abundance of taxa when selecting NGS platforms and analysis tools for microbiome studies.
High-Throughput Sequencing for Detection of Subpopulations of Bacteria Not Previously Associated with Artisanal Cheeses

PubMed Central

Quigley, Lisa; O'Sullivan, Orla; Beresford, Tom P.; Ross, R. Paul; Fitzgerald, Gerald F.

2012-01-01

Here, high-throughput sequencing was employed to reveal the highly diverse bacterial populations present in 62 Irish artisanal cheeses and, in some cases, associated cheese rinds. Using this approach, we revealed the presence of several genera not previously associated with cheese, including Faecalibacterium, Prevotella, and Helcococcus and, for the first time, detected the presence of Arthrobacter and Brachybacterium in goats' milk cheese. Our analysis confirmed many previously observed patterns, such as the dominance of typical cheese bacteria, the fact that the microbiota of raw and pasteurized milk cheeses differ, and that the level of cheese maturation has a significant influence on Lactobacillus populations. It was also noted that cheeses containing adjunct ingredients had lower proportions of Lactococcus species. It is thus apparent that high-throughput sequencing-based investigations can provide valuable insights into the microbial populations of artisanal foods. PMID:22685131
High-throughput sequencing for detection of subpopulations of bacteria not previously associated with artisanal cheeses.

PubMed

Quigley, Lisa; O'Sullivan, Orla; Beresford, Tom P; Ross, R Paul; Fitzgerald, Gerald F; Cotter, Paul D

2012-08-01

Here, high-throughput sequencing was employed to reveal the highly diverse bacterial populations present in 62 Irish artisanal cheeses and, in some cases, associated cheese rinds. Using this approach, we revealed the presence of several genera not previously associated with cheese, including Faecalibacterium, Prevotella, and Helcococcus and, for the first time, detected the presence of Arthrobacter and Brachybacterium in goats' milk cheese. Our analysis confirmed many previously observed patterns, such as the dominance of typical cheese bacteria, the fact that the microbiota of raw and pasteurized milk cheeses differ, and that the level of cheese maturation has a significant influence on Lactobacillus populations. It was also noted that cheeses containing adjunct ingredients had lower proportions of Lactococcus species. It is thus apparent that high-throughput sequencing-based investigations can provide valuable insights into the microbial populations of artisanal foods.
Short-read, high-throughput sequencing technology for STR genotyping

PubMed Central

Bornman, Daniel M.; Hester, Mark E.; Schuetter, Jared M.; Kasoji, Manjula D.; Minard-Smith, Angela; Barden, Curt A.; Nelson, Scott C.; Godbold, Gene D.; Baker, Christine H.; Yang, Boyu; Walther, Jacquelyn E.; Tornes, Ivan E.; Yan, Pearlly S.; Rodriguez, Benjamin; Bundschuh, Ralf; Dickens, Michael L.; Young, Brian A.; Faith, Seth A.

2013-01-01

DNA-based methods for human identification principally rely upon genotyping of short tandem repeat (STR) loci. Electrophoretic-based techniques for variable-length classification of STRs are universally utilized, but are limited in that they have relatively low throughput and do not yield nucleotide sequence information. High-throughput sequencing technology may provide a more powerful instrument for human identification, but is not currently validated for forensic casework. Here, we present a systematic method to perform high-throughput genotyping analysis of the Combined DNA Index System (CODIS) STR loci using short-read (150 bp) massively parallel sequencing technology. Open source reference alignment tools were optimized to evaluate PCR-amplified STR loci using a custom designed STR genome reference. Evaluation of this approach demonstrated that the 13 CODIS STR loci and amelogenin (AMEL) locus could be accurately called from individual and mixture samples. Sensitivity analysis showed that as few as 18,500 reads, aligned to an in silico referenced genome, were required to genotype an individual (>99% confidence) for the CODIS loci. The power of this technology was further demonstrated by identification of variant alleles containing single nucleotide polymorphisms (SNPs) and the development of quantitative measurements (reads) for resolving mixed samples. PMID:25621315
A family-based probabilistic method for capturing de novo mutations from high-throughput short-read sequencing data.

PubMed

Cartwright, Reed A; Hussin, Julie; Keebler, Jonathan E M; Stone, Eric A; Awadalla, Philip

2012-01-06

Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.
A confidence interval analysis of sampling effort, sequencing depth, and taxonomic resolution of fungal community ecology in the era of high-throughput sequencing.

PubMed

Oono, Ryoko

2017-01-01

High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions 'how and why are communities different?' This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences.
A confidence interval analysis of sampling effort, sequencing depth, and taxonomic resolution of fungal community ecology in the era of high-throughput sequencing

PubMed Central

2017-01-01

High-throughput sequencing technology has helped microbial community ecologists explore ecological and evolutionary patterns at unprecedented scales. The benefits of a large sample size still typically outweigh that of greater sequencing depths per sample for accurate estimations of ecological inferences. However, excluding or not sequencing rare taxa may mislead the answers to the questions ‘how and why are communities different?’ This study evaluates the confidence intervals of ecological inferences from high-throughput sequencing data of foliar fungal endophytes as case studies through a range of sampling efforts, sequencing depths, and taxonomic resolutions to understand how technical and analytical practices may affect our interpretations. Increasing sampling size reliably decreased confidence intervals across multiple community comparisons. However, the effects of sequencing depths on confidence intervals depended on how rare taxa influenced the dissimilarity estimates among communities and did not significantly decrease confidence intervals for all community comparisons. A comparison of simulated communities under random drift suggests that sequencing depths are important in estimating dissimilarities between microbial communities under neutral selective processes. Confidence interval analyses reveal important biases as well as biological trends in microbial community studies that otherwise may be ignored when communities are only compared for statistically significant differences. PMID:29253889
High-throughput detection of ethanol-producing cyanobacteria in a microdroplet platform.

PubMed

Abalde-Cela, Sara; Gould, Anna; Liu, Xin; Kazamia, Elena; Smith, Alison G; Abell, Chris

2015-05-06

Ethanol production by microorganisms is an important renewable energy source. Most processes involve fermentation of sugars from plant feedstock, but there is increasing interest in direct ethanol production by photosynthetic organisms. To facilitate this, a high-throughput screening technique for the detection of ethanol is required. Here, a method for the quantitative detection of ethanol in a microdroplet-based platform is described that can be used for screening cyanobacterial strains to identify those with the highest ethanol productivity levels. The detection of ethanol by enzymatic assay was optimized both in bulk and in microdroplets. In parallel, the encapsulation of engineered ethanol-producing cyanobacteria in microdroplets and their growth dynamics in microdroplet reservoirs were demonstrated. The combination of modular microdroplet operations including droplet generation for cyanobacteria encapsulation, droplet re-injection and pico-injection, and laser-induced fluorescence, were used to create this new platform to screen genetically engineered strains of cyanobacteria with different levels of ethanol production.

Microfabrication of a platform to measure and manipulate the mechanics of engineered microtissues.

PubMed

Ramade, Alexandre; Legant, Wesley R; Picart, Catherine; Chen, Christopher S; Boudou, Thomas

2014-01-01

Engineered tissues can be used to understand fundamental features of biology, develop organotypic in vitro model systems, and as engineered tissue constructs for replacing damaged tissue in vivo. However, a key limitation is an inability to test the wide range of parameters that might impact the engineered tissue in a high-throughput manner and in an environment that mimics the three-dimensional (3D) native architecture. We developed a microfabricated platform to generate arrays of microtissues embedded within 3D micropatterned matrices. Microcantilevers simultaneously constrain microtissue formation and report forces generated by the microtissues in real time, opening the possibility to use high-throughput, low-volume screening for studies on engineered tissues. Thanks to the micrometer scale of the microtissues, this platform is also suitable for high-throughput monitoring of drug-induced effect on architecture and contractility in engineered tissues. Moreover, independent variations of the mechanical stiffness of the cantilevers and collagen matrix allow the measurement and manipulation of the mechanics of the microtissues. Thus, our approach will likely provide valuable opportunities to elucidate how biomechanical, electrical, biochemical, and genetic/epigenetic cues modulate the formation and maturation of 3D engineered tissues. In this chapter, we describe the microfabrication, preparation, and experimental use of such microfabricated tissue gauges. Copyright © 2014 Elsevier Inc. All rights reserved.
High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers.

PubMed

Hou, Weiguo; Wang, Shang; Briggs, Brandon R; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

2018-01-01

Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.
High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

PubMed Central

Hou, Weiguo; Wang, Shang; Briggs, Brandon R.; Li, Gaoyuan; Xie, Wei; Dong, Hailiang

2018-01-01

Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities) from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length) encoding the cyanophage gp23 major capsid protein (MCP). Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92%) belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.
Role of APOE Isoforms in the Pathogenesis of TBI induced Alzheimer’s Disease

DTIC Science & Technology

2016-10-01

deletion, APOE targeted replacement, complex breeding, CCI model optimization, mRNA library generation, high throughput massive parallel sequencing...demonstrate that the lack of Abca1 increases amyloid plaques and decreased APOE protein levels in AD-model mice. In this proposal we will test the hypothesis...injury, inflammatory reaction, transcriptome, high throughput massive parallel sequencing, mRNA-seq., behavioral testing, memory impairment, recovery 3
Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline.

PubMed

Reid, Jeffrey G; Carroll, Andrew; Veeraraghavan, Narayanan; Dahdouli, Mahmoud; Sundquist, Andreas; English, Adam; Bainbridge, Matthew; White, Simon; Salerno, William; Buhay, Christian; Yu, Fuli; Muzny, Donna; Daly, Richard; Duyk, Geoff; Gibbs, Richard A; Boerwinkle, Eric

2014-01-29

Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.
High-density fiber optic biosensor arrays

NASA Astrophysics Data System (ADS)

Epstein, Jason R.; Walt, David R.

2002-02-01

Novel approaches are required to coordinate the immense amounts of information derived from diverse genomes. This concept has influenced the expanded role of high-throughput DNA detection and analysis in the biological sciences. A high-density fiber optic DNA biosensor was developed consisting of oligonucleotide-functionalized, 3.1 mm diameter microspheres deposited into the etched wells on the distal face of a 500 micrometers imaging fiber bundle. Imaging fiber bundles containing thousands of optical fibers, each associated with a unique oligonucleotide probe sequence, were the foundation for an optically connected, individually addressable DNA detection platform. Different oligonucleotide-functionalized microspheres were combined in a stock solution, and randomly dispersed into the etched wells. Microsphere positions were registered from optical dyes incorporated onto the microspheres. The distribution process provided an inherent redundancy that increases the signal-to-noise ratio as the square root of the number of sensors examined. The representative amount of each probe-type in the array was dependent on their initial stock solution concentration, and as other sequences of interest arise, new microsphere elements can be added to arrays without altering the existing detection capabilities. The oligonucleotide probe sequences hybridize to fluorescently-labeled, complementary DNA target solutions. Fiber optic DNA microarray research has included DNA-protein interaction profiles, microbial strain differentiation, non-labeled target interrogation with molecular beacons, and single cell-based assays. This biosensor array is proficient in DNA detection linked to specific disease states, single nucleotide polymorphism (SNP's) discrimination, and gene expression analysis. This array platform permits multiple detection formats, provides smaller feature sizes, and enables sensor design flexibility. High-density fiber optic microarray biosensors provide a fast, reversible format with the detection limit of a few hundred molecules.
SPIM-fluid: open source light-sheet based platform for high-throughput imaging

PubMed Central

Gualda, Emilio J.; Pereira, Hugo; Vale, Tiago; Estrada, Marta Falcão; Brito, Catarina; Moreno, Nuno

2015-01-01

Light sheet fluorescence microscopy has recently emerged as the technique of choice for obtaining high quality 3D images of whole organisms/embryos with low photodamage and fast acquisition rates. Here we present an open source unified implementation based on Arduino and Micromanager, which is capable of operating Light Sheet Microscopes for automatized 3D high-throughput imaging on three-dimensional cell cultures and model organisms like zebrafish, oriented to massive drug screening. PMID:26601007
Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes.

PubMed

Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E

2008-01-15

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.
Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes

PubMed Central

Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.

2008-01-01

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
A robust robotic high-throughput antibody purification platform.

PubMed

Schmidt, Peter M; Abdo, Michael; Butcher, Rebecca E; Yap, Min-Yin; Scotney, Pierre D; Ramunno, Melanie L; Martin-Roussety, Genevieve; Owczarek, Catherine; Hardy, Matthew P; Chen, Chao-Guang; Fabri, Louis J

2016-07-15

Monoclonal antibodies (mAbs) have become the fastest growing segment in the drug market with annual sales of more than 40 billion US$ in 2013. The selection of lead candidate molecules involves the generation of large repertoires of antibodies from which to choose a final therapeutic candidate. Improvements in the ability to rapidly produce and purify many antibodies in sufficient quantities reduces the lead time for selection which ultimately impacts on the speed with which an antibody may transition through the research stage and into product development. Miniaturization and automation of chromatography using micro columns (RoboColumns(®) from Atoll GmbH) coupled to an automated liquid handling instrument (ALH; Freedom EVO(®) from Tecan) has been a successful approach to establish high throughput process development platforms. Recent advances in transient gene expression (TGE) using the high-titre Expi293F™ system have enabled recombinant mAb titres of greater than 500mg/L. These relatively high protein titres reduce the volume required to generate several milligrams of individual antibodies for initial biochemical and biological downstream assays, making TGE in the Expi293F™ system ideally suited to high throughput chromatography on an ALH. The present publication describes a novel platform for purifying Expi293F™-expressed recombinant mAbs directly from cell-free culture supernatant on a Perkin Elmer JANUS-VariSpan ALH equipped with a plate shuttle device. The purification platform allows automated 2-step purification (Protein A-desalting/size exclusion chromatography) of several hundred mAbs per week. The new robotic method can purify mAbs with high recovery (>90%) at sub-milligram level with yields of up to 2mg from 4mL of cell-free culture supernatant. Copyright © 2016 Elsevier B.V. All rights reserved.
High Throughput Transcriptomics @ USEPA (Toxicology ...

EPA Pesticide Factsheets

The ideal chemical testing approach will provide complete coverage of all relevant toxicological responses. It should be sensitive and specific It should identify the mechanism/mode-of-action (with dose-dependence). It should identify responses relevant to the species of interest. Responses should ideally be translated into tissue-, organ-, and organism-level effects. It must be economical and scalable. Using a High Throughput Transcriptomics platform within US EPA provides broader coverage of biological activity space and toxicological MOAs and helps fill the toxicological data gap. Slide presentation at the 2016 ToxForum on using High Throughput Transcriptomics at US EPA for broader coverage biological activity space and toxicological MOAs.
History, applications, and challenges of immune repertoire research.

PubMed

Liu, Xiao; Wu, Jinghua

2018-02-27

The diversity of T and B cells in terms of their receptor sequences is huge in the vertebrate's immune system and provides broad protection against the vast diversity of pathogens. Immune repertoire is defined as the sum of T cell receptors and B cell receptors (also named immunoglobulin) that makes the organism's adaptive immune system. Before the emergence of high-throughput sequencing, the studies on immune repertoire were limited by the underdeveloped methodologies, since it was impossible to capture the whole picture by the low-throughput tools. The massive paralleled sequencing technology suits perfectly the researches on immune repertoire. In this article, we review the history of immune repertoire studies, in terms of technologies and research applications. Particularly, we discuss several aspects of challenges in this field and highlight the efforts to develop potential solutions, in the era of high-throughput sequencing of the immune repertoire.
Simultaneous Measurements of Auto-Immune and Infectious Disease Specific Antibodies Using a High Throughput Multiplexing Tool

PubMed Central

Asati, Atul; Kachurina, Olga; Kachurin, Anatoly

2012-01-01

Considering importance of ganglioside antibodies as biomarkers in various immune-mediated neuropathies and neurological disorders, we developed a high throughput multiplexing tool for the assessment of gangliosides-specific antibodies based on Biolpex/Luminex platform. In this report, we demonstrate that the ganglioside high throughput multiplexing tool is robust, highly specific and demonstrating ∼100-fold higher concentration sensitivity for IgG detection than ELISA. In addition to the ganglioside-coated array, the high throughput multiplexing tool contains beads coated with influenza hemagglutinins derived from H1N1 A/Brisbane/59/07 and H1N1 A/California/07/09 strains. Influenza beads provided an added advantage of simultaneous detection of ganglioside- and influenza-specific antibodies, a capacity important for the assay of both infectious antigen-specific and autoimmune antibodies following vaccination or disease. Taken together, these results support the potential adoption of the ganglioside high throughput multiplexing tool for measuring ganglioside antibodies in various neuropathic and neurological disorders. PMID:22952605
Alkahest NuclearBLAST : a user-friendly BLAST management and analysis system

PubMed Central

Diener, Stephen E; Houfek, Thomas D; Kalat, Sam E; Windham, DE; Burke, Mark; Opperman, Charles; Dean, Ralph A

2005-01-01

Background - Sequencing of EST and BAC end datasets is no longer limited to large research groups. Drops in per-base pricing have made high throughput sequencing accessible to individual investigators. However, there are few options available which provide a free and user-friendly solution to the BLAST result storage and data mining needs of biologists. Results - Here we describe NuclearBLAST, a batch BLAST analysis, storage and management system designed for the biologist. It is a wrapper for NCBI BLAST which provides a user-friendly web interface which includes a request wizard and the ability to view and mine the results. All BLAST results are stored in a MySQL database which allows for more advanced data-mining through supplied command-line utilities or direct database access. NuclearBLAST can be installed on a single machine or clustered amongst a number of machines to improve analysis throughput. NuclearBLAST provides a platform which eases data-mining of multiple BLAST results. With the supplied scripts, the program can export data into a spreadsheet-friendly format, automatically assign Gene Ontology terms to sequences and provide bi-directional best hits between two datasets. Users with SQL experience can use the database to ask even more complex questions and extract any subset of data they require. Conclusion - This tool provides a user-friendly interface for requesting, viewing and mining of BLAST results which makes the management and data-mining of large sets of BLAST analyses tractable to biologists. PMID:15958161
Mapping the miRNA interactome by crosslinking ligation and sequencing of hybrids (CLASH)

PubMed Central

Helwak, Aleksandra; Tollervey, David

2014-01-01

RNA-RNA interactions play critical roles in many cellular processes but studying them is difficult and laborious. Here, we describe an experimental procedure, termed crosslinking ligation and sequencing of hybrids (CLASH), which allows high-throughput identification of sites of RNA-RNA interaction. During CLASH, a tagged bait protein is UV crosslinked in vivo to stabilise RNA interactions and purified under denaturing conditions. RNAs associated with the bait protein are partially truncated, and the ends of RNA-duplexes are ligated together. Following linker addition, cDNA library preparation and high-throughput sequencing, the ligated duplexes give rise to chimeric cDNAs, which unambiguously identify RNA-RNA interaction sites independent of bioinformatic predictions. This protocol is optimized for studying miRNA targets bound by Argonaute proteins, but should be easily adapted for other RNA-binding proteins and classes of RNA. The protocol requires around 5 days to complete, excluding the time required for high-throughput sequencing and bioinformatic analyses. PMID:24577361
Suspension arrays based on nanoparticle-encoded microspheres for high-throughput multiplexed detection

PubMed Central

Leng, Yuankui

2017-01-01

Spectrometrically or optically encoded microsphere based suspension array technology (SAT) is applicable to the high-throughput, simultaneous detection of multiple analytes within a small, single sample volume. Thanks to the rapid development of nanotechnology, tremendous progress has been made in the multiplexed detecting capability, sensitivity, and photostability of suspension arrays. In this review, we first focus on the current stock of nanoparticle-based barcodes as well as the manufacturing technologies required for their production. We then move on to discuss all existing barcode-based bioanalysis patterns, including the various labels used in suspension arrays, label-free platforms, signal amplification methods, and fluorescence resonance energy transfer (FRET)-based platforms. We then introduce automatic platforms for suspension arrays that use superparamagnetic nanoparticle-based microspheres. Finally, we summarize the current challenges and their proposed solutions, which are centered on improving encoding capacities, alternative probe possibilities, nonspecificity suppression, directional immobilization, and “point of care” platforms. Throughout this review, we aim to provide a comprehensive guide for the design of suspension arrays, with the goal of improving their performance in areas such as multiplexing capacity, throughput, sensitivity, and cost effectiveness. We hope that our summary on the state-of-the-art development of these arrays, our commentary on future challenges, and some proposed avenues for further advances will help drive the development of suspension array technology and its related fields. PMID:26021602
A powerful and flexible statistical framework for testing hypotheses of allele-specific gene expression from RNA-seq data

PubMed Central

Skelly, Daniel A.; Johansson, Marnie; Madeoy, Jennifer; Wakefield, Jon; Akey, Joshua M.

2011-01-01

Variation in gene expression is thought to make a significant contribution to phenotypic diversity among individuals within populations. Although high-throughput cDNA sequencing offers a unique opportunity to delineate the genome-wide architecture of regulatory variation, new statistical methods need to be developed to capitalize on the wealth of information contained in RNA-seq data sets. To this end, we developed a powerful and flexible hierarchical Bayesian model that combines information across loci to allow both global and locus-specific inferences about allele-specific expression (ASE). We applied our methodology to a large RNA-seq data set obtained in a diploid hybrid of two diverse Saccharomyces cerevisiae strains, as well as to RNA-seq data from an individual human genome. Our statistical framework accurately quantifies levels of ASE with specified false-discovery rates, achieving high reproducibility between independent sequencing platforms. We pinpoint loci that show unusual and biologically interesting patterns of ASE, including allele-specific alternative splicing and transcription termination sites. Our methodology provides a rigorous, quantitative, and high-resolution tool for profiling ASE across whole genomes. PMID:21873452
High Throughput Computing Impact on Meta Genomics (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

ScienceCinema

Gore, Brooklin

2018-02-01

This presentation includes a brief background on High Throughput Computing, correlating gene transcription factors, optical mapping, genotype to phenotype mapping via QTL analysis, and current work on next gen sequencing.
Low-Cost, High-Throughput Sequencing of DNA Assemblies Using a Highly Multiplexed Nextera Process.

PubMed

Shapland, Elaine B; Holmes, Victor; Reeves, Christopher D; Sorokin, Elena; Durot, Maxime; Platt, Darren; Allen, Christopher; Dean, Jed; Serber, Zach; Newman, Jack; Chandran, Sunil

2015-07-17

In recent years, next-generation sequencing (NGS) technology has greatly reduced the cost of sequencing whole genomes, whereas the cost of sequence verification of plasmids via Sanger sequencing has remained high. Consequently, industrial-scale strain engineers either limit the number of designs or take short cuts in quality control. Here, we show that over 4000 plasmids can be completely sequenced in one Illumina MiSeq run for less than $3 each (15× coverage), which is a 20-fold reduction over using Sanger sequencing (2× coverage). We reduced the volume of the Nextera tagmentation reaction by 100-fold and developed an automated workflow to prepare thousands of samples for sequencing. We also developed software to track the samples and associated sequence data and to rapidly identify correctly assembled constructs having the fewest defects. As DNA synthesis and assembly become a centralized commodity, this NGS quality control (QC) process will be essential to groups operating high-throughput pipelines for DNA construction.
A High-Throughput Automated Microfluidic Platform for Calcium Imaging of Taste Sensing.

PubMed

Hsiao, Yi-Hsing; Hsu, Chia-Hsien; Chen, Chihchen

2016-07-08

The human enteroendocrine L cell line NCI-H716, expressing taste receptors and taste signaling elements, constitutes a unique model for the studies of cellular responses to glucose, appetite regulation, gastrointestinal motility, and insulin secretion. Targeting these gut taste receptors may provide novel treatments for diabetes and obesity. However, NCI-H716 cells are cultured in suspension and tend to form multicellular aggregates, preventing high-throughput calcium imaging due to interferences caused by laborious immobilization and stimulus delivery procedures. Here, we have developed an automated microfluidic platform that is capable of trapping more than 500 single cells into microwells with a loading efficiency of 77% within two minutes, delivering multiple chemical stimuli and performing calcium imaging with enhanced spatial and temporal resolutions when compared to bath perfusion systems. Results revealed the presence of heterogeneity in cellular responses to the type, concentration, and order of applied sweet and bitter stimuli. Sucralose and denatonium benzoate elicited robust increases in the intracellular Ca(2+) concentration. However, glucose evoked a rapid elevation of intracellular Ca(2+) followed by reduced responses to subsequent glucose stimulation. Using Gymnema sylvestre as a blocking agent for the sweet taste receptor confirmed that different taste receptors were utilized for sweet and bitter tastes. This automated microfluidic platform is cost-effective, easy to fabricate and operate, and may be generally applicable for high-throughput and high-content single-cell analysis and drug screening.

Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding.

PubMed

Lan, Freeman; Demaree, Benjamin; Ahmed, Noorsher; Abate, Adam R

2017-07-01

The application of single-cell genome sequencing to large cell populations has been hindered by technical challenges in isolating single cells during genome preparation. Here we present single-cell genomic sequencing (SiC-seq), which uses droplet microfluidics to isolate, fragment, and barcode the genomes of single cells, followed by Illumina sequencing of pooled DNA. We demonstrate ultra-high-throughput sequencing of >50,000 cells per run in a synthetic community of Gram-negative and Gram-positive bacteria and fungi. The sequenced genomes can be sorted in silico based on characteristic sequences. We use this approach to analyze the distributions of antibiotic-resistance genes, virulence factors, and phage sequences in microbial communities from an environmental sample. The ability to routinely sequence large populations of single cells will enable the de-convolution of genetic heterogeneity in diverse cell populations.
Is this the real time for genomics?

PubMed

Guarnaccia, Maria; Gentile, Giulia; Alessi, Enrico; Schneider, Claudio; Petralia, Salvatore; Cavallaro, Sebastiano

2014-01-01

In the last decades, molecular biology has moved from gene-by-gene analysis to more complex studies using a genome-wide scale. Thanks to high-throughput genomic technologies, such as microarrays and next-generation sequencing, a huge amount of information has been generated, expanding our knowledge on the genetic basis of various diseases. Although some of this information could be transferred to clinical diagnostics, the technologies available are not suitable for this purpose. In this review, we will discuss the drawbacks associated with the use of traditional DNA microarrays in diagnostics, pointing out emerging platforms that could overcome these obstacles and offer a more reproducible, qualitative and quantitative multigenic analysis. New miniaturized and automated devices, called Lab-on-Chip, begin to integrate PCR and microarray on the same platform, offering integrated sample-to-result systems. The introduction of this kind of innovative devices may facilitate the transition of genome-based tests into clinical routine. Copyright © 2014. Published by Elsevier Inc.
High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs.

PubMed

Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus; Morling, Niels

2016-01-01

Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards with buccal swabs and compared the results with those obtained with DNA extracted using the EZ1 DNA Investigator Kit. Concordant profiles were obtained for all samples. Our protocol includes simple punch, wash, and PCR steps, reducing cost and hands-on time in the laboratory. Furthermore, it facilitates automation of DNA sequencing.
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis

PubMed Central

2012-01-01

Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.

PubMed

Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong

2012-01-25

The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.
A High Throughput Model of Post-Traumatic Osteoarthritis using Engineered Cartilage Tissue Analogs

PubMed Central

Mohanraj, Bhavana; Meloni, Gregory R.; Mauck, Robert L.; Dodge, George R.

2014-01-01

(1) Objective A number of in vitro models of post-traumatic osteoarthritis (PTOA) have been developed to study the effect of mechanical overload on the processes that regulate cartilage degeneration. While such frameworks are critical for the identification therapeutic targets, existing technologies are limited in their throughput capacity. Here, we validate a test platform for high-throughput mechanical injury incorporating engineered cartilage. (2) Method We utilized a high throughput mechanical testing platform to apply injurious compression to engineered cartilage and determined their strain and strain rate dependent responses to injury. Next, we validated this response by applying the same injury conditions to cartilage explants. Finally, we conducted a pilot screen of putative PTOA therapeutic compounds. (3) Results Engineered cartilage response to injury was strain dependent, with a 2-fold increase in GAG loss at 75% compared to 50% strain. Extensive cell death was observed adjacent to fissures, with membrane rupture corroborated by marked increases in LDH release. Testing of established PTOA therapeutics showed that pan-caspase inhibitor (ZVF) was effective at reducing cell death, while the amphiphilic polymer (P188) and the free-radical scavenger (NAC) reduced GAG loss as compared to injury alone. (4) Conclusions The injury response in this engineered cartilage model replicated key features of the response from cartilage explants, validating this system for application of physiologically relevant injurious compression. This study establishes a novel tool for the discovery of mechanisms governing cartilage injury, as well as a screening platform for the identification of new molecules for the treatment of PTOA. PMID:24999113
High-Throughput Nano-Biofilm Microarray for Antifungal Drug Discovery

DTIC Science & Technology

2013-06-25

High-Throughput Nano-Biofilm Microarray for Antifungal Drug Discovery Anand Srinivasan,a, c Kai P. Leung,d Jose L. Lopez-Ribot,b, c Anand K...Ramasubramaniana, c Departments of Biomedical Engineeringa and Biologyb and South Texas Center for Emerging Infectious Diseases, c The University of Texas at San...of the opportunistic fungal pathogen Candida albicans on a microarray platform. The mi- croarray consists of 1,200 individual cultures of 30 nl of C
Identification and characterization of EBV genomes in spontaneously immortalized human peripheral blood B lymphocytes by NGS technology.

PubMed

Lei, Haiyan; Li, Tianwei; Hung, Guo-Chiuan; Li, Bingjie; Tsai, Shien; Lo, Shyh-Ching

2013-11-19

We conducted genomic sequencing to identify Epstein Barr Virus (EBV) genomes in 2 human peripheral blood B lymphocytes that underwent spontaneous immortalization promoted by mycoplasma infections in culture, using the high-throughput sequencing (HTS) Illumina MiSeq platform. The purpose of this study was to examine if rapid detection and characterization of a viral agent could be effectively achieved by HTS using a platform that has become readily available in general biology laboratories. Raw read sequences, averaging 175 bps in length, were mapped with DNA databases of human, bacteria, fungi and virus genomes using the CLC Genomics Workbench bioinformatics tool. Overall 37,757 out of 49,520,834 total reads in one lymphocyte line (# K4413-Mi) and 28,178 out of 45,335,960 reads in the other lymphocyte line (# K4123-Mi) were identified as EBV sequences. The two EBV genomes with estimated 35.22-fold and 31.06-fold sequence coverage respectively, designated K4413-Mi EBV and K4123-Mi EBV (GenBank accession number KC440852 and KC440851 respectively), are characteristic of type-1 EBV. Sequence comparison and phylogenetic analysis among K4413-Mi EBV, K4123-Mi EBV and the EBV genomes previously reported to GenBank as well as the NA12878 EBV genome assembled from database of the 1000 Genome Project showed that these 2 EBVs are most closely related to B95-8, an EBV previously isolated from a patient with infectious mononucleosis and WT-EBV. They are less similar to EBVs associated with nasopharyngeal carcinoma (NPC) from Hong Kong and China as well as the Akata strain of a case of Burkitt's lymphoma from Japan. They are most different from type 2 EBV found in Western African Burkitt's lymphoma.
EZH2 and CD79B mutational status over time in B-cell non-Hodgkin lymphomas detected by high-throughput sequencing using minimal samples

PubMed Central

Saieg, Mauro Ajaj; Geddie, William R; Boerner, Scott L; Bailey, Denis; Crump, Michael; da Cunha Santos, Gilda

2013-01-01

BACKGROUND: Numerous genomic abnormalities in B-cell non-Hodgkin lymphomas (NHLs) have been revealed by novel high-throughput technologies, including recurrent mutations in EZH2 (enhancer of zeste homolog 2) and CD79B (B cell antigen receptor complex-associated protein beta chain) genes. This study sought to determine the evolution of the mutational status of EZH2 and CD79B over time in different samples from the same patient in a cohort of B-cell NHLs, through use of a customized multiplex mutation assay. METHODS: DNA that was extracted from cytological material stored on FTA cards as well as from additional specimens, including archived frozen and formalin-fixed histological specimens, archived stained smears, and cytospin preparations, were submitted to a multiplex mutation assay specifically designed for the detection of point mutations involving EZH2 and CD79B, using MassARRAY spectrometry followed by Sanger sequencing. RESULTS: All 121 samples from 80 B-cell NHL cases were successfully analyzed. Mutations in EZH2 (Y646) and CD79B (Y196) were detected in 13.2% and 8% of the samples, respectively, almost exclusively in follicular lymphomas and diffuse large B-cell lymphomas. In one-third of the positive cases, a wild type was detected in a different sample from the same patient during follow-up. CONCLUSIONS: Testing multiple minimal tissue samples using a high-throughput multiplex platform exponentially increases tissue availability for molecular analysis and might facilitate future studies of tumor progression and the related molecular events. Mutational status of EZH2 and CD79B may vary in B-cell NHL samples over time and support the concept that individualized therapy should be based on molecular findings at the time of treatment, rather than on results obtained from previous specimens. Cancer (Cancer Cytopathol) 2013;121:377–386. © 2013 American Cancer Society. PMID:23361872
High throughput on-chip analysis of high-energy charged particle tracks using lensfree imaging

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Wei; Shabbir, Faizan; Gong, Chao

2015-04-13

We demonstrate a high-throughput charged particle analysis platform, which is based on lensfree on-chip microscopy for rapid ion track analysis using allyl diglycol carbonate, i.e., CR-39 plastic polymer as the sensing medium. By adopting a wide-area opto-electronic image sensor together with a source-shifting based pixel super-resolution technique, a large CR-39 sample volume (i.e., 4 cm × 4 cm × 0.1 cm) can be imaged in less than 1 min using a compact lensfree on-chip microscope, which detects partially coherent in-line holograms of the ion tracks recorded within the CR-39 detector. After the image capture, using highly parallelized reconstruction and ion track analysis algorithms running on graphics processingmore » units, we reconstruct and analyze the entire volume of a CR-39 detector within ∼1.5 min. This significant reduction in the entire imaging and ion track analysis time not only increases our throughput but also allows us to perform time-resolved analysis of the etching process to monitor and optimize the growth of ion tracks during etching. This computational lensfree imaging platform can provide a much higher throughput and more cost-effective alternative to traditional lens-based scanning optical microscopes for ion track analysis using CR-39 and other passive high energy particle detectors.« less
YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.

PubMed

Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G; Rigoutsos, Isidore; Kirino, Yohei

2017-05-19

Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Xi-CAM v1.2.3

DOE Office of Scientific and Technical Information (OSTI.GOV)

PANDOLFI, RONALD; KUMAR, DINESH; VENKATAKRISHNAN, SINGANALLUR

Xi-CAM aims to provide a community driven platform for multimodal analysis in synchrotron science. The platform core provides a robust plugin infrastructure for extensibility, allowing continuing development to simply add further functionality. Current modules include tools for characterization with (GI)SAXS, Tomography, and XAS. This will continue to serve as a development base as algorithms for multimodal analysis develop. Seamless remote data access, visualization and analysis are key elements of Xi-CAM, and will become critical to synchrotron data infrastructure as expectations for future data volume and acquisition rates rise with continuously increasing throughputs. The highly interactive design elements of Xi-cam willmore » similarly support a generation of users which depend on immediate data quality feedback during high-throughput or burst acquisition modes.« less
ElectroTaxis-on-a-Chip (ETC): an integrated quantitative high-throughput screening platform for electrical field-directed cell migration.

PubMed

Zhao, Siwei; Zhu, Kan; Zhang, Yan; Zhu, Zijie; Xu, Zhengping; Zhao, Min; Pan, Tingrui

2014-11-21

Both endogenous and externally applied electrical stimulation can affect a wide range of cellular functions, including growth, migration, differentiation and division. Among those effects, the electrical field (EF)-directed cell migration, also known as electrotaxis, has received broad attention because it holds great potential in facilitating clinical wound healing. Electrotaxis experiment is conventionally conducted in centimetre-sized flow chambers built in Petri dishes. Despite the recent efforts to adapt microfluidics for electrotaxis studies, the current electrotaxis experimental setup is still cumbersome due to the needs of an external power supply and EF controlling/monitoring systems. There is also a lack of parallel experimental systems for high-throughput electrotaxis studies. In this paper, we present a first independently operable microfluidic platform for high-throughput electrotaxis studies, integrating all functional components for cell migration under EF stimulation (except microscopy) on a compact footprint (the same as a credit card), referred to as ElectroTaxis-on-a-Chip (ETC). Inspired by the R-2R resistor ladder topology in digital signal processing, we develop a systematic approach to design an infinitely expandable microfluidic generator of EF gradients for high-throughput and quantitative studies of EF-directed cell migration. Furthermore, a vacuum-assisted assembly method is utilized to allow direct and reversible attachment of our device to existing cell culture media on biological surfaces, which separates the cell culture and device preparation/fabrication steps. We have demonstrated that our ETC platform is capable of screening human cornea epithelial cell migration under the stimulation of an EF gradient spanning over three orders of magnitude. The screening results lead to the identification of the EF-sensitive range of that cell type, which can provide valuable guidance to the clinical application of EF-facilitated wound healing.
Size Matters: Assessing Optimum Soil Sample Size for Fungal and Bacterial Community Structure Analyses Using High Throughput Sequencing of rRNA Gene Amplicons

DOE PAGES

Penton, C. Ryan; Gupta, Vadakattu V. S. R.; Yu, Julian; ...

2016-06-02

We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5, and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI) were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungalmore » community structure, replicate dispersion and the number of operational taxonomic units (OTUs) retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation.« less
Size Matters: Assessing Optimum Soil Sample Size for Fungal and Bacterial Community Structure Analyses Using High Throughput Sequencing of rRNA Gene Amplicons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Penton, C. Ryan; Gupta, Vadakattu V. S. R.; Yu, Julian

We examined the effect of different soil sample sizes obtained from an agricultural field, under a single cropping system uniform in soil properties and aboveground crop responses, on bacterial and fungal community structure and microbial diversity indices. DNA extracted from soil sample sizes of 0.25, 1, 5, and 10 g using MoBIO kits and from 10 and 100 g sizes using a bead-beating method (SARDI) were used as templates for high-throughput sequencing of 16S and 28S rRNA gene amplicons for bacteria and fungi, respectively, on the Illumina MiSeq and Roche 454 platforms. Sample size significantly affected overall bacterial and fungalmore » community structure, replicate dispersion and the number of operational taxonomic units (OTUs) retrieved. Richness, evenness and diversity were also significantly affected. The largest diversity estimates were always associated with the 10 g MoBIO extractions with a corresponding reduction in replicate dispersion. For the fungal data, smaller MoBIO extractions identified more unclassified Eukaryota incertae sedis and unclassified glomeromycota while the SARDI method retrieved more abundant OTUs containing unclassified Pleosporales and the fungal genera Alternaria and Cercophora. Overall, these findings indicate that a 10 g soil DNA extraction is most suitable for both soil bacterial and fungal communities for retrieving optimal diversity while still capturing rarer taxa in concert with decreasing replicate variation.« less
Accelerating the design of solar thermal fuel materials through high throughput simulations.

PubMed

Liu, Yun; Grossman, Jeffrey C

2014-12-10

Solar thermal fuels (STF) store the energy of sunlight, which can then be released later in the form of heat, offering an emission-free and renewable solution for both solar energy conversion and storage. However, this approach is currently limited by the lack of low-cost materials with high energy density and high stability. In this Letter, we present an ab initio high-throughput computational approach to accelerate the design process and allow for searches over a broad class of materials. The high-throughput screening platform we have developed can run through large numbers of molecules composed of earth-abundant elements and identifies possible metastable structures of a given material. Corresponding isomerization enthalpies associated with the metastable structures are then computed. Using this high-throughput simulation approach, we have discovered molecular structures with high isomerization enthalpies that have the potential to be new candidates for high-energy density STF. We have also discovered physical principles to guide further STF materials design through structural analysis. More broadly, our results illustrate the potential of using high-throughput ab initio simulations to design materials that undergo targeted structural transitions.
Systematic Optimization of Long Gradient Chromatography Mass Spectrometry for Deep Analysis of Brain Proteome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Hong; Yang, Yanling; Li, Yuxin

2015-02-06

Development of high resolution liquid chromatography (LC) is essential for improving the sensitivity and throughput of mass spectrometry (MS)-based proteomics. Here we present systematic optimization of a long gradient LC-MS/MS platform to enhance protein identification from a complex mixture. The platform employed an in-house fabricated, reverse phase column (100 μm x 150 cm) coupled with Q Exactive MS. The column was capable of achieving a peak capacity of approximately 700 in a 720 min gradient of 10-45% acetonitrile. The optimal loading level was about 6 micrograms of peptides, although the column allowed loading as many as 20 micrograms. Gas phasemore » fractionation of peptide ions further increased the number of peptide identification by ~10%. Moreover, the combination of basic pH LC pre-fractionation with the long gradient LC-MS/MS platform enabled the identification of 96,127 peptides and 10,544 proteins at 1% protein false discovery rate in a postmortem brain sample of Alzheimer’s disease. As deep RNA sequencing of the same specimen suggested that ~16,000 genes were expressed, current analysis covered more than 60% of the expressed proteome. Further improvement strategies of the LC/LC-MS/MS platform were also discussed.« less
Year 2 Report: Protein Function Prediction Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, C E

2012-04-27

Upon completion of our second year of development in a 3-year development cycle, we have completed a prototype protein structure-function annotation and function prediction system: Protein Function Prediction (PFP) platform (v.0.5). We have met our milestones for Years 1 and 2 and are positioned to continue development in completion of our original statement of work, or a reasonable modification thereof, in service to DTRA Programs involved in diagnostics and medical countermeasures research and development. The PFP platform is a multi-scale computational modeling system for protein structure-function annotation and function prediction. As of this writing, PFP is the only existing fullymore » automated, high-throughput, multi-scale modeling, whole-proteome annotation platform, and represents a significant advance in the field of genome annotation (Fig. 1). PFP modules perform protein functional annotations at the sequence, systems biology, protein structure, and atomistic levels of biological complexity (Fig. 2). Because these approaches provide orthogonal means of characterizing proteins and suggesting protein function, PFP processing maximizes the protein functional information that can currently be gained by computational means. Comprehensive annotation of pathogen genomes is essential for bio-defense applications in pathogen characterization, threat assessment, and medical countermeasure design and development in that it can short-cut the time and effort required to select and characterize protein biomarkers.« less
High-throughput analysis of the protein sequence-stability landscape using a quantitative "yeast surface two-hybrid" system and fragment reconstitution

PubMed Central

Dutta, Sanjib; Koide, Akiko; Koide, Shohei

2008-01-01

Stability evaluation of many mutants can lead to a better understanding of the sequence determinants of a structural motif and of factors governing protein stability and protein evolution. The traditional biophysical analysis of protein stability is low throughput, limiting our ability to widely explore the sequence space in a quantitative manner. In this study, we have developed a high-throughput library screening method for quantifying stability changes, which is based on protein fragment reconstitution and yeast surface display. Our method exploits the thermodynamic linkage between protein stability and fragment reconstitution and the ability of the yeast surface display technique to quantitatively evaluate protein-protein interactions. The method was applied to a fibronectin type III (FN3) domain. Characterization of fragment reconstitution was facilitated by the co-expression of two FN3 fragments, thus establishing a "yeast surface two-hybrid" method. Importantly, our method does not rely on competition between clones and thus eliminates a common limitation of high-throughput selection methods in which the most stable variants are predominantly recovered. Thus, it allows for the isolation of sequences that exhibits a desired level of stability. We identified over one hundred unique sequences for a β-bulge motif, which was significantly more informative than natural sequences of the FN3 family in revealing the sequence determinants for the β-bulge. Our method provides a powerful means to rapidly assess stability of many variants, to systematically assess contribution of different factors to protein stability and to enhance protein stability. PMID:18674545
Towards High-throughput Immunomics for Infectious Diseases: Use of Next-generation Peptide Microarrays for Rapid Discovery and Mapping of Antigenic Determinants*

PubMed Central

Carmona, Santiago J.; Nielsen, Morten; Schafer-Nielsen, Claus; Mucci, Juan; Altcheh, Jaime; Balouz, Virginia; Tekiel, Valeria; Frasch, Alberto C.; Campetella, Oscar; Buscaglia, Carlos A.; Agüero, Fernán

2015-01-01

Complete characterization of antibody specificities associated to natural infections is expected to provide a rich source of serologic biomarkers with potential applications in molecular diagnosis, follow-up of chemotherapeutic treatments, and prioritization of targets for vaccine development. Here, we developed a highly-multiplexed platform based on next-generation high-density peptide microarrays to map these specificities in Chagas Disease, an exemplar of a human infectious disease caused by the protozoan Trypanosoma cruzi. We designed a high-density peptide microarray containing more than 175,000 overlapping 15mer peptides derived from T. cruzi proteins. Peptides were synthesized in situ on microarray slides, spanning the complete length of 457 parasite proteins with fully overlapped 15mers (1 residue shift). Screening of these slides with antibodies purified from infected patients and healthy donors demonstrated both a high technical reproducibility as well as epitope mapping consistency when compared with earlier low-throughput technologies. Using a conservative signal threshold to classify positive (reactive) peptides we identified 2,031 disease-specific peptides and 97 novel parasite antigens, effectively doubling the number of known antigens and providing a 10-fold increase in the number of fine mapped antigenic determinants for this disease. Finally, further analysis of the chip data showed that optimizing the amount of sequence overlap of displayed peptides can increase the protein space covered in a single chip by at least ∼threefold without sacrificing sensitivity. In conclusion, we show the power of high-density peptide chips for the discovery of pathogen-specific linear B-cell epitopes from clinical samples, thus setting the stage for high-throughput biomarker discovery screenings and proteome-wide studies of immune responses against pathogens. PMID:25922409

Towards High-throughput Immunomics for Infectious Diseases: Use of Next-generation Peptide Microarrays for Rapid Discovery and Mapping of Antigenic Determinants.

PubMed

Carmona, Santiago J; Nielsen, Morten; Schafer-Nielsen, Claus; Mucci, Juan; Altcheh, Jaime; Balouz, Virginia; Tekiel, Valeria; Frasch, Alberto C; Campetella, Oscar; Buscaglia, Carlos A; Agüero, Fernán

2015-07-01

Complete characterization of antibody specificities associated to natural infections is expected to provide a rich source of serologic biomarkers with potential applications in molecular diagnosis, follow-up of chemotherapeutic treatments, and prioritization of targets for vaccine development. Here, we developed a highly-multiplexed platform based on next-generation high-density peptide microarrays to map these specificities in Chagas Disease, an exemplar of a human infectious disease caused by the protozoan Trypanosoma cruzi. We designed a high-density peptide microarray containing more than 175,000 overlapping 15 mer peptides derived from T. cruzi proteins. Peptides were synthesized in situ on microarray slides, spanning the complete length of 457 parasite proteins with fully overlapped 15 mers (1 residue shift). Screening of these slides with antibodies purified from infected patients and healthy donors demonstrated both a high technical reproducibility as well as epitope mapping consistency when compared with earlier low-throughput technologies. Using a conservative signal threshold to classify positive (reactive) peptides we identified 2,031 disease-specific peptides and 97 novel parasite antigens, effectively doubling the number of known antigens and providing a 10-fold increase in the number of fine mapped antigenic determinants for this disease. Finally, further analysis of the chip data showed that optimizing the amount of sequence overlap of displayed peptides can increase the protein space covered in a single chip by at least ∼ threefold without sacrificing sensitivity. In conclusion, we show the power of high-density peptide chips for the discovery of pathogen-specific linear B-cell epitopes from clinical samples, thus setting the stage for high-throughput biomarker discovery screenings and proteome-wide studies of immune responses against pathogens. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
HTS techniques for patch clamp-based ion channel screening - advances and economy.

PubMed

Farre, Cecilia; Fertig, Niels

2012-06-01

Ten years ago, the first publication appeared showing patch clamp recordings performed on a planar glass chip instead of using a conventional patch clamp pipette. "Going planar" proved to revolutionize ion channel drug screening as we know it, by allowing high quality measurements of ion channels and their effectors at a higher throughput and at the same time de-skilling the highly laborious technique. Over the years, platforms evolved in response to user requirements regarding experimental features, data handling plus storage, and suitable target diversity. This article gives a snapshot image of patch clamp-based ion channel screening with focus on platforms developed to meet requirements of high-throughput screening environments. The commercially available platforms are described, along with their benefits and drawbacks in ion channel drug screening. Automated patch clamp (APC) platforms allow faster investigation of a larger number of ion channel active compounds or cell clones than previously possible. Since patch clamp is the only method allowing direct, real-time measurements of ion channel activity, APC holds the promise of picking up high quality leads, where they otherwise would have been overseen using indirect methods. In addition, drug candidate safety profiling can be performed earlier in the drug discovery process, avoiding late-phase compound withdrawal due to safety liability issues, which is highly costly and inefficient.
Simultaneous co-detection of wild-type and vaccine strain measles virus using the BD MAX system.

PubMed

Thapa, Kiran; Ellem, Justin A; Basile, Kerri; Carter, Ian; Olma, Tom; Chen, Sharon C-A; Dwyer, Dominic E; Kok, Jen

2018-06-01

Despite the reported elimination of measles virus in Australia, importation of cases from endemic countries continues to lead to secondary local transmission and outbreaks. Rapid laboratory confirmation of measles is paramount for individual patient management and outbreak responses. Further, it is important to rapidly distinguish infection from wild-type virus or vaccine strains to guide public health responses. We developed a high throughput, TaqMan-based multiplex reverse-transcription-polymerase chain reaction (PCR) assay using the BD MAX platform (Becton Dickinson) that simultaneously detects measles virus and differentiates between wild-type and vaccine strains without the need for sequencing. Copyright © 2018 Royal College of Pathologists of Australasia. Published by Elsevier B.V. All rights reserved.
Functional genomic analysis of drug sensitivity pathways to guide adjuvant strategies in breast cancer

PubMed Central

Swanton, Charles; Szallasi, Zoltan; Brenton, James D; Downward, Julian

2008-01-01

The widespread introduction of high throughput RNA interference screening technology has revealed tumour drug sensitivity pathways to common cytotoxics such as paclitaxel, doxorubicin and 5-fluorouracil, targeted agents such as trastuzumab and inhibitors of AKT and Poly(ADP-ribose) polymerase (PARP) as well as endocrine therapies such as tamoxifen. Given the limited power of microarray signatures to predict therapeutic response in associative studies of small clinical trial cohorts, the use of functional genomic data combined with expression or sequence analysis of genes and microRNAs implicated in drug response in human tumours may provide a more robust method to guide adjuvant treatment strategies in breast cancer that are transferable across different expression platforms and patient cohorts. PMID:18986507
Lensless on-chip imaging of cells provides a new tool for high-throughput cell-biology and medical diagnostics.

PubMed

Mudanyali, Onur; Erlinger, Anthony; Seo, Sungkyu; Su, Ting-Wei; Tseng, Derek; Ozcan, Aydogan

2009-12-14

Conventional optical microscopes image cells by use of objective lenses that work together with other lenses and optical components. While quite effective, this classical approach has certain limitations for miniaturization of the imaging platform to make it compatible with the advanced state of the art in microfluidics. In this report, we introduce experimental details of a lensless on-chip imaging concept termed LUCAS (Lensless Ultra-wide field-of-view Cell monitoring Array platform based on Shadow imaging) that does not require any microscope objectives or other bulky optical components to image a heterogeneous cell solution over an ultra-wide field of view that can span as large as approximately 18 cm(2). Moreover, unlike conventional microscopes, LUCAS can image a heterogeneous cell solution of interest over a depth-of-field of approximately 5 mm without the need for refocusing which corresponds to up to approximately 9 mL sample volume. This imaging platform records the shadows (i.e., lensless digital holograms) of each cell of interest within its field of view, and automated digital processing of these cell shadows can determine the type, the count and the relative positions of cells within the solution. Because it does not require any bulky optical components or mechanical scanning stages it offers a significantly miniaturized platform that at the same time reduces the cost, which is quite important for especially point of care diagnostic tools. Furthermore, the imaging throughput of this platform is orders of magnitude better than conventional optical microscopes, which could be exceedingly valuable for high-throughput cell-biology experiments.
Lensless On-chip Imaging of Cells Provides a New Tool for High-throughput Cell-Biology and Medical Diagnostics

PubMed Central

Mudanyali, Onur; Erlinger, Anthony; Seo, Sungkyu; Su, Ting-Wei; Tseng, Derek; Ozcan, Aydogan

2009-01-01

Conventional optical microscopes image cells by use of objective lenses that work together with other lenses and optical components. While quite effective, this classical approach has certain limitations for miniaturization of the imaging platform to make it compatible with the advanced state of the art in microfluidics. In this report, we introduce experimental details of a lensless on-chip imaging concept termed LUCAS (Lensless Ultra-wide field-of-view Cell monitoring Array platform based on Shadow imaging) that does not require any microscope objectives or other bulky optical components to image a heterogeneous cell solution over an ultra-wide field of view that can span as large as ~18 cm2. Moreover, unlike conventional microscopes, LUCAS can image a heterogeneous cell solution of interest over a depth-of-field of ~5 mm without the need for refocusing which corresponds to up to ~9 mL sample volume. This imaging platform records the shadows (i.e., lensless digital holograms) of each cell of interest within its field of view, and automated digital processing of these cell shadows can determine the type, the count and the relative positions of cells within the solution. Because it does not require any bulky optical components or mechanical scanning stages it offers a significantly miniaturized platform that at the same time reduces the cost, which is quite important for especially point of care diagnostic tools. Furthermore, the imaging throughput of this platform is orders of magnitude better than conventional optical microscopes, which could be exceedingly valuable for high-throughput cell-biology experiments. PMID:20010542
Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data

PubMed Central

2014-01-01

Background The rapid evolution in high-throughput sequencing (HTS) technologies has opened up new perspectives in several research fields and led to the production of large volumes of sequence data. A fundamental step in HTS data analysis is the mapping of reads onto reference sequences. Choosing a suitable mapper for a given technology and a given application is a subtle task because of the difficulty of evaluating mapping algorithms. Results In this paper, we present a benchmark procedure to compare mapping algorithms used in HTS using both real and simulated datasets and considering four evaluation criteria: computational resource and time requirements, robustness of mapping, ability to report positions for reads in repetitive regions, and ability to retrieve true genetic variation positions. To measure robustness, we introduced a new definition for a correctly mapped read taking into account not only the expected start position of the read but also the end position and the number of indels and substitutions. We developed CuReSim, a new read simulator, that is able to generate customized benchmark data for any kind of HTS technology by adjusting parameters to the error types. CuReSim and CuReSimEval, a tool to evaluate the mapping quality of the CuReSim simulated reads, are freely available. We applied our benchmark procedure to evaluate 14 mappers in the context of whole genome sequencing of small genomes with Ion Torrent data for which such a comparison has not yet been established. Conclusions A benchmark procedure to compare HTS data mappers is introduced with a new definition for the mapping correctness as well as tools to generate simulated reads and evaluate mapping quality. The application of this procedure to Ion Torrent data from the whole genome sequencing of small genomes has allowed us to validate our benchmark procedure and demonstrate that it is helpful for selecting a mapper based on the intended application, questions to be addressed, and the technology used. This benchmark procedure can be used to evaluate existing or in-development mappers as well as to optimize parameters of a chosen mapper for any application and any sequencing platform. PMID:24708189
Microbial communities and their predicted metabolic characteristics in deep fracture groundwaters of the crystalline bedrock at Olkiluoto, Finland

NASA Astrophysics Data System (ADS)

Bomberg, Malin; Lamminmäki, Tiina; Itävaara, Merja

2016-11-01

The microbial diversity in oligotrophic isolated crystalline Fennoscandian Shield bedrock fracture groundwaters is high, but the core community has not been identified. Here we characterized the bacterial and archaeal communities in 12 water conductive fractures situated at depths between 296 and 798 m by high throughput amplicon sequencing using the Illumina HiSeq platform. Between 1.7 × 104 and 1.2 × 106 bacterial or archaeal sequence reads per sample were obtained. These sequences revealed that up to 95 and 99 % of the bacterial and archaeal sequences obtained from the 12 samples, respectively, belonged to only a few common species, i.e. the core microbiome. However, the remaining rare microbiome contained over 3- and 6-fold more bacterial and archaeal taxa. The metabolic properties of the microbial communities were predicted using PICRUSt. The approximate estimation showed that the metabolic pathways commonly included fermentation, fatty acid oxidation, glycolysis/gluconeogenesis, oxidative phosphorylation, and methanogenesis/anaerobic methane oxidation, but carbon fixation through the Calvin cycle, reductive TCA cycle, and the Wood-Ljungdahl pathway was also predicted. The rare microbiome is an unlimited source of genomic functionality in all ecosystems. It may consist of remnants of microbial communities prevailing in earlier environmental conditions, but could also be induced again if changes in their living conditions occur.
Phylogenetic analysis of higher-level relationships within Hydroidolina (Cnidaria: Hydrozoa) using mitochondrial genome data and insight into their mitochondrial transcription.

PubMed

Kayal, Ehsan; Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A; Lindsay, Dhugal J; Hopcroft, Russell R; Collins, Allen G

2015-01-01

Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III-IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I-II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms.
Phylogenetic analysis of higher-level relationships within Hydroidolina (Cnidaria: Hydrozoa) using mitochondrial genome data and insight into their mitochondrial transcription

PubMed Central

Bentlage, Bastian; Cartwright, Paulyn; Yanagihara, Angel A.; Lindsay, Dhugal J.; Hopcroft, Russell R.; Collins, Allen G.

2015-01-01

Hydrozoans display the most morphological diversity within the phylum Cnidaria. While recent molecular studies have provided some insights into their evolutionary history, sister group relationships remain mostly unresolved, particularly at mid-taxonomic levels. Specifically, within Hydroidolina, the most speciose hydrozoan subclass, the relationships and sometimes integrity of orders are highly unsettled. Here we obtained the near complete mitochondrial sequence of twenty-six hydroidolinan hydrozoan species from a range of sources (DNA and RNA-seq data, long-range PCR). Our analyses confirm previous inference of the evolution of mtDNA in Hydrozoa while introducing a novel genome organization. Using RNA-seq data, we propose a mechanism for the expression of mitochondrial mRNA in Hydroidolina that can be extrapolated to the other medusozoan taxa. Phylogenetic analyses using the full set of mitochondrial gene sequences provide some insights into the order-level relationships within Hydroidolina, including siphonophores as the first diverging clade, a well-supported clade comprised of Leptothecata-Filifera III–IV, and a second clade comprised of Aplanulata-Capitata s.s.-Filifera I–II. Finally, we describe our relatively inexpensive and accessible multiplexing strategy to sequence long-range PCR amplicons that can be adapted to most high-throughput sequencing platforms. PMID:26618080
Advances in single-cell experimental design made possible by automated imaging platforms with feedback through segmentation.

PubMed

Crick, Alex J; Cammarota, Eugenia; Moulang, Katie; Kotar, Jurij; Cicuta, Pietro

2015-01-01

Live optical microscopy has become an essential tool for studying the dynamical behaviors and variability of single cells, and cell-cell interactions. However, experiments and data analysis in this area are often extremely labor intensive, and it has often not been achievable or practical to perform properly standardized experiments on a statistically viable scale. We have addressed this challenge by developing automated live imaging platforms, to help standardize experiments, increasing throughput, and unlocking previously impossible ones. Our real-time cell tracking programs communicate in feedback with microscope and camera control software, and they are highly customizable, flexible, and efficient. As examples of our current research which utilize these automated platforms, we describe two quite different applications: egress-invasion interactions of malaria parasites and red blood cells, and imaging of immune cells which possess high motility and internal dynamics. The automated imaging platforms are able to track a large number of motile cells simultaneously, over hours or even days at a time, greatly increasing data throughput and opening up new experimental possibilities. Copyright © 2015 Elsevier Inc. All rights reserved.
False positives complicate ancient pathogen identifications using high-throughput shotgun sequencing

PubMed Central

2014-01-01

Background Identification of historic pathogens is challenging since false positives and negatives are a serious risk. Environmental non-pathogenic contaminants are ubiquitous. Furthermore, public genetic databases contain limited information regarding these species. High-throughput sequencing may help reliably detect and identify historic pathogens. Results We shotgun-sequenced 8 16th-century Mixtec individuals from the site of Teposcolula Yucundaa (Oaxaca, Mexico) who are reported to have died from the huey cocoliztli (‘Great Pestilence’ in Nahautl), an unknown disease that decimated native Mexican populations during the Spanish colonial period, in order to identify the pathogen. Comparison of these sequences with those deriving from the surrounding soil and from 4 precontact individuals from the site found a wide variety of contaminant organisms that confounded analyses. Without the comparative sequence data from the precontact individuals and soil, false positives for Yersinia pestis and rickettsiosis could have been reported. Conclusions False positives and negatives remain problematic in ancient DNA analyses despite the application of high-throughput sequencing. Our results suggest that several studies claiming the discovery of ancient pathogens may need further verification. Additionally, true single molecule sequencing’s short read lengths, inability to sequence through DNA lesions, and limited ancient-DNA-specific technical development hinder its application to palaeopathology. PMID:24568097
Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

PubMed Central

Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

2015-01-01

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
Vertical silicon nanowires as a universal platform for delivering biomolecules into living cells

PubMed Central

Shalek, Alex K.; Robinson, Jacob T.; Karp, Ethan S.; Lee, Jin Seok; Ahn, Dae-Ro; Yoon, Myung-Han; Sutton, Amy; Jorgolli, Marsela; Gertner, Rona S.; Gujral, Taranjit S.; MacBeath, Gavin; Yang, Eun Gyeong; Park, Hongkun

2010-01-01

A generalized platform for introducing a diverse range of biomolecules into living cells in high-throughput could transform how complex cellular processes are probed and analyzed. Here, we demonstrate spatially localized, efficient, and universal delivery of biomolecules into immortalized and primary mammalian cells using surface-modified vertical silicon nanowires. The method relies on the ability of the silicon nanowires to penetrate a cell’s membrane and subsequently release surface-bound molecules directly into the cell’s cytosol, thus allowing highly efficient delivery of biomolecules without chemical modification or viral packaging. This modality enables one to assess the phenotypic consequences of introducing a broad range of biological effectors (DNAs, RNAs, peptides, proteins, and small molecules) into almost any cell type. We show that this platform can be used to guide neuronal progenitor growth with small molecules, knock down transcript levels by delivering siRNAs, inhibit apoptosis using peptides, and introduce targeted proteins to specific organelles. We further demonstrate codelivery of siRNAs and proteins on a single substrate in a microarray format, highlighting this technology’s potential as a robust, monolithic platform for high-throughput, miniaturized bioassays. PMID:20080678
Development of high-throughput and high sensitivity capillary gel electrophoresis platform method for Western, Eastern, and Venezuelan equine encephalitis (WEVEE) virus like particles (VLPs) purity determination and characterization.

PubMed

Gollapudi, Deepika; Wycuff, Diane L; Schwartz, Richard M; Cooper, Jonathan W; Cheng, K C

2017-10-01

In this paper, we describe development of a high-throughput, highly sensitive method based on Lab Chip CGE-SDS platform for purity determination and characterization of virus-like particle (VLP) vaccines. A capillary gel electrophoresis approach requiring about 41 s per sample for analysis and demonstrating sensitivity to protein initial concentrations as low as 20 μg/mL, this method has been used previously to evaluate monoclonal antibodies, but this application for lot release assay of VLPs using this platform is unique. The method was qualified and shown to be accurate for the quantitation of VLP purity. Assay repeatability was confirmed to be less than 2% relative standard deviation of the mean (% RSD) with interday precision less than 2% RSD. The assay can evaluate purified VLPs in a concentration range of 20-249 μg/mL for VEE and 20-250 μg/mL for EEE and WEE VLPs. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
High resolution melting analytical platform for rapid prenatal and postnatal diagnosis of β-thalassemia common among Southeast Asian population.

PubMed

Prajantasen, Thanet; Fucharoen, Supan; Fucharoen, Goonnapa

2015-02-20

High resolution melting (HRM) analysis is a powerful technology for scanning sequence alteration. We have applied this HRM assay to screen common β-thalassemia mutations found among Southeast Asian population. Known DNA samples with 8 common mutations were used in initial development of the methods including -28 A-G, codon 17 A-T, IVSI-1G-T, IVSI-5G-C, codon 26G-A (Hb E), codons 41/42 -TTCT, codons 71/72+A and IVSII-654 C-T. Further validation was done on 60 postnatal and 6 prenatal diagnoses of β-thalassemia. Each mutation has specific HRM profile which could be used in rapid screening. Apart from those with DNA deletions, the results of HRM assay matched 100% with those of routine diagnosis made by routine allele specific PCR. In addition, the HRM assay could initially recognize three unknown mutations including a hitherto un-described one in Thai population. The established HRM assay should prove useful for rapid and high throughput platform for screening and prenatal diagnosis of β-thalassemia common in the region. Copyright © 2014 Elsevier B.V. All rights reserved.
Evaluation of sequencing approaches for high-throughput toxicogenomics (SOT)

EPA Science Inventory

Whole-genome in vitro transcriptomics has shown the capability to identify mechanisms of action and estimates of potency for chemical-mediated effects in a toxicological framework, but with limited throughput and high cost. We present the evaluation of three toxicogenomics platfo...
Advances in High-Throughput Speed, Low-Latency Communication for Embedded Instrumentation (7th Annual SFAF Meeting, 2012)

ScienceCinema

Jordan, Scott

2018-01-24

Scott Jordan on "Advances in high-throughput speed, low-latency communication for embedded instrumentation" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
High-throughput analysis of yeast replicative aging using a microfluidic system

PubMed Central

Jo, Myeong Chan; Liu, Wei; Gu, Liang; Dang, Weiwei; Qin, Lidong

2015-01-01

Saccharomyces cerevisiae has been an important model for studying the molecular mechanisms of aging in eukaryotic cells. However, the laborious and low-throughput methods of current yeast replicative lifespan assays limit their usefulness as a broad genetic screening platform for research on aging. We address this limitation by developing an efficient, high-throughput microfluidic single-cell analysis chip in combination with high-resolution time-lapse microscopy. This innovative design enables, to our knowledge for the first time, the determination of the yeast replicative lifespan in a high-throughput manner. Morphological and phenotypical changes during aging can also be monitored automatically with a much higher throughput than previous microfluidic designs. We demonstrate highly efficient trapping and retention of mother cells, determination of the replicative lifespan, and tracking of yeast cells throughout their entire lifespan. Using the high-resolution and large-scale data generated from the high-throughput yeast aging analysis (HYAA) chips, we investigated particular longevity-related changes in cell morphology and characteristics, including critical cell size, terminal morphology, and protein subcellular localization. In addition, because of the significantly improved retention rate of yeast mother cell, the HYAA-Chip was capable of demonstrating replicative lifespan extension by calorie restriction. PMID:26170317
The autism sequencing consortium: large-scale, high-throughput sequencing in autism spectrum disorders.

PubMed

Buxbaum, Joseph D; Daly, Mark J; Devlin, Bernie; Lehner, Thomas; Roeder, Kathryn; State, Matthew W

2012-12-20

Research during the past decade has seen significant progress in the understanding of the genetic architecture of autism spectrum disorders (ASDs), with gene discovery accelerating as the characterization of genomic variation has become increasingly comprehensive. At the same time, this research has highlighted ongoing challenges. Here we address the enormous impact of high-throughput sequencing (HTS) on ASD gene discovery, outline a consensus view for leveraging this technology, and describe a large multisite collaboration developed to accomplish these goals. Similar approaches could prove effective for severe neurodevelopmental disorders more broadly. Copyright © 2012 Elsevier Inc. All rights reserved.

High-throughput sequencing enhanced phage display enables the identification of patient-specific epitope motifs in serum.

PubMed

Christiansen, Anders; Kringelum, Jens V; Hansen, Christian S; Bøgh, Katrine L; Sullivan, Eric; Patel, Jigar; Rigby, Neil M; Eiwegger, Thomas; Szépfalusi, Zsolt; de Masi, Federico; Nielsen, Morten; Lund, Ole; Dufva, Martin

2015-08-06

Phage display is a prominent screening technique with a multitude of applications including therapeutic antibody development and mapping of antigen epitopes. In this study, phages were selected based on their interaction with patient serum and exhaustively characterised by high-throughput sequencing. A bioinformatics approach was developed in order to identify peptide motifs of interest based on clustering and contrasting to control samples. Comparison of patient and control samples confirmed a major issue in phage display, namely the selection of unspecific peptides. The potential of the bioinformatic approach was demonstrated by identifying epitopes of a prominent peanut allergen, Ara h 1, in sera from patients with severe peanut allergy. The identified epitopes were confirmed by high-density peptide micro-arrays. The present study demonstrates that high-throughput sequencing can empower phage display by (i) enabling the analysis of complex biological samples, (ii) circumventing the traditional laborious picking and functional testing of individual phage clones and (iii) reducing the number of selection rounds.
SNP Discovery in the Transcriptome of White Pacific Shrimp Litopenaeus vannamei by Next Generation Sequencing

PubMed Central

Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

2014-01-01

The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies. PMID:24498047
Image Harvest: an open-source platform for high-throughput plant image processing and analysis

PubMed Central

Knecht, Avi C.; Campbell, Malachy T.; Caprez, Adam; Swanson, David R.; Walia, Harkamal

2016-01-01

High-throughput plant phenotyping is an effective approach to bridge the genotype-to-phenotype gap in crops. Phenomics experiments typically result in large-scale image datasets, which are not amenable for processing on desktop computers, thus creating a bottleneck in the image-analysis pipeline. Here, we present an open-source, flexible image-analysis framework, called Image Harvest (IH), for processing images originating from high-throughput plant phenotyping platforms. Image Harvest is developed to perform parallel processing on computing grids and provides an integrated feature for metadata extraction from large-scale file organization. Moreover, the integration of IH with the Open Science Grid provides academic researchers with the computational resources required for processing large image datasets at no cost. Image Harvest also offers functionalities to extract digital traits from images to interpret plant architecture-related characteristics. To demonstrate the applications of these digital traits, a rice (Oryza sativa) diversity panel was phenotyped and genome-wide association mapping was performed using digital traits that are used to describe different plant ideotypes. Three major quantitative trait loci were identified on rice chromosomes 4 and 6, which co-localize with quantitative trait loci known to regulate agronomically important traits in rice. Image Harvest is an open-source software for high-throughput image processing that requires a minimal learning curve for plant biologists to analyzephenomics datasets. PMID:27141917
An Intelligent Automation Platform for Rapid Bioprocess Design.

PubMed

Wu, Tianyi; Zhou, Yuhong

2014-08-01

Bioprocess development is very labor intensive, requiring many experiments to characterize each unit operation in the process sequence to achieve product safety and process efficiency. Recent advances in microscale biochemical engineering have led to automated experimentation. A process design workflow is implemented sequentially in which (1) a liquid-handling system performs high-throughput wet lab experiments, (2) standalone analysis devices detect the data, and (3) specific software is used for data analysis and experiment design given the user's inputs. We report an intelligent automation platform that integrates these three activities to enhance the efficiency of such a workflow. A multiagent intelligent architecture has been developed incorporating agent communication to perform the tasks automatically. The key contribution of this work is the automation of data analysis and experiment design and also the ability to generate scripts to run the experiments automatically, allowing the elimination of human involvement. A first-generation prototype has been established and demonstrated through lysozyme precipitation process design. All procedures in the case study have been fully automated through an intelligent automation platform. The realization of automated data analysis and experiment design, and automated script programming for experimental procedures has the potential to increase lab productivity. © 2013 Society for Laboratory Automation and Screening.
An Intelligent Automation Platform for Rapid Bioprocess Design

PubMed Central

Wu, Tianyi

2014-01-01

Bioprocess development is very labor intensive, requiring many experiments to characterize each unit operation in the process sequence to achieve product safety and process efficiency. Recent advances in microscale biochemical engineering have led to automated experimentation. A process design workflow is implemented sequentially in which (1) a liquid-handling system performs high-throughput wet lab experiments, (2) standalone analysis devices detect the data, and (3) specific software is used for data analysis and experiment design given the user’s inputs. We report an intelligent automation platform that integrates these three activities to enhance the efficiency of such a workflow. A multiagent intelligent architecture has been developed incorporating agent communication to perform the tasks automatically. The key contribution of this work is the automation of data analysis and experiment design and also the ability to generate scripts to run the experiments automatically, allowing the elimination of human involvement. A first-generation prototype has been established and demonstrated through lysozyme precipitation process design. All procedures in the case study have been fully automated through an intelligent automation platform. The realization of automated data analysis and experiment design, and automated script programming for experimental procedures has the potential to increase lab productivity. PMID:24088579
Bicycle: a bioinformatics pipeline to analyze bisulfite sequencing data.

PubMed

Graña, Osvaldo; López-Fernández, Hugo; Fdez-Riverola, Florentino; González Pisano, David; Glez-Peña, Daniel

2018-04-15

High-throughput sequencing of bisulfite-converted DNA is a technique used to measure DNA methylation levels. Although a considerable number of computational pipelines have been developed to analyze such data, none of them tackles all the peculiarities of the analysis together, revealing limitations that can force the user to manually perform additional steps needed for a complete processing of the data. This article presents bicycle, an integrated, flexible analysis pipeline for bisulfite sequencing data. Bicycle analyzes whole genome bisulfite sequencing data, targeted bisulfite sequencing data and hydroxymethylation data. To show how bicycle overtakes other available pipelines, we compared them on a defined number of features that are summarized in a table. We also tested bicycle with both simulated and real datasets, to show its level of performance, and compared it to different state-of-the-art methylation analysis pipelines. Bicycle is publicly available under GNU LGPL v3.0 license at http://www.sing-group.org/bicycle. Users can also download a customized Ubuntu LiveCD including bicycle and other bisulfite sequencing data pipelines compared here. In addition, a docker image with bicycle and its dependencies, which allows a straightforward use of bicycle in any platform (e.g. Linux, OS X or Windows), is also available. ograna@cnio.es or dgpena@uvigo.es. Supplementary data are available at Bioinformatics online.
Impact of Next Generation Sequencing Techniques in Food Microbiology

PubMed Central

Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana

2014-01-01

Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799
Scanning the Effects of Ethyl Methanesulfonate on the Whole Genome of Lotus japonicus Using Second-Generation Sequencing Analysis

PubMed Central

Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M.; Biswas, Bandana; Batley, Jacqueline

2015-01-01

Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm. PMID:25660167
International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

PubMed Central

Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

2015-01-01

This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
Profile and Fate of Bacterial Pathogens in Sewage Treatment Plants Revealed by High-Throughput Metagenomic Approach.

PubMed

Li, Bing; Ju, Feng; Cai, Lin; Zhang, Tong

2015-09-01

The broad-spectrum profile of bacterial pathogens and their fate in sewage treatment plants (STPs) were investigated using high-throughput sequencing based metagenomic approach. This novel approach could provide a united platform to standardize bacterial pathogen detection and realize direct comparison among different samples. Totally, 113 bacterial pathogen species were detected in eight samples including influent, effluent, activated sludge (AS), biofilm, and anaerobic digestion sludge with the abundances ranging from 0.000095% to 4.89%. Among these 113 bacterial pathogens, 79 species were reported in STPs for the first time. Specially, compared to AS in bulk mixed liquor, more pathogen species and higher total abundance were detected in upper foaming layer of AS. This suggests that the foaming layer of AS might impose more threat to onsite workers and citizens in the surrounding areas of STPs because pathogens in foaming layer are easily transferred into air and cause possible infections. The high removal efficiency (98.0%) of total bacterial pathogens suggests that AS treatment process is effective to remove most bacterial pathogens. Remarkable similarities of bacterial pathogen compositions between influent and human gut indicated that bacterial pathogen profiles in influents could well reflect the average bacterial pathogen communities of urban resident guts within the STP catchment area.
ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

PubMed

Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie

2014-02-01

Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.
Indigenous species barcode database improves the identification of zooplankton

PubMed Central

Yang, Jianghua; Zhang, Wanwan; Sun, Jingying; Xie, Yuwei; Zhang, Yimin; Burton, G. Allen; Yu, Hongxia

2017-01-01

Incompleteness and inaccuracy of DNA barcode databases is considered an important hindrance to the use of metabarcoding in biodiversity analysis of zooplankton at the species-level. Species barcoding by Sanger sequencing is inefficient for organisms with small body sizes, such as zooplankton. Here mitochondrial cytochrome c oxidase I (COI) fragment barcodes from 910 freshwater zooplankton specimens (87 morphospecies) were recovered by a high-throughput sequencing platform, Ion Torrent PGM. Intraspecific divergence of most zooplanktons was < 5%, except Branchionus leydign (Rotifer, 14.3%), Trichocerca elongate (Rotifer, 11.5%), Lecane bulla (Rotifer, 15.9%), Synchaeta oblonga (Rotifer, 5.95%) and Schmackeria forbesi (Copepod, 6.5%). Metabarcoding data of 28 environmental samples from Lake Tai were annotated by both an indigenous database and NCBI Genbank database. The indigenous database improved the taxonomic assignment of metabarcoding of zooplankton. Most zooplankton (81%) with barcode sequences in the indigenous database were identified by metabarcoding monitoring. Furthermore, the frequency and distribution of zooplankton were also consistent between metabarcoding and morphology identification. Overall, the indigenous database improved the taxonomic assignment of zooplankton. PMID:28977035
Identification of differentially expressed genes in cucumber (Cucumis sativus L.) root under waterlogging stress by digital gene expression profile.

PubMed

Qi, Xiao-Hua; Xu, Xue-Wen; Lin, Xiao-Jian; Zhang, Wen-Jie; Chen, Xue-Hao

2012-03-01

High-throughput tag-sequencing (Tag-seq) analysis based on the Solexa Genome Analyzer platform was applied to analyze the gene expression profiling of cucumber plant at 5 time points over a 24h period of waterlogging treatment. Approximately 5.8 million total clean sequence tags per library were obtained with 143013 distinct clean tag sequences. Approximately 23.69%-29.61% of the distinct clean tags were mapped unambiguously to the unigene database, and 53.78%-60.66% of the distinct clean tags were mapped to the cucumber genome database. Analysis of the differentially expressed genes revealed that most of the genes were down-regulated in the waterlogging stages, and the differentially expressed genes mainly linked to carbon metabolism, photosynthesis, reactive oxygen species generation/scavenging, and hormone synthesis/signaling. Finally, quantitative real-time polymerase chain reaction using nine genes independently verified the tag-mapped results. This present study reveals the comprehensive mechanisms of waterlogging-responsive transcription in cucumber. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Diversity of the TLR4 Immunity Receptor in Czech Native Cattle Breeds Revealed Using the Pacific Biosciences Sequencing Platform.

PubMed

Novák, Karel; Pikousová, Jitka; Czerneková, Vladimíra; Mátlová, Věra

2017-07-03

The allelic variants of immunity genes in historical breeds likely reflect local infection pressure and therefore represent a reservoir for breeding. Screening to determine the diversity of the Toll-like receptor gene TLR4 was conducted in two conserved cattle breeds: Czech Red and Czech Red Pied. High-throughput sequencing of pooled PCR amplicons using the PacBio platform revealed polymorphisms, which were subsequently confirmed via genotyping techniques. Eight SNPs found in coding and adjacent regions were grouped into 18 haplotypes, representing a significant portion of the known diversity in the global breed panel and presumably exceeding diversity in production populations. Notably, the ancient Czech Red breed appeared to possess greater haplotype diversity than the Czech Red Pied breed, a Simmental variant, although the haplotype frequencies might have been distorted by significant crossbreeding and bottlenecks in the history of Czech Red cattle. The differences in haplotype frequencies validated the phenotypic distinctness of the local breeds. Due to the availability of Czech Red Pied production herds, the effect of intensive breeding on TLR diversity can be evaluated in this model. The advantages of the Pacific Biosciences technology for the resequencing of long PCR fragments with subsequent direct phasing were independently validated.
High Throughput Light Absorber Discovery, Part 2: Establishing Structure-Band Gap Energy Relationships.

PubMed

Suram, Santosh K; Newhouse, Paul F; Zhou, Lan; Van Campen, Douglas G; Mehta, Apurva; Gregoire, John M

2016-11-14

Combinatorial materials science strategies have accelerated materials development in a variety of fields, and we extend these strategies to enable structure-property mapping for light absorber materials, particularly in high order composition spaces. High throughput optical spectroscopy and synchrotron X-ray diffraction are combined to identify the optical properties of Bi-V-Fe oxides, leading to the identification of Bi 4 V 1.5 Fe 0.5 O 10.5 as a light absorber with direct band gap near 2.7 eV. The strategic combination of experimental and data analysis techniques includes automated Tauc analysis to estimate band gap energies from the high throughput spectroscopy data, providing an automated platform for identifying new optical materials.
SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

PubMed

Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

2012-07-15

In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.
Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing.

PubMed

Giraud, Mathieu; Salson, Mikaël; Duez, Marc; Villenet, Céline; Quief, Sabine; Caillault, Aurélie; Grardel, Nathalie; Roumier, Christophe; Preudhomme, Claude; Figeac, Martin

2014-05-28

V(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood. We propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols. The proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.
Developing a novel fiber optic fluorescence device for multiplexed high-throughput cytotoxic screening.

PubMed

Lee, Dennis; Barnes, Stephen

2010-01-01

The need for new pharmacological agents is unending. Yet the drug discovery process has changed substantially over the past decade and continues to evolve in response to new technologies. There is presently a high demand to reduce discovery time by improving specific lab disciplines and developing new technology platforms in the area of cell-based assay screening. Here we present the developmental concept and early stage testing of the Ab-Sniffer, a novel fiber optic fluorescence device for high-throughput cytotoxicity screening using an immobilized whole cell approach. The fused silica fibers are chemically functionalized with biotin to provide interaction with fluorescently labeled, streptavidin functionalized alginate-chitosan microspheres. The microspheres are also functionalized with Concanavalin A to facilitate binding to living cells. By using lymphoma cells and rituximab in an adaptation of a well-known cytotoxicity protocol we demonstrate the utility of the Ab-Sniffer for functional screening of potential drug compounds rather than indirect, non-functional screening via binding assay. The platform can be extended to any assay capable of being tied to a fluorescence response including multiple target cells in each well of a multi-well plate for high-throughput screening.
The Emory Chemical Biology Discovery Center: leveraging academic innovation to advance novel targets through HTS and beyond.

PubMed

Johns, Margaret A; Meyerkord-Belton, Cheryl L; Du, Yuhong; Fu, Haian

2014-03-01

The Emory Chemical Biology Discovery Center (ECBDC) aims to accelerate high throughput biology and translation of biomedical research discoveries into therapeutic targets and future medicines by providing high throughput research platforms to scientific collaborators worldwide. ECBDC research is focused at the interface of chemistry and biology, seeking to fundamentally advance understanding of disease-related biology with its HTS/HCS platforms and chemical tools, ultimately supporting drug discovery. Established HTS/HCS capabilities, university setting, and expertise in diverse assay formats, including protein-protein interaction interrogation, have enabled the ECBDC to contribute to national chemical biology efforts, empower translational research, and serve as a training ground for young scientists. With these resources, the ECBDC is poised to leverage academic innovation to advance biology and therapeutic discovery.
Xi-cam: Flexible High Throughput Data Processing for GISAXS

NASA Astrophysics Data System (ADS)

Pandolfi, Ronald; Kumar, Dinesh; Venkatakrishnan, Singanallur; Sarje, Abinav; Krishnan, Hari; Pellouchoud, Lenson; Ren, Fang; Fournier, Amanda; Jiang, Zhang; Tassone, Christopher; Mehta, Apurva; Sethian, James; Hexemer, Alexander

With increasing capabilities and data demand for GISAXS beamlines, supporting software is under development to handle larger data rates, volumes, and processing needs. We aim to provide a flexible and extensible approach to GISAXS data treatment as a solution to these rising needs. Xi-cam is the CAMERA platform for data management, analysis, and visualization. The core of Xi-cam is an extensible plugin-based GUI platform which provides users an interactive interface to processing algorithms. Plugins are available for SAXS/GISAXS data and data series visualization, as well as forward modeling and simulation through HipGISAXS. With Xi-cam's advanced mode, data processing steps are designed as a graph-based workflow, which can be executed locally or remotely. Remote execution utilizes HPC or de-localized resources, allowing for effective reduction of high-throughput data. Xi-cam is open-source and cross-platform. The processing algorithms in Xi-cam include parallel cpu and gpu processing optimizations, also taking advantage of external processing packages such as pyFAI. Xi-cam is available for download online.

Integrated nanopore sensing platform with sub-microsecond temporal resolution

PubMed Central

Rosenstein, Jacob K; Wanunu, Meni; Merchant, Christopher A; Drndic, Marija; Shepard, Kenneth L

2012-01-01

Nanopore sensors have attracted considerable interest for high-throughput sensing of individual nucleic acids and proteins without the need for chemical labels or complex optics. A prevailing problem in nanopore applications is that the transport kinetics of single biomolecules are often faster than the measurement time resolution. Methods to slow down biomolecular transport can be troublesome and are at odds with the natural goal of high-throughput sensing. Here we introduce a low-noise measurement platform that integrates a complementary metal-oxide semiconductor (CMOS) preamplifier with solid-state nanopores in thin silicon nitride membranes. With this platform we achieved a signal-to-noise ratio exceeding five at a bandwidth of 1 MHz, which to our knowledge is the highest bandwidth nanopore recording to date. We demonstrate transient signals as brief as 1 μs from short DNA molecules as well as current signatures during molecular passage events that shed light on submolecular DNA configurations in small nanopores. PMID:22426489
High-throughput detection of ethanol-producing cyanobacteria in a microdroplet platform

PubMed Central

Abalde-Cela, Sara; Gould, Anna; Liu, Xin; Kazamia, Elena; Smith, Alison G.; Abell, Chris

2015-01-01

Ethanol production by microorganisms is an important renewable energy source. Most processes involve fermentation of sugars from plant feedstock, but there is increasing interest in direct ethanol production by photosynthetic organisms. To facilitate this, a high-throughput screening technique for the detection of ethanol is required. Here, a method for the quantitative detection of ethanol in a microdroplet-based platform is described that can be used for screening cyanobacterial strains to identify those with the highest ethanol productivity levels. The detection of ethanol by enzymatic assay was optimized both in bulk and in microdroplets. In parallel, the encapsulation of engineered ethanol-producing cyanobacteria in microdroplets and their growth dynamics in microdroplet reservoirs were demonstrated. The combination of modular microdroplet operations including droplet generation for cyanobacteria encapsulation, droplet re-injection and pico-injection, and laser-induced fluorescence, were used to create this new platform to screen genetically engineered strains of cyanobacteria with different levels of ethanol production. PMID:25878135
REDItools: high-throughput RNA editing detection made easy.

PubMed

Picardi, Ernesto; Pesole, Graziano

2013-07-15

The reliable detection of RNA editing sites from massive sequencing data remains challenging and, although several methodologies have been proposed, no computational tools have been released to date. Here, we introduce REDItools a suite of python scripts to perform high-throughput investigation of RNA editing using next-generation sequencing data. REDItools are in python programming language and freely available at http://code.google.com/p/reditools/. ernesto.picardi@uniba.it or graziano.pesole@uniba.it Supplementary data are available at Bioinformatics online.
High throughput protein production screening

DOEpatents

Beernink, Peter T [Walnut Creek, CA; Coleman, Matthew A [Oakland, CA; Segelke, Brent W [San Ramon, CA

2009-09-08

Methods, compositions, and kits for the cell-free production and analysis of proteins are provided. The invention allows for the production of proteins from prokaryotic sequences or eukaryotic sequences, including human cDNAs using PCR and IVT methods and detecting the proteins through fluorescence or immunoblot techniques. This invention can be used to identify optimized PCR and WT conditions, codon usages and mutations. The methods are readily automated and can be used for high throughput analysis of protein expression levels, interactions, and functional states.
High-Throughput Sequencing: A Roadmap Toward Community Ecology

PubMed Central

Poisot, Timothée; Péquin, Bérangère; Gravel, Dominique

2013-01-01

High-throughput sequencing is becoming increasingly important in microbial ecology, yet it is surprisingly under-used to generate or test biogeographic hypotheses. In this contribution, we highlight how adding these methods to the ecologist toolbox will allow the detection of new patterns, and will help our understanding of the structure and dynamics of diversity. Starting with a review of ecological questions that can be addressed, we move on to the technical and analytical issues that will benefit from an increased collaboration between different disciplines. PMID:23610649
ShortRead: a bioconductor package for input, quality assessment and exploration of high-throughput sequence data

PubMed Central

Morgan, Martin; Anders, Simon; Lawrence, Michael; Aboyoun, Patrick; Pagès, Hervé; Gentleman, Robert

2009-01-01

Summary: ShortRead is a package for input, quality assessment, manipulation and output of high-throughput sequencing data. ShortRead is provided in the R and Bioconductor environments, allowing ready access to additional facilities for advanced statistical analysis, data transformation, visualization and integration with diverse genomic resources. Availability and Implementation: This package is implemented in R and available at the Bioconductor web site; the package contains a ‘vignette’ outlining typical work flows. Contact: mtmorgan@fhcrc.org PMID:19654119
NiftyPET: a High-throughput Software Platform for High Quantitative Accuracy and Precision PET Imaging and Analysis.

PubMed

Markiewicz, Pawel J; Ehrhardt, Matthias J; Erlandsson, Kjell; Noonan, Philip J; Barnes, Anna; Schott, Jonathan M; Atkinson, David; Arridge, Simon R; Hutton, Brian F; Ourselin, Sebastien

2018-01-01

We present a standalone, scalable and high-throughput software platform for PET image reconstruction and analysis. We focus on high fidelity modelling of the acquisition processes to provide high accuracy and precision quantitative imaging, especially for large axial field of view scanners. All the core routines are implemented using parallel computing available from within the Python package NiftyPET, enabling easy access, manipulation and visualisation of data at any processing stage. The pipeline of the platform starts from MR and raw PET input data and is divided into the following processing stages: (1) list-mode data processing; (2) accurate attenuation coefficient map generation; (3) detector normalisation; (4) exact forward and back projection between sinogram and image space; (5) estimation of reduced-variance random events; (6) high accuracy fully 3D estimation of scatter events; (7) voxel-based partial volume correction; (8) region- and voxel-level image analysis. We demonstrate the advantages of this platform using an amyloid brain scan where all the processing is executed from a single and uniform computational environment in Python. The high accuracy acquisition modelling is achieved through span-1 (no axial compression) ray tracing for true, random and scatter events. Furthermore, the platform offers uncertainty estimation of any image derived statistic to facilitate robust tracking of subtle physiological changes in longitudinal studies. The platform also supports the development of new reconstruction and analysis algorithms through restricting the axial field of view to any set of rings covering a region of interest and thus performing fully 3D reconstruction and corrections using real data significantly faster. All the software is available as open source with the accompanying wiki-page and test data.
Variant discovery in the sheepmeat odour and flavour in javanese fat tailed sheep using RNA sequencing

NASA Astrophysics Data System (ADS)

Abuzahra, M. A. M.; Jakaria; Listyarini, K.; Furqon, A.; Sumantri, C.; Uddin, M. J.; Gunawan, A.

2018-05-01

High-throughput RNA sequencing (RNA-Seq) reveals new challenges for the detection of transcriptome variants (SNPs) in different tissues and species. The aims of this study was to characterize a SNP discovery analysis in the sheep meat odour and flavour transcriptome using RNA-Seq. Six liver samples from divergent sheep meat odour and flavour were analyzed using the Illumina Genome Hiseq 2500 Analyzer. The SNP detection analysis revealed 142 SNPs in sheep meat samples, and a large number of those corresponded to differences between high and low sheep meat odour and flavour ovis genome assembly OAR v4.0. Among them, about 90.4% of genes had multiple polymorphisms within 12 genes (JAML, ANGPTL8, LOC101103463, SEPW1, SCN5A, LOC101113036, DOCK6, GTSE1, KIF12, KCTD17, KANK2, CYP2A6). Several of the SNPs (JAML, CYP2A6, SEPW1, and KIF12) found in this study could be included as suitable markers in genotyping platforms to perform association analyses in commercial populations and apply genomic selection protocols in the sheep meat production.
Pneumatic Microvalve-Based Hydrodynamic Sample Injection for High-Throughput, Quantitative Zone Electrophoresis in Capillaries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kelly, Ryan T.; Wang, Chenchen; Rausch, Sarah J.

2014-07-01

A hybrid microchip/capillary CE system was developed to allow unbiased and lossless sample loading and high throughput repeated injections. This new hybrid CE system consists of a polydimethylsiloxane (PDMS) microchip sample injector featuring a pneumatic microvalve that separates a sample introduction channel from a short sample loading channel and a fused silica capillary separation column that connects seamlessly to the sample loading channel. The sample introduction channel is pressurized such that when the pneumatic microvalve opens briefly, a variable-volume sample plug is introduced into the loading channel. A high voltage for CE separation is continuously applied across the loading channelmore » and the fused silica capillary separation column. Analytes are rapidly separated in the fused silica capillary with high resolution. High sensitivity MS detection after CE separation is accomplished via a sheathless CE/ESI-MS interface. The performance evaluation of the complete CE/ESI-MS platform demonstrated that reproducible sample injection with well controlled sample plug volumes could be achieved by using the PDMS microchip injector. The absence of band broadening from microchip to capillary indicated a minimum dead volume at the junction. The capabilities of the new CE/ESI-MS platform in performing high throughput and quantitative sample analyses were demonstrated by the repeated sample injection without interrupting an ongoing separation and a good linear dependence of the total analyte ion abundance on the sample plug volume using a mixture of peptide standards. The separation efficiency of the new platform was also evaluated systematically at different sample injection times, flow rates and CE separation voltages.« less
A platform for efficient genotyping in Musa using microsatellite markers

PubMed Central

Christelová, Pavla; Valárik, Miroslav; Hřibová, Eva; Van den houwe, Ines; Channelière, Stéphanie; Roux, Nicolas; Doležel, Jaroslav

2011-01-01

Background and aims Bananas and plantains (Musa spp.) are one of the major fruit crops worldwide with acknowledged importance as a staple food for millions of people. The rich genetic diversity of this crop is, however, endangered by diseases, adverse environmental conditions and changed farming practices, and the need for its characterization and preservation is urgent. With the aim of providing a simple and robust approach for molecular characterization of Musa species, we developed an optimized genotyping platform using 19 published simple sequence repeat markers. Methodology The genotyping system is based on 19 microsatellite loci, which are scored using fluorescently labelled primers and high-throughput capillary electrophoresis separation with high resolution. This genotyping platform was tested and optimized on a set of 70 diploid and 38 triploid banana accessions. Principal results The marker set used in this study provided enough polymorphism to discriminate between individual species, subspecies and subgroups of all accessions of Musa. Likewise, the capability of identifying duplicate samples was confirmed. Based on the results of a blind test, the genotyping system was confirmed to be suitable for characterization of unknown accessions. Conclusions Here we report on the first complex and standardized platform for molecular characterization of Musa germplasm that is ready to use for the wider Musa research and breeding community. We believe that this genotyping system offers a versatile tool that can accommodate all possible requirements for characterizing Musa diversity, and is economical for samples ranging from one to many accessions. PMID:22476494
Ultra-High-Throughput Screening of an In Vitro-Synthesized Horseradish Peroxidase Displayed on Microbeads Using Cell Sorter

PubMed Central

Zhu, Bo; Mizoguchi, Takuro; Kojima, Takaaki; Nakano, Hideo

2015-01-01

The C1a isoenzyme of horseradish peroxidase (HRP) is an industrially important heme-containing enzyme that utilizes hydrogen peroxide to oxidize a wide variety of inorganic and organic compounds for practical applications, including synthesis of fine chemicals, medical diagnostics, and bioremediation. To develop a ultra-high-throughput screening system for HRP, we successfully produced active HRP in an Escherichia coli cell-free protein synthesis system, by adding disulfide bond isomerase DsbC and optimizing the concentrations of hemin and calcium ions and the temperature. The biosynthesized HRP was fused with a single-chain Cro (scCro) DNA-binding tag at its N-terminal and C-terminal sites. The addition of the scCro-tag at both ends increased the solubility of the protein. Next, HRP and its fusion proteins were successfully synthesized in a water droplet emulsion by using hexadecane as the oil phase and SunSoft No. 818SK as the surfactant. HRP fusion proteins were displayed on microbeads attached with double-stranded DNA (containing the scCro binding sequence) via scCro-DNA interactions. The activities of the immobilized HRP fusion proteins were detected with a tyramide-based fluorogenic assay using flow cytometry. Moreover, a model microbead library containing wild type hrp (WT) and inactive mutant (MUT) genes was screened using fluorescence-activated cell-sorting, thus efficiently enriching the WT gene from the 1:100 (WT:MUT) library. The technique described here could serve as a novel platform for the ultra-high-throughput discovery of more useful HRP mutants and other heme-containing peroxidases. PMID:25993095
Scaling up the 454 Titanium Library Construction and Pooling of Barcoded Libraries

DOE Office of Scientific and Technical Information (OSTI.GOV)

Phung, Wilson; Hack, Christopher; Shapiro, Harris

2009-03-23

We have been developing a high throughput 454 library construction process at the Joint Genome Institute to meet the needs of de novo sequencing a large number of microbial and eukaryote genomes, EST, and metagenome projects. We have been focusing efforts in three areas: (1) modifying the current process to allow the construction of 454 standard libraries on a 96-well format; (2) developing a robotic platform to perform the 454 library construction; and (3) designing molecular barcodes to allow pooling and sorting of many different samples. In the development of a high throughput process to scale up the number ofmore » libraries by adapting the process to a 96-well plate format, the key process change involves the replacement of gel electrophoresis for size selection with Solid Phase Reversible Immobilization (SPRI) beads. Although the standard deviation of the insert sizes increases, the overall quality sequence and distribution of the reads in the genome has not changed. The manual process of constructing 454 shotgun libraries on 96-well plates is a time-consuming, labor-intensive, and ergonomically hazardous process; we have been experimenting to program a BioMek robot to perform the library construction. This will not only enable library construction to be completed in a single day, but will also minimize any ergonomic risk. In addition, we have implemented a set of molecular barcodes (AKA Multiple Identifiers or MID) and a pooling process that allows us to sequence many targets simultaneously. Here we will present the testing of pooling a set of selected fosmids derived from the endomycorrhizal fungus Glomus intraradices. By combining the robotic library construction process and the use of molecular barcodes, it is now possible to sequence hundreds of fosmids that represent a minimal tiling path of this genome. Here we present the progress and the challenges of developing these scaled-up processes.« less
Forecasting Ecological Genomics: High-Tech Animal Instrumentation Meets High-Throughput Sequencing

PubMed Central

Shafer, Aaron B. A.; Northrup, Joseph M.; Wikelski, Martin; Wittemyer, George; Wolf, Jochen B. W.

2016-01-01

Recent advancements in animal tracking technology and high-throughput sequencing are rapidly changing the questions and scope of research in the biological sciences. The integration of genomic data with high-tech animal instrumentation comes as a natural progression of traditional work in ecological genetics, and we provide a framework for linking the separate data streams from these technologies. Such a merger will elucidate the genetic basis of adaptive behaviors like migration and hibernation and advance our understanding of fundamental ecological and evolutionary processes such as pathogen transmission, population responses to environmental change, and communication in natural populations. PMID:26745372
Multiplexed fragaria chloroplast genome sequencing

Treesearch

W. Njuguna; A. Liston; R. Cronn; N.V. Bassil

2010-01-01

A method to sequence multiple chloroplast genomes using ultra high throughput sequencing technologies was recently described. Complete chloroplast genome sequences can resolve phylogenetic relationships at low taxonomic levels and identify informative point mutations and indels. The objective of this research was to sequence multiple Fragaria...
A simple cell-based high throughput screening (HTS) assay for inhibitors of Salmonella enterica RNA polymerase containing the general stress response regulator RpoS (σS).

PubMed

Campos-Gomez, Javier; Benitez, Jorge A

2018-07-01

RNA polymerase containing the stress response regulator σ S subunit (RpoS) plays a key role in bacterial survival in hostile environments in nature and during infection. Here we devise and validate a simple cell-based high throughput luminescence assay for this holoenzyme suitable for screening large chemical libraries in a robotic platform. Copyright © 2018 Elsevier B.V. All rights reserved.
High-Throughput Sequencing, a Versatile Weapon to Support Genome-Based Diagnosis in Infectious Diseases: Applications to Clinical Bacteriology

PubMed Central

Caboche, Ségolène; Audebert, Christophe; Hot, David

2014-01-01

The recent progresses of high-throughput sequencing (HTS) technologies enable easy and cost-reduced access to whole genome sequencing (WGS) or re-sequencing. HTS associated with adapted, automatic and fast bioinformatics solutions for sequencing applications promises an accurate and timely identification and characterization of pathogenic agents. Many studies have demonstrated that data obtained from HTS analysis have allowed genome-based diagnosis, which has been consistent with phenotypic observations. These proofs of concept are probably the first steps toward the future of clinical microbiology. From concept to routine use, many parameters need to be considered to promote HTS as a powerful tool to help physicians and clinicians in microbiological investigations. This review highlights the milestones to be completed toward this purpose. PMID:25437800
[Genetic analysis of two children patients affected with CHARGE syndrome].

PubMed

Li, Guoqiang; Li, Niu; Xu, Yufei; Li, Juan; Ding, Yu; Shen, Yiping; Wang, Xiumin; Wang, Jian

2018-04-10

To analyze two Chinese pediatric patients with multiple malformations and growth and development delay. Both patients were subjected to targeted gene sequencing, and the results were analyzed with Ingenuity Variant Analysis software. Suspected pathogenic variations were verified by Sanger sequencing. High-throughput sequencing showed that both patients have carried heterozygous variants of the CHD7 gene. Patient 1 carried a nonsense mutation in exon 36 (c.7957C>T, p.Arg2653*), while patient 2 carried a nonsense mutation of exon 2 (c.718C>T, p.Gln240*). Sanger sequencing confirmed the above mutations in both patients, while their parents were of wild-type for the corresponding sites, indicating that the two mutations have happened de novo. Two patients were diagnosed with CHARGE syndrome by high-throughput sequencing.
Identification of a novel NHS mutation in a Chinese family with Nance-Horan syndrome.

PubMed

Li, Aijun; Li, Bingzhen; Wu, Lemeng; Yang, Liping; Chen, Ningning; Ma, Zhizhong

2015-04-01

To identiy the disease causing mutation in a Chinese family presenting with early-onset cataract and dental anomalies. A specific Hereditary Eye Disease Enrichment Panel (HEDEP) (personalized customization by MyGenostics, Baltimore, MD) based on targeted exome capture technology was used to collect the protein coding regions of 30 early-onset cataract associated genes, and high throughput sequencing was done with Illumina HiSeq 2000 platform. The identified variant was confirmed with Sanger sequencing. A novel deletion in exon 4 (c.852delG) of NHS gene was identified; the identified 1 bp deletion altered the reading frame and was predicted to result in a premature stop codon after the addition of twelve novel amino acid (p.S285PfsX13). This mutation co-segregated in affected males and obligate female carriers, but was absent in 100 matched controls. Our findings broaden the spectrum of NHS mutations causing Nance-Horan syndrome and phenotypic spectrum of the disease in Chinese patients.
Development and Evaluation of a 9K SNP Array for Peach by Internationally Coordinated SNP Detection and Validation in Breeding Germplasm

PubMed Central

Scalabrin, Simone; Gilmore, Barbara; Lawley, Cynthia T.; Gasic, Ksenija; Micheletti, Diego; Rosyara, Umesh R.; Cattonaro, Federica; Vendramin, Elisa; Main, Dorrie; Aramini, Valeria; Blas, Andrea L.; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Troggio, Michela; Sosinski, Bryon; Aranzana, Maria José; Arús, Pere; Iezzoni, Amy; Morgante, Michele; Peace, Cameron

2012-01-01

Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs. The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species. PMID:22536421
Evaluation of multiplex assay platforms for detection of influenza hemagglutinin subtype specific antibody responses.

PubMed

Li, Zhu-Nan; Weber, Kimberly M; Limmer, Rebecca A; Horne, Bobbi J; Stevens, James; Schwerzmann, Joy; Wrammert, Jens; McCausland, Megan; Phipps, Andrew J; Hancock, Kathy; Jernigan, Daniel B; Levine, Min; Katz, Jacqueline M; Miller, Joseph D

2017-05-01

Influenza hemagglutination inhibition (HI) and virus microneutralization assays (MN) are widely used for seroprevalence studies. However, these assays have limited field portability and are difficult to fully automate for high throughput laboratory testing. To address these issues, three multiplex influenza subtype-specific antibody detection assays were developed using recombinant hemagglutinin antigens in combination with Chembio, Luminex ® , and ForteBio ® platforms. Assay sensitivity, specificity, and subtype cross-reactivity were evaluated using a panel of well characterized human sera. Compared to the traditional HI, assay sensitivity ranged from 87% to 92% and assay specificity in sera collected from unexposed persons ranged from 65% to 100% across the platforms. High assay specificity (86-100%) for A(H5N1) rHA was achieved for sera from exposed or unexposed to hetorosubtype influenza HAs. In contrast, assay specificity for A(H1N1)pdm09 rHA using sera collected from A/Vietnam/1204/2004 (H5N1) vaccinees in 2008 was low (22-30%) in all platforms. Although cross-reactivity against rHA subtype proteins was observed in each assay platform, the correct subtype specific responses were identified 78%-94% of the time when paired samples were available for analysis. These results show that high throughput and portable multiplex assays that incorporate rHA can be used to identify influenza subtype specific infections. Published by Elsevier B.V.

High-throughput gene mapping in Caenorhabditis elegans.

PubMed

Swan, Kathryn A; Curtis, Damian E; McKusick, Kathleen B; Voinov, Alexander V; Mapa, Felipa A; Cancilla, Michael R

2002-07-01

Positional cloning of mutations in model genetic systems is a powerful method for the identification of targets of medical and agricultural importance. To facilitate the high-throughput mapping of mutations in Caenorhabditis elegans, we have identified a further 9602 putative new single nucleotide polymorphisms (SNPs) between two C. elegans strains, Bristol N2 and the Hawaiian mapping strain CB4856, by sequencing inserts from a CB4856 genomic DNA library and using an informatics pipeline to compare sequences with the canonical N2 genomic sequence. When combined with data from other laboratories, our marker set of 17,189 SNPs provides even coverage of the complete worm genome. To date, we have confirmed >1099 evenly spaced SNPs (one every 91 +/- 56 kb) across the six chromosomes and validated the utility of our SNP marker set and new fluorescence polarization-based genotyping methods for systematic and high-throughput identification of genes in C. elegans by cloning several proprietary genes. We illustrate our approach by recombination mapping and confirmation of the mutation in the cloned gene, dpy-18.
SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments.

PubMed

Burdick, David B; Cavnor, Chris C; Handcock, Jeremy; Killcoyne, Sarah; Lin, Jake; Marzolf, Bruz; Ramsey, Stephen A; Rovira, Hector; Bressler, Ryan; Shmulevich, Ilya; Boyle, John

2010-07-14

High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires. Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code. The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services.
SEQADAPT: an adaptable system for the tracking, storage and analysis of high throughput sequencing experiments

PubMed Central

2010-01-01

Background High throughput sequencing has become an increasingly important tool for biological research. However, the existing software systems for managing and processing these data have not provided the flexible infrastructure that research requires. Results Existing software solutions provide static and well-established algorithms in a restrictive package. However as high throughput sequencing is a rapidly evolving field, such static approaches lack the ability to readily adopt the latest advances and techniques which are often required by researchers. We have used a loosely coupled, service-oriented infrastructure to develop SeqAdapt. This system streamlines data management and allows for rapid integration of novel algorithms. Our approach also allows computational biologists to focus on developing and applying new methods instead of writing boilerplate infrastructure code. Conclusion The system is based around the Addama service architecture and is available at our website as a demonstration web application, an installable single download and as a collection of individual customizable services. PMID:20630057
Library Design-Facilitated High-Throughput Sequencing of Synthetic Peptide Libraries.

PubMed

Vinogradov, Alexander A; Gates, Zachary P; Zhang, Chi; Quartararo, Anthony J; Halloran, Kathryn H; Pentelute, Bradley L

2017-11-13

A methodology to achieve high-throughput de novo sequencing of synthetic peptide mixtures is reported. The approach leverages shotgun nanoliquid chromatography coupled with tandem mass spectrometry-based de novo sequencing of library mixtures (up to 2000 peptides) as well as automated data analysis protocols to filter away incorrect assignments, noise, and synthetic side-products. For increasing the confidence in the sequencing results, mass spectrometry-friendly library designs were developed that enabled unambiguous decoding of up to 600 peptide sequences per hour while maintaining greater than 85% sequence identification rates in most cases. The reliability of the reported decoding strategy was additionally confirmed by matching fragmentation spectra for select authentic peptides identified from library sequencing samples. The methods reported here are directly applicable to screening techniques that yield mixtures of active compounds, including particle sorting of one-bead one-compound libraries and affinity enrichment of synthetic library mixtures performed in solution.
Mapping DNA polymerase errors by single-molecule sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, David F.; Lu, Jenny; Chang, Seungwoo

Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less
Mapping DNA polymerase errors by single-molecule sequencing

DOE PAGES

Lee, David F.; Lu, Jenny; Chang, Seungwoo; ...

2016-05-16

Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less
The Einstein Genome Gateway using WASP - a high throughput multi-layered life sciences portal for XSEDE.

PubMed

Golden, Aaron; McLellan, Andrew S; Dubin, Robert A; Jing, Qiang; O Broin, Pilib; Moskowitz, David; Zhang, Zhengdong; Suzuki, Masako; Hargitai, Joseph; Calder, R Brent; Greally, John M

2012-01-01

Massively-parallel sequencing (MPS) technologies and their diverse applications in genomics and epigenomics research have yielded enormous new insights into the physiology and pathophysiology of the human genome. The biggest hurdle remains the magnitude and diversity of the datasets generated, compromising our ability to manage, organize, process and ultimately analyse data. The Wiki-based Automated Sequence Processor (WASP), developed at the Albert Einstein College of Medicine (hereafter Einstein), uniquely manages to tightly couple the sequencing platform, the sequencing assay, sample metadata and the automated workflows deployed on a heterogeneous high performance computing cluster infrastructure that yield sequenced, quality-controlled and 'mapped' sequence data, all within the one operating environment accessible by a web-based GUI interface. WASP at Einstein processes 4-6 TB of data per week and since its production cycle commenced it has processed ~ 1 PB of data overall and has revolutionized user interactivity with these new genomic technologies, who remain blissfully unaware of the data storage, management and most importantly processing services they request. The abstraction of such computational complexity for the user in effect makes WASP an ideal middleware solution, and an appropriate basis for the development of a grid-enabled resource - the Einstein Genome Gateway - as part of the Extreme Science and Engineering Discovery Environment (XSEDE) program. In this paper we discuss the existing WASP system, its proposed middleware role, and its planned interaction with XSEDE to form the Einstein Genome Gateway.
Multi-function microfluidic platform for sensor integration.

PubMed

Fernandes, Ana C; Semenova, Daria; Panjan, Peter; Sesay, Adama M; Gernaey, Krist V; Krühne, Ulrich

2018-03-06

The limited availability of metabolite-specific sensors for continuous sampling and monitoring is one of the main bottlenecks contributing to failures in bioprocess development. Furthermore, only a limited number of approaches exist to connect currently available measurement systems with high throughput reactor units. This is especially relevant in the biocatalyst screening and characterization stage of process development. In this work, a strategy for sensor integration in microfluidic platforms is demonstrated, to address the need for rapid, cost-effective and high-throughput screening in bioprocesses. This platform is compatible with different sensor formats by enabling their replacement and was built in order to be highly flexible and thus suitable for a wide range of applications. Moreover, this re-usable platform can easily be connected to analytical equipment, such as HPLC, laboratory scale reactors or other microfluidic chips through the use of standardized fittings. In addition, the developed platform includes a two-sensor system interspersed with a mixing channel, which allows the detection of samples that might be outside the first sensor's range of detection, through dilution of the sample solution up to 10 times. In order to highlight the features of the proposed platform, inline monitoring of glucose levels is presented and discussed. Glucose was chosen due to its importance in biotechnology as a relevant substrate. The platform demonstrated continuous measurement of substrate solutions for up to 12 h. Furthermore, the influence of the fluid velocity on substrate diffusion was observed, indicating the need for in-flow calibration to achieve a good quantitative output. Copyright © 2018 Elsevier B.V. All rights reserved.
Accelerating the Design of Solar Thermal Fuel Materials through High Throughput Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Y; Grossman, JC

2014-12-01

Solar thermal fuels (STF) store the energy of sunlight, which can then be released later in the form of heat, offering an emission-free and renewable solution for both solar energy conversion and storage. However, this approach is currently limited by the lack of low-cost materials with high energy density and high stability. In this Letter, we present an ab initio high-throughput computational approach to accelerate the design process and allow for searches over a broad class of materials. The high-throughput screening platform we have developed can run through large numbers of molecules composed of earth-abundant elements and identifies possible metastablemore » structures of a given material. Corresponding isomerization enthalpies associated with the metastable structures are then computed. Using this high-throughput simulation approach, we have discovered molecular structures with high isomerization enthalpies that have the potential to be new candidates for high-energy density STF. We have also discovered physical principles to guide further STF materials design through structural analysis. More broadly, our results illustrate the potential of using high-throughput ab initio simulations to design materials that undergo targeted structural transitions.« less
Characterization of DNA-protein interactions using high-throughput sequencing data from pulldown experiments

NASA Astrophysics Data System (ADS)

Moreland, Blythe; Oman, Kenji; Curfman, John; Yan, Pearlly; Bundschuh, Ralf

Methyl-binding domain (MBD) protein pulldown experiments have been a valuable tool in measuring the levels of methylated CpG dinucleotides. Due to the frequent use of this technique, high-throughput sequencing data sets are available that allow a detailed quantitative characterization of the underlying interaction between methylated DNA and MBD proteins. Analyzing such data sets, we first found that two such proteins cannot bind closer to each other than 2 bp, consistent with structural models of the DNA-protein interaction. Second, the large amount of sequencing data allowed us to find rather weak but nevertheless clearly statistically significant sequence preferences for several bases around the required CpG. These results demonstrate that pulldown sequencing is a high-precision tool in characterizing DNA-protein interactions. This material is based upon work supported by the National Science Foundation under Grant No. DMR-1410172.
Hybridization chain reaction amplification for highly sensitive fluorescence detection of DNA with dextran coated microarrays.

PubMed

Chao, Jie; Li, Zhenhua; Li, Jing; Peng, Hongzhen; Su, Shao; Li, Qian; Zhu, Changfeng; Zuo, Xiaolei; Song, Shiping; Wang, Lianhui; Wang, Lihua

2016-07-15

Microarrays of biomolecules hold great promise in the fields of genomics, proteomics, and clinical assays on account of their remarkably parallel and high-throughput assay capability. However, the fluorescence detection used in most conventional DNA microarrays is still limited by sensitivity. In this study, we have demonstrated a novel universal and highly sensitive platform for fluorescent detection of sequence specific DNA at the femtomolar level by combining dextran-coated microarrays with hybridization chain reaction (HCR) signal amplification. Three-dimensional dextran matrix was covalently coated on glass surface as the scaffold to immobilize DNA recognition probes to increase the surface binding capacity and accessibility. DNA nanowire tentacles were formed on the matrix surface for efficient signal amplification by capturing multiple fluorescent molecules in a highly ordered way. By quantifying microscopic fluorescent signals, the synergetic effects of dextran and HCR greatly improved sensitivity of DNA microarrays, with a detection limit of 10fM (1×10(5) molecules). This detection assay could recognize one-base mismatch with fluorescence signals dropped down to ~20%. This cost-effective microarray platform also worked well with samples in serum and thus shows great potential for clinical diagnosis. Copyright © 2016 Elsevier B.V. All rights reserved.
Pneumatic Microvalve-Based Hydrodynamic Sample Injection for High-Throughput, Quantitative Zone Electrophoresis in Capillaries

PubMed Central

2015-01-01

A hybrid microchip/capillary electrophoresis (CE) system was developed to allow unbiased and lossless sample loading and high-throughput repeated injections. This new hybrid CE system consists of a poly(dimethylsiloxane) (PDMS) microchip sample injector featuring a pneumatic microvalve that separates a sample introduction channel from a short sample loading channel, and a fused-silica capillary separation column that connects seamlessly to the sample loading channel. The sample introduction channel is pressurized such that when the pneumatic microvalve opens briefly, a variable-volume sample plug is introduced into the loading channel. A high voltage for CE separation is continuously applied across the loading channel and the fused-silica capillary separation column. Analytes are rapidly separated in the fused-silica capillary, and following separation, high-sensitivity MS detection is accomplished via a sheathless CE/ESI-MS interface. The performance evaluation of the complete CE/ESI-MS platform demonstrated that reproducible sample injection with well controlled sample plug volumes could be achieved by using the PDMS microchip injector. The absence of band broadening from microchip to capillary indicated a minimum dead volume at the junction. The capabilities of the new CE/ESI-MS platform in performing high-throughput and quantitative sample analyses were demonstrated by the repeated sample injection without interrupting an ongoing separation and a linear dependence of the total analyte ion abundance on the sample plug volume using a mixture of peptide standards. The separation efficiency of the new platform was also evaluated systematically at different sample injection times, flow rates, and CE separation voltages. PMID:24865952
GiNA, an Efficient and High-Throughput Software for Horticultural Phenotyping

PubMed Central

Diaz-Garcia, Luis; Covarrubias-Pazaran, Giovanny; Schlautman, Brandon; Zalapa, Juan

2016-01-01

Traditional methods for trait phenotyping have been a bottleneck for research in many crop species due to their intensive labor, high cost, complex implementation, lack of reproducibility and propensity to subjective bias. Recently, multiple high-throughput phenotyping platforms have been developed, but most of them are expensive, species-dependent, complex to use, and available only for major crops. To overcome such limitations, we present the open-source software GiNA, which is a simple and free tool for measuring horticultural traits such as shape- and color-related parameters of fruits, vegetables, and seeds. GiNA is multiplatform software available in both R and MATLAB® programming languages and uses conventional images from digital cameras with minimal requirements. It can process up to 11 different horticultural morphological traits such as length, width, two-dimensional area, volume, projected skin, surface area, RGB color, among other parameters. Different validation tests produced highly consistent results under different lighting conditions and camera setups making GiNA a very reliable platform for high-throughput phenotyping. In addition, five-fold cross validation between manually generated and GiNA measurements for length and width in cranberry fruits were 0.97 and 0.92. In addition, the same strategy yielded prediction accuracies above 0.83 for color estimates produced from images of cranberries analyzed with GiNA compared to total anthocyanin content (TAcy) of the same fruits measured with the standard methodology of the industry. Our platform provides a scalable, easy-to-use and affordable tool for massive acquisition of phenotypic data of fruits, seeds, and vegetables. PMID:27529547
GiNA, an Efficient and High-Throughput Software for Horticultural Phenotyping.

PubMed

Diaz-Garcia, Luis; Covarrubias-Pazaran, Giovanny; Schlautman, Brandon; Zalapa, Juan

2016-01-01

Traditional methods for trait phenotyping have been a bottleneck for research in many crop species due to their intensive labor, high cost, complex implementation, lack of reproducibility and propensity to subjective bias. Recently, multiple high-throughput phenotyping platforms have been developed, but most of them are expensive, species-dependent, complex to use, and available only for major crops. To overcome such limitations, we present the open-source software GiNA, which is a simple and free tool for measuring horticultural traits such as shape- and color-related parameters of fruits, vegetables, and seeds. GiNA is multiplatform software available in both R and MATLAB® programming languages and uses conventional images from digital cameras with minimal requirements. It can process up to 11 different horticultural morphological traits such as length, width, two-dimensional area, volume, projected skin, surface area, RGB color, among other parameters. Different validation tests produced highly consistent results under different lighting conditions and camera setups making GiNA a very reliable platform for high-throughput phenotyping. In addition, five-fold cross validation between manually generated and GiNA measurements for length and width in cranberry fruits were 0.97 and 0.92. In addition, the same strategy yielded prediction accuracies above 0.83 for color estimates produced from images of cranberries analyzed with GiNA compared to total anthocyanin content (TAcy) of the same fruits measured with the standard methodology of the industry. Our platform provides a scalable, easy-to-use and affordable tool for massive acquisition of phenotypic data of fruits, seeds, and vegetables.
Detection of Bacillus anthracis DNA in Complex Soil and Air Samples Using Next-Generation Sequencing

PubMed Central

Be, Nicholas A.; Thissen, James B.; Gardner, Shea N.; McLoughlin, Kevin S.; Fofanov, Viacheslav Y.; Koshinsky, Heather; Ellingson, Sally R.; Brettin, Thomas S.; Jackson, Paul J.; Jaing, Crystal J.

2013-01-01

Bacillus anthracis is the potentially lethal etiologic agent of anthrax disease, and is a significant concern in the realm of biodefense. One of the cornerstones of an effective biodefense strategy is the ability to detect infectious agents with a high degree of sensitivity and specificity in the context of a complex sample background. The nature of the B. anthracis genome, however, renders specific detection difficult, due to close homology with B. cereus and B. thuringiensis. We therefore elected to determine the efficacy of next-generation sequencing analysis and microarrays for detection of B. anthracis in an environmental background. We applied next-generation sequencing to titrated genome copy numbers of B. anthracis in the presence of background nucleic acid extracted from aerosol and soil samples. We found next-generation sequencing to be capable of detecting as few as 10 genomic equivalents of B. anthracis DNA per nanogram of background nucleic acid. Detection was accomplished by mapping reads to either a defined subset of reference genomes or to the full GenBank database. Moreover, sequence data obtained from B. anthracis could be reliably distinguished from sequence data mapping to either B. cereus or B. thuringiensis. We also demonstrated the efficacy of a microbial census microarray in detecting B. anthracis in the same samples, representing a cost-effective and high-throughput approach, complementary to next-generation sequencing. Our results, in combination with the capacity of sequencing for providing insights into the genomic characteristics of complex and novel organisms, suggest that these platforms should be considered important components of a biosurveillance strategy. PMID:24039948
Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

PubMed Central

2014-01-01

Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911
Custom Super-Resolution Microscope for the Structural Analysis of Nanostructures

DTIC Science & Technology

2018-05-29

research community. As part of our validation of the new design approach, we performed two - color imaging of pairs of adjacent oligo probes hybridized...nanostructures and biological targets. Our microscope features a large field of view and custom optics that facilitate 3D imaging and enhanced contrast in...our imaging throughput by creating two microscopy platforms for high-throughput, super-resolution materials characterization, with the AO set-up being
Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information

PubMed Central

McDonald, Daniel; Gonzalez, Antonio; Navas-Molina, Jose A.; Jiang, Lingjing; Xu, Zhenjiang Zech; Winker, Kevin; Kado, Deborah M.; Orwoll, Eric; Manary, Mark; Mirarab, Siavash

2018-01-01

ABSTRACT Recent algorithmic advances in amplicon-based microbiome studies enable the inference of exact amplicon sequence fragments. These new methods enable the investigation of sub-operational taxonomic units (sOTU) by removing erroneous sequences. However, short (e.g., 150-nucleotide [nt]) DNA sequence fragments do not contain sufficient phylogenetic signal to reproduce a reasonable tree, introducing a barrier in the utilization of critical phylogenetically aware metrics such as Faith’s PD or UniFrac. Although fragment insertion methods do exist, those methods have not been tested for sOTUs from high-throughput amplicon studies in insertions against a broad reference phylogeny. We benchmarked the SATé-enabled phylogenetic placement (SEPP) technique explicitly against 16S V4 sequence fragments and showed that it outperforms the conceptually problematic but often-used practice of reconstructing de novo phylogenies. In addition, we provide a BSD-licensed QIIME2 plugin (https://github.com/biocore/q2-fragment-insertion) for SEPP and integration into the microbial study management platform QIITA. IMPORTANCE The move from OTU-based to sOTU-based analysis, while providing additional resolution, also introduces computational challenges. We demonstrate that one popular method of dealing with sOTUs (building a de novo tree from the short sequences) can provide incorrect results in human gut metagenomic studies and show that phylogenetic placement of the new sequences with SEPP resolves this problem while also yielding other benefits over existing methods. PMID:29719869
ScaffoldSeq: Software for characterization of directed evolution populations.

PubMed

Woldring, Daniel R; Holec, Patrick V; Hackel, Benjamin J

2016-07-01

ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Precision production: enabling deterministic throughput for precision aspheres with MRF

NASA Astrophysics Data System (ADS)

Maloney, Chris; Entezarian, Navid; Dumas, Paul

2017-10-01

Aspherical lenses offer advantages over spherical optics by improving image quality or reducing the number of elements necessary in an optical system. Aspheres are no longer being used exclusively by high-end optical systems but are now replacing spherical optics in many applications. The need for a method of production-manufacturing of precision aspheres has emerged and is part of the reason that the optics industry is shifting away from artisan-based techniques towards more deterministic methods. Not only does Magnetorheological Finishing (MRF) empower deterministic figure correction for the most demanding aspheres but it also enables deterministic and efficient throughput for series production of aspheres. The Q-flex MRF platform is designed to support batch production in a simple and user friendly manner. Thorlabs routinely utilizes the advancements of this platform and has provided results from using MRF to finish a batch of aspheres as a case study. We have developed an analysis notebook to evaluate necessary specifications for implementing quality control metrics. MRF brings confidence to optical manufacturing by ensuring high throughput for batch processing of aspheres.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.