Science.gov

Sample records for generation sequencing platforms

  1. Next-Generation Sequencing Platforms

    NASA Astrophysics Data System (ADS)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  2. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    PubMed

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction. PMID:26464377

  3. Use of Four Next-Generation Sequencing Platforms to Determine HIV-1 Coreceptor Tropism

    PubMed Central

    Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J.; Robertson, David L.; Mimms, Larry; Quiñones-Mateu, Miguel E.

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage. PMID:23166726

  4. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    PubMed Central

    2012-01-01

    Background Next generation sequencing (NGS) technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy. Results Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform. Conclusions All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support. PMID:22827831

  5. FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms.

    PubMed

    Dodt, Matthias; Roehr, Johannes T; Ahmed, Rina; Dieterich, Christoph

    2012-01-01

    Quantitative and systems biology approaches benefit from the unprecedented depth of next-generation sequencing. A typical experiment yields millions of short reads, which oftentimes carry particular sequence tags. These tags may be: (a) specific to the sequencing platform and library construction method (e.g., adapter sequences); (b) have been introduced by experimental design (e.g., sample barcodes); or (c) constitute some biological signal (e.g., splice leader sequences in nematodes). Our software FLEXBAR enables accurate recognition, sorting and trimming of sequence tags with maximal flexibility, based on exact overlap sequence alignment. The software supports data formats from all current sequencing platforms, including color-space reads. FLEXBAR maintains read pairings and processes separate barcode reads on demand. Our software facilitates the fine-grained adjustment of sequence tag detection parameters and search regions. FLEXBAR is a multi-threaded software and combines speed with precision. Even complex read processing scenarios might be executed with a single command line call. We demonstrate the utility of the software in terms of read mapping applications, library demultiplexing and splice leader detection. FLEXBAR and additional information is available for academic use from the website: http://sourceforge.net/projects/flexbar/. PMID:24832523

  6. FLEXBAR—Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms

    PubMed Central

    Dodt, Matthias; Roehr, Johannes T.; Ahmed, Rina; Dieterich, Christoph

    2012-01-01

    Quantitative and systems biology approaches benefit from the unprecedented depth of next-generation sequencing. A typical experiment yields millions of short reads, which oftentimes carry particular sequence tags. These tags may be: (a) specific to the sequencing platform and library construction method (e.g., adapter sequences); (b) have been introduced by experimental design (e.g., sample barcodes); or (c) constitute some biological signal (e.g., splice leader sequences in nematodes). Our software FLEXBAR enables accurate recognition, sorting and trimming of sequence tags with maximal flexibility, based on exact overlap sequence alignment. The software supports data formats from all current sequencing platforms, including color-space reads. FLEXBAR maintains read pairings and processes separate barcode reads on demand. Our software facilitates the fine-grained adjustment of sequence tag detection parameters and search regions. FLEXBAR is a multi-threaded software and combines speed with precision. Even complex read processing scenarios might be executed with a single command line call. We demonstrate the utility of the software in terms of read mapping applications, library demultiplexing and splice leader detection. FLEXBAR and additional information is available for academic use from the website: http://sourceforge.net/projects/flexbar/. PMID:24832523

  7. Preparation of Fragment Libraries for Next-Generation Sequencing on the Applied Biosystems SOLiD Platform

    PubMed Central

    Yegnasubramanian, Srinivasan

    2014-01-01

    The primary purpose of this protocol is to prepare genomic DNA libraries that can then be analyzed by massively parallel next-generation sequencing on the Applied Bio-systems SOLiD platform. This protocol can be adapted to next-generation sequencing workflows to ultimately generate up to 1 billion 50 bp sequence tags from the ends of each of the DNA molecules in the library in a single next-generation sequencing run. PMID:24011046

  8. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  9. A Microfluidic DNA Library Preparation Platform for Next-Generation Sequencing

    PubMed Central

    Sinha, Anupama; Bent, Zachary W.; Solberg, Owen D.; Williams, Kelly P.; Langevin, Stanley A.; Renzi, Ronald F.; Van De Vreugde, James L.; Meagher, Robert J.; Schoeniger, Joseph S.; Lane, Todd W.; Branda, Steven S.; Bartsch, Michael S.; Patel, Kamlesh D.

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories. PMID:23894387

  10. Clinical analysis of genome next-generation sequencing data using the Omicia platform

    PubMed Central

    Coonrod, Emily M; Margraf, Rebecca L; Russell, Archie; Voelkerding, Karl V; Reese, Martin G

    2013-01-01

    Aims Next-generation sequencing is being implemented in the clinical laboratory environment for the purposes of candidate causal variant discovery in patients affected with a variety of genetic disorders. The successful implementation of this technology for diagnosing genetic disorders requires a rapid, user-friendly method to annotate variants and generate short lists of clinically relevant variants of interest. This report describes Omicia’s Opal platform, a new software tool designed for variant discovery and interpretation in a clinical laboratory environment. The software allows clinical scientists to process, analyze, interpret and report on personal genome files. Materials & Methods To demonstrate the software, the authors describe the interactive use of the system for the rapid discovery of disease-causing variants using three cases. Results & Conclusion Here, the authors show the features of the Opal system and their use in uncovering variants of clinical significance. PMID:23895124

  11. StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics.

    PubMed

    Ramirez-Gonzalez, Ricardo H; Leggett, Richard M; Waite, Darren; Thanki, Anil; Drou, Nizar; Caccamo, Mario; Davey, Robert

    2013-01-01

    Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. "provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month". The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages. PMID:24627795

  12. Next-Generation Sequencing Workflow for NSCLC Critical Samples Using a Targeted Sequencing Approach by Ion Torrent PGM™ Platform

    PubMed Central

    Vanni, Irene; Coco, Simona; Truini, Anna; Rusmini, Marta; Dal Bello, Maria Giovanna; Alama, Angela; Banelli, Barbara; Mora, Marco; Rijavec, Erika; Barletta, Giulia; Genova, Carlo; Biello, Federica; Maggioni, Claudia; Grossi, Francesco

    2015-01-01

    Next-generation sequencing (NGS) is a cost-effective technology capable of screening several genes simultaneously; however, its application in a clinical context requires an established workflow to acquire reliable sequencing results. Here, we report an optimized NGS workflow analyzing 22 lung cancer-related genes to sequence critical samples such as DNA from formalin-fixed paraffin-embedded (FFPE) blocks and circulating free DNA (cfDNA). Snap frozen and matched FFPE gDNA from 12 non-small cell lung cancer (NSCLC) patients, whose gDNA fragmentation status was previously evaluated using a multiplex PCR-based quality control, were successfully sequenced with Ion Torrent PGM™. The robust bioinformatic pipeline allowed us to correctly call both Single Nucleotide Variants (SNVs) and indels with a detection limit of 5%, achieving 100% specificity and 96% sensitivity. This workflow was also validated in 13 FFPE NSCLC biopsies. Furthermore, a specific protocol for low input gDNA capable of producing good sequencing data with high coverage, high uniformity, and a low error rate was also optimized. In conclusion, we demonstrate the feasibility of obtaining gDNA from FFPE samples suitable for NGS by performing appropriate quality controls. The optimized workflow, capable of screening low input gDNA, highlights NGS as a potential tool in the detection, disease monitoring, and treatment of NSCLC. PMID:26633390

  13. A platform for leveraging next generation sequencing for routine microbiology and public health use.

    PubMed

    Rusu, Laura I; Wyres, Kelly L; Reumann, Matthias; Queiroz, Carlos; Bojovschi, Alexe; Conway, Tom; Garg, Saurabh; Edwards, David J; Hogg, Geoff; Holt, Kathryn E

    2015-01-01

    Even with the advent of next-generation sequencing (NGS) technologies which have revolutionised the field of bacterial genomics in recent years, a major barrier still exists to the implementation of NGS for routine microbiological use (in public health and clinical microbiology laboratories). Such routine use would make a big difference to investigations of pathogen transmission and prevention/control of (sometimes lethal) infections. The inherent complexity and high frequency of data analyses on very large sets of bacterial DNA sequence data, the ability to ensure data provenance and automatically track and log all analyses for audit purposes, the need for quick and accurate results, together with an essential user-friendly interface for regular non-technical laboratory staff, are all critical requirements for routine use in a public health setting. There are currently no systems to answer positively to all these requirements, in an integrated manner. In this paper, we describe a system for sequence analysis and interpretation that is highly automated and tackles the issues raised earlier, and that is designed for use in diagnostic laboratories by healthcare workers with no specialist bioinformatics knowledge. PMID:25870761

  14. Towards a Next-Generation Sequencing Diagnostic Service for Tumour Genotyping: A Comparison of Panels and Platforms.

    PubMed

    Burghel, George J; Hurst, Carolyn D; Watson, Christopher M; Chambers, Phillip A; Dickinson, Helen; Roberts, Paul; Knowles, Margaret A

    2015-01-01

    Detection of clinically actionable mutations in diagnostic tumour specimens aids in the selection of targeted therapeutics. With an ever increasing number of clinically significant mutations identified, tumour genetic diagnostics is moving from single to multigene analysis. As it is still not feasible for routine diagnostic laboratories to perform sequencing of the entire cancer genome, our approach was to undertake targeted mutation detection. To optimise our diagnostic workflow, we evaluated three target enrichment strategies using two next-generation sequencing (NGS) platforms (Illumina MiSeq and Ion PGM). The target enrichment strategies were Fluidigm Access Array custom amplicon panel including 13 genes (MiSeq sequencing), the Oxford Gene Technologies (OGT) SureSeq Solid Tumour hybridisation panel including 60 genes (MiSeq sequencing), and an Ion AmpliSeq Cancer Hotspot Panel including 50 genes (Ion PGM sequencing). DNA extracted from formalin-fixed paraffin-embedded (FFPE) blocks of eight previously characterised cancer cell lines was tested using the three panels. Matching genomic DNA from fresh cultures of these cell lines was also tested using the custom Fluidigm panel and the OGT SureSeq Solid Tumour panel. Each panel allowed mutation detection of core cancer genes including KRAS, BRAF, and EGFR. Our results indicate that the panels enable accurate variant detection despite sequencing from FFPE DNA. PMID:26351634

  15. Performance Comparison of Illumina and Ion Torrent Next-Generation Sequencing Platforms for 16S rRNA-Based Bacterial Community Profiling

    PubMed Central

    Kawashima, Toana; Rosenthal, Christopher; Hoogestraat, Daniel R.; Cummings, Lisa A.; Sengupta, Dhruba J.; Harkins, Timothy T.; Cookson, Brad T.

    2014-01-01

    High-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common “benchtop” sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone. PMID:25261520

  16. A comprehensive transcriptome assembly of pigeonpea (Cajanauscajan L.) using sanger and second-generation sequencing platforms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18,353 Sanger expressed sequenced tags (ESTs) from more than 16 genotypes. The resultant transcriptome assembly, refer...

  17. Evaluation and comparison of two commercially available targeted next-generation sequencing platforms to assist oncology decision making

    PubMed Central

    Weiss, Glen J; Hoff, Brandi R; Whitehead, Robert P; Sangal, Ashish; Gingrich, Susan A; Penny, Robert J; Mallery, David W; Morris, Scott M; Thompson, Eric J; Loesch, David M; Khemka, Vivek

    2015-01-01

    Background It is widely acknowledged that there is value in examining cancers for genomic aberrations via next-generation sequencing (NGS). How commercially available NGS platforms compare with each other, and the clinical utility of the reported actionable results, are not well known. During the course of the current study, the Foundation One (F1) test generated data on a combination of somatic mutations, insertion and deletion polymorphisms, chromosomal abnormalities, and deoxyribonucleic acid (DNA) copy number changes at ~250× coverage, while the Paradigm Cancer Diagnostic (PCDx) test generated the same type of data at >5,000× coverage, plus provided messenger RNA (mRNA) expression levels. We sought to compare and evaluate paired formalin-fixed paraffin-embedded tumor tissue using these two platforms. Methods Samples from patients with advanced solid tumors were submitted to both the F1 and PCDx vendors for NGS analysis. Turnaround time (TAT) was calculated. Biomarkers were considered clinically actionable if they had a published association with treatment response in humans and were assigned to the following categories: commercially available drug (CA), clinical trial drug (CT), or neither option (hereafter referred to as “None”). Results The demographics of the 21 unique patient tumor samples included ten men and eleven women, with a median age of 56 years. Due to insufficient archival tissue from the same collection period, in one case, we used samples from different collections. PCDx reported first results faster than F1 in 20 cases. When received at both vendors on the same day, PCDx reported first results for 14 of 15 cases, with a median TAT of 9 days earlier than F1 (P<0.0001). Categorization of CA compared to CT and none significantly favored PCDx (P=0.012). Conclusion In the current analysis, commercially available NGS platforms provided clinically relevant actionable targets (CA or CT) in 47%–67% of diverse cancer types. In the samples

  18. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo) genome assembly and analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing technologies were used to rapidly and efficiently sequence the genome of the domestic turkey (Meleagris gallopavo). The current genome assembly (~1.1 Gb) includes 917 Mb of sequence assigned to chromosomes. Innate heterozygosity of the sequenced bird allowed discovery of...

  19. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  20. Multi-platform and cross-methodological reproducibility of transcriptome profiling by RNA-seq in the ABRF Next-Generation Sequencing Study

    PubMed Central

    Nicolet, Charles M.; Grove, Deborah; Levy, Shawn; Farmerie, William; Viale, Agnes; Wright, Chris; Schweitzer, Peter A.; Gao, Yuan; Kim, Dewey; Boland, Joe; Hicks, Belynda; Kim, Ryan; Chhangawala, Sagar; Jafari, Nadereh; Raghavachari, Nalini; Gandara, Jorge; Garcia-Reyero, Natàlia; Hendrickson, Cynthia; Roberson, David; Rosenfeld, Jeffrey; Smith, Todd; Underwood, Jason G.; Wang, May; Zumbo, Paul; Baldwin, Don A.; Grills, George S.; Mason, Christopher E.

    2014-01-01

    High-throughput RNA sequencing (RNA-seq) dramatically expands the potential for novel genomics discoveries, but the wide variety of platforms, protocols and performance has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We tested replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (polyA-selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies’ PGM and Proton, Pacific Biosciences RS and Roche’s 454). The results show high intra-platform and inter-platform concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. These data also demonstrate that ribosomal RNA depletion can both enable effective analysis of degraded RNA samples and be readily compared to polyA-enriched fractions. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq. PMID:25150835

  1. Efficacy of a 3rd generation high-throughput sequencing platform for analyses of 16S rRNA genes from environmental samples.

    PubMed

    Mosher, Jennifer J; Bernberg, Erin L; Shevchenko, Olga; Kan, Jinjun; Kaplan, Louis A

    2013-11-01

    Longer sequences of the bacterial 16S rRNA gene could provide greater phylogenetic and taxonomic resolutions and advance knowledge of population dynamics within complex natural communities. We assessed the accuracy of a Pacific Biosciences (PacBio) single molecule, real time (SMRT) sequencing based on DNA polymerization, a promising 3rd generation high-throughput technique, and compared this to the 2nd generation Roche 454 pyrosequencing platform. Amplicons of the 16S rRNA gene from a known isolate, Shewanella oneidensis MR1, and environmental samples from two streambed habitats, rocks and sediments, and a riparian zone soil, were analyzed. On the PacBio we analyzed ~500 bp amplicons that covered the V1-V3 regions and the full 1500 bp amplicons of the V1-V9 regions. On the Roche 454 we analyzed the ~500 bp amplicons. Error rates associated with the isolate were lowest with the Roche 454 method (2%), increased by more than 2-fold for the 500 bp amplicons with the PacBio SMRT chip (4-5%), and by more than 8-fold for the full gene with the PacBio SMRT chip (17-18%). Higher error rates with the PacBio SMRT chip artificially inflated estimates of richness and lowered estimates of coverage for environmental samples. The 3rd generation sequencing technology we evaluated does not provide greater phylogenetic and taxonomic resolutions for studies of microbial ecology. PMID:23999276

  2. An effective screening strategy for deafness in combination with a next-generation sequencing platform: a consecutive analysis

    PubMed Central

    Sakuma, Naoko; Moteki, Hideaki; Takahashi, Masahiro; Nishio, Shin-ya; Arai, Yasuhiro; Yamashita, Yukiko; Oridate, Nobuhiko; Usami, Shin-ichi

    2016-01-01

    The diagnosis of the genetic etiology of deafness contributes to the clinical management of patients. We performed the following four genetic tests in three stages for 52 consecutive deafness subjects in one facility. We used the Invader assay for 46 mutations in 13 genes and Sanger sequencing for the GJB2 gene or SLC26A4 gene in the first-stage test, the TaqMan genotyping assay in the second-stage test and targeted exon sequencing using massively parallel DNA sequencing in the third-stage test. Overall, we identified the genetic cause in 40% (21/52) of patients. The diagnostic rates of autosomal dominant, autosomal recessive and sporadic cases were 50%, 60% and 34%, respectively. When the sporadic cases with congenital and severe hearing loss were selected, the diagnostic rate rose to 48%. The combination approach using these genetic tests appears to be useful as a diagnostic tool for deafness patients. We recommended that genetic testing for the screening of common mutations in deafness genes using the Invader assay or TaqMan genotyping assay be performed as the initial evaluation. For the remaining undiagnosed cases, targeted exon sequencing using massively parallel DNA sequencing is clinically and economically beneficial. PMID:26763877

  3. Comprehensive transcriptome assembly of chickpea (Cicer arietinum L.) using Sanger and next generation sequencing platforms: development and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A high-quality transcriptome assembly for chickpea has been developed using ~135 million Illumina single-end reads, 7.12 million single-end FLX/454 reads, and 139 thousand Sanger expressed sequence tags (ESTs). This hybrid transcriptome assembly, which we refer to as the "Cicer arietinum Transcripto...

  4. Primer ID Informs Next-Generation Sequencing Platforms and Reveals Preexisting Drug Resistance Mutations in the HIV-1 Reverse Transcriptase Coding Domain

    PubMed Central

    Keys, Jessica R.; Zhou, Shuntai; Anderson, Jeffrey A.; Eron, Joseph J.; Rackoff, Lauren A.; Jabara, Cassandra

    2015-01-01

    Abstract Sequencing of a bulk polymerase chain reaction (PCR) product to identify drug resistance mutations informs antiretroviral therapy selection but has limited sensitivity for minority variants. Alternatively, deep sequencing is capable of detecting minority variants but is subject to sequencing errors and PCR resampling due to low input templates. We screened for resistance mutations among 184 HIV-1-infected, therapy-naive subjects using the 454 sequencing platform to sequence two amplicons spanning HIV-1 reverse transcriptase codons 34–245. Samples from 19 subjects were also analyzed using the MiSeq sequencing platform for comparison. Errors and PCR resampling were addressed by tagging each HIV-1 RNA template copy (i.e., cDNA) with a unique sequence tag (Primer ID), allowing a consensus sequence to be constructed for each original template from resampled sequences. In control reactions, Primer ID reduced 454 and MiSeq errors from 71 to 2.6 and from 24 to 1.2 errors/10,000 nucleotides, respectively. MiSeq also allowed accurate sequencing of codon 65, an important drug resistance position embedded in a homopolymeric run that is poorly resolved by the 454 platform. Excluding homopolymeric positions, 14% of subjects had evidence of ≥1 resistance mutation among Primer ID consensus sequences, compared to 2.7% by bulk population sequencing. When calls were restricted to mutations that appeared twice among consensus sequence populations, 6% of subjects had detectable resistance mutations. The use of Primer ID revealed 5–15% template utilization on average, limiting the depth of deep sequencing sampling and revealing sampling variation due to low template utilization. Primer ID addresses important limitations of deep sequencing and produces less biased estimates of low-level resistance mutations in the viral population. PMID:25748056

  5. Primer ID Informs Next-Generation Sequencing Platforms and Reveals Preexisting Drug Resistance Mutations in the HIV-1 Reverse Transcriptase Coding Domain.

    PubMed

    Keys, Jessica R; Zhou, Shuntai; Anderson, Jeffrey A; Eron, Joseph J; Rackoff, Lauren A; Jabara, Cassandra; Swanstrom, Ronald

    2015-06-01

    Sequencing of a bulk polymerase chain reaction (PCR) product to identify drug resistance mutations informs antiretroviral therapy selection but has limited sensitivity for minority variants. Alternatively, deep sequencing is capable of detecting minority variants but is subject to sequencing errors and PCR resampling due to low input templates. We screened for resistance mutations among 184 HIV-1-infected, therapy-naive subjects using the 454 sequencing platform to sequence two amplicons spanning HIV-1 reverse transcriptase codons 34-245. Samples from 19 subjects were also analyzed using the MiSeq sequencing platform for comparison. Errors and PCR resampling were addressed by tagging each HIV-1 RNA template copy (i.e., cDNA) with a unique sequence tag (Primer ID), allowing a consensus sequence to be constructed for each original template from resampled sequences. In control reactions, Primer ID reduced 454 and MiSeq errors from 71 to 2.6 and from 24 to 1.2 errors/10,000 nucleotides, respectively. MiSeq also allowed accurate sequencing of codon 65, an important drug resistance position embedded in a homopolymeric run that is poorly resolved by the 454 platform. Excluding homopolymeric positions, 14% of subjects had evidence of ≥1 resistance mutation among Primer ID consensus sequences, compared to 2.7% by bulk population sequencing. When calls were restricted to mutations that appeared twice among consensus sequence populations, 6% of subjects had detectable resistance mutations. The use of Primer ID revealed 5-15% template utilization on average, limiting the depth of deep sequencing sampling and revealing sampling variation due to low template utilization. Primer ID addresses important limitations of deep sequencing and produces less biased estimates of low-level resistance mutations in the viral population. PMID:25748056

  6. Profile of bacterial communities in South African mine-water samples using Illumina next-generation sequencing platform.

    PubMed

    Keshri, Jitendra; Mankazana, Boitumelo B J; Momba, Maggy N B

    2015-04-01

    Mine water is an example of an extreme environment that contains a large number of diverse and specific bacteria. It is imperative to gain an understanding of these bacterial communities in order to develop effective strategies for the bioremediation of polluted aquatic systems. In this study, the high-throughput sequencing approach was used to characterize the bacterial communities in two different mine waters of South Africa: vanadium and gold mine water. Over 2629 operational taxonomic units (OTUs) were recovered from 15,802 reads of the 16S ribosomal RNA (rRNA) gene. They represented 8 phyla, 43 orders, 84 families and 105 genera. Proteobacteria and unclassified bacterial sequences were the most dominant. Apart from these, Firmicutes, Bacteroidetes, Actinobacteria, Candidate phylum OD1, Cyanobacteria, Verrucomicrobia and Deinococcus-Thermus were the recovered phyla, although their relative abundance differed between both the mine-water samples. Yet, diversity indices suggested that the bacterial communities inhabiting the vanadium mine water were more diverse than those in gold mine water. Interestingly, substantial percentages of the reads from either sample (58 % in vanadium and 17 % in gold mine water) could not be assigned to any phylum and remained unclassified, suggesting hitherto unidentified populations, and vast untapped microbial diversity. Overall, the results of this study exhibited bacterial community structures with high diversity in mine water, which can be explored further for their role in bioremediation and environmental management. PMID:25416590

  7. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  8. Regulation of next generation sequencing.

    PubMed

    Javitt, Gail H; Carner, Katherine Strong

    2014-01-01

    Next generation sequencing raises new questions within the context of an existing and still evolving regulatory landscape for device manufacturers and clinical laboratories. FDA cleared the first NGS sequencing platform in November 2013, but it is unclear what lies ahead for this technology. NGS will require new types of training and expertise to interpret the vast quantities of genetic data so as to provide meaningful clinical information to physicians and patients. This paper will describe the current regulatory landscape for NGS technologies, identify the regulatory challenges they present, and consider whether new regulatory paradigms are needed to accommodate NGS technologies and services. PMID:25298288

  9. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L.) using sanger and next generation sequencing platforms: development and applications.

    PubMed

    Kudapa, Himabindu; Azam, Sarwar; Sharpe, Andrew G; Taran, Bunyamin; Li, Rong; Deonovic, Benjamin; Cameron, Connor; Farmer, Andrew D; Cannon, Steven B; Varshney, Rajeev K

    2014-01-01

    A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in

  10. A comprehensive transcriptome assembly of Pigeonpea (Cajanus cajan L.) using sanger and second-generation sequencing platforms.

    PubMed

    Kudapa, Himabindu; Bharti, Arvind K; Cannon, Steven B; Farmer, Andrew D; Mulaosmanovic, Benjamin; Kramer, Robin; Bohra, Abhishek; Weeks, Nathan T; Crow, John A; Tuteja, Reetu; Shah, Trushar; Dutta, Sutapa; Gupta, Deepak K; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; May, Gregory D; Singh, Nagendra K; Varshney, Rajeev K

    2012-09-01

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ~8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea. PMID:22241453

  11. Automatic Command Sequence Generation

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladded, Roy; Khanampompan, Teerapat

    2007-01-01

    Automatic Sequence Generator (Autogen) Version 3.0 software automatically generates command sequences for the Mars Reconnaissance Orbiter (MRO) and several other JPL spacecraft operated by the multi-mission support team. Autogen uses standard JPL sequencing tools like APGEN, ASP, SEQGEN, and the DOM database to automate the generation of uplink command products, Spacecraft Command Message Format (SCMF) files, and the corresponding ground command products, DSN Keywords Files (DKF). Autogen supports all the major multi-mission mission phases including the cruise, aerobraking, mapping/science, and relay mission phases. Autogen is a Perl script, which functions within the mission operations UNIX environment. It consists of two parts: a set of model files and the autogen Perl script. Autogen encodes the behaviors of the system into a model and encodes algorithms for context sensitive customizations of the modeled behaviors. The model includes knowledge of different mission phases and how the resultant command products must differ for these phases. The executable software portion of Autogen, automates the setup and use of APGEN for constructing a spacecraft activity sequence file (SASF). The setup includes file retrieval through the DOM (Distributed Object Manager), an object database used to store project files. This step retrieves all the needed input files for generating the command products. Depending on the mission phase, Autogen also uses the ASP (Automated Sequence Processor) and SEQGEN to generate the command product sent to the spacecraft. Autogen also provides the means for customizing sequences through the use of configuration files. By automating the majority of the sequencing generation process, Autogen eliminates many sequence generation errors commonly introduced by manually constructing spacecraft command sequences. Through the layering of commands into the sequence by a series of scheduling algorithms, users are able to rapidly and reliably construct the

  12. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing.

  13. Relay Sequence Generation Software

    NASA Technical Reports Server (NTRS)

    Gladden, Roy E.; Khanampompan, Teerapat

    2009-01-01

    Due to thermal and electromagnetic interactivity between the UHF (ultrahigh frequency) radio onboard the Mars Reconnaissance Orbiter (MRO), which performs relay sessions with the Martian landers, and the remainder of the MRO payloads, it is required to integrate and de-conflict relay sessions with the MRO science plan. The MRO relay SASF/PTF (spacecraft activity sequence file/ payload target file) generation software facilitates this process by generating a PTF that is needed to integrate the periods of time during which MRO supports relay activities with the rest of the MRO science plans. The software also generates the needed command products that initiate the relay sessions, some features of which are provided by the lander team, some are managed by MRO internally, and some being derived.

  14. MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform

    PubMed Central

    Suyama, Yoshihisa; Matsuki, Yu

    2015-01-01

    Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239

  15. Next generation sequencing of viral RNA genomes

    PubMed Central

    2013-01-01

    Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

  16. AB118. Validation of next generation sequencing by Sanger sequencing

    PubMed Central

    Low, Meow Hong Wendy; Lai, Hwei Meeng Angeline; Jamuar, Saumya Shekhar; Law, Hai Yang

    2015-01-01

    Background and objective Development of the next generation sequencing (NGS) platform was driven by the completion of the Human Genome Project in 2003. With the availability of NGS, the time taken for sequencing of humongous genomic regions was greatly reduced and data generated per unit DNA was also significantly increased. Though the cost to use NGS in a clinically setting is far from ideal, economically speaking, there is a significant decrease in the average cost per sequenced base. To validate findings of NGS on mutation detected for FBN1, TGFBR2, RAF1, RTEL1, LMNA, MID2, KCNK9, DMD, SMARCA2 and IQSEC2 by using gold standard, Sanger Sequencing. Methods The coordinate of the mutation identified by NGS was used to retrieve the adjacent genomic sequence in UCSC Genome Browser (Available from URL: https://genome.ucsc.edu/). Targeted primers were designed with Primer 3 software (Available from URL: http://primer3.ut.ee/) based on the genomic sequence obtained from UCSC. The following step involves the optimization of a Polymerase Chain Reaction (PCR) with the designed primers to amplify the desired DNA template for the targeted region. Upon optimization, the template is purified and subjected to dye terminator sequencing to generate multiple DNA fragments of varying sizes. Lastly, the DNA fragments will be purified and analysed with an automated sequencer. The sequencer separates the DNA fragments based on their size by carrying out capillary electrophoresis. Results A total of 28 cases were validated with Sanger sequencing. Of them, 25 (89.3%) cases concur with the findings from NGS and 3 (10.7%) cases were false-positive calls. Conclusions NGS shows promise in the future molecular diagnostic regime, however, at the present moment, it needs to be done concurrently with Sanger sequencing for clinical applications.

  17. Quasi-Random Sequence Generators.

    Energy Science and Technology Software Center (ESTSC)

    1994-03-01

    Version 00 LPTAU generates quasi-random sequences. The sequences are uniformly distributed sets of L=2**30 points in the N-dimensional unit cube: I**N=[0,1]. The sequences are used as nodes for multidimensional integration, as searching points in global optimization, as trial points in multicriteria decision making, as quasi-random points for quasi Monte Carlo algorithms.

  18. Sequencing platform and library preparation choices impact viral metagenomes

    PubMed Central

    2013-01-01

    Background Microbes drive the biogeochemistry that fuels the planet. Microbial viruses modulate their hosts directly through mortality and horizontal gene transfer, and indirectly by re-programming host metabolisms during infection. However, our ability to study these virus-host interactions is limited by methods that are low-throughput and heavily reliant upon the subset of organisms that are in culture. One way forward are culture-independent metagenomic approaches, but these novel methods are rarely rigorously tested, especially for studies of environmental viruses, air microbiomes, extreme environment microbiology and other areas with constrained sample amounts. Here we perform replicated experiments to evaluate Roche 454, Illumina HiSeq, and Ion Torrent PGM sequencing and library preparation protocols on virus metagenomes generated from as little as 10pg of DNA. Results Using %G + C content to compare metagenomes, we find that (i) metagenomes are highly replicable, (ii) some treatment effects are minimal, e.g., sequencing technology choice has 6-fold less impact than varying input DNA amount, and (iii) when restricted to a limited DNA concentration (<1μg), changing the amount of amplification produces little variation. These trends were also observed when examining the metagenomes for gene function and assembly performance, although the latter more closely aligned to sequencing effort and read length than preparation steps tested. Among Illumina library preparation options, transposon-based libraries diverged from all others and adaptor ligation was a critical step for optimizing sequencing yields. Conclusions These data guide researchers in generating systematic, comparative datasets to understand complex ecosystems, and suggest that neither varied amplification nor sequencing platforms will deter such efforts. PMID:23663384

  19. Bioinformatics for Next Generation Sequencing Data

    PubMed Central

    Magi, Alberto; Benelli, Matteo; Gozzini, Alessia; Girolami, Francesca; Torricelli, Francesca; Brandi, Maria Luisa

    2010-01-01

    The emergence of next-generation sequencing (NGS) platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow. PMID:24710047

  20. Application of genotyping-by-sequencing on semiconductor sequencing platforms: A comparison of genetic and reference-based marker ordering in barley

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid development of next generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach fo...

  1. Wolfcampian sequence stratigraphy of eastern Central Basin platform, Texas

    SciTech Connect

    Candelaria, M.P.; Entzminger, D.J.; Behnken, F.H. ); Sarg, J.F. ); Wilde, G.L. )

    1992-04-01

    Integrated study of well logs, cores, high-resolution seismic data, and biostratigraphy has established the sequence framework of the Atokan (Early Pennsylvanian)-Wolfcampian (Early Permian) stratigraphic section along the eastern margin of the Central Basin platform in the Permian basin. Sequence interpretation of high-resolution, high-fold seismic data through this stratigraphic interval has revealed a complex progradational/retrogradational evolution of the platform margin that has demonstrated overall progradation of at least 12 km during early-middle Wolfcampian. Sequence stratigraphic study of the Wolfcamp interval has revealed details of the internal architecture and morphologic evolution of the contemporaneous platform margin. Two generalized seismic facies assemblages are recognized in the Wolfcampian. Platform interior facies are characterized by high-amplitude, laterally continuous parallel reflections; platform margin facies consist of progradational sigmoidal to oblique clinoforms and are characterized by discontinuous, low-amplitude reflections. Sequence interpretation of carbonate platform-to-basin strata geometries helps in predicting subtle stratigraphic trapping relationships and potential reservoir facies distribution. Moreover, this interpretive method assists in describing complex reservoir heterogeneities that can contribute to significant reserve additions from within existing fields.

  2. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    PubMed

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. PMID:25110940

  3. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  4. Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies

    PubMed Central

    Utturkar, Sagar M; Klingeman, Dawn M; Bruno-Barcena, José M; Chinn, Mari S; Grunden, Amy M; Köpke, Michael; Brown, Steven D

    2015-01-01

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data. PMID:25977818

  5. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    SciTech Connect

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.

  6. Detection of BRAF Mutations Using a Fully Automated Platform and Comparison with High Resolution Melting, Real-Time Allele Specific Amplification, Immunohistochemistry and Next Generation Sequencing Assays, for Patients with Metastatic Melanoma

    PubMed Central

    Harlé, Alexandre; Salleron, Julia; Franczak, Claire; Dubois, Cindy; Filhine-Tressarieu, Pierre; Leroux, Agnès; Merlin, Jean-Louis

    2016-01-01

    Background Metastatic melanoma is a severe disease with one of the highest mortality rate in skin diseases. Overall survival has significantly improved with immunotherapy and targeted therapies. Kinase inhibitors targeting BRAF V600 showed promising results. BRAF genotyping is mandatory for the prescription of anti-BRAF therapies. Methods Fifty-nine formalin-fixed paraffin-embedded melanoma samples were assessed using High-Resolution-Melting (HRM) PCR, Real-time allele-specific amplification (RT-ASA) PCR, Next generation sequencing (NGS), immunohistochemistry (IHC) and the fully-automated molecular diagnostics platform IdyllaTM. Sensitivity, specificity, positive predictive value and negative predictive value were calculated using NGS as the reference standard to compare the different assays. Results BRAF mutations were found in 28(47.5%), 29(49.2%), 31(52.5%), 29(49.2%) and 27(45.8%) samples with HRM, RT-ASA, NGS, IdyllaTM and IHC respectively. Twenty-six (81.2%) samples were found bearing a c.1799T>A (p.Val600Glu) mutation, three (9.4%) with a c.1798_1799delinsAA (p.Val600Lys) mutation and one with c.1789_1790delinsTC (p.Leu597Ser) mutation. Two samples were found bearing complex mutations. Conclusions HRM appears the less sensitive assay for the detection of BRAF V600 mutations. The RT-ASA, IdyllaTM and IHC assays are suitable for routine molecular diagnostics aiming at the prescription of anti-BRAF therapies. IdyllaTM assay is fully-automated and requires less than 2 minutes for samples preparation and is the fastest of the tested assays. PMID:27111917

  7. Expression Profiling Using New Generation Sequencing Technologies

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microarray hybridization technology has become widely used in parallel analysis of gene expression. Recent advances in genome sequencing platforms point to an alternate approach through digital quantitation of sequencing reads produced from cDNA samples. This presentation will compare advantages a...

  8. Replacement Sequence of Events Generator

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladden, Daniel Wenkert Roy; Khanampompan, Teerpat

    2008-01-01

    The soeWINDOW program automates the generation of an ITAR (International Traffic in Arms Regulations)-compliant sub-RSOE (Replacement Sequence of Events) by extracting a specified temporal window from an RSOE while maintaining page header information. RSOEs contain a significant amount of information that is not ITAR-compliant, yet that foreign partners need to see for command details to their instrument, as well as the surrounding commands that provide context for validation. soeWINDOW can serve as an example of how command support products can be made ITAR-compliant for future missions. This software is a Perl script intended for use in the mission operations UNIX environment. It is designed for use to support the MRO (Mars Reconnaissance Orbiter) instrument team. The tool also provides automated DOM (Distributed Object Manager) storage into the special ITAR-okay DOM collection, and can be used for creating focused RSOEs for product review by any of the MRO teams.

  9. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    DOE PAGESBeta

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less

  10. Next generation sequencing methodologies--an overview.

    PubMed

    Pickrell, William O; Rees, Mark I; Chung, Seo-Kyung

    2012-01-01

    Gene discovery has been one of the most important advances in our understanding of human disorders. Early linkage and positional cloning strategies have now given way to next generation sequencing (NGS) with age-old help from biostatistical and bioinformatical input. In this chapter, we present the importance of getting the basics right, namely, how the best phenotyping in the clinical domain will provide a higher chance of a successful NGS experiment. In addition, we show getting the correct submission of DNA samples to NGS providers is dependent on the type of inheritance pattern that may or may not be apparent. We discuss one of the most crucial decisions for investigators when designing a study, namely choosing a trio, quad or cohort for analysis. Following on from this, we compare and contrast the underlying technology adopted by provider companies as they vie for customers and submissions. Each platform has advantages and disadvantages based on false calls, coverage, and read depth; however, some of these issues may be solved with the third wave of sequencing technology development in early commercial roll-out. Lastly, we provide a bioinformatic filtering overview of a "quad"-based submission and show how 3 million SNPs and indels can be reduced to a biologically plausible and experimentally manageable n≤50 gene variants. PMID:23046880

  11. Construction of a rationally designed antibody platform for sequencing-assisted selection.

    PubMed

    Larman, H Benjamin; Xu, George Jing; Pavlova, Natalya N; Elledge, Stephen J

    2012-11-01

    Antibody discovery platforms have become an important source of both therapeutic biomolecules and research reagents. Massively parallel DNA sequencing can be used to assist antibody selection by comprehensively monitoring libraries during selection, thus greatly expanding the power of these systems. We have therefore constructed a rationally designed, fully defined single-chain variable fragment (scFv) library and analysis platform optimized for analysis with short-read deep sequencing. Sequence-defined oligonucleotide libraries encoding three complementarity-determining regions (L3 from the light chain, H2 and H3 from the heavy chain) were synthesized on a programmable microarray and combinatorially cloned into a single scFv framework for molecular display. Our unique complementarity-determining region sequence design optimizes for protein binding by utilizing a hidden Markov model that was trained on all antibody-antigen cocrystal structures in the Protein Data Bank. The resultant ~10(12)-member library was produced in ribosome-display format, and comprehensively analyzed over four rounds of antigen selections by multiplex paired-end Illumina sequencing. The hidden Markov model scFv library generated multiple binders against an emerging cancer antigen and is the basis for a next-generation antibody production platform. PMID:23064642

  12. Membrane platforms for biological nanopore sensing and sequencing.

    PubMed

    Schmidt, Jacob

    2016-06-01

    In the past two decades, biological nanopores have been developed and explored for use in sensing applications as a result of their exquisite sensitivity and easily engineered, reproducible, and economically manufactured structures. Nanopore sensing has been shown to differentiate between highly similar analytes, measure polymer size, detect the presence of specific genes, and rapidly sequence nucleic acids translocating through the pore. Devices featuring protein nanopores have been limited in part by the membrane support containing the nanopore, the shortcomings of which have been addressed in recent work developing new materials, approaches, and apparatus resulting in membrane platforms featuring automatability and increased robustness, lifetime, and measurement throughput. PMID:26773300

  13. ACMG clinical laboratory standards for next-generation sequencing

    PubMed Central

    Rehm, Heidi L.; Bale, Sherri J; Bayrak-Toydemir, Pinar; Berg, Jonathan S; Brown, Kerry K; Deignan, Joshua L; Friez, Michael J; Funke, Birgit H; Hegde, Madhuri R; Lyon, Elaine

    2014-01-01

    Next-generation sequencing technologies have been and continue to be deployed in clinical laboratories, enabling rapid transformations in genomic medicine. These technologies have reduced the cost of large-scale sequencing by several orders of magnitude, and continuous advances are being made. It is now feasible to analyze an individual's near-complete exome or genome to assist in the diagnosis of a wide array of clinical scenarios. Next-generation sequencing technologies are also facilitating further advances in therapeutic decision making and disease prediction for at-risk patients. However, with rapid advances come additional challenges involving the clinical validation and use of these constantly evolving technologies and platforms in clinical laboratories. To assist clinical laboratories with the validation of next-generation sequencing methods and platforms, the ongoing monitoring of next-generation sequencing testing to ensure quality results, and the interpretation and reporting of variants found using these technologies, the American College of Medical Genetics and Genomics has developed the following professional standards and guidelines. PMID:23887774

  14. Metagenomic next-generation sequencing of viruses infecting grapevines.

    PubMed

    Burger, Johan T; Maree, Hans J

    2015-01-01

    Next-generation sequencing (NGS) technologies, for the first time, provide a truly "complete" representation of the viral (and other) pathogens present in a host organism. This is achieved in an unbiased way, and without any prior biological or molecular knowledge of these pathogen(s). During recent years a number of broad approaches, for most of the popular NGS platforms, have been developed. Here we describe such a protocol-one that accurately and reliably analyze viruses (and viroids) infecting grapevine. Our strategy relies on the synthesis of cDNA sequencing libraries from dsRNA, extracted from diseased grapevine tissues; the sequencing of these on an Illumina platform, and a streamlined bioinformatics pipeline to analyze the NGS data, yielding the virus composition (virome) of a specific grapevine tissue type, organ, entire plant, or even a vineyard. PMID:25981264

  15. Utilization of Benchtop Next Generation Sequencing Platforms Ion Torrent PGM and MiSeq in Noninvasive Prenatal Testing for Chromosome 21 Trisomy and Testing of Impact of In Silico and Physical Size Selection on Its Analytical Performance

    PubMed Central

    Minarik, Gabriel; Repiska, Gabriela; Hyblova, Michaela; Nagyova, Emilia; Soltys, Katarina; Budis, Jaroslav; Duris, Frantisek; Sysak, Rastislav; Gerykova Bujalkova, Maria; Vlkova-Izrael, Barbora; Biro, Orsolya; Nagy, Balint; Szemes, Tomas

    2015-01-01

    Objectives The aims of this study were to test the utility of benchtop NGS platforms for NIPT for trisomy 21 using previously published z score calculation methods and to optimize the sample preparation and data analysis with use of in silico and physical size selection methods. Methods Samples from 130 pregnant women were analyzed by whole genome sequencing on benchtop NGS systems Ion Torrent PGM and MiSeq. The targeted yield of 3 million raw reads on each platform was used for z score calculation. The impact of in silico and physical size selection on analytical performance of the test was studied. Results Using a z score value of 3 as the cut-off, 98.11% - 100% (104-106/106) specificity and 100% (24/24) sensitivity and 99.06% - 100% (105-106/106) specificity and 100% (24/24) sensitivity were observed for Ion Torrent PGM and MiSeq, respectively. After in silico based size selection both platforms reached 100% specificity and sensitivity. Following the physical size selection z scores of tested trisomic samples increased significantly—p = 0.0141 and p = 0.025 for Ion Torrent PGM and MiSeq, respectively. Conclusions Noninvasive prenatal testing for chromosome 21 trisomy with the utilization of benchtop NGS systems led to results equivalent to previously published studies performed on high-to-ultrahigh throughput NGS systems. The in silico size selection led to higher specificity of the test. Physical size selection performed on isolated DNA led to significant increase in z scores. The observed results could represent a basis for increasing of cost effectiveness of the test and thus help with its penetration worldwide. PMID:26669558

  16. Concept For Generation Of Long Pseudorandom Sequences

    NASA Technical Reports Server (NTRS)

    Wang, C. C.

    1990-01-01

    Conceptual very-large-scale integrated (VLSI) digital circuit performs exponentiation in finite field. Algorithm that generates unusually long sequences of pseudorandom numbers executed by digital processor that includes such circuits. Concepts particularly advantageous for such applications as spread-spectrum communications, cryptography, and generation of ranging codes, synthetic noise, and test data, where usually desirable to make pseudorandom sequences as long as possible.

  17. Metagenomics using next-generation sequencing.

    PubMed

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis. PMID:24515370

  18. Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective

    PubMed Central

    Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

    2011-01-01

    Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead. PMID:22147957

  19. ADS: The Next Generation Search Platform

    NASA Astrophysics Data System (ADS)

    Accomazzi, A.; Kurtz, M. J.; Henneken, E. A.; Chyla, R.; Luker, J.; Grant, C. S.; Thompson, D. M.; Holachek, A.; Dave, R.; Murray, S. S.

    2015-04-01

    Four years after the last LISA meeting, the NASA Astrophysics Data System (ADS) finds itself in the middle of major changes to the infrastructure and contents of its database. In this paper we highlight a number of features of great importance to librarians and discuss the additional functionality that we are currently developing. Our citation coverage has doubled since 2010 and now consists of over 10 million citations. We are normalizing the affiliation information in our records and we have started collecting and linking funding sources with papers in our system. At the same time, we are undergoing major technology changes in the ADS platform. We have rolled out and are now enhancing a new high-performance search engine capable of performing full-text as well as metadata searches using an intuitive query language. We are currently able to index acknowledgments, affiliations, citations, and funding sources. While this effort is still ongoing, some of its benefits are already available through the ADS Labs user interface and API at http://adslabs.org/adsabs/.

  20. Multifunctional pulse sequence generator for pulse NMR

    NASA Astrophysics Data System (ADS)

    Wang, Dongsheng

    1988-06-01

    A new multifunctional pulse sequence generator has been designed and constructed. It can conveniently generate various pulse sequences used in nuclear-magnetic resonance (NMR) to measure the spin-lattice relaxation time T1, the spin-spin relaxation time T2, and the spin-locking relaxation time T1 ρ. It can also be used in pulse Fourier transform NMR and double resonance. The intervals of pulses can increase automatically with sequence repetitions and the generator can be used in two-dimensional spectrum measurement and spin-density imaging research. The sequences can be generated through four different triggering methods and there are two synchronous pulse outputs and fifteen auxiliary pulse outputs, so the generator can be conveniently interfaced with a computer or other instruments. The circuitry, functions, and features of the generator are described in this article.

  1. A window into third-generation sequencing.

    PubMed

    Schadt, Eric E; Turner, Steve; Kasarskis, Andrew

    2010-10-15

    First- and second-generation sequencing technologies have led the way in revolutionizing the field of genomics and beyond, motivating an astonishing number of scientific advances, including enabling a more complete understanding of whole genome sequences and the information encoded therein, a more complete characterization of the methylome and transcriptome and a better understanding of interactions between proteins and DNA. Nevertheless, there are sequencing applications and aspects of genome biology that are presently beyond the reach of current sequencing technologies, leaving fertile ground for additional innovation in this space. In this review, we describe a new generation of single-molecule sequencing technologies (third-generation sequencing) that is emerging to fill this space, with the potential for dramatically longer read lengths, shorter time to result and lower overall cost. PMID:20858600

  2. NG6: Integrated next generation sequencing storage and processing environment

    PubMed Central

    2012-01-01

    Background Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. Results We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. Conclusions NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data. PMID:22958229

  3. Next generation platforms for high-throughput biodosimetry

    PubMed Central

    Repin, Mikhail; Turner, Helen C.; Garty, Guy; Brenner, David J.

    2014-01-01

    Here the general concept of the combined use of plates and tubes in racks compatible with the American National Standards Institute/the Society for Laboratory Automation and Screening microplate formats as the next generation platforms for increasing the throughput of biodosimetry assays was described. These platforms can be used at different stages of biodosimetry assays starting from blood collection into microtubes organised in standardised racks and ending with the cytogenetic analysis of samples in standardised multiwell and multichannel plates. Robotically friendly platforms can be used for different biodosimetry assays in minimally equipped laboratories and on cost-effective automated universal biotech systems. PMID:24837249

  4. Iterative method for generating correlated binary sequences

    NASA Astrophysics Data System (ADS)

    Usatenko, O. V.; Melnik, S. S.; Apostolov, S. S.; Makarov, N. M.; Krokhin, A. A.

    2014-11-01

    We propose an efficient iterative method for generating random correlated binary sequences with a prescribed correlation function. The method is based on consecutive linear modulations of an initially uncorrelated sequence into a correlated one. Each step of modulation increases the correlations until the desired level has been reached. The robustness and efficiency of the proposed algorithm are tested by generating sequences with inverse power-law correlations. The substantial increase in the strength of correlation in the iterative method with respect to single-step filtering generation is shown for all studied correlation functions. Our results can be used for design of disordered superlattices, waveguides, and surfaces with selective transport properties.

  5. Next-generation sequencing in the clinic: promises and challenges.

    PubMed

    Xuan, Jiekun; Yu, Ying; Qing, Tao; Guo, Lei; Shi, Leming

    2013-11-01

    The advent of next generation sequencing (NGS) technologies has revolutionized the field of genomics, enabling fast and cost-effective generation of genome-scale sequence data with exquisite resolution and accuracy. Over the past years, rapid technological advances led by academic institutions and companies have continued to broaden NGS applications from research to the clinic. A recent crop of discoveries have highlighted the medical impact of NGS technologies on Mendelian and complex diseases, particularly cancer. However, the ever-increasing pace of NGS adoption presents enormous challenges in terms of data processing, storage, management and interpretation as well as sequencing quality control, which hinder the translation from sequence data into clinical practice. In this review, we first summarize the technical characteristics and performance of current NGS platforms. We further highlight advances in the applications of NGS technologies towards the development of clinical diagnostics and therapeutics. Common issues in NGS workflows are also discussed to guide the selection of NGS platforms and pipelines for specific research purposes. PMID:23174106

  6. Double-digest RAD sequencing using Ion Proton semiconductor platform (ddRADseq-ion) with nonmodel organisms.

    PubMed

    Recknagel, Hans; Jacobs, Arne; Herzyk, Pawel; Elmer, Kathryn R

    2015-11-01

    Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double-digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single-end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11,000 polymorphic loci per library of 6-30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost-effective generation of variable and reproducible genetic markers. PMID:25808755

  7. Comparison of Next-Generation Sequencing Systems

    PubMed Central

    Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

    2012-01-01

    With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized. PMID:22829749

  8. Theory of Periodic-Binary-Sequence Generators

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1987-01-01

    Algorithms yield feedback shift registers with maximum regularity. Report provides extensive mathematical treatment of new and previous results related to generation of pseudo-noise binary sequences by feedback shift registers. Generator architectures amenable to efficient implementation in very-large-scale integrated (VLSI) circuits. Report includes literature references to applications of such sequences in random-number generation, radar, VLSI testing, data encryption and decryption, algebraic error-detection and error-correction encoding and decoding, and feedback-shift-register synthesis of sequential machines.

  9. Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments

    PubMed Central

    Qi, Yuan; Liu, Xiuping; Liu, Chang-gong; Wang, Bailing; Hess, Kenneth R.; Symmans, W. Fraser; Shi, Weiwei; Pusztai, Lajos

    2015-01-01

    Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA. We performed targeted sequencing of all known human protein kinase genes (kinome) (~3.2 Mb) using the SOLiD v4 platform. Seventeen breast cancer samples were sequenced in duplicate (n=14) or triplicate (n=3) to assess concordance of all calls and single nucleotide variant (SNV) calls. The concordance rates over the entire sequenced region were >99.99%, while the concordance rates for SNVs were 54.3-75.5%. There was substantial variation in basic sequencing metrics from experiment to experiment. The type of nucleotide substitution and genomic location of the variant had little impact on concordance but concordance increased with coverage level, variant allele count (VAC), variant allele frequency (VAF), variant allele quality and p-value of SNV-call. The most important determinants of concordance were VAC and VAF. Even using the highest stringency of QC metrics the reproducibility of SNV calls was around 80% suggesting that erroneous variant calling can be as high as 20-40% in a single experiment. The sequence data have been deposited into the European Genome-phenome Archive (EGA) with accession number EGAS00001000826. PMID:26136146

  10. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform.

    PubMed

    Shokralla, Shadi; Porter, Teresita M; Gibson, Joel F; Dobosz, Rafal; Janzen, Daniel H; Hallwachs, Winnie; Golding, G Brian; Hajibabaei, Mehrdad

    2015-01-01

    Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions. PMID:25884109

  11. Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform

    PubMed Central

    Mina, Erika Della; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

    2015-01-01

    We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8–10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1–2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on ‘benchtop' sequencers combining rapid turnaround times with higher manageability. PMID:24848745

  12. Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform.

    PubMed

    Della Mina, Erika; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

    2015-03-01

    We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8-10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1-2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on 'benchtop' sequencers combining rapid turnaround times with higher manageability. PMID:24848745

  13. A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms.

    PubMed

    Indugu, Nagaraju; Bittinger, Kyle; Kumar, Sanjay; Vecchiarelli, Bonnie; Pitta, Dipti

    2016-01-01

    Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M (2) = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed. PMID:26870608

  14. A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms

    PubMed Central

    Indugu, Nagaraju; Bittinger, Kyle; Kumar, Sanjay; Vecchiarelli, Bonnie

    2016-01-01

    Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M2 = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed. PMID:26870608

  15. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment.

    PubMed

    Kwak, Daniel; Kam, Alfred; Becerra, David; Zhou, Qikuan; Hops, Adam; Zarour, Eleyine; Kam, Arthur; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814

  16. SNP Discovery through Next-Generation Sequencing and Its Applications

    PubMed Central

    Kumar, Santosh; Banks, Travis W.; Cloutier, Sylvie

    2012-01-01

    The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in various model and nonmodel plant species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. Although excellent reviews have been published on next-generation sequencing, its associated bioinformatics challenges, and the applications of SNPs in genetic studies, a comprehensive review connecting these three intertwined research areas is needed. This paper touches upon various aspects of SNP discovery, highlighting key points in availability and selection of appropriate sequencing platforms, bioinformatics pipelines, SNP filtering criteria, and applications of SNPs in genetic analyses. The use of next-generation sequencing methodologies in many non-model crops leading to discovery and implementation of SNPs in various genetic studies is discussed. Development and improvement of bioinformatics software that are open source and freely available have accelerated the SNP discovery while reducing the associated cost. Key considerations for SNP filtering and associated pipelines are discussed in specific topics. A list of commonly used software and their sources is compiled for easy access and reference. PMID:23227038

  17. Next-Generation Sequence Assembly: Four Stages of Data Processing and Computational Challenges

    PubMed Central

    El-Metwally, Sara; Hamza, Taher; Zakaria, Magdi; Helmy, Mohamed

    2013-01-01

    Decoding DNA symbols using next-generation sequencers was a major breakthrough in genomic research. Despite the many advantages of next-generation sequencers, e.g., the high-throughput sequencing rate and relatively low cost of sequencing, the assembly of the reads produced by these sequencers still remains a major challenge. In this review, we address the basic framework of next-generation genome sequence assemblers, which comprises four basic stages: preprocessing filtering, a graph construction process, a graph simplification process, and postprocessing filtering. Here we discuss them as a framework of four stages for data analysis and processing and survey variety of techniques, algorithms, and software tools used during each stage. We also discuss the challenges that face current assemblers in the next-generation environment to determine the current state-of-the-art. We recommend a layered architecture approach for constructing a general assembler that can handle the sequences generated by different sequencing platforms. PMID:24348224

  18. Platforms.

    PubMed

    Josko, Deborah

    2014-01-01

    The advent of DNA sequencing technologies and the various applications that can be performed will have a dramatic effect on medicine and healthcare in the near future. There are several DNA sequencing platforms available on the market for research and clinical use. Based on the medical laboratory scientist or researcher's needs and taking into consideration laboratory space and budget, one can chose which platform will be beneficial to their institution and their patient population. Although some of the instrument costs seem high, diagnosing a patient quickly and accurately will save hospitals money with fewer hospital stays and targeted treatment based on an individual's genetic make-up. By determining the type of disease an individual has, based on the mutations present or having the ability to prescribe the appropriate antimicrobials based on the knowledge of the organism's resistance patterns, the clinician will be better able to treat and diagnose a patient which ultimately will improve patient outcomes and prognosis. PMID:25219075

  19. Impact of next generation sequencing techniques in food microbiology.

    PubMed

    Mayo, Baltasar; Rachid, Caio T C C; Alegría, Angel; Leite, Analy M O; Peixoto, Raquel S; Delgado, Susana

    2014-08-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  20. Impact of Next Generation Sequencing Techniques in Food Microbiology

    PubMed Central

    Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana

    2014-01-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  1. Next Generation Sequencing Reveals the Hidden Diversity of Zooplankton Assemblages

    PubMed Central

    Harmer, Rachel A.; Somerfield, Paul J.; Atkinson, Angus

    2013-01-01

    Background Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. Methodology/Principle Findings Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. Conclusions Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly

  2. Next-Generation Sequencing for Binary Protein–Protein Interactions

    PubMed Central

    Suter, Bernhard; Zhang, Xinmin; Pesce, C. Gustavo; Mendelsohn, Andrew R.; Dinesh-Kumar, Savithramma P.; Mao, Jian-Hua

    2015-01-01

    The yeast two-hybrid (Y2H) system exploits host cell genetics in order to display binary protein–protein interactions (PPIs) via defined and selectable phenotypes. Numerous improvements have been made to this method, adapting the screening principle for diverse applications, including drug discovery and the scale-up for proteome wide interaction screens in human and other organisms. Here we discuss a systematic workflow and analysis scheme for screening data generated by Y2H and related assays that includes high-throughput selection procedures, readout of comprehensive results via next-generation sequencing (NGS), and the interpretation of interaction data via quantitative statistics. The novel assays and tools will serve the broader scientific community to harness the power of NGS technology to address PPI networks in health and disease. We discuss examples of how this next-generation platform can be applied to address specific questions in diverse fields of biology and medicine. PMID:26734059

  3. A repetitive sequence assembler based on next-generation sequencing.

    PubMed

    Lian, S; Tu, Y; Wang, Y; Chen, X; Wang, L

    2016-01-01

    Repetitive sequences of variable length are common in almost all eukaryotic genomes, and most of them are presumed to have important biomedical functions and can cause genomic instability. Next-generation sequencing (NGS) technologies provide the possibility of identifying capturing these repetitive sequences directly from the NGS data. In this study, we assessed the performances in identifying capturing repeats of leading assemblers, such as Velvet, SOAPdenovo, SGA, MSR-CA, Bambus2, ALLPATHS-LG, and AByss using three real NGS datasets. Our results indicated that most of them performed poorly in capturing the repeats. Consequently, we proposed a repetitive sequence assembler, named NGSReper, for capturing repeats from NGS data. Simulated datasets were used to validate the feasibility of NGSReper. The results indicate that the completeness of capturing repeat is up to 99%. Cross validation was performed in three real NGS datasets, and extensive comparisons indicate that NGSReper performed best in terms of completeness and accuracy in capturing repeats. In conclusion, NGSReper is an appropriate and suitable tool for capturing repeats directly from NGS data. PMID:27525861

  4. Next-Generation Sequencing in Intellectual Disability.

    PubMed

    Carvill, Gemma L; Mefford, Heather C

    2015-09-01

    Next-generation sequencing technologies have revolutionized gene discovery in patients with intellectual disability (ID) and led to an unprecedented expansion in the number of genes implicated in this disorder. We discuss the strategies that have been used to identify these novel genes for both syndromic and nonsyndromic ID and highlight the phenotypic and genetic heterogeneity that underpin this condition. Finally, we discuss the future of defining the genetic etiology of ID, including the role of whole-genome sequencing, mosaicism, and the importance of diagnostic testing in ID. PMID:27617123

  5. Microfluidic Platform Generates Oxygen Landscapes for Localized Hypoxic Activation

    PubMed Central

    Rexius, Megan L.; Mauleon, Gerardo; Malik, Asrar B.; Rehman, Jalees; Eddington, David T.

    2014-01-01

    An open-well microfluidic platform generates an oxygen landscape using gas-perfused networks which diffuse across a membrane. The device enables real-time analysis of cellular and tissue responses to oxygen tension to define how cells adapt to heterogeneous oxygen conditions found in the physiological setting. We demonstrate that localized hypoxic activation of cells elicited specific metabolic and gene responses in human microvascular endothelial cells and bone marrow-derived mesenchymal stem cells. A robust demonstration of the compatibility of the device with standard laboratory techniques demonstrates the wide utility of the method. This platform is ideally suited to study real-time cell responses and cell-cell interactions within physiologically relevant oxygen landscapes. PMID:25315003

  6. deepTools: a flexible platform for exploring deep-sequencing data

    PubMed Central

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A.; Manke, Thomas

    2014-01-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. PMID:24799436

  7. deepTools: a flexible platform for exploring deep-sequencing data.

    PubMed

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. PMID:24799436

  8. Capturing genomic signatures of DNA sequence variation using a standard anonymous microarray platform

    PubMed Central

    Cannon, C. H.; Kua, C. S.; Lobenhofer, E. K.; Hurban, P.

    2006-01-01

    Comparative genomics, using the model organism approach, has provided powerful insights into the structure and evolution of whole genomes. Unfortunately, only a small fraction of Earth's biodiversity will have its genome sequenced in the foreseeable future. Most wild organisms have radically different life histories and evolutionary genomics than current model systems. A novel technique is needed to expand comparative genomics to a wider range of organisms. Here, we describe a novel approach using an anonymous DNA microarray platform that gathers genomic samples of sequence variation from any organism. Oligonucleotide probe sequences placed on a custom 44 K array were 25 bp long and designed using a simple set of criteria to maximize their complexity and dispersion in sequence probability space. Using whole genomic samples from three known genomes (mouse, rat and human) and one unknown (Gonystylus bancanus), we demonstrate and validate its power, reliability, transitivity and sensitivity. Using two separate statistical analyses, a large numbers of genomic ‘indicator’ probes were discovered. The construction of a genomic signature database based upon this technique would allow virtual comparisons and simple queries could generate optimal subsets of markers to be used in large-scale assays, using simple downstream techniques. Biologists from a wide range of fields, studying almost any organism, could efficiently perform genomic comparisons, at potentially any phylogenetic level after performing a small number of standardized DNA microarray hybridizations. Possibilities for refining and expanding the approach are discussed. PMID:17000641

  9. Initial steps towards a production platform for DNA sequence analysis on the grid

    PubMed Central

    2010-01-01

    Background Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. Results In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. Conclusions The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/ PMID:21156038

  10. On the study of microbial transcriptomes using second- and third-generation sequencing technologies.

    PubMed

    Choi, Sang Chul

    2016-08-01

    Second-generation sequencing technologies transformed the study of microbial transcriptomes. They helped reveal the transcription start sites and antisense transcripts of microbial species, improving the microbial genome annotation. Quantification of genome-wide gene expression levels allowed for functional studies of microbial research. Ever-evolving sequencing technologies are reshaping approaches to studying microbial transcriptomes. Recently, Oxford Nanopore Technologies delivered a sequencing platform called MinION, a third-generation sequencing technology, to the research community. We expect it to be the next sequencing technology that enables breakthroughs in life science fields. The studies of microbial transcriptomes will be no exception. In this paper, we review microbial transcriptomics studies using second- generation sequencing technology. We also discuss the prospect of microbial transcriptomics studies with thirdgeneration sequencing. PMID:27480632

  11. Clinical Integration of Next Generation Sequencing Technology

    PubMed Central

    Gullapalli, R.R.; Lyons-Weiler, M.; Petrosko, P.; Dhir, R.; Becich, M.J.; LaFramboise, W.A.

    2012-01-01

    Abstract/Synopsis Recent technological advances in Next Generation Sequencing (NGS) methods have substantially reduced cost and operational complexity leading to the production of bench top sequencers and commercial software solutions for implementation in small research and clinical laboratories. This chapter summarizes requirements and hurdles to the successful implementation of these systems including 1) calibration, validation and optimization of the instrumentation, experimental paradigm and primary readout, 2) secure transfer, storage and secondary processing of the data, 3) implementation of software tools for targeted analysis, and 4) training of research and clinical personnel to evaluate data fidelity and interpret the molecular significance of the genomic output. In light of the commercial and technological impetus to bring NGS technology into the clinical domain, it is critical that novel tests incorporate rigid protocols with built-in calibration standards and that data transfer and processing occur under exacting security measures for interpretation by clinicians with specialized training in molecular diagnostics. PMID:23078661

  12. Next Generation Sequencing in Endocrine Practice

    PubMed Central

    Forlenza, Gregory P.; Calhoun, Amy; Beckman, Kenneth B.; Halvorsen, Tanya; Hamdoun, Elwaseila; Zierhut, Heather; Sarafoglou, Kyriakie; Polgreen, Lynda E.; Miller, Bradley S.; Nathan, Brandon; Petryk, Anna

    2016-01-01

    With the completion of the Human Genome Project and advances in genomic sequencing technologies, the use of clinical molecular diagnostics has grown tremendously over the last decade. Next-generation sequencing (NGS) has overcome many of the practical roadblocks that had slowed the adoption of molecular testing for routine clinical diagnosis. In endocrinology, targeted NGS now complements biochemical testing and imaging studies. The goal of this review is to provide clinicians with a guide to the application of NGS to genetic testing for endocrine conditions, by compiling a list of established gene mutations detectable by NGS, and highlighting key phenotypic features of these disorders. As we outline in this review, the clinical utility of NGS-based molecular testing for endocrine disorders is very high. Identifying an exact genetic etiology improves understanding of the disease, provides clear explanation to families about the cause, and guides decisions about screening, prevention and/or treatment. PMID:25958132

  13. Next generation sequencing and its applications in forensic genetics.

    PubMed

    Børsting, Claus; Morling, Niels

    2015-09-01

    It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics. PMID:25704953

  14. Revealing the Complexity of Breast Cancer by Next Generation Sequencing

    PubMed Central

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of “-omic” platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  15. Whole-transcriptome sequencing of Pinellia ternata using the Illumina platform.

    PubMed

    Huang, X; Jing, Y; Liu, D J; Yang, B Y; Chen, H; Li, M

    2016-01-01

    Pinelliae rhizoma is the dried tuber of Pinellia ternata (Thunb.) Breit., and has been used for thousands of years as a traditional Chinese medicine. However, its genomic background is little known. With the development of high-throughput genomic sequencing, it is now easy and cheap to obtain genomic information. In this study, 193,032,910 high-quality clean reads were generated using the Illumina Hiseq 2000 platform. A total of 53,544 unigenes were identified from the contigs assembled. Functional annotation analysis annotated 37,318, 27,697, 23,043, 22,869, 23,328, and 27,415 unigenes. KEGG analysis revealed that five pathways (169 genes) were associated with alkaloid synthesis, 201 unigenes were related to fatty acid biosynthesis (ko00061), and 133 unigenes were involved in the biosynthesis of unsaturated fatty acids (ko01040). In addition, 6703 simple sequence repeats were designed based on the unigene sequences for screening germplasm resources in the future. These data are a valuable resource for genomic studies on Pinellia plants. PMID:27420994

  16. Next-generation sequencing for mitochondrial disorders

    PubMed Central

    Carroll, C J; Brilhante, V; Suomalainen, A

    2014-01-01

    A great deal of our understanding of mitochondrial function has come from studies of inherited mitochondrial diseases, but still majority of the patients lack molecular diagnosis. Furthermore, effective treatments for mitochondrial disorders do not exist. Development of therapies has been complicated by the fact that the diseases are extremely heterogeneous, and collecting large enough cohorts of similarly affected individuals to assess new therapies properly has been difficult. Next-generation sequencing technologies have in the last few years been shown to be an effective method for the genetic diagnosis of inherited mitochondrial diseases. Here we review the strategies and findings from studies applying next-generation sequencing methods for the genetic diagnosis of mitochondrial disorders. Detailed knowledge of molecular causes also enables collection of homogenous cohorts of patients for therapy trials, and therefore boosts development of intervention. Linked Articles This article is part of a themed issue on Mitochondrial Pharmacology: Energy, Injury & Beyond. To view the other articles in this issue visit http://dx.doi.org/10.1111/bph.2014.171.issue-8 PMID:24138576

  17. Strategies for complete mitochondrial genome sequencing on Ion Torrent PGM™ platform in forensic sciences.

    PubMed

    Zhou, Yishu; Guo, Fei; Yu, Jiao; Liu, Feng; Zhao, Jinling; Shen, Hongying; Zhao, Bin; Jia, Fei; Sun, Zhu; Song, He; Jiang, Xianhua

    2016-05-01

    Next generation sequencing (NGS) is a time saving and cost-efficient method to detect the complete mitochondrial genome (mtGenome) compared to Sanger sequencing. In this study we focused on developing strategies for mtGenome sequencing on the Ion Torrent PGM™ platform and NGS data analysis. With our experience, 4, 15 and 30 samples could be loaded onto Ion 314™, Ion 316™ and Ion 318™ chips respectively at a pooling concentration of 26pM, achieving to sufficient average coverage of ≥1500 × and well strand balance of 1.05. Data processing software is essential to NGS mega data analysis. The in-house Perl scripts were developed for primary data analysis to screen out uncertain positions and samples from variant call format (VCF) reports and for pedigree study to perform pairwise comparisons. The Integrative Genomic Viewer (IGV) and the NextGENe software were introduced to secondary data analysis. The mthap and EMMA were employed for haplogroup assignment. The dataset was reviewed and approved by the EMPOP as the final version, which showed 2.66% error rate generated from the Torrent Variant Caller (TVC). Across the mtGenome, 4022 variants were found at 725 nucleotide positions, where ratio of transitions to transversions was estimated at 20.89:1 and 22.18% of variants was concentrated at hypervariable segments I and II (HVS-I and HVS-II). Totally, 107 complete mtGenome haplotypes were observed from 107 Northern Chinese Han and assigned to 88 haplogroups. The random match probability (RMP) of complete mtGenome was calculated as 0.009345794, decreasing 26.19% by comparison to that of HVS-I only, and the haplotype diversity (HD) was evaluated as 1, increasing 0.33% by comparison to that of HVS-I only. Principal component analysis (PCA) showed that our population was clustered to East and Southeast Asians. The strategies in this study are suitable for complete mtGenome sequencing on Ion Torrent PGM™ platform and Northern Chinese Han (EMP00670) is the first

  18. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens

    PubMed Central

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-01-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. PMID:24641208

  19. A Modular Assembly Platform for Rapid Generation of DNA Constructs

    PubMed Central

    Akama-Garren, Elliot H.; Joshi, Nikhil S.; Tammela, Tuomas; Chang, Gregory P.; Wagner, Bethany L.; Lee, Da-Yae; Rideout III, William M.; Papagiannakopoulos, Thales; Xue, Wen; Jacks, Tyler

    2016-01-01

    Traditional cloning methods have limitations on the number of DNA fragments that can be simultaneously manipulated, which dramatically slows the pace of molecular assembly. Here we describe GMAP, a Gibson assembly-based modular assembly platform consisting of a collection of promoters and genes, which allows for one-step production of DNA constructs. GMAP facilitates rapid assembly of expression and viral constructs using modular genetic components, as well as increasingly complicated genetic tools using contextually relevant genomic elements. Our data demonstrate the applicability of GMAP toward the validation of synthetic promoters, identification of potent RNAi constructs, establishment of inducible lentiviral systems, tumor initiation in genetically engineered mouse models, and gene-targeting for the generation of knock-in mice. GMAP represents a recombinant DNA technology designed for widespread circulation and easy adaptation for other uses, such as synthetic biology, genetic screens, and CRISPR-Cas9. PMID:26887506

  20. A Modular Assembly Platform for Rapid Generation of DNA Constructs.

    PubMed

    Akama-Garren, Elliot H; Joshi, Nikhil S; Tammela, Tuomas; Chang, Gregory P; Wagner, Bethany L; Lee, Da-Yae; Rideout, William M; Papagiannakopoulos, Thales; Xue, Wen; Jacks, Tyler

    2016-01-01

    Traditional cloning methods have limitations on the number of DNA fragments that can be simultaneously manipulated, which dramatically slows the pace of molecular assembly. Here we describe GMAP, a Gibson assembly-based modular assembly platform consisting of a collection of promoters and genes, which allows for one-step production of DNA constructs. GMAP facilitates rapid assembly of expression and viral constructs using modular genetic components, as well as increasingly complicated genetic tools using contextually relevant genomic elements. Our data demonstrate the applicability of GMAP toward the validation of synthetic promoters, identification of potent RNAi constructs, establishment of inducible lentiviral systems, tumor initiation in genetically engineered mouse models, and gene-targeting for the generation of knock-in mice. GMAP represents a recombinant DNA technology designed for widespread circulation and easy adaptation for other uses, such as synthetic biology, genetic screens, and CRISPR-Cas9. PMID:26887506

  1. Second-generation Sequencing for Marker Development in Sugarcane

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Second generation sequencing (also known as next-generation or massively parallel sequencing) involves the simultaneous generation of millions of short DNA sequences. The impact and applications of this technology are still emerging; however, strategies that reduce the complexity of the DNA sample p...

  2. Periodic binary sequence generators: VLSI circuits considerations

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1984-01-01

    Feedback shift registers are efficient periodic binary sequence generators. Polynomials of degree r over a Galois field characteristic 2(GF(2)) characterize the behavior of shift registers with linear logic feedback. The algorithmic determination of the trinomial of lowest degree, when it exists, that contains a given irreducible polynomial over GF(2) as a factor is presented. This corresponds to embedding the behavior of an r-stage shift register with linear logic feedback into that of an n-stage shift register with a single two-input modulo 2 summer (i.e., Exclusive-OR gate) in its feedback. This leads to Very Large Scale Integrated (VLSI) circuit architecture of maximal regularity (i.e., identical cells) with intercell communications serialized to a maximal degree.

  3. Long period pseudo random number sequence generator

    NASA Technical Reports Server (NTRS)

    Wang, Charles C. (Inventor)

    1989-01-01

    A circuit for generating a sequence of pseudo random numbers, (A sub K). There is an exponentiator in GF(2 sup m) for the normal basis representation of elements in a finite field GF(2 sup m) each represented by m binary digits and having two inputs and an output from which the sequence (A sub K). Of pseudo random numbers is taken. One of the two inputs is connected to receive the outputs (E sub K) of maximal length shift register of n stages. There is a switch having a pair of inputs and an output. The switch outputs is connected to the other of the two inputs of the exponentiator. One of the switch inputs is connected for initially receiving a primitive element (A sub O) in GF(2 sup m). Finally, there is a delay circuit having an input and an output. The delay circuit output is connected to the other of the switch inputs and the delay circuit input is connected to the output of the exponentiator. Whereby after the exponentiator initially receives the primitive element (A sub O) in GF(2 sup m) through the switch, the switch can be switched to cause the exponentiator to receive as its input a delayed output A(K-1) from the exponentiator thereby generating (A sub K) continuously at the output of the exponentiator. The exponentiator in GF(2 sup m) is novel and comprises a cyclic-shift circuit; a Massey-Omura multiplier; and, a control logic circuit all operably connected together to perform the function U(sub i) = 92(sup i) (for n(sub i) = 1 or 1 (for n(subi) = 0).

  4. Global Transcriptome Sequencing Using the Illumina Platform and the Development of EST-SSR Markers in Autotetraploid Alfalfa

    PubMed Central

    Liu, Zhipeng; Chen, Tianlong; Ma, Lichao; Zhao, Zhiguang; Zhao, Patrick X.; Nan, Zhibiao; Wang, Yanrong

    2013-01-01

    Background Alfalfa is the most widely cultivated forage legume and one of the most economically valuable crops in the world. The large size and complexity of the alfalfa genome has delayed the development of genomic resources for alfalfa research. Second-generation Illumina transcriptome sequencing is an efficient method for generating a global transcriptome sequence dataset for gene discovery and molecular marker development in alfalfa. Methodology/Principal Findings More than 28 million sequencing reads (5.64 Gb of clean nucleotides) were generated by Illumina paired-end sequencing from 15 different alfalfa tissue samples. In total, 40,433 unigenes with an average length of 803 bp were obtained by de novo assembly. Based on a sequence similarity search of known proteins, a total of 36,684 (90.73%) unigenes were annotated. In addition, 1,649 potential EST-SSRs were identified as potential molecular markers from unigenes with lengths exceeding 1 kb. A total of 100 pairs of PCR primers were randomly selected to validate the assembly quality and develop EST-SSR markers from genomic DNA. Of these primer pairs, 82 were able to amplify sequences in initial screening tests, and 27 primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among 10 alfalfa accessions. Conclusions/Significance The present study provided global sequence data for autotetraploid alfalfa and demonstrates the Illumina platform is a fast and effective approach to EST-SSR markers development in alfalfa. The use of these transcriptome datasets will serve as a valuable public information platform to accelerate studies of the alfalfa genome. PMID:24349529

  5. Preparation of SELEX Samples for Next-Generation Sequencing.

    PubMed

    Tolle, Fabian; Mayer, Günter

    2016-01-01

    Fuelled by massive whole genome sequencing projects such as the human genome project, enormous technological advancements and therefore tremendous price drops could be achieved, rendering next-generation sequencing very attractive for deep sequencing of SELEX libraries. Herein we describe the preparation of SELEX samples for Illumina sequencing, based on the already established whole genome sequencing workflow. We describe the addition of barcode sequences for multiplexing and the adapter ligation, avoiding associated pitfalls. PMID:26552817

  6. A novel three-round multiplex PCR for SNP genotyping with next generation sequencing.

    PubMed

    Chen, Ke; Zhou, Yu-Xun; Li, Kai; Qi, Li-Xin; Zhang, Qi-Fei; Wang, Mao-Chun; Xiao, Jun-Hua

    2016-06-01

    Owing to the high throughput and low cost, next generation sequencing has attracted much attention for SNP genotyping application for researchers. Here, we introduce a new method based on three-round multiplex PCR to precisely genotype SNPs with next generation sequencing. This method can as much as possible consume the equivalent amount of each pair of specific primers to largely eliminate the amplification discrepancy between different loci. After the PCR amplification, the products can be directly subjected to next generation sequencing platform. We simultaneously amplified 37 SNP loci of 757 samples and sequenced all amplicons on ion torrent PGM platform; 90.5 % of the target SNP loci were accurately genotyped (at least 15×) and 90.4 % amplicons had uniform coverage with a variation less than 50-fold. Ligase detection reaction (LDR) was performed to genotype the 19 SNP loci (as part of the 37 SNP loci) with 91 samples randomly selected from the 757 samples, and 99.5 % genotyping data were consistent with the next generation sequencing results. Our results demonstrate that three-round PCR coupled with next generation sequencing is an efficient and economical genotyping approach. Graphical Abstract The schematic diagram of three-round PCR. PMID:27113460

  7. The impact of next-generation sequencing on genomics

    PubMed Central

    Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

    2011-01-01

    This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come. PMID:21477781

  8. Applications of next-generation sequencing techniques in plant biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The last several years have seen revolutionary advances in DNA sequencing technologies with the advent of next generation sequencing (NGS) techniques. NGS methods now allow millions of bases to be sequenced in one round, at a fraction of the cost relative to traditional Sanger sequencing, allowing u...

  9. Guidelines for diagnostic next-generation sequencing

    PubMed Central

    Matthijs, Gert; Souche, Erika; Alders, Mariëlle; Corveleyn, Anniek; Eck, Sebastian; Feenstra, Ilse; Race, Valérie; Sistermans, Erik; Sturm, Marc; Weiss, Marjan; Yntema, Helger; Bakker, Egbert; Scheffer, Hans; Bauer, Peter

    2016-01-01

    We present, on behalf of EuroGentest and the European Society of Human Genetics, guidelines for the evaluation and validation of next-generation sequencing (NGS) applications for the diagnosis of genetic disorders. The work was performed by a group of laboratory geneticists and bioinformaticians, and discussed with clinical geneticists, industry and patients' representatives, and other stakeholders in the field of human genetics. The statements that were written during the elaboration of the guidelines are presented here. The background document and full guidelines are available as supplementary material. They include many examples to assist the laboratories in the implementation of NGS and accreditation of this service. The work and ideas presented by others in guidelines that have emerged elsewhere in the course of the past few years were also considered and are acknowledged in the full text. Interestingly, a few new insights that have not been cited before have emerged during the preparation of the guidelines. The most important new feature is the presentation of a ‘rating system' for NGS-based diagnostic tests. The guidelines and statements have been applauded by the genetic diagnostic community, and thus seem to be valuable for the harmonization and quality assurance of NGS diagnostics in Europe. PMID:26508566

  10. Next-Generation Sequencing: A Review of Technologies and Tools for Wound Microbiome Research

    PubMed Central

    Hodkinson, Brendan P.; Grice, Elizabeth A.

    2015-01-01

    Significance: The colonization of wounds by specific microbes or communities of microbes may delay healing and/or lead to infection-related complication. Studies of wound-associated microbial communities (microbiomes) to date have primarily relied upon culture-based methods, which are known to have extreme biases and are not reliable for the characterization of microbiomes. Biofilms are very resistant to culture and are therefore especially difficult to study with techniques that remain standard in clinical settings. Recent Advances: Culture-independent approaches employing next-generation DNA sequencing have provided researchers and clinicians a window into wound-associated microbiomes that could not be achieved before and has begun to transform our view of wound-associated biodiversity. Within the past decade, many platforms have arisen for performing this type of sequencing, with various types of applications for microbiome research being possible on each. Critical Issues: Wound care incorporating knowledge of microbiomes gained from next-generation sequencing could guide clinical management and treatments. The purpose of this review is to outline the current platforms, their applications, and the steps necessary to undertake microbiome studies using next-generation sequencing. Future Directions: As DNA sequencing technology progresses, platforms will continue to produce longer reads and more reads per run at lower costs. A major future challenge is to implement these technologies in clinical settings for more precise and rapid identification of wound bioburden. PMID:25566414

  11. Generating Functions for the Powers of Fibonacci Sequences

    ERIC Educational Resources Information Center

    Terrana, D.; Chen, H.

    2007-01-01

    In this note, based on the Binet formulas and the power-reducing techniques, closed forms of generating functions for the powers of Fibonacci sequences are presented. The corresponding results are extended to some other famous sequences as well.

  12. An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome

    PubMed Central

    2013-01-01

    Background Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome. Results Following error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously. Conclusions This is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be

  13. Depositional sequence evolution, Paleozoic and early Mesozoic of the central Saharan platform, North Africa

    SciTech Connect

    Sprague, A.R.G. )

    1991-08-01

    Over 30 depositional sequences have been identified in the Paleozoic and lower Mesozoic of the Ghadames basin of eastern Algeria, southern Tunisia, and western Libya. Well logs and lithologic information from more than 500 wells were used to correlate the 30 sequences throughout the basin (total area more than 1 million km{sup 2}). Based on systematic change in the log response of strata in successively younger sequences, five groups of sequences with distinctive characteristics have been identified: Cambro-Ordivician, Upper Silurian-Middle Devonian, Upper Devonian, Carboniferous, and Middle Triassic-Middle Jurassic. Each sequence group is terminated by a major, tectonically enhanced sequence boundary that is immediately overlain (except for the Carboniferous) by a shale-prone interval deposited in response to basin-wide flooding. The four Paleozoic sequence groups were deposited on the Saharan platform, a north facing, clastic-dominated shelf that covered most of North Africa during the Paleozoic. The sequence boundary at the top of the Carboniferous sequence group is one of several Permian-Carboniferous angular unconformities in North Africa related to the Hercynian orogeny. The youngest sequence group (Middle Triassic to Middle Jurassic) is a clastic-evaporite package that onlaps southward onto the top of Paleozoic sequence boundary. The progressive changes from the Cambrian to the Jurassic, in the nature of the Ghadames basin sequences is a reflection of the interplay between basin morphology and tectonics, vegetation, eustasy, climate, and sediment supply.

  14. Economic regulation of next-generation sequencing.

    PubMed

    Evans, Barbara J

    2014-01-01

    Next-generation sequencing broadens the debate about appropriate regulatory oversight of genetic testing and may force scholars to move beyond familiar privacy and health and safety regulatory issues to address new problems with industry structure and economic regulation. The genetic testing industry is passing through a period of profound structural change in response to shifts in technology and in the legal environment. Making genetic testing safe and effective for consumers increasingly requires access to comprehensive genomic data infrastructures that can support accurate, state-of-the-art interpretation of genetic test results. At present, there are significant barriers to access and there is no sector-specific regulator with power to ensure appropriate data access. Without it, genetic testing will not be safe for consumers even when it is performed at CLIA-certified laboratories using tests that have been FDA-cleared or approved. This article explores the emerging structure of the genetic testing industry and describes its present economic regulatory vacuum. In view of this gap in regulation, the article explores whether generally applicable law, particularly antitrust law, may offer solutions to the industry's data access problems. It concludes that courts may have a useful role to play, particularly in Europe and other jurisdictions where the essential facilities doctrine enjoys continued vitality. After Verizon Communications v. Law Offices of Curtis V. Trinko, the role of U.S. federal courts is less certain. Congress has demonstrated willingness to address access issues as they emerged in other infrastructure industries in recent decades. This article expresses no preference between legislative and judicial solutions. Its aim is simply to highlight an emerging economic regulatory issue which, if left unresolved, presents real health and safety concerns for consumers who receive genetic tests. PMID:25298291

  15. Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

    PubMed Central

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  16. Historical perspective, development and applications of next-generation sequencing in plant virology.

    PubMed

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  17. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    PubMed Central

    Tattini, Lorenzo; D’Aurizio, Romina; Magi, Alberto

    2015-01-01

    Structural variants are genomic rearrangements larger than 50 bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events. PMID:26161383

  18. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  19. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  20. Primer and platform effects on 16S rRNA tag sequencing

    PubMed Central

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-01-01

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. Beta diversity metrics are surprisingly robust to both primer and sequencing platform biases. PMID:26300854

  1. Ad 2.0: a novel recombineering platform for high-throughput generation of tailored adenoviruses.

    PubMed

    Mück-Häusl, Martin; Solanki, Manish; Zhang, Wenli; Ruzsics, Zsolt; Ehrhardt, Anja

    2015-04-30

    Recombinant adenoviruses containing a double-stranded DNA genome of 26-45 kb were broadly explored in basic virology, for vaccination purposes, for treatment of tumors based on oncolytic virotherapy, or simply as a tool for efficient gene transfer. However, the majority of recombinant adenoviral vectors (AdVs) is based on a small fraction of adenovirus types and their genetic modification. Recombineering techniques provide powerful tools for arbitrary engineering of recombinant DNA. Here, we adopted a seamless recombineering technology for high-throughput and arbitrary genetic engineering of recombinant adenoviral DNA molecules. Our cloning platform which also includes a novel recombination pipeline is based on bacterial artificial chromosomes (BACs). It enables generation of novel recombinant adenoviruses from different sources and switching between commonly used early generation AdVs and the last generation high-capacity AdVs lacking all viral coding sequences making them attractive candidates for clinical use. In combination with a novel recombination pipeline allowing cloning of AdVs containing large and complex transgenes and the possibility to generate arbitrary chimeric capsid-modified adenoviruses, these techniques allow generation of tailored AdVs with distinct features. Our technologies will pave the way toward broader applications of AdVs in molecular medicine including gene therapy and vaccination studies. PMID:25609697

  2. A research roadmap for next-generation sequencing informatics.

    PubMed

    Altman, Russ B; Prabhu, Snehit; Sidow, Arend; Zook, Justin M; Goldfeder, Rachel; Litwack, David; Ashley, Euan; Asimenos, George; Bustamante, Carlos D; Donigan, Katherine; Giacomini, Kathleen M; Johansen, Elaine; Khuri, Natalia; Lee, Eunice; Liang, Xueying Sharon; Salit, Marc; Serang, Omar; Tezak, Zivana; Wall, Dennis P; Mansfield, Elizabeth; Kass-Hout, Taha

    2016-04-20

    Next-generation sequencing technologies are fueling a wave of new diagnostic tests. Progress on a key set of nine research challenge areas will help generate the knowledge required to advance effectively these diagnostics to the clinic. PMID:27099173

  3. Polynomials Generated by the Fibonacci Sequence

    NASA Astrophysics Data System (ADS)

    Garth, David; Mills, Donald; Mitchell, Patrick

    2007-06-01

    The Fibonacci sequence's initial terms are F_0=0 and F_1=1, with F_n=F_{n-1}+F_{n-2} for n>=2. We define the polynomial sequence p by setting p_0(x)=1 and p_{n}(x)=x*p_{n-1}(x)+F_{n+1} for n>=1, with p_{n}(x)= sum_{k=0}^{n} F_{k+1}x^{n-k}. We call p_n(x) the Fibonacci-coefficient polynomial (FCP) of order n. The FCP sequence is distinct from the well-known Fibonacci polynomial sequence. We answer several questions regarding these polynomials. Specifically, we show that each even-degree FCP has no real zeros, while each odd-degree FCP has a unique, and (for degree at least 3) irrational, real zero. Further, we show that this sequence of unique real zeros converges monotonically to the negative of the golden ratio. Using Rouche's theorem, we prove that the zeros of the FCP's approach the golden ratio in modulus. We also prove a general result that gives the Mahler measures of an infinite subsequence of the FCP sequence whose coefficients are reduced modulo an integer m>=2. We then apply this to the case that m=L_n, the nth Lucas number, showing that the Mahler measure of the subsequence is phi^{n-1}, where phi=(1+sqrt 5)/2.

  4. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  5. JVM: Java Visual Mapping tool for next generation sequencing read.

    PubMed

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB. PMID:25387956

  6. SNP Discovery Using Next Generation Transcriptomic Sequencing.

    PubMed

    De Wit, Pierre

    2016-01-01

    In this chapter, I will guide the user through methods to find new SNP markers from expressed sequence (RNA-Seq) data, focusing on the sample preparation and also on the bioinformatic analyses needed to sort through the immense flood of data from high-throughput sequencing machines. The general steps included are as follows: sample preparation, sequencing, quality control of data, assembly, mapping, SNP discovery, filtering, validation. The first few steps are traditional laboratory protocols, whereas steps following the sequencing are of bioinformatic nature. The bioinformatics described herein are by no means exhaustive, rather they serve as one example of a simple way of analyzing high-throughput sequence data to find SNP markers. Ideally, one would like to run through this protocol several times with a new dataset, while varying software parameters slightly, in order to determine the robustness of the results. The final validation step, although not described in much detail here, is also quite critical as that will be the final test of the accuracy of the assumptions made in silico.There is a plethora of downstream applications of a SNP dataset, not covered in this chapter. For an example of a more thorough protocol also including differential gene expression and functional enrichment analyses, BLAST annotation and downstream applications of SNP markers, a good starting point could be the "Simple Fool's Guide to population genomics via RNA-Seq," which is available at http://sfg.stanford.edu . PMID:27460371

  7. Analyzing the safety of removal sequences for piles of an offshore jacket platform

    NASA Astrophysics Data System (ADS)

    Pan, Xin-Ying; Zhang, Zhao-De

    2009-12-01

    An inevitable consequence of the development of the offshore petroleum industry is the eventual obsolescence of large offshore structures. Proper methods for removal of decommissioned offshore platforms are becoming an important topic that the oil and gas industry must pay increasing attention to. While removing sections from a decommissioned jacket platform, the stability of the remaining parts is critical. The jacket danger indices D σ and D s defined in this paper are very useful for analyzing the safety of any procedure planned for disassembling a jacket platform. The safest piles cutting sequence can be determined easily by comparing every column of D σ and D s or simply analyzing the figures of every row of D σ and D s .

  8. Image encryption using random sequence generated from generalized information domain

    NASA Astrophysics Data System (ADS)

    Xia-Yan, Zhang; Guo-Ji, Zhang; Xuan, Li; Ya-Zhou, Ren; Jie-Hua, Wu

    2016-05-01

    A novel image encryption method based on the random sequence generated from the generalized information domain and permutation–diffusion architecture is proposed. The random sequence is generated by reconstruction from the generalized information file and discrete trajectory extraction from the data stream. The trajectory address sequence is used to generate a P-box to shuffle the plain image while random sequences are treated as keystreams. A new factor called drift factor is employed to accelerate and enhance the performance of the random sequence generator. An initial value is introduced to make the encryption method an approximately one-time pad. Experimental results show that the random sequences pass the NIST statistical test with a high ratio and extensive analysis demonstrates that the new encryption scheme has superior security.

  9. Variable Speed Wind Turbine Generator with Zero-sequence Filter

    DOEpatents

    Muljadi, Eduard

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  10. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, E.

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility. 14 figs.

  11. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, Eduard

    1998-01-01

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  12. A Study on Sequence Generation Powers of Small Cellular Automata

    NASA Astrophysics Data System (ADS)

    Kamikawa, Naoki; Umeo, Hiroshi

    A model of cellular automata (CA) is considered to be a well-studied non-linear model of complex systems in which an infinite one-dimensional array of finite state machines (cells) updates itself in a synchronous manner according to a uniform local rule. A sequence generation problem on the CAs has been studied and many scholars proposed several real-time sequence generation algorithms for a variety of non-regular sequences such as prime, Fibonacci, and {2n|n=1,2,3,...} sequences etc. The paper describes the sequence generation powers of CAs having a small number of states, focusing on the CAs with one, two, and three internal states, respectively. The authors enumerate all of the sequences generated by two-state CAs and present several non-regular sequences that can be generated in real-time by three-state CAs, but not generated by any two-state CA. It is shown that there exists a sequence generation gap among the powers of those small CAs.

  13. Comparative depositional geometries and facies within windward rimmed platform and carbonate ramp sequences

    SciTech Connect

    Boss, S.K.; Rasmussen, K.A.; Neumann, A.C. )

    1992-01-01

    Northern Great Bahama Bank (NGBB) combines geomorphic aspects of rimmed platforms and carbonate ramps in a windward (high-energy) environment. Analysis of Holocene sediment cores, seismic reflection mapping of the Holocene-Pleistocene unconformity and transgressive Holocene deposits and petrographic study of excavated Holocene submarine-cemented horizons provides an integrated view of evolving depositional geometries within both rimmed platform and ramp settings. Cores display gross textural and compositional homogeneity; all sediments are medium to coarse sands comprised of composite peloids, Halimeda sp., benthic foraminifera and molluscs. Three-dimensional seismic mapping reveals that this basal unconformity exhibits variation in topographic relief related to both constructional and erosional processes; rimmed portions of the platform are associated with topographic plateaus'' with fringing eolianite ridges or (rarely) reefs. These plateaus'' are separated by a somewhat deeper (ca. 5m deep) trough'' exhibiting little relief, but sloping seaward to form a ramp. Multiple intrasequence cemented horizons are a common feature of the thinner deposits of the NGBB ramp where tidal exchange is vigorous and sediment deposition is episodic or in dynamic balance with sediment export. Thus, rimmed carbonate platform facies are thick marine sands with relatively little submarine cementation while open, unsheltered ramp facies are characterized by thin sediment sequences containing numerous, discontinuous submarine-cemented horizons. In the absence of other obvious facies or geomorphic indicators (e.g. preserved reefal rims), the preservation of similar depositional features in ancient limestones may serve as a useful discriminant of rimmed platform versus carbonate ramp settings.

  14. The Feasibility Study of Non-Invasive Fetal Trisomy 18 and 21 Detection with Semiconductor Sequencing Platform

    PubMed Central

    Guo, Qiwei; Chen, Jinchun; Quan, Shengmao; Zhang, Ahong; Zheng, Hailing; Zhu, Xingqiang; Lin, Jin; Xu, Huan; Wu, Ayang; Park, Sin-Gi; Kim, Byung Chul; Joo, Hee Jae; Chen, Hongliang; Bhak, Jong

    2014-01-01

    Objective Recent non-invasive prenatal testing (NIPT) technologies are based on next-generation sequencing (NGS). NGS allows rapid and effective clinical diagnoses to be determined with two common sequencing systems: Illumina and Ion Torrent platforms. The majority of NIPT technology is associated with Illumina platform. We investigated whether fetal trisomy 18 and 21 were sensitively and specifically detectable by semiconductor sequencer: Ion Proton. Methods From March 2012 to October 2013, we enrolled 155 pregnant women with fetuses who were diagnosed as high risk of fetal defects at Xiamen Maternal & Child Health Care Hospital (Xiamen, Fujian, China). Adapter-ligated DNA libraries were analyzed by the Ion Proton™ System (Life Technologies, Grand Island, NY, USA) with an average 0.3× sequencing coverage per nucleotide. Average total raw reads per sample was 6.5 million and mean rate of uniquely mapped reads was 59.0%. The results of this study were derived from BWA mapping. Z-score was used for fetal trisomy 18 and 21 detection. Results Interactive dot diagrams showed the minimal z-score values to discriminate negative versus positive cases of fetal trisomy 18 and 21. For fetal trisomy 18, the minimal z-score value of 2.459 showed 100% positive predictive and negative predictive values. The minimal z-score of 2.566 was used to classify negative versus positive cases of fetal trisomy 21. Conclusion These results provide the evidence that fetal trisomy 18 and 21 detection can be performed with semiconductor sequencer. Our data also suggest that a prospective study should be performed with a larger cohort of clinically diverse obstetrics patients. PMID:25329639

  15. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data. PMID:25959587

  16. Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

    PubMed Central

    Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

    2015-01-01

    Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016

  17. Next Generation Sequencing for the Diagnosis of Cardiac Arrhythmia Syndromes

    PubMed Central

    Lubitz, Steven A.; Ellinor, Patrick T.

    2015-01-01

    Inherited arrhythmia syndromes are collectively associated with substantial morbidity, yet our understanding of the genetic architecture of these conditions remains limited. Recent technological advances in DNA sequencing have led to the commercialization of genetic testing now widely available in clinical practice. In particular, next generation sequencing allows the large-scale and rapid assessment of entire genomes. Although next generation sequencing represents a major technological advance, it has introduced numerous challenges with respect to the interpretation of genetic variation, and has opened a veritable floodgate of biological data of unknown clinical significance to practitioners. In this review, we discuss current genetic testing indications for inherited arrhythmia syndromes, broadly outline characteristics of next generation sequencing techniques, and highlight challenges associated with such testing. We further summarize future directions that will be necessary to address to enable the widespread adoption of next generation sequencing in the routine management of patients with inherited arrhythmia syndromes. PMID:25625719

  18. Methods in virus diagnostics: from ELISA to next generation sequencing.

    PubMed

    Boonham, Neil; Kreuze, Jan; Winter, Stephan; van der Vlugt, René; Bergervoet, Jan; Tomlinson, Jenny; Mumford, Rick

    2014-06-24

    Despite the seemingly continuous development of newer and ever more elaborate methods for detecting and identifying viruses, very few of these new methods get adopted for routine use in testing laboratories, often despite the many and varied claimed advantages they possess. To understand why the rate of uptake of new technologies is so low, requires a strong understanding of what makes a good routine diagnostic tool to begin. This can be done by looking at the two most successfully established plant virus detection methods: enzyme-linked immunosorbant assay (ELISA) and more recently introduced real-time polymerase chain reaction (PCR). By examining the characteristics of this pair of technologies, it becomes clear that they share many benefits, such as an industry standard format and high levels of repeatability and reproducibility. These combine to make methods that are accessible to testing labs, which are easy to establish and robust in their use, even with new and inexperienced users. Hence, to ensure the establishment of new techniques it is necessary to not only provide benefits not found with ELISA or real-time PCR, but also to provide a platform that is easy to establish and use. In plant virus diagnostics, recent developments can be clustered into three core areas: (1) techniques that can be performed in the field or resource poor locations (e.g., loop-mediated isothermal amplification LAMP); (2) multiplex methods that are able to detect many viruses in a single test (e.g., Luminex bead arrays); and (3) methods suited to virus discovery (e.g., next generation sequencing, NGS). Field based methods are not new, with Lateral Flow Devices (LFDs) for the detection being available for a number of years now. However, the widespread uptake of this technology remains poor. LAMP does offer significant advantages over LFDs, in terms of sensitivity and generic application, but still faces challenges in terms of establishment. It is likely that the main barrier to the

  19. Non-random DNA fragmentation in next-generation sequencing

    PubMed Central

    Poptsova, Maria S.; Il'icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-01-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed “reads” are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions. PMID:24681819

  20. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  1. Computer program to generate attitude error equations for a gimballed platform

    NASA Technical Reports Server (NTRS)

    Hall, W. A., Jr.; Morris, T. D.; Rone, K. Y.

    1972-01-01

    Computer program for solving attitude error equations related to gimballed platform is described. Program generates matrix elements of attitude error equations when initial matrices and trigonometric identities have been defined. Program is written for IBM 360 computer.

  2. Bioelectrochemical system platform for sustainable environmental remediation and energy generation.

    PubMed

    Wang, Heming; Luo, Haiping; Fallgren, Paul H; Jin, Song; Ren, Zhiyong Jason

    2015-01-01

    The increasing awareness of the energy-environment nexus is compelling the development of technologies that reduce environmental impacts during energy production as well as energy consumption during environmental remediation. Countries spend billions in pollution cleanup projects, and new technologies with low energy and chemical consumption are needed for sustainable remediation practice. This perspective review provides a comprehensive summary on the mechanisms of the new bioelectrochemical system (BES) platform technology for efficient and low cost remediation, including petroleum hydrocarbons, chlorinated solvents, perchlorate, azo dyes, and metals, and it also discusses the potential new uses of BES approach for some emerging contaminants remediation, such as CO2 in air and nutrients and micropollutants in water. The unique feature of BES for environmental remediation is the use of electrodes as non-exhaustible electron acceptors, or even donors, for contaminant degradation, which requires minimum energy or chemicals but instead produces sustainable energy for monitoring and other onsite uses. BES provides both oxidation (anode) and reduction (cathode) reactions that integrate microbial-electro-chemical removal mechanisms, so complex contaminants with different characteristics can be removed. We believe the BES platform carries great potential for sustainable remediation and hope this perspective provides background and insights for future research and development. PMID:25886880

  3. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    PubMed Central

    Knapp, Michael; Hofreiter, Michael

    2010-01-01

    The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions. PMID:24710043

  4. Mining frequent biological sequences based on bitmap without candidate sequence generation.

    PubMed

    Wang, Qian; Davis, Darryl N; Ren, Jiadong

    2016-02-01

    Biological sequences carry a lot of important genetic information of organisms. Furthermore, there is an inheritance law related to protein function and structure which is useful for applications such as disease prediction. Frequent sequence mining is a core technique for association rule discovery, but existing algorithms suffer from low efficiency or poor error rate because biological sequences differ from general sequences with more characteristics. In this paper, an algorithm for mining Frequent Biological Sequence based on Bitmap, FBSB, is proposed. FBSB uses bitmaps as the simple data structure and transforms each row into a quicksort list QS-list for sequence growth. For the continuity and accuracy requirement of biological sequence mining, tested sequences used during the mining process of FBSB are real ones instead of generated candidates, and all the frequent sequences can be mined without any errors. Comparing with other algorithms, the experimental results show that FBSB can achieve a better performance on both run time and scalability. PMID:26773937

  5. Building a next generation platform for association studies in cacao

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The drastic reductions in cost and time associated with the collection of DNA sequence and genotype data have revolutionized genetic mapping in model systems (e.g. humans, Arabidopsis) and also promise to significantly enhance the power and resolution of genetic mapping in agricultural systems. Prog...

  6. Integrated platform for detection of DNA sequence variants using capillary array electrophoresis

    SciTech Connect

    Qingbro, Li; Liu, Zhaowei; Monroe, Heidi M; Culiat, Cymbeline T

    2002-08-01

    We have developed a highly versatile platform that performs temperature gradient capillary electrophoresis (TGCE) for mutation/single-nucleotide polymorphism (SNP) detection, sequencing and mutation/SNP genotyping for identification of sequence variants on an automated 24-, 96- or 192-capillary array instrument. In the first mode, multiple DNA samples consisting of homoduplexes and heteroduplexes are separated by CE, during which a temperature gradient is applied that covers all possible temperatures of 50% melting equilibrium (Tms) for the samples. The differences in Tms result in separation of homoduplexes from heteroduplexes, thereby identifying the presence of DNA variants. The sequencing mode is then used to determine the exact location of the mutation/SNPs in the DNA variants. The first two modes allow the rapid identification of variants from the screening of a large number of samples. Only the variants need to be sequenced. The third mode utilizes multiplexed single-base extensions (SBEs) to survey mutations and SNPs at the known sites of DNA sequence. The TGCE approach combined with sequencing and SBE is fast and cost-effective for high-throughput mutation/SNP detection.

  7. Next-generation sequencing in clinical virology: Discovery of new viruses

    PubMed Central

    Datta, Sibnarayan; Budhauliya, Raghvendra; Das, Bidisha; Chatterjee, Soumya; Vanlalhmuaka; Veer, Vijay

    2015-01-01

    Viruses are a cause of significant health problem worldwide, especially in the developing nations. Due to different anthropological activities, human populations are exposed to different viral pathogens, many of which emerge as outbreaks. In such situations, discovery of novel viruses is utmost important for deciding prevention and treatment strategies. Since last century, a number of different virus discovery methods, based on cell culture inoculation, sequence-independent PCR have been used for identification of a variety of viruses. However, the recent emergence and commercial availability of next-generation sequencers (NGS) has entirely changed the field of virus discovery. These massively parallel sequencing platforms can sequence a mixture of genetic materials from a very heterogeneous mix, with high sensitivity. Moreover, these platforms work in a sequence-independent manner, making them ideal tools for virus discovery. However, for their application in clinics, sample preparation or enrichment is necessary to detect low abundance virus populations. A number of techniques have also been developed for enrichment or viral nucleic acids. In this manuscript, we review the evolution of sequencing; NGS technologies available today as well as widely used virus enrichment technologies. We also discuss the challenges associated with their applications in the clinical virus discovery. PMID:26279987

  8. Multiple nuclear ortholog next generation sequencing phylogeny of Daucus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing is helping to solve the data insufficiency problem hindering well-resolved dominant gene phylogenies. We used Roche 454 technology to obtain DNA sequences from 93 nuclear orthologs, dispersed throughout all linkage groups of Daucus. Of these 93 orthologs, ten were designed...

  9. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

    PubMed

    Huang, Austin; Kantor, Rami; DeLong, Allison; Schreier, Leeann; Istrail, Sorin

    Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data. PMID:23202421

  10. Collaborative Effort for a Centralized Worldwide Tuberculosis Relational Sequencing Data Platform.

    PubMed

    Starks, Angela M; Avilés, Enrique; Cirillo, Daniela M; Denkinger, Claudia M; Dolinger, David L; Emerson, Claudia; Gallarda, Jim; Hanna, Debra; Kim, Peter S; Liwski, Richard; Miotto, Paolo; Schito, Marco; Zignol, Matteo

    2015-10-15

    Continued progress in addressing challenges associated with detection and management of tuberculosis requires new diagnostic tools. These tools must be able to provide rapid and accurate information for detecting resistance to guide selection of the treatment regimen for each patient. To achieve this goal, globally representative genotypic, phenotypic, and clinical data are needed in a standardized and curated data platform. A global partnership of academic institutions, public health agencies, and nongovernmental organizations has been established to develop a tuberculosis relational sequencing data platform (ReSeqTB) that seeks to increase understanding of the genetic basis of resistance by correlating molecular data with results from drug susceptibility testing and, optimally, associated patient outcomes. These data will inform development of new diagnostics, facilitate clinical decision making, and improve surveillance for drug resistance. ReSeqTB offers an opportunity for collaboration to achieve improved patient outcomes and to advance efforts to prevent and control this devastating disease. PMID:26409275

  11. DTUsat-2 - The Next Generation Animal Migration Research Platform

    NASA Astrophysics Data System (ADS)

    Bjarnø, J. B.; Fléron, R. W.

    2008-08-01

    The DTUsat-2 project aims to demonstrate pico-class (<1kg) satellites as a viable platform for high value scientific investigations. Through the development of both ground and space segments in a generic miniature tracking system, the project specifically targets a major impediment faced by the biological research community, namely the lack of accurate intercontinental tracking solutions for smaller migratory species. Resolving this issue will not only push the boundaries of migratory research, but also entails the capability of bringing global remote tracking access to hitherto inaccessible sciences. This paper outlines the scope of the DTUsat-2 project and the organizational framework behind the mission. Moreover, the system level designs are discussed in relation to the latest advances on actual project implementations and development milestones.

  12. Next Generation Sequencing: Potential and Application in Drug Discovery

    PubMed Central

    Yadav, Navneet Kumar; Shukla, Pooja; Omer, Ankur; Pareek, Shruti; Singh, R. K.

    2014-01-01

    The world has now entered into a new era of genomics because of the continued advancements in the next generation high throughput sequencing technologies, which includes sequencing by synthesis-fluorescent in situ sequencing (FISSEQ), pyrosequencing, sequencing by ligation using polony amplification, supported oligonucleotide detection (SOLiD), sequencing by hybridization along with sequencing by ligation, and nanopore technology. Great impacts of these methods can be seen for solving the genome related problems of plant and animal kingdom that will open the door of a new era of genomics. This may ultimately overcome the Sanger sequencing that ruled for 30 years. NGS is expected to advance and make the drug discovery process more rapid. PMID:24688432

  13. NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data

    PubMed Central

    Patel, Ravi K.; Jain, Mukesh

    2012-01-01

    Next generation sequencing (NGS) technologies provide a high-throughput means to generate large amount of sequence data. However, quality control (QC) of sequence data generated from these technologies is extremely important for meaningful downstream analysis. Further, highly efficient and fast processing tools are required to handle the large volume of datasets. Here, we have developed an application, NGS QC Toolkit, for quality check and filtering of high-quality data. This toolkit is a standalone and open source application freely available at http://www.nipgr.res.in/ngsqctoolkit.html. All the tools in the application have been implemented in Perl programming language. The toolkit is comprised of user-friendly tools for QC of sequencing data generated using Roche 454 and Illumina platforms, and additional tools to aid QC (sequence format converter and trimming tools) and analysis (statistics tools). A variety of options have been provided to facilitate the QC at user-defined parameters. The toolkit is expected to be very useful for the QC of NGS data to facilitate better downstream analysis. PMID:22312429

  14. Next generation sequencing: implications in personalized medicine and pharmacogenomics.

    PubMed

    Rabbani, Bahareh; Nakaoka, Hirofumi; Akhondzadeh, Shahin; Tekin, Mustafa; Mahdieh, Nejat

    2016-05-24

    A breakthrough in next generation sequencing (NGS) in the last decade provided an unprecedented opportunity to investigate genetic variations in humans and their roles in health and disease. NGS offers regional genomic sequencing such as whole exome sequencing of coding regions of all genes, as well as whole genome sequencing. RNA-seq offers sequencing of the entire transcriptome and ChIP-seq allows for sequencing the epigenetic architecture of the genome. Identifying genetic variations in individuals can be used to predict disease risk, with the potential to halt or retard disease progression. NGS can also be used to predict the response to or adverse effects of drugs or to calculate appropriate drug dosage. Such a personalized medicine also provides the possibility to treat diseases based on the genetic makeup of the patient. Here, we review the basics of NGS technologies and their application in human diseases to foster human healthcare and personalized medicine. PMID:27066891

  15. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    PubMed

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes. PMID:27045828

  16. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges

    PubMed Central

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R. Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS. PMID:26473927

  17. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges.

    PubMed

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS. PMID:26473927

  18. A Pulse Generator Based on an Arduino Platform for Ultrasonic Applications

    NASA Astrophysics Data System (ADS)

    Acevedo, Pedro; Vázquez, Mónica; Durán, Joel; Petrearce, Rodolfo

    The objective of this work is to use the Arduino platform as an ultrasonic pulse generator to excite PVDF ultrasonic arrays in transmission. An experimental setup was implemented using a through-transmission configuration to evaluate the performance of the generator.

  19. Third Generation Sequencing Techniques and Applications to Drug Discovery

    PubMed Central

    Ozsolak, Fatih

    2012-01-01

    Introduction There is an immediate need for functional and molecular studies to decipher differences between disease and “normal” settings to identify large quantities of validated targets with the highest therapeutic utilities. Furthermore, drug mechanism of action and biomarkers to predict drug efficacy and safety need to be identified for effective design of clinical trials, decreasing attrition rates, regulatory agency approval process and drug repositioning. By expanding the power of genetics and pharmacogenetics studies, next generation nucleic acid sequencing technologies have started to play an important role in all stages of drug discovery. Areas covered This article reviews the first and second generation sequencing technologies (SGSTs) and challenges they pose to biomedicine. The article then focuses on the emerging third generation sequencing technologies (TGSTs), their technological foundations and potential contributions to drug discovery. Expert Opinion Despite the scientific and commercial success of SGSTs, the goal of rapid, comprehensive and unbiased sequencing of nucleic acids has not been achieved. TGSTs promise to increase sequencing throughput and read lengths, decrease costs, run times and error rates, eliminate biases inherent in SGSTs, and offer capabilities beyond nucleic acid sequencing. Such changes will have positive impact in all sequencing applications to drug discovery. PMID:22468954

  20. Pittosporum cryptic virus 1: genome sequence completion using next-generation sequencing.

    PubMed

    Elbeaino, Toufic; Kubaa, Raied Abou; Tuzlali, Hasan Tuna; Digiaro, Michele

    2016-07-01

    Next-generation sequencing (NGS) was applied to dsRNAs extracted from an Italian pittosporum plant infected with pittosporum cryptic virus 1 (PiCV1). NGS allowed assembly of the full genome sequence of PiCV1, comprising dsRNA1 (1.9 kbp) and dsRNA2 (1.5 kbp), which encode the RNA-dependent RNA polymerase and capsid protein genes, respectively. Phylogenetic and sequence analyses confirmed that PiCV1 is a new member of the genus Deltapartitivirus, family Partiviridae. From the same plant, NSG also permitted assembly of the complete genome sequence of eggplant mottled dwarf virus (EMDV), which shared 86 % to 98 % nucleotide sequence identity with complete and partial sequences (ca 6750 nt) of other known EMDV isolates with sequences available in the GenBank database. PMID:27087112

  1. Repetitive reef to ooid sequences near leeward margin of Caicos Platform, British West Indies

    SciTech Connect

    Waltz, M.; Rossinsky, V.; Wanless, H.R.

    1987-05-01

    Drill core transects and outcrops near the leeward margin of the Caicos Platform, BWI, reveal repetitive (one Holocene and two Pleistocene) shallowing-upward sequences of either (a) reefal boundstones overlain by layered oolitic grainstones or (b) burrowed oolitic grainstones overlain by layered oolitic grainstones. Each sediment sequence is separated from the other by a calcrete exposure surface. A transect, perpendicular to the trend of an exposed Pleistocene barrier reef/ooid sand complex, shows two separate sediment packages of reefal boundstones and reef-derived skeletal packstones overlain by layered oolitic grainstones. The well-exposed upper package consists of a shallowing-upward barrier reef, which is immediately overlain by burrowed and cross-bedded oolitic grainstones, beach rock blocks, and coral rubble, capped by layered oolitic grainstones. Separated by an exposure horizon, the lowermost package consists of coral and skeletal sands overlain by layered oolitic grainstones. Cores from a transect in a non-reefal setting north of the barrier reef complex reveal highly burrowed oolitic grainstones capped by layered oolitic grainstones. As a Holocene example, immediately offshore of this transect, modern reefs and bioturbated oolitic grainstones are presently being buried beneath coral rubble, beach rock blocks, and prograding oolitic beaches. Deposition of the capping layered oolitic grainstones appears to occur during stable and falling sea levels. This co-occurrence of reefal sediment and ooid sands suggests that the two are not mutually exclusive and that reef-ooid succession is a reoccurring part of leeward margin platform margin-building.

  2. Transcriptome sequencing and development of an expression microarray platform for the domestic ferret

    PubMed Central

    2010-01-01

    Background The ferret (Mustela putorius furo) represents an attractive animal model for the study of respiratory diseases, including influenza. Despite its importance for biomedical research, the number of reagents for molecular and immunological analysis is restricted. We present here a parallel sequencing effort to produce an extensive EST (expressed sequence tags) dataset derived from a normalized ferret cDNA library made from mRNA from ferret blood, liver, lung, spleen and brain. Results We produced more than 500000 sequence reads that were assembled into 16000 partial ferret genes. These genes were combined with the available ferret sequences in the GenBank to develop a ferret specific microarray platform. Using this array, we detected tissue specific expression patterns which were confirmed by quantitative real time PCR assays. We also present a set of 41 ferret genes with even transcription profiles across the tested tissues, indicating their usefulness as housekeeping genes. Conclusion The tools developed in this study allow for functional genomic analysis and make further development of reagents for the ferret model possible. PMID:20403183

  3. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition.

    PubMed

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-11-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation. PMID:26536029

  4. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition

    PubMed Central

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-01-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation. PMID:26536029

  5. Sequencing, De novo Assembly, Functional Annotation and Analysis of Phyllanthus amarus Leaf Transcriptome Using the Illumina Platform

    PubMed Central

    Bose Mazumdar, Aparupa; Chattopadhyay, Sharmila

    2016-01-01

    Phyllanthus amarus Schum. and Thonn., a widely distributed annual medicinal herb has a long history of use in the traditional system of medicine for over 2000 years. However, the lack of genomic data for P. amarus, a non-model organism hinders research at the molecular level. In the present study, high-throughput sequencing technology has been employed to enhance better understanding of this herb and provide comprehensive genomic information for future work. Here P. amarus leaf transcriptome was sequenced using the Illumina Miseq platform. We assembled 85,927 non-redundant (nr) “unitranscript” sequences with an average length of 1548 bp, from 18,060,997 raw reads. Sequence similarity analyses and annotation of these unitranscripts were performed against databases like green plants nr protein database, Gene Ontology (GO), Clusters of Orthologous Groups (COG), PlnTFDB, KEGG databases. As a result, 69,394 GO terms, 583 enzyme codes (EC), 134 KEGG maps, and 59 Transcription Factor (TF) families were generated. Functional and comparative analyses of assembled unitranscripts were also performed with the most closely related species like Populus trichocarpa and Ricinus communis using TRAPID. KEGG analysis showed that a number of assembled unitranscripts were involved in secondary metabolites, mainly phenylpropanoid, flavonoid, terpenoids, alkaloids, and lignan biosynthetic pathways that have significant medicinal attributes. Further, Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values of the identified secondary metabolite pathway genes were determined and Reverse Transcription PCR (RT-PCR) of a few of these genes were performed to validate the de novo assembled leaf transcriptome dataset. In addition 65,273 simple sequence repeats (SSRs) were also identified. To the best of our knowledge, this is the first transcriptomic dataset of P. amarus till date. Our study provides the largest genetic resource that will lead to drug development and pave

  6. The 2013 seismic sequence close to gas injection platform of the Castor project, offshore Spain

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Grigoli, Francesco; Heimann, Sebastian; Gonzalez, Alvaro; Buforn, Elisa; Maghsoudi, Samira; Blanch, Estefania; Dahm, Torsten

    2014-05-01

    A spatially localized seismic sequence has originated few tens of kilometres offshore the Mediterranean coast of Spain, starting on September 5, 2013, and lasting at least until October 2013. The sequence culminated in a maximal moment magnitude Mw 4.3 earthquake, on October 1, 2013. The epicentral region is located near the offshore platform of the Castor project, where gas is conducted through a pipeline from mainland and where it was recently injected in a depleted oil reservoir, at about 2 km depth. We analyse the temporal evolution of the seismic sequence and use full waveform techniques to derive absolute and relative locations, estimate depths and focal mechanisms for the largest events in the sequence (with magnitude mbLg larger than 3), and compare them to a previous event (April 8, 2012, mbLg 3.3) taking place in the same region prior to the gas injection. Moment tensor inversion results show that the overall seismicity in this sequence is characterized by oblique mechanisms with a normal fault component, with a 30° low-dip angle plane oriented NNE-SSW and a sub- vertical plane oriented NW-SE. The combined analysis of hypocentral location and focal mechanisms could indicate that the seismic sequence corresponds to rupture processes along sub- horizontal shallow surfaces, which could have been triggered by the gas injection in the reservoir,. An alternative scenario includes the iterated triggering of a system of steep faults oriented NW-SE, which were identified by prior marine seismics investigations. The most relevant seismogenic feature in the area is the Fosa de Amposta fault system, which includes different strands mapped at different distances to the coast, with a general NE-SW orientation, roughly parallel to the coastline. No significant known historical seismicity has involved this fault in the past. Our both scenarios exclude its activation, as its known orientation is inconsistent with focal mechanism results.

  7. [The application of next generation sequencing on epigenetic study].

    PubMed

    Shen, Sheng; Qu, Yanchun; Zhang, Jun

    2014-03-01

    The application of next generation sequencing (NGS) technique has a great impact on epigenetic studies. Coupled with NGS, a number of sequencing-based methodologies have been developed and applied in epigenetic studies, such as Whole Genome Bisulfite Sequencing (WGBS), Reduced Representation Bisulfite Sequencing (RRBS), Methylated DNA Immunoprecipitation Sequencing (MeDIP-seq), Chromatin Immunoprecipitation-Sequencing (ChIP-seq), TAB-seq (Tet-assisted Bisulfite Sequencing), Chromosome Conformation Capture Sequencing (3C-seq) and various of 3C-seq de-rivatives, DNase1-seq/MNase-seq/FAIRE-seqand RNA Sequencing (RNA-seq). These new techniques were used to iden-tify DNA methylation patterns and a broad range of protein/nucleic acid interactions, and to analyze chromatin conforma-tion.With these new technologies, researchers have gained a broader view and better tools to investigate the distributions and dynamic changes of epigenetic markers affected by both internal and external factors. The principles and characteristics of major applications of NGS technologies on epigenetics were summarized; and the recent advances and the future direc-tions in NGS-based epigenetic studies were further discussed. PMID:24846966

  8. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    PubMed

    Star, Bastiaan; Nederbragt, Alexander J; Hansen, Marianne H S; Skage, Morten; Gilfillan, Gregor D; Bradbury, Ian R; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S; Jentoft, Sissel

    2014-01-01

    Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104

  9. Manipulating attentional load in sequence learning through random number generation

    PubMed Central

    Wierzchoń, Michał; Gaillard, Vinciane; Asanowicz, Dariusz; Cleeremans, Axel

    2012-01-01

    Implicit learning is often assumed to be an effortless process. However, some artificial grammar learning and sequence learning studies using dual tasks seem to suggest that attention is essential for implicit learning to occur. This discrepancy probably results from the specific type of secondary task that is used. Different secondary tasks may engage attentional resources differently and therefore may bias performance on the primary task in different ways. Here, we used a random number generation (RNG) task, which may allow for a closer monitoring of a participant’s engagement in a secondary task than the popular secondary task in sequence learning studies: tone counting (TC). In the first two experiments, we investigated the interference associated with performing RNG concurrently with a serial reaction time (SRT) task. In a third experiment, we compared the effects of RNG and TC. In all three experiments, we directly evaluated participants’ knowledge of the sequence with a subsequent sequence generation task. Sequence learning was consistently observed in all experiments, but was impaired under dual-task conditions. Most importantly, our data suggest that RNG is more demanding and impairs learning to a greater extent than TC. Nevertheless, we failed to observe effects of the secondary task in subsequent sequence generation. Our studies indicate that RNG is a promising task to explore the involvement of attention in the SRT task. PMID:22723816

  10. Clinical Next Generation Sequencing for Precision Medicine in Cancer.

    PubMed

    Dong, Ling; Wang, Wanheng; Li, Alvin; Kansal, Rina; Chen, Yuhan; Chen, Hong; Li, Xinmin

    2015-08-01

    Rapid adoption of next generation sequencing (NGS) in genomic medicine has been driven by low cost, high throughput sequencing and rapid advances in our understanding of the genetic bases of human diseases. Today, the NGS method has dominated sequencing space in genomic research, and quickly entered clinical practice. Because unique features of NGS perfectly meet the clinical reality (need to do more with less), the NGS technology is becoming a driving force to realize the dream of precision medicine. This article describes the strengths of NGS, NGS panels used in precision medicine, current applications of NGS in cytology, and its challenges and future directions for routine clinical use. PMID:27006629

  11. Clinical Next Generation Sequencing for Precision Medicine in Cancer

    PubMed Central

    Dong, Ling; Wang, Wanheng; Li, Alvin; Kansal, Rina; Chen, Yuhan; Chen, Hong; Li, Xinmin

    2015-01-01

    Rapid adoption of next generation sequencing (NGS) in genomic medicine has been driven by low cost, high throughput sequencing and rapid advances in our understanding of the genetic bases of human diseases. Today, the NGS method has dominated sequencing space in genomic research, and quickly entered clinical practice. Because unique features of NGS perfectly meet the clinical reality (need to do more with less), the NGS technology is becoming a driving force to realize the dream of precision medicine. This article describes the strengths of NGS, NGS panels used in precision medicine, current applications of NGS in cytology, and its challenges and future directions for routine clinical use. PMID:27006629

  12. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing.

    PubMed

    Nguyen-Dumont, Tú; Pope, Bernard J; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C; Park, Daniel J

    2013-11-15

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  13. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing

    PubMed Central

    Nguyen-Dumont, Tú; Pope, Bernard J.; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C.; Park, Daniel J.

    2013-01-01

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  14. DNA immunoprecipitation semiconductor sequencing (DIP-SC-seq) as a rapid method to generate genome wide epigenetic signatures.

    PubMed

    Thomson, John P; Fawkes, Angie; Ottaviano, Raffaele; Hunter, Jennifer M; Shukla, Ruchi; Mjoseng, Heidi K; Clark, Richard; Coutts, Audrey; Murphy, Lee; Meehan, Richard R

    2015-01-01

    Modification of DNA resulting in 5-methylcytosine (5 mC) or 5-hydroxymethylcytosine (5hmC) has been shown to influence the local chromatin environment and affect transcription. Although recent advances in next generation sequencing technology allow researchers to map epigenetic modifications across the genome, such experiments are often time-consuming and cost prohibitive. Here we present a rapid and cost effective method of generating genome wide DNA modification maps utilising commercially available semiconductor based technology (DNA immunoprecipitation semiconductor sequencing; "DIP-SC-seq") on the Ion Proton sequencer. Focussing on the 5hmC mark we demonstrate, by directly comparing with alternative sequencing strategies, that this platform can successfully generate genome wide 5hmC patterns from as little as 500 ng of genomic DNA in less than 4 days. Such a method can therefore facilitate the rapid generation of multiple genome wide epigenetic datasets. PMID:25985418

  15. DNA immunoprecipitation semiconductor sequencing (DIP-SC-seq) as a rapid method to generate genome wide epigenetic signatures

    PubMed Central

    Thomson, John P.; Fawkes, Angie; Ottaviano, Raffaele; Hunter, Jennifer M.; Shukla, Ruchi; Mjoseng, Heidi K.; Clark, Richard; Coutts, Audrey; Murphy, Lee; Meehan, Richard R.

    2015-01-01

    Modification of DNA resulting in 5-methylcytosine (5 mC) or 5-hydroxymethylcytosine (5hmC) has been shown to influence the local chromatin environment and affect transcription. Although recent advances in next generation sequencing technology allow researchers to map epigenetic modifications across the genome, such experiments are often time-consuming and cost prohibitive. Here we present a rapid and cost effective method of generating genome wide DNA modification maps utilising commercially available semiconductor based technology (DNA immunoprecipitation semiconductor sequencing; “DIP-SC-seq”) on the Ion Proton sequencer. Focussing on the 5hmC mark we demonstrate, by directly comparing with alternative sequencing strategies, that this platform can successfully generate genome wide 5hmC patterns from as little as 500 ng of genomic DNA in less than 4 days. Such a method can therefore facilitate the rapid generation of multiple genome wide epigenetic datasets. PMID:25985418

  16. Using Illumina next generation sequencing technologies to sequence multigene families in de novo species.

    PubMed

    Hughes, Graham M; Gang, Li; Murphy, William J; Higgins, Desmond G; Teeling, Emma C

    2013-05-01

    The advent of Next Generation Sequencing Technology (NGST) has revolutionized molecular biology research, allowing for rapid gene/genome sequencing from a multitude of diverse species. As high throughput sequencing becomes more accessible, more efficient workflows must be developed to deal with the amounts of data produced and better assemble the genomes of de novo lineages. We combine traditional laboratory methods with Illumina NGST to amplify and sequence the largest mammalian multigene family, the Olfactory Receptor gene family, for species with and without a reference genome. We develop novel assembly methods to annotate and filter these data, which can be utilized for any gene family or any species. We find no significant difference between the ratio of genes within their respective gene families of our data compared with available genomic data. Using simulated data we explore the limitations of short-read sequence data and our assembly in recovering this gene family. We highlight the benefits and shortcomings of these methods. Compared with data generated from traditional polymerase chain reaction, cloning and Sanger sequencing methodologies, sequence data generated using our pipeline increases yield and sequencing efficiency without reducing the number of unique genes amplified. A cloning step is not required, therefore shortening data generation time. The novel downstream methodologies and workflows described provide a tool to be utilized by many fields of biology, to access and analyze the vast quantities of data generated. By combining laboratory and in silico methods, we provide a means of extracting genomic information for multigene families without complete genome sequencing. PMID:23480365

  17. Next-Generation Sequencing in the Understanding of Kaposi's Sarcoma-Associated Herpesvirus (KSHV) Biology.

    PubMed

    Strahan, Roxanne; Uppal, Timsy; Verma, Subhash C

    2016-01-01

    Non-Sanger-based novel nucleic acid sequencing techniques, referred to as Next-Generation Sequencing (NGS), provide a rapid, reliable, high-throughput, and massively parallel sequencing methodology that has improved our understanding of human cancers and cancer-related viruses. NGS has become a quintessential research tool for more effective characterization of complex viral and host genomes through its ever-expanding repertoire, which consists of whole-genome sequencing, whole-transcriptome sequencing, and whole-epigenome sequencing. These new NGS platforms provide a comprehensive and systematic genome-wide analysis of genomic sequences and a full transcriptional profile at a single nucleotide resolution. When combined, these techniques help unlock the function of novel genes and the related pathways that contribute to the overall viral pathogenesis. Ongoing research in the field of virology endeavors to identify the role of various underlying mechanisms that control the regulation of the herpesvirus biphasic lifecycle in order to discover potential therapeutic targets and treatment strategies. In this review, we have complied the most recent findings about the application of NGS in Kaposi's sarcoma-associated herpesvirus (KSHV) biology, including identification of novel genomic features and whole-genome KSHV diversities, global gene regulatory network profiling for intricate transcriptome analyses, and surveying of epigenetic marks (DNA methylation, modified histones, and chromatin remodelers) during de novo, latent, and productive KSHV infections. PMID:27043613

  18. A resampling procedure for generating conditioned daily weather sequences

    USGS Publications Warehouse

    Clark, M.P.; Gangopadhyay, S.; Brandon, D.; Werner, K.; Hay, L.; Rajagopalan, B.; Yates, D.

    2004-01-01

    [1] A method is introduced to generate conditioned daily precipitation and temperature time series at multiple stations. The method resamples data from the historical record "nens" times for the period of interest (nens = number of ensemble members) and reorders the ensemble members to reconstruct the observed spatial (intersite) and temporal correlation statistics. The weather generator model is applied to 2307 stations in the contiguous United States and is shown to reproduce the observed spatial correlation between neighboring stations, the observed correlation between variables (e.g., between precipitation and temperature), and the observed temporal correlation between subsequent days in the generated weather sequence. The weather generator model is extended to produce sequences of weather that are conditioned on climate indices (in this case the Nin??o 3.4 index). Example illustrations of conditioned weather sequences are provided for a station in Arizona (Petrified Forest, 34.8??N, 109.9??W), where El Nin??o and La Nin??a conditions have a strong effect on winter precipitation. The conditioned weather sequences generated using the methods described in this paper are appropriate for use as input to hydrologic models to produce multiseason forecasts of streamflow.

  19. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    PubMed Central

    Abernathy, Jason W; Xu, Peng; Li, Ping; Xu, De-Hai; Kucuktas, Huseyin; Klesius, Phillip; Arias, Covadonga; Liu, Zhanjiang

    2007-01-01

    Background The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs) for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs. Results We sequenced 10,368 EST clones using a normalized cDNA library made from pooled samples of the trophont, tomont, and theront life-cycle stages, and generated 9,769 sequences (94.2% success rate). Post-sequencing processing led to 8,432 high quality sequences. Clustering analysis of these ESTs allowed identification of 4,706 unique sequences containing 976 contigs and 3,730 singletons. These unique sequences represent over two million base pairs (~10% of Plasmodium falciparum genome, a phylogenetically related protozoan). BLASTX searches produced 2,518 significant (E-value < 10-5) hits and further Gene Ontology (GO) analysis annotated 1,008 of these genes. The ESTs were analyzed comparatively against the genomes of the related protozoa Tetrahymena thermophila and P. falciparum, allowing putative identification of additional genes. All the EST sequences were deposited by dbEST in GenBank (GenBank: EG957858–EG966289). Gene discovery and annotations are presented and discussed. Conclusion This set of ESTs represents a significant proportion of the Ich transcriptome, and provides a material basis for the development of microarrays useful for gene expression studies concerning Ich development, pathogenesis, and virulence. PMID:17577414

  20. Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells.

    PubMed

    Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J; Yancopoulos, George D; Lin, Hsin Chieh; Gromada, Jesper

    2016-03-22

    This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type-specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system. PMID:26951663

  1. Nanomicroarray and Multiplex Next-Generation Sequencing for Simultaneous Identification and Characterization of Influenza Viruses

    PubMed Central

    Ragupathy, Viswanath; Liu, Jikun; Wang, Xue; Vemula, Sai Vikram; El Mubarak, Haja Sittana; Ye, Zhiping; Landry, Marie L.

    2015-01-01

    Conventional methods for detection and discrimination of influenza viruses are time consuming and labor intensive. We developed a diagnostic platform for simultaneous identification and characterization of influenza viruses that uses a combination of nanomicroarray for screening and multiplex next-generation sequencing (NGS) assays for laboratory confirmation. The nanomicroarray was developed to target hemagglutinin, neuraminidase, and matrix genes to identify influenza A and B viruses. PCR amplicons synthesized by using an adapted universal primer for all 8 gene segments of 9 influenza A subtypes were detected in the nanomicroarray and confirmed by the NGS assays. This platform can simultaneously detect and differentiate multiple influenza A subtypes in a single sample. Use of these methods as part of a new diagnostic algorithm for detection and confirmation of influenza infections may provide ongoing public health benefits by assisting with future epidemiologic studies and improving preparedness for potential influenza pandemics. PMID:25694248

  2. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

    PubMed Central

    2014-01-01

    Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911

  3. Application of next-generation sequencing in gastrointestinal and liver tumors.

    PubMed

    Mikhail, Sameh; Faltas, Bishoy; Salem, Mohamed E; Bekaii-Saab, Tanios

    2016-05-01

    Malignant transformation of normal cells is associated with the evolution of genomic alterations. This concept has led to the development of molecular testing platforms to identify genomic alterations that can be targeted with novel therapies. Next generation sequencing (NGS) has heralded a new era in precision medicine in which tumor genes can be studied efficiently. Recent developments in NGS have allowed investigators to identify genomic predictive makers and hereditary mutations to guide treatment decision. The application of NGS in gastrointestinal cancers is being extensively studied but continues to face substantial challenges. In our review, we discuss various NGS platforms and highlight their role in identifying familial mutations and markers of response or resistance to cancer therapy. We also provide a balanced discussion of the challenges that limit the routine use of NGS in clinical practice. PMID:26916979

  4. Visualizing next-generation sequencing data with JBrowse

    PubMed Central

    Westesson, Oscar; Skinner, Mitchell

    2013-01-01

    JBrowse is a web-based genome browser, allowing many sources of data to be visualized, interpreted and navigated in a coherent visual framework. JBrowse uses efficient data structures, pre-generation of image tiles and client-side rendering to provide a fast, interactive browsing experience. Many of JBrowse's design features make it well suited for visualizing high-volume data, such as aligned next-generation sequencing reads. PMID:22411711

  5. Pattern Recognition on Read Positioning in Next Generation Sequencing

    PubMed Central

    Byeon, Boseon; Kovalchuk, Igor

    2016-01-01

    The usefulness and the utility of the next generation sequencing (NGS) technology are based on the assumption that the DNA or cDNA cleavage required to generate short sequence reads is random. Several previous reports suggest the existence of sequencing bias of NGS reads. To address this question in greater detail, we analyze NGS data from four organisms with different GC content, Plasmodium falciparum (19.39%), Arabidopsis thaliana (36.03%), Homo sapiens (40.91%) and Streptomyces coelicolor (72.00%). Using machine learning techniques, we recognize the pattern that the NGS read start is positioned in the local region where the nucleotide distribution is dissimilar from the global nucleotide distribution. We also demonstrate that the mono-nucleotide distribution underestimates sequencing bias, and the recognized pattern is explained largely by the distribution of multi-nucleotides (di-, tri-, and tetra- nucleotides) rather than mono-nucleotides. This implies that the correction of sequencing bias needs to be performed on the basis of the multi-nucleotide distribution. Providing companion software to quantify the effect of the recognized pattern on read positioning, we exemplify that the bias correction based on the mono-nucleotide distribution may not be sufficient to clean sequencing bias. PMID:27299343

  6. PepTool and GeneTool: platform-independent tools for biological sequence analysis.

    PubMed

    Wishart, D S; Stothard, P; Van Domselaar, G H

    2000-01-01

    Although we are unable to discuss all of the functionality available in PepTool and GeneTool, it should be evident from this brief review that both packages offer a great deal in terms of functionality and ease-of-use. Furthermore, a number of useful innovations including platform-independent GUI design, networked parallelism, direct internet connectivity, database compression, and a variety of enhanced or improved algorithms should make these two programs particularly useful in the rapidly changing world of biological sequence analysis. More complete descriptions of the programs, algorithms and operation of PepTool and GeneTool are available on the BioTools web site (www.biotools.com), in the associated program user manuals and in the on-line Help pages. PMID:10547833

  7. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  8. Generation of allocation sequences in randomised trials: chance, not choice.

    PubMed

    Schulz, Kenneth F; Grimes, David A

    2002-02-01

    The randomised controlled trial sets the gold standard of clinical research. However, randomisation persists as perhaps the least-understood aspect of a trial. Moreover, anything short of proper randomisation courts selection and confounding biases. Researchers should spurn all systematic, non-random methods of allocation. Trial participants should be assigned to comparison groups based on a random process. Simple (unrestricted) randomisation, analogous to repeated fair coin-tossing, is the most basic of sequence generation approaches. Furthermore, no other approach, irrespective of its complexity and sophistication, surpasses simple randomisation for prevention of bias. Investigators should, therefore, use this method more often than they do, and readers should expect and accept disparities in group sizes. Several other complicated restricted randomisation procedures limit the likelihood of undesirable sample size imbalances in the intervention groups. The most frequently used restricted sequence generation procedure is blocked randomisation. If this method is used, investigators should randomly vary the block sizes and use larger block sizes, particularly in an unblinded trial. Other restricted procedures, such as urn randomisation, combine beneficial attributes of simple and restricted randomisation by preserving most of the unpredictability while achieving some balance. The effectiveness of stratified randomisation depends on use of a restricted randomisation approach to balance the allocation sequences for each stratum. Generation of a proper randomisation sequence takes little time and effort but affords big rewards in scientific accuracy and credibility. Investigators should devote appropriate resources to the generation of properly randomised trials and reporting their methods clearly. PMID:11853818

  9. Microfluidic platform for isolating nucleic acid targets using sequence specific hybridization

    PubMed Central

    Wang, Jingjing; Morabito, Kenneth; Tang, Jay X.; Tripathi, Anubhav

    2013-01-01

    The separation of target nucleic acid sequences from biological samples has emerged as a significant process in today's diagnostics and detection strategies. In addition to the possible clinical applications, the fundamental understanding of target and sequence specific hybridization on surface modified magnetic beads is of high value. In this paper, we describe a novel microfluidic platform that utilizes a mobile magnetic field in static microfluidic channels, where single stranded DNA (ssDNA) molecules are isolated via nucleic acid hybridization. We first established efficient isolation of biotinylated capture probe (BP) using streptavidin-coated magnetic beads. Subsequently, we investigated the hybridization of target ssDNA with BP bound to beads and explained these hybridization kinetics using a dual-species kinetic model. The number of hybridized target ssDNA molecules was determined to be about 6.5 times less than that of BP on the bead surface, due to steric hindrance effects. The hybridization of target ssDNA with non-complementary BP bound to bead was also examined, and non-specific hybridization was found to be insignificant. Finally, we demonstrated highly efficient capture and isolation of target ssDNA in the presence of non-target ssDNA, where as low as 1% target ssDNA can be detected from mixture. The microfluidic method described in this paper is significantly relevant and is broadly applicable, especially towards point-of-care biological diagnostic platforms that require binding and separation of known target biomolecules, such as RNA, ssDNA, or protein. PMID:24404041

  10. Next Generation Sequencing to Characterize Mitochondrial Genomic DNA Heteroplasmy

    PubMed Central

    Huang, Taosheng

    2015-01-01

    This protocol is to describe the methodology to characterize mitochondria DNA (mtDNA) heteroplasmy with parallel sequencing. Mitochondria play a very important role in important cellular functions. Each eukaryotic cell contains hundreds of mitochondria with hundreds of mitochondria genomes. The mutant mtDNA and the wild type may co-exist as heteroplasmy, and cause human disease. The purpose of this methodology is to simultaneously determine mtDNA sequence and to quantify the heteroplasmy level. The protocol includes two-fragment mitochondria genome DNA PCR amplification. The PCR product is then mixed at an equimolar ratio. The samples will be barcoded and sequenced with high-throughput next-generation sequencing technology. We found that this technology is highly sensitive, specific, and accurate in determining mtDNA mutations and the degree of heteroplasmic level. PMID:21975941

  11. Application of next generation sequencing technology in Mendelian movement disorders.

    PubMed

    Wang, Yumin; Pan, Xuya; Xue, Dan; Li, Yuwei; Zhang, Xueying; Kuang, Biao; Zheng, Jiabo; Deng, Hao; Li, Xiaoling; Xiong, Wei; Zeng, Zhaoyang; Li, Guiyuan

    2016-02-01

    Next generation sequencing (NGS) has developed very rapidly in the last decade. Compared with Sanger sequencing, NGS has the advantages of high sensitivity and high throughput. Movement disorders are a common type of neurological disease. Although traditional linkage analysis has become a standard method to identify the pathogenic genes in diseases, it is getting difficult to find new pathogenic genes in rare Mendelian disorders, such as movement disorders, due to a lack of appropriate families with high penetrance or enough affected individuals. Thus, NGS is an ideal approach to identify the causal alleles for inherited disorders. NGS is used to identify genes in several diseases and new mutant sites in Mendelian movement disorders. This article reviewed the recent progress in NGS and the use of NGS in Mendelian movement disorders from genome sequencing and transcriptome sequencing. A perspective on how NGS could be employed in rare Mendelian disorders is also provided. PMID:26932219

  12. Plant virology and next generation sequencing: experiences with a Potyvirus.

    PubMed

    Kehoe, Monica A; Coutts, Brenda A; Buirchell, Bevan J; Jones, Roger A C

    2014-01-01

    Next generation sequencing is quickly emerging as the go-to tool for plant virologists when sequencing whole virus genomes, and undertaking plant metagenomic studies for new virus discoveries. This study aims to compare the genomic and biological properties of Bean yellow mosaic virus (BYMV) (genus Potyvirus), isolates from Lupinus angustifolius plants with black pod syndrome (BPS), systemic necrosis or non-necrotic symptoms, and from two other plant species. When one Clover yellow vein virus (ClYVV) (genus Potyvirus) and 22 BYMV isolates were sequenced on the Illumina HiSeq2000, one new ClYVV and 23 new BYMV sequences were obtained. When the 23 new BYMV genomes were compared with 17 other BYMV genomes available on Genbank, phylogenetic analysis provided strong support for existence of nine phylogenetic groupings. Biological studies involving seven isolates of BYMV and one of ClYVV gave no symptoms or reactions that could be used to distinguish BYMV isolates from L. angustifolius plants with black pod syndrome from other isolates. Here, we propose that the current system of nomenclature based on biological properties be replaced by numbered groups (I-IX). This is because use of whole genomes revealed that the previous phylogenetic grouping system based on partial sequences of virus genomes and original isolation hosts was unsustainable. This study also demonstrated that, where next generation sequencing is used to obtain complete plant virus genomes, consideration needs to be given to issues regarding sample preparation, adequate levels of coverage across a genome and methods of assembly. It also provided important lessons that will be helpful to other plant virologists using next generation sequencing in the future. PMID:25102175

  13. Plant Virology and Next Generation Sequencing: Experiences with a Potyvirus

    PubMed Central

    Kehoe, Monica A.; Coutts, Brenda A.; Buirchell, Bevan J.; Jones, Roger A. C.

    2014-01-01

    Next generation sequencing is quickly emerging as the go-to tool for plant virologists when sequencing whole virus genomes, and undertaking plant metagenomic studies for new virus discoveries. This study aims to compare the genomic and biological properties of Bean yellow mosaic virus (BYMV) (genus Potyvirus), isolates from Lupinus angustifolius plants with black pod syndrome (BPS), systemic necrosis or non-necrotic symptoms, and from two other plant species. When one Clover yellow vein virus (ClYVV) (genus Potyvirus) and 22 BYMV isolates were sequenced on the Illumina HiSeq2000, one new ClYVV and 23 new BYMV sequences were obtained. When the 23 new BYMV genomes were compared with 17 other BYMV genomes available on Genbank, phylogenetic analysis provided strong support for existence of nine phylogenetic groupings. Biological studies involving seven isolates of BYMV and one of ClYVV gave no symptoms or reactions that could be used to distinguish BYMV isolates from L. angustifolius plants with black pod syndrome from other isolates. Here, we propose that the current system of nomenclature based on biological properties be replaced by numbered groups (I–IX). This is because use of whole genomes revealed that the previous phylogenetic grouping system based on partial sequences of virus genomes and original isolation hosts was unsustainable. This study also demonstrated that, where next generation sequencing is used to obtain complete plant virus genomes, consideration needs to be given to issues regarding sample preparation, adequate levels of coverage across a genome and methods of assembly. It also provided important lessons that will be helpful to other plant virologists using next generation sequencing in the future. PMID:25102175

  14. Modeling Pseudorandom Sequence Generators using Cellular Automata: The Alternating Step Generator

    NASA Astrophysics Data System (ADS)

    Pazo-Robles, María Eugenia; Fúster-Sabater, Amparo

    2007-12-01

    Stream ciphers are pseudorandom bit generators whose output sequences are combined with the sensitive information by means of a mathematical function currently an addition module 2. The Alternating Step Generator is a pseudorandom sequence generator with good cryptographic properties and non-linear structure. In this work, we propose two different ways to model such a generator by using linear and discrete mathematical functions e.g. Cellular Automata. One of these ways deals with the realization of a linear model from a pair of basic automata provided by the Catell and Muzio algorithm. The other way is a new approach based on automata's addition consisting in the realization of a new automaton with non-primitive polynomial and short length. Both methods provide linear models able to generate the output sequence of the Alternating Step Generator.

  15. RAD in the realm of next-generation sequencing technologies.

    PubMed

    Rowe, H C; Renaut, S; Guggisberg, A

    2011-09-01

    The first North American RAD Sequencing and Genomics Symposium, sponsored by Floragenex (http://www.floragenex.com/radmeeting/), took place in Portland, Oregon (USA) on 19 April 2011. This symposium was convened to promote and discuss the use of restriction-site-associated DNA (RAD) sequencing technologies. RAD sequencing is one of several strategies recently developed to increase the power of data generated via short-read sequencing technologies by reducing their complexity (Baird et al. 2008; Huang et al. 2009; Andolfatto et al. 2011; Elshire et al. 2011). RAD sequencing, as a form of genotyping by sequencing, has been effectively applied in genetic mapping and quantitative trait loci (QTL) analyses in a range of organisms including nonmodel, genetically highly heterogeneous organisms (Table 1; Baird et al. 2008; Baxter et al. 2011; Chutimanitsakun et al. 2011; Pfender et al. 2011). RAD sequencing has recently found applications in phylogeography (Emerson et al. 2010) and population genomics (Hohenlohe et al. 2010). Considering the diversity of talks presented during this meeting, more developments are to be expected in the very near future. PMID:21991593

  16. HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis

    PubMed Central

    Santana-Quintero, Luis; Dingerdissen, Hayley; Thierry-Mieg, Jean; Mazumder, Raja; Simonyan, Vahan

    2014-01-01

    Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for storage and analysis of extra-large data, presents an algorithmic solution: the HIVE-hexagon DNA sequence aligner. HIVE-hexagon implements novel approaches to exploit both characteristics of sequence space and CPU, RAM and Input/Output (I/O) architecture to quickly compute accurate alignments. Key components of HIVE-hexagon include non-redundification and sorting of sequences; floating diagonals of linearized dynamic programming matrices; and consideration of cross-similarity to minimize computations. Availability https://hive.biochemistry.gwu.edu/hive/ PMID:24918764

  17. Automatic generation of primary sequence patterns from sets of related protein sequences.

    PubMed

    Smith, R F; Smith, T F

    1990-01-01

    We have developed a computer algorithm that can extract the pattern of conserved primary sequence elements common to all members of a homologous protein family. The method involves clustering the pairwise similarity scores among a set of related sequences to generate a binary dendrogram (tree). The tree is then reduced in a stepwise manner by progressively replacing the node connecting the two most similar termini by one common pattern until only a single common "root" pattern remains. A pattern is generated at a node by (i) performing a local optimal alignment on the sequence/pattern pair connected by the node with the use of an extended dynamic programming algorithm and then (ii) constructing a single common pattern from this alignment with a nested hierarchy of amino acid classes to identify the minimal inclusive amino acid class covering each paired set of elements in the alignment. Gaps within an alignment are created and/or extended using a "pay once" gap penalty rule, and gapped positions are converted into gap characters that function as 0 or 1 amino acid of any type during subsequent alignment. This method has been used to generate a library of covering patterns for homologous families in the National Biomedical Research Foundation/Protein Identification Resource protein sequence data base. We show that a covering pattern can be more diagnostic for sequence family membership than any of the individual sequences used to construct the pattern. PMID:2296575

  18. Identification of virus encoding microRNAs using 454 FLX sequencing platform.

    PubMed

    Kong, Byung-Whi

    2011-01-01

    MicroRNAs are a class of small noncoding RNA molecules that play a pivotal role in the regulation of gene expression at the posttranscriptional level. Most large double-stranded DNA viruses, mainly the herpesvirus family, are known to express miRNAs. Viral miRNAs can regulate both viral- and cellular transcripts. By eliminating cloning steps for large number of Sanger sequencing reactions, recent development of massively parallel next-generation sequencing methods has accelerated identification of small RNA species expressed from viruses, prokaryotes, and eukaryotes. The miRNAs expressed from infectious laryngotracheitis virus (ILTV), which is an alphaherpesvirus belonging to the herpesviridae family and which causes an acute respiratory disorder in chicken, were identified by small RNA enrichment and the 454 FLX sequencing method. PMID:21431764

  19. Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

    PubMed

    Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

    2015-12-01

    We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. PMID:26319908

  20. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

    PubMed Central

    Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396

  1. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    PubMed

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396

  2. A Multi-Site Study Employing High Resolution HLA Genotyping by Next Generation Sequencing

    PubMed Central

    Holcomb, C. L.; Höglund, B.; Anderson, M. W.; Blake, L.A.; Böhme, I.; Egholm, M.; Ferriola, D.; Gabriel, C.; Gelber, S. E.; Goodridge, D.; Hawbecker, S.; Klein, R.; Ladner, M.; Lind, C.; Monos, D.; Pando, M. J.; Pröll, J.; Sayer, D. C.; Schmitz-Agheguian, G.; Simen, B. B.; Thiele, B.; Trachtenberg, E. A.; Tyan, D. B.; Wassmuth, R.; White, S.; Erlich, H. A.

    2014-01-01

    The high degree of polymorphism at HLA class I and class II loci makes high resolution HLA typing challenging. Current typing methods, including Sanger sequencing, yield ambiguous typing results due to incomplete genomic coverage and inability to set phase for HLA haplotype determination. The 454 Life Sciences GS FLX next generation sequencing system coupled with Conexio ATF software can provide very high resolution HLA genotyping. High throughput genotyping can be achieved by use of primers with multiplex identifier (MID) tags to allow pooling of the amplicons generated from different individuals prior to sequencing. We have conducted a double blind study in which eight laboratory sites performed amplicon sequencing using GS FLX standard chemistry and genotyped the same 20 samples for HLA-A, -B, -C, DPB1, DQA1, DQB1, DRB1, and DRB3, DRB4 and DRB5 (DRB3/4/5) in a single sequencing run. The average sequence read length was 250 base pairs (bp) and the average number of sequence reads per amplicon was 672, providing confidence in the allele assignments. Of the 1280 genotypes considered, assignment was possible in 95% of the cases. Failure to assign genotypes was the result of researcher procedural error or the presence of a novel allele rather than a failure of sequencing technology. Concordance with known genotypes, in cases where assignment was possible, ranged from 95.3% to 99.4% for the eight sites, with overall concordance of 97.2%. We conclude that clonal pyrosequencing using the GS FLX platform and Conexio ATF software allows reliable identification of HLA genotypes at high resolution. PMID:21299525

  3. Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data

    PubMed Central

    Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer

    2012-01-01

    Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available. PMID:22384016

  4. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    PubMed Central

    Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available. PMID:27327771

  5. Preliminary Sequence stratigraphy framework of the SW part of the Actopan Platform, Lower Cretaceous, Hidalgo, Mexico

    NASA Astrophysics Data System (ADS)

    Abascal, G.; Murillo-Muñeton, G.

    2013-05-01

    The oldest sedimentary rocks in what is known as the Actopan Platform, in the State of Hidalgo, Mexico, are superbly exposed toward the southwestern part of such platform. A detailed stratigraphic/sedimentologic study was carried out to a 623 m-thick section; this study was focused to establish a sequence stratigraphic framework. The base of the section consists of a Lower Cretaceous 6223-m thick, mixed siliciclastic-carbonate sedimentary succession that has been named Santuario Formation. The terrigenous facies of this unit correspond to red beds that consist of shales, sandstones y few conglomerates deposited under continental conditions (fluvial). White and yellowish sandstones, possibly deposited by deltaic systems, occur in minor amounts. A tuff layer is found in its lower part. The carbonate facies of the Santuario Formation consist mainly of skeletal mudstones/wackestones de bioclastos-peloides and subordinate quantities of sandy dolostones, skeletal packstones/grainstones and rudist (requeniids) boundstones. The middle and upper parts of the studied stratigraphic section correspond to an essentially carbonate succession that in known as El Abra Formation. This unit is comprised of the following facies: skeletal mudstones/wackestones, skeletal packstones/grainstone, and minor rudist (requeniid and Chondrodonta) boundstones and cryptalgal laminites deposited in shallow subtidal lagoon to tidal flat conditions. At this location, a "Middle" Cretaceous age (Albian-Cenomanian) has been assigned to the El Abra Formation. However, the common presence of the benthic foraminifer Chofatella decipiens Schlumberger in these facies indicates that their age extends, at least, to the Lower Cretaceous (Barremian). This age was confirmed with the dating of zircons in tuff deposited in the base section. The carbonate facies of the Santuario Formation stack forming fifth-order subtidal cycles or parasequences. While the carbonate facies of the El Abra Formation also stack

  6. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    DOE PAGESBeta

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  7. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    SciTech Connect

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

  8. Generating Researcher Networks with Identified Persons on a Semantic Service Platform

    NASA Astrophysics Data System (ADS)

    Jung, Hanmin; Lee, Mikyoung; Kim, Pyung; Lee, Seungwoo

    This paper describes a Semantic Web-based method to acquire researcher networks by means of identification scheme, ontology, and reasoning. Three steps are required to realize it; resolving co-references, finding experts, and generating researcher networks. We adopt OntoFrame as an underlying semantic service platform and apply reasoning to make direct relations between far-off classes in ontology schema. 453,124 Elsevier journal articles with metadata and full-text documents in information technology and biomedical domains have been loaded and served on the platform as a test set.

  9. Seismic stratigraphy of the western Florida carbonate platform above the Mid-Cretaceous sequence boundary (MCSB)

    SciTech Connect

    Jee, J.L. . Dept. of Geology)

    1993-03-01

    From the Apalachicola Basin (AB) to the Sarasota Arch, a web of multifold seismic and 29 wells were analyzed to determine Upper Cretaceous-Cenozoic stratigraphy. Concordant reflection geometries above and below the MCSB throughout most of the study area do not suggest prolonged subaerial exposure of the platform as some have considered. The configuration of the MCSB surface influenced the distribution of overlying sediment such that the section is thick in the basins and thin on the highs. The three main units recognized are Upper Cretaceous, Paleocene-Eocene, and post-Eocene. The Upper Cretaceous has two subunits, KU1 and KU2. KU1 corresponds in age to the Tuscaloosa-Eutaw lithostratigraphic units, has continuous, parallel seismic facies, and tends to thicken in depressions on the MCSB. KU2 is age-equivalent to part of the Selma Gp. Maastrichtian strata are locally thin to partly absent. In the AB, KU2 appears intensely faulted. Sonic velocities in KU2 show southeastward change to more carbonate rock across the Middle Ground Arch, where hummocky-to-contorted seismic facies and thickening on the structural high suggest constructional accumulation. In wells, Paleocene strata lie unconformably on the Upper Cretaceous. The Paleocene section is thin and not easy to resolve on seismic sections. In the AB, the lowermost Eocene sequence is a wedge that thickens dramatically to the west. In the eastern AB, younger Eocene sequences are stacked to form broad en echelon mounds. Post-Eocene strata in the AB are continuous, parallel and drape the upper Eocene surface. Along the southeastern, up-dip margin of the Tampa Embayment (TE), a belt of west-prograding clinoforms marks the Eocene shelf edge. Landward of this, a seismic marbled zone suggests dolomitic facies. In the post-Eocene section of the TE, Oligocene-Lower Miocene strata form successive sequences of progradational clinoforms that steepen as they impinge on the FL Escarpment.

  10. [Application of next-generation semiconductor sequencing technologies in genetic diagnosis of inherited cardiomyopathies].

    PubMed

    Yue, Zhao; Hong, Zhang; Xueshan, Xia

    2015-07-01

    Inherited cardiomyopathy is the most common hereditary cardiac disease. It also causes a significant proportion of sudden cardiac deaths in young adults and athletes. So far, approximately one hundred genes have been reported to be involved in cardiomyopathies through different mechanisms. Therefore, the identification of the genetic basis and disease mechanisms of cardiomyopathies are important for establishing a clinical diagnosis and genetic testing. Next-generation semiconductor sequencing (NGSS) technology platform is a high-throughput sequencer capable of analyzing clinically derived genomes with high productivity, sensitivity and specificity. It was launched in 2010 by Life Technologies of USA, and it is based on a high density semiconductor chip, which was covered with tens of thousands of wells. NGSS has been successfully used in candidate gene mutation screening to identify hereditary disease. In this review, we summarize these genetic variations, challenge and application of NGSS in inherited cardiomyopathy, and its value in disease diagnosis, prevention and treatment. PMID:26351163

  11. A Semiconductor Chip-Based Next Generation Sequencing Procedure for the Main Pulmonary Hypertension Genes.

    PubMed

    Gómez, Juan; Reguero, Julian R; Alvarez, Celso; Junquera, Manuel R; Arango, Ana; Morís, César; Coto, Eliecer

    2015-08-01

    The aim of this study was to characterize the mutational spectrum of pulmonary hypertension (PH) patients through a next generation sequencing platform. In a total of 22 patients, the BMPR2, SMAD9, CAV1, KCNK3, and EIF2AK4 genes were sequenced with semiconductor chips and the ion torrent personal genome machine. We found six putative mutations in SMAD (p.R263Q), BMPR2 (p.S301P, p.T493I), CAV1 (p.V155I), and EIF2AK4 (p.L489P, p.P1115L) in five patients. One patient was compound heterozygous for BMPR2 + SMAD mutations, and one patient was homozygous for EIF2AK4 p.P1115L. The reported procedure would facilitate the rapid mutational screening of large cohorts of PH patients. PMID:25917481

  12. Modification of the Transplex WTA2 Amplification Product for Next Generation Sequencing

    PubMed Central

    Ward, B.; Fenoglio, D.; Heuermann, K.

    2011-01-01

    Transplex Whole Transcriptome Amplification (WTA2)a exponentially amplifies RNA producing a double-stranded cDNA library while precisely maintaining differential levels of individual transcripts in test and reference samples. Though originally designed to amplify nanogram quantities of RNA, Transplex WTA2 has been shown to be exceedingly effective for amplification from damaged RNA template (FFPE and laser captured tissue samples) and single-cell input quantities (picograms). The efficacy of Transplex WTA2 amplification for downstream applications, primarily qPCR and expression microarray analysis, is well-documented. It follows that the utilization of next-generation sequencing for gene expression research and diagnostics would be well served by Transplex amplification of RNA isolated from samples of severely restricted quantity or quality. Strategies for the integration of Transplex WTA2 with next-generation sequencing are examined, with particular emphasis on elimination of the characteristic fixed primer sequence associated with each amplicon in the amplification library. Removal of these sites will allow direct entry of the resulting product into the sequencing workflow. Methods under consideration will enable the WTA2 amplicon to feed into the current sample prep protocols for the Illumina GA and GAII, SoLiD 5500/5500xl, and Roche-454 GS FLX/Junior platforms.

  13. Nanopore-based fourth-generation DNA sequencing technology.

    PubMed

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-02-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  14. Nanopore-based Fourth-generation DNA Sequencing Technology

    PubMed Central

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  15. Machine-Checked Sequencer for Critical Embedded Code Generator

    NASA Astrophysics Data System (ADS)

    Izerrouken, Nassima; Pantel, Marc; Thirioux, Xavier

    This paper presents the development of a correct-by-construction block sequencer for GeneAuto a qualifiable (according to DO178B/ED12B recommendation) automatic code generator. It transforms Simulink models to MISRA C code for safety critical systems. Our approach which combines classical development process and formal specification and verification using proof-assistants, led to preliminary fruitful exchanges with certification authorities. We present parts of the classical user and tools requirements and derived formal specifications, implementation and verification for the correctness and termination of the block sequencer. This sequencer has been successfully applied to real-size industrial use cases from various transportation domain partners and led to requirement errors detection and a correct-by-construction implementation.

  16. Mapping Sensorimotor Sequences to Word Sequences: A Connectionist Model of Language Acquisition and Sentence Generation

    ERIC Educational Resources Information Center

    Takac, Martin; Benuskova, Lubica; Knott, Alistair

    2012-01-01

    In this article we present a neural network model of sentence generation. The network has both technical and conceptual innovations. Its main technical novelty is in its semantic representations: the messages which form the input to the network are structured as sequences, so that message elements are delivered to the network one at a time. Rather…

  17. Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing

    PubMed Central

    Blomquist, Thomas; Crawford, Erin L.; Yeo, Jiyoun; Zhang, Xiaolu; Willey, James C.

    2015-01-01

    Background Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses. Methods Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS. Results For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (R2 = 0.93). Conclusion In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of

  18. All-optical pseudorandom bit sequences generator based on TOADs

    NASA Astrophysics Data System (ADS)

    Sun, Zhenchao; Wang, Zhi; Wu, Chongqing; Wang, Fu; Li, Qiang

    2016-03-01

    A scheme for all-optical pseudorandom bit sequences (PRBS) generator is demonstrated with optical logic gate 'XNOR' and all-optical wavelength converter based on cascaded Tera-Hertz Optical Asymmetric Demultiplexer (TOADs). Its feasibility is verified by generation of return-to-zero on-off keying (RZ-OOK) 263-1 PRBS at the speed of 1 Gb/s with 10% duty radio. The high randomness of ultra-long cycle PRBS is validated by successfully passing the standard benchmark test.

  19. New Generations: Sequencing Machines and Their Computational Challenges

    PubMed Central

    Schwartz, David C.; Waterman, Michael S.

    2011-01-01

    New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands. PMID:22121326

  20. Continuous flow generation of magnetoliposomes in a low-cost portable microfluidic platform.

    PubMed

    Conde, Alvaro J; Batalla, Milena; Cerda, Belén; Mykhaylyk, Olga; Plank, Christian; Podhajcer, Osvaldo; Cabaleiro, Juan M; Madrid, Rossana E; Policastro, Lucia

    2014-12-01

    We present a low-cost, portable microfluidic platform that uses laminated polymethylmethacrylate chips, peristaltic micropumps and LEGO® Mindstorms components for the generation of magnetoliposomes that does not require extrusion steps. Mixtures of lipids reconstituted in ethanol and an aqueous phase were injected independently in order to generate a combination of laminar flows in such a way that we could effectively achieve four hydrodynamic focused nanovesicle generation streams. Monodisperse magnetoliposomes with characteristics comparable to those obtained by traditional methods have been obtained. The magnetoliposomes are responsive to external magnetic field gradients, a result that suggests that the nanovesicles can be used in research and applications in nanomedicine. PMID:25257193

  1. Next generation sequencing in synovial sarcoma reveals novel gene mutations.

    PubMed

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H S; Flucke, Uta E; Groenen, Patricia J T A; Tops, Bastiaan B J; Kamping, Eveline J; Pfundt, Rolph; de Bruijn, Diederik R H; Geurts van Kessel, Ad H M; van Krieken, Han J H J M; van der Graaf, Winette T A; Versleijen-Jonkers, Yvonne M H

    2015-10-27

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  2. Next generation sequencing in synovial sarcoma reveals novel gene mutations

    PubMed Central

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H.S.; Flucke, Uta E.; Groenen, Patricia J.T.A.; Tops, Bastiaan B.J.; Kamping, Eveline J.; Pfundt, Rolph; de Bruijn, Diederik R.H.; van Kessel, Ad H.M. Geurts; van Krieken, Han J.H.J.M.; van der Graaf, Winette T.A.; Versleijen-Jonkers, Yvonne M.H.

    2015-01-01

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  3. Statistical Quantification of Methylation Levels by Next-Generation Sequencing

    PubMed Central

    Wu, Guodong; Yi, Nengjun; Absher, Devin; Zhi, Degui

    2011-01-01

    Background/Aims Recently, next-generation sequencing-based technologies have enabled DNA methylation profiling at high resolution and low cost. Methyl-Seq and Reduced Representation Bisulfite Sequencing (RRBS) are two such technologies that interrogate methylation levels at CpG sites throughout the entire human genome. With rapid reduction of sequencing costs, these technologies will enable epigenotyping of large cohorts for phenotypic association studies. Existing quantification methods for sequencing-based methylation profiling are simplistic and do not deal with the noise due to the random sampling nature of sequencing and various experimental artifacts. Therefore, there is a need to investigate the statistical issues related to the quantification of methylation levels for these emerging technologies, with the goal of developing an accurate quantification method. Methods In this paper, we propose two methods for Methyl-Seq quantification. The first method, the Maximum Likelihood estimate, is both conceptually intuitive and computationally simple. However, this estimate is biased at extreme methylation levels and does not provide variance estimation. The second method, based on Bayesian hierarchical model, allows variance estimation of methylation levels, and provides a flexible framework to adjust technical bias in the sequencing process. Results We compare the previously proposed binary method, the Maximum Likelihood (ML) method, and the Bayesian method. In both simulation and real data analysis of Methyl-Seq data, the Bayesian method offers the most accurate quantification. The ML method is slightly less accurate than the Bayesian method. But both our proposed methods outperform the original binary method in Methyl-Seq. In addition, we applied these quantification methods to simulation data and show that, with sequencing depth above 40–300 (which varies with different tissue samples) per cleavage site, Methyl-Seq offers a comparable quantification

  4. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    PubMed Central

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi. PMID:27507959

  5. Incipiently drowned platform deposit in cyclic Ordovician shelf sequence: Lower Ordovician Chepultepec Formation, Virginia

    SciTech Connect

    Bova, J.A.; Read, J.F.

    1983-03-01

    The Chepultepec interval, 145 to 260 m (476 to 853 ft) thick, in Virginia contains the Lower Member up to 150 m (492 ft) thick, and the Upper Member, up to 85 m (279 ft) thick, of peritidal cyclic limestone and dolomite, and a Middle Member, up to 110 m (360 ft) thick, of subtidal limestone and bioherms, passing northwestward into cyclic facies. Calculated long term subsidence rates were 4 to 5 cm/1000 yr (mature passive margin rates), shelf gradients were 6 cm/km, and average duration of cycles was 140,00 years. Peritidal cyclic sequences are upward shallowing sequences of pellet-skeletal limestone, thrombolites, rippled calcisiltites and intraclast grainstone, and laminite caps. They formed by rapid transgression with apparent submergence increments averaging approximately 2 m (6.5 ft) in Lower Member and 3.5 m (11.4 ft), Upper Member. Deposition during Middle Member time was dominated by skeletal limestone-mudstone, calcisiltite with storm generated fining-upward sequences, and burrow-mixed units that were formed near fair-weather wave base, along with thrombolite bioherms. Locally, there are upward shallowing sequences, of basal wackestone/mudstone to calcisiltite to bioherm complexes (locally with erosional scalloped tops). Following each submergence, carbonate sedimentation was able to build to sea level prior to renewed submergence. Large submergence events caused tidal flats to be shifted far to the west, and they were unable to prograde out onto the open shelf because of insufficient time before subsidence was renewed, and because the open shelf setting inhibited tidal flat deposition. The Middle Member represents an incipiently drowned sequence that developed by repeated submergence events.

  6. Next-generation sequencing for diagnosis of rare diseases in the neonatal intensive care unit

    PubMed Central

    Daoud, Hussein; Luco, Stephanie M.; Li, Rui; Bareke, Eric; Beaulieu, Chandree; Jarinova, Olga; Carson, Nancy; Nikkel, Sarah M.; Graham, Gail E.; Richer, Julie; Armour, Christine; Bulman, Dennis E.; Chakraborty, Pranesh; Geraghty, Michael; Lines, Matthew A.; Lacaze-Masmonteil, Thierry; Majewski, Jacek; Boycott, Kym M.; Dyment, David A.

    2016-01-01

    Background: Rare diseases often present in the first days and weeks of life and may require complex management in the setting of a neonatal intensive care unit (NICU). Exhaustive consultations and traditional genetic or metabolic investigations are costly and often fail to arrive at a final diagnosis when no recognizable syndrome is suspected. For this pilot project, we assessed the feasibility of next-generation sequencing as a tool to improve the diagnosis of rare diseases in newborns in the NICU. Methods: We retrospectively identified and prospectively recruited newborns and infants admitted to the NICU of the Children’s Hospital of Eastern Ontario and the Ottawa Hospital, General Campus, who had been referred to the medical genetics or metabolics inpatient consult service and had features suggesting an underlying genetic or metabolic condition. DNA from the newborns and parents was enriched for a panel of clinically relevant genes and sequenced on a MiSeq sequencing platform (Illumina Inc.). The data were interpreted with a standard informatics pipeline and reported to care providers, who assessed the importance of genotype–phenotype correlations. Results: Of 20 newborns studied, 8 received a diagnosis on the basis of next-generation sequencing (diagnostic rate 40%). The diagnoses were renal tubular dysgenesis, SCN1A-related encephalopathy syndrome, myotubular myopathy, FTO deficiency syndrome, cranioectodermal dysplasia, congenital myasthenic syndrome, autosomal dominant intellectual disability syndrome type 7 and Denys–Drash syndrome. Interpretation: This pilot study highlighted the potential of next-generation sequencing to deliver molecular diagnoses rapidly with a high success rate. With broader use, this approach has the potential to alter health care delivery in the NICU. PMID:27241786

  7. Suppression Subtractive Hybridization Versus Next-Generation Sequencing in Plant Genetic Engineering: Challenges and Perspectives.

    PubMed

    Sahebi, Mahbod; Hanafi, Mohamed M; Azizi, Parisa; Hakim, Abdul; Ashkani, Sadegh; Abiri, Rambod

    2015-10-01

    Suppression subtractive hybridization (SSH) is an effective method to identify different genes with different expression levels involved in a variety of biological processes. This method has often been used to study molecular mechanisms of plants in complex relationships with different pathogens and a variety of biotic stresses. Compared to other techniques used in gene expression profiling, SSH needs relatively smaller amounts of the initial materials, with lower costs, and fewer false positives present within the results. Extraction of total RNA from plant species rich in phenolic compounds, carbohydrates, and polysaccharides that easily bind to nucleic acids through cellular mechanisms is difficult and needs to be considered. Remarkable advancement has been achieved in the next-generation sequencing (NGS) field. As a result of progress within fields related to molecular chemistry and biology as well as specialized engineering, parallelization in the sequencing reaction has exceptionally enhanced the overall read number of generated sequences per run. Currently available sequencing platforms support an earlier unparalleled view directly into complex mixes associated with RNA in addition to DNA samples. NGS technology has demonstrated the ability to sequence DNA with remarkable swiftness, therefore allowing previously unthinkable scientific accomplishments along with novel biological purposes. However, the massive amounts of data generated by NGS impose a substantial challenge with regard to data safe-keeping and analysis. This review examines some simple but vital points involved in preparing the initial material for SSH and introduces this method as well as its associated applications to detect different novel genes from different plant species. This review evaluates general concepts, basic applications, plus the probable results of NGS technology in genomics, with unique mention of feasible potential tools as well as bioinformatics. PMID:26271955

  8. Complete plastid genome sequence of Vaccinium macrocarpon: structure, gene content and rearrangements revealed by next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete plastid genome sequence of the American cranberry was reconstructed using next-generation sequencing data by in silico procedures. We used Roche 454 shotgun sequence data to isolate cranberry plastid-specific sequences of the cultivar ‘HyRed’ via homology comparisons with complete seque...

  9. Metre-scale cyclicity in Middle Eocene platform carbonates in northern Egypt: Implications for facies development and sequence stratigraphy

    NASA Astrophysics Data System (ADS)

    Tawfik, Mohamed; El-Sorogy, Abdelbaset; Moussa, Mahmoud

    2016-07-01

    The shallow-water carbonates of the Middle Eocene in northern Egypt represent a Tethyan reef-rimmed carbonate platform with bedded inner-platform facies. Based on extensive micro- and biofacies documentation, five lithofacies associations were defined and their respective depositional environments were interpreted. Investigated sections were subdivided into three third-order sequences, named S1, S2 and S3. Sequence S1 is interpreted to correspond to the Lutetian, S2 corresponds to the Late Lutetian and Early Bartonian, and S3 represents the Late Bartonian. Each of the three sequences was further subdivided into fourth-order cycle sets and fifth-order cycles. The complete hierarchy of cycles can be correlated along 190 km across the study area, and highlighting a general "layer-cake" stratigraphic architecture. The documentation of the studied outcrops may contribute to the better regional understanding of the Middle Eocene formations in northern Egypt and to Tethyan pericratonic carbonate models in general.

  10. Humans cannot consciously generate random numbers sequences: Polemic study.

    PubMed

    Figurska, Małgorzata; Stańczyk, Maciej; Kulesza, Kamil

    2008-01-01

    It is widely believed, that randomness exists in Nature. In fact such an assumption underlies many scientific theories and is embedded in the foundations of quantum mechanics. Assuming that this hypothesis is valid one can use natural phenomena, like radioactive decay, to generate random numbers. Today, computers are capable of generating the so-called pseudorandom numbers. Such series of numbers are only seemingly random (bias in the randomness quality can be observed). Question whether people can produce random numbers, has been investigated by many scientists in the recent years. The paper "Humans can consciously generate random numbers sequences..." published recently in Medical Hypotheses made claims that were in many ways contrary to state of art; it also stated far-reaching hypotheses. So, we decided to repeat the experiments reported, with special care being taken of proper laboratory procedures. Here, we present the results and discuss possible implications in computer and other sciences. PMID:17888582

  11. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers.

    PubMed

    Myer, Phillip R; Kim, MinSeok; Freetly, Harvey C; Smith, Timothy P L

    2016-08-01

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplification primer selection, and read length, which can affect the apparent microbial community. In this study, we compared short read 16S rRNA variable regions, V1-V3, with that of near-full length 16S regions, V1-V8, using highly diverse steer rumen microbial communities, in order to examine the impact of technology selection on phylogenetic profiles. Short paired-end reads from the Illumina MiSeq platform were used to generate V1-V3 sequence, while long "circular consensus" reads from the Pacific Biosciences RSII instrument were used to generate V1-V8 data. The two platforms revealed similar microbial operational taxonomic units (OTUs), as well as similar species richness, Good's coverage, and Shannon diversity metrics. However, the V1-V8 amplified ruminal community resulted in significant increases in several orders of taxa, such as phyla Proteobacteria and Verrucomicrobia (P < 0.05). Taxonomic classification accuracy was also greater in the near full-length read. UniFrac distance matrices using jackknifed UPGMA clustering also noted differences between the communities. These data support the consensus that longer reads result in a finer phylogenetic resolution that may not be achieved by shorter 16S rRNA gene fragments. Our work on the cattle rumen bacterial community demonstrates that utilizing near full-length 16S reads may be useful in conducting a more thorough study, or for developing a niche-specific database to use in analyzing data from shorter read technologies when budgetary constraints preclude use of near-full length 16S sequencing. PMID:27282101

  12. Next generation sequencing applications for breast cancer research

    PubMed Central

    PETRIC, ROXANA COJOCNEANU; POP, LAURA-ANCUTA; JURJ, ANCUTA; RADULY, LAJOS; DUMITRASCU, DAN; DRAGOS, NICOLAE; NEAGOE, IOANA BERINDAN

    2015-01-01

    For some time, cancer has not been thought of as a disease, but as a multifaceted, heterogeneous complex of genotypic and phenotypic manifestations leading to tumorigenesis. Due to recent technological progress, the outcome of cancer patients can be greatly improved by introducing in clinical practice the advantages brought about by the development of next generation sequencing techniques. Biomedical suppliers have come up with various applications which medical researchers can use to characterize a patient’s disease from molecular and genetic point of view in order to provide caregivers with rapid and relevant information to guide them in choosing the most appropriate course of treatment, with maximum efficiency and minimal side effects. Breast cancer, whose incidence has risen dramatically, is a good candidate for these novel diagnosis and therapeutic approaches, particularly when referring to specific sequencing panels which are designed to detect germline or somatic mutations in genes that are involved in breast cancer tumorigenesis and progression. Benchtop next generation sequencing machines are becoming a more common presence in the clinical setting, empowering physicians to better treat their patients, by offering early diagnosis alternatives, targeted remedies, and bringing medicine a step closer to achieving its ultimate goal, personalized therapy. PMID:26609257

  13. Next generation sequencing applications for breast cancer research.

    PubMed

    Petric, Roxana Cojocneanu; Pop, Laura-Ancuta; Jurj, Ancuta; Raduly, Lajos; Dumitrascu, Dan; Dragos, Nicolae; Neagoe, Ioana Berindan

    2015-01-01

    For some time, cancer has not been thought of as a disease, but as a multifaceted, heterogeneous complex of genotypic and phenotypic manifestations leading to tumorigenesis. Due to recent technological progress, the outcome of cancer patients can be greatly improved by introducing in clinical practice the advantages brought about by the development of next generation sequencing techniques. Biomedical suppliers have come up with various applications which medical researchers can use to characterize a patient's disease from molecular and genetic point of view in order to provide caregivers with rapid and relevant information to guide them in choosing the most appropriate course of treatment, with maximum efficiency and minimal side effects. Breast cancer, whose incidence has risen dramatically, is a good candidate for these novel diagnosis and therapeutic approaches, particularly when referring to specific sequencing panels which are designed to detect germline or somatic mutations in genes that are involved in breast cancer tumorigenesis and progression. Benchtop next generation sequencing machines are becoming a more common presence in the clinical setting, empowering physicians to better treat their patients, by offering early diagnosis alternatives, targeted remedies, and bringing medicine a step closer to achieving its ultimate goal, personalized therapy. PMID:26609257

  14. Unraveling genomic variation from next generation sequencing data

    PubMed Central

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

  15. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach

    PubMed Central

    Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P.

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  16. Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

    PubMed Central

    Desikan, Srinidhi; Narayanan, Sujatha

    2015-01-01

    Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019

  17. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    PubMed

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  18. BING: biomedical informatics pipeline for Next Generation Sequencing.

    PubMed

    Kriseman, Jeffrey; Busick, Christopher; Szelinger, Szabolcs; Dinu, Valentin

    2010-06-01

    High throughput parallel genomic sequencing (Next Generation Sequencing, NGS) shifts the bottleneck in sequencing processes from experimental data production to computationally intensive informatics-based data analysis. This manuscript introduces a biomedical informatics pipeline (BING) for the analysis of NGS data that offers several novel computational approaches to 1. image alignment, 2. signal correlation, compensation, separation, and pixel-based cluster registration, 3. signal measurement and base calling, 4. quality control and accuracy measurement. These approaches address many of the informatics challenges, including image processing, computational performance, and accuracy. These new algorithms are benchmarked against the Illumina Genome Analysis Pipeline. BING is the one of the first software tools to perform pixel-based analysis of NGS data. When compared to the Illumina informatics tool, BING's pixel-based approach produces a significant increase in the number of sequence reads, while reducing the computational time per experiment and error rate (<2%). This approach has the potential of increasing the density and throughput of NGS technologies. PMID:19925883

  19. Metagenome of microorganisms associated with the toxic Cyanobacteria Microcystis aeruginosa analyzed using the 454 sequencing platform

    NASA Astrophysics Data System (ADS)

    Li, Nan; Zhang, Lei; Li, Fuchao; Wang, Yuezhu; Zhu, Yongqiang; Kang, Hui; Wang, Shengyue; Qin, Song

    2011-05-01

    In this study, the 454 pyrosequencing technology was used to analyze the DNA of the Microcystis aeruginosa symbiosis system from cyanobacterial algal blooms in Taihu Lake, China. We generated 183 228 reads with an average length of 248 bp. Running the 454 assembly algorithm over our sequences yielded 22 239 significant contigs. After excluding the M. aeruginosa sequences, we obtained 1 322 assembled contigs longer than 1 000 bp. Taxonomic analysis indicated that four kingdoms were represented in the community: Archaea ( n = 9; 0.01%), Bacteria ( n = 98 921; 99.6%), Eukaryota ( n = 373; 3.7%), and Viruses ( n = 18; 0.02%). The bacterial sequences were predominantly Alphaproteobacteria ( n = 41 805; 83.3%), Betaproteobacteria ( n = 5 254; 10.5%) and Gammaproteobacteria ( n = 1 180; 2.4%). Gene annotations and assignment of COG (clusters of orthologous groups) functional categories indicate that a large number of the predicted genes are involved in metabolic, genetic, and environmental information processes. Our results demonstrate the extraordinary diversity of a microbial community in an ectosymbiotic system and further establish the tremendous utility of pyrosequencing.

  20. Integrated Next-Generation Sequencing and Avatar Mouse Models for Personalized Cancer Treatment

    PubMed Central

    Garralda, Elena; Paz, Keren; López-Casas, Pedro P.; Jones, Siân; Katz, Amanda; Kann, Lisa M.; López-Rios, Fernando; Sarno, Francesca; Al-Shahrour, Fátima; Vasquez, David; Bruckheimer, Elizabeth; Angiuoli, Samuel V.; Calles, Antonio; Diaz, Luis A.; Velculescu, Victor E.; Valencia, Alfonso; Sidransky, David; Hidalgo, Manuel

    2015-01-01

    Background Current technology permits an unbiased massive analysis of somatic genetic alterations from tumor DNA as well as the generation of individualized mouse xenografts (Avatar models). This work aimed to evaluate our experience integrating these two strategies to personalize the treatment of patients with cancer. Methods We performed whole-exome sequencing analysis of 25 patients with advanced solid tumors to identify putatively actionable tumor-specific genomic alterations. Avatar models were used as an in vivo platform to test proposed treatment strategies. Results Successful exome sequencing analyses have been obtained for 23 patients. Tumor-specific mutations and copy-number variations were identified. All samples profiled contained relevant genomic alterations. Tumor was implanted to create an Avatar model from 14 patients and 10 succeeded. Occasionally, actionable alterations such as mutations in NF1, PI3KA, and DDR2 failed to provide any benefit when a targeted drug was tested in the Avatar and, accordingly, treatment of the patients with these drugs was not effective. To date, 13 patients have received a personalized treatment and 6 achieved durable partial remissions. Prior testing of candidate treatments in Avatar models correlated with clinical response and helped to select empirical treatments in some patients with no actionable mutations. Conclusion The use of full genomic analysis for cancer care is encouraging but presents important challenges that will need to be solved for broad clinical application. Avatar models are a promising investigational platform for therapeutic decision making. While limitations still exist, this strategy should be further tested. PMID:24634382

  1. Computational characterisation of cancer molecular profiles derived using next generation sequencing

    PubMed Central

    Oleksiewicz, Urszula; Tomczak, Katarzyna; Woropaj, Jakub; Markowska, Monika; Stępniak, Piotr

    2015-01-01

    Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets. PMID:25691827

  2. Second-generation environmental sequencing unmasks marine metazoan biodiversity

    PubMed Central

    Fonseca, Vera G.; Carvalho, Gary R.; Sung, Way; Johnson, Harriet F.; Power, Deborah M.; Neill, Simon P.; Packer, Margaret; Blaxter, Mark L.; Lambshead, P. John D.; Thomas, W. Kelley; Creer, Simon

    2010-01-01

    Biodiversity is of crucial importance for ecosystem functioning, sustainability and resilience, but the magnitude and organization of marine diversity at a range of spatial and taxonomic scales are undefined. In this paper, we use second-generation sequencing to unmask putatively diverse marine metazoan biodiversity in a Scottish temperate benthic ecosystem. We show that remarkable differences in diversity occurred at microgeographical scales and refute currently accepted ecological and taxonomic paradigms of meiofaunal identity, rank abundance and concomitant understanding of trophic dynamics. Richness estimates from the current benchmarked Operational Clustering of Taxonomic Units from Parallel UltraSequencing analyses are broadly aligned with those derived from morphological assessments. However, the slope of taxon rarefaction curves for many phyla remains incomplete, suggesting that the true alpha diversity is likely to exceed current perceptions. The approaches provide a rapid, objective and cost-effective taxonomic framework for exploring links between ecosystem structure and function of all hitherto intractable, but ecologically important, communities. PMID:20981026

  3. Improved timing sequence generator on the DIII-D tokamak

    NASA Astrophysics Data System (ADS)

    Colio, R. A.; Finkenthal, D. F.; Deterly, T. M.

    2011-10-01

    The DIII-D tokamak uses a central clock source and trigger system to synchronize plant operations and diagnostics. The system uses a bi-phase encoding technique to send both clock and trigger signals to remote receivers, and supports both pre-programmed sequences of triggers as well as event-driven triggers. A 1 MHz timebase is used and triggers are encoded as eight-bit hexadecimal words. Currently, the system relies on a cascaded series of CAMAC-based delay generators to produce the trigger sequence. We present a modern and more versatile implementation based on a single FPGA (field programmable gate array) capable of providing clock rates upward of 100 MHz while maintaining compatibility with existing equipment. A proposal for system clock synchronization with GPS for improved precision is also presented. Work supported in part by US DOE under DE-FC02-04ER54698 and the National Undergraduate Fellowship in Fusion Science and Engineering.

  4. Next-Generation Sequencing: Role in Gynecologic Cancers.

    PubMed

    Evans, Tarra; Matulonis, Ursula

    2016-09-01

    Next-generation sequencing (NGS) has risen to the forefront of tumor analysis and has enabled unprecedented advances in the molecular profiling of solid tumors. Through massively parallel sequencing, previously unrecognized genomic alterations have been unveiled in many malignancies, including gynecologic cancers, thus expanding the potential repertoire for the use of targeted therapies. NGS has expanded the understanding of the genomic foundation of gynecologic malignancies and has allowed identification of germline and somatic mutations associated with cancer development, enabled tumor reclassification, and helped determine mechanisms of treatment resistance. NGS has also facilitated rationale therapeutic strategies based on actionable molecular aberrations. However, issues remain regarding cost and clinical utility. This review covers NGS analysis of and its impact thus far on gynecologic cancers, specifically ovarian, endometrial, cervical, and vulvar cancers. PMID:27587626

  5. Perspectives of integrative cancer genomics in next generation sequencing era.

    PubMed

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-06-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

  6. Tablet: Visualizing Next-Generation Sequence Assemblies and Mappings.

    PubMed

    Milne, Iain; Bayer, Micha; Stephen, Gordon; Cardle, Linda; Marshall, David

    2016-01-01

    This chapter is designed to be a practical guide to using Tablet for the visualization of next/second-generation (NGS) sequencing data. NGS data is being produced more frequently and in greater data volumes every year. As such, it is increasingly important to have tools which enable biologists and bioinformaticians to understand and gain key insights into their data. Visualization can play a key role in the exploration of such data as well as aid in the visual validation of sequence assemblies and features such as single nucleotide polymorphisms (SNPs). We aim to show several use cases which demonstrate Tablet's ability to visually highlight various situations of interest which can arise in NGS data. PMID:26519411

  7. Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.

    PubMed

    Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L

    2016-05-01

    Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies. PMID:26944624

  8. SRAdb: query and use public next-generation sequencing data from within R

    PubMed Central

    2013-01-01

    Background The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Illumina (Genome Analyzer, HiSeq, MiSeq, .etc), Roche 454 GS System, Applied Biosystems SOLiD System, Helicos Heliscope, PacBio RS, and others. Results SRAdb is an attempt to make queries of the metadata associated with SRA submission, study, sample, experiment and run more robust and precise, and make access to sequencing data in the SRA easier. We have parsed all the SRA metadata into a SQLite database that is routinely updated and can be easily distributed. The SRAdb R/Bioconductor package then utilizes this SQLite database for querying and accessing metadata. Full text search functionality makes querying metadata very flexible and powerful. Fastq files associated with query results can be downloaded easily for local analysis. The package also includes an interface from R to a popular genome browser, the Integrated Genomics Viewer. Conclusions SRAdb Bioconductor package provides a convenient and integrated framework to query and access SRA metadata quickly and powerfully from within R. PMID:23323543

  9. Next-generation sequencing technology in clinical virology.

    PubMed

    Capobianchi, M R; Giombini, E; Rozera, G

    2013-01-01

    Recent advances in nucleic acid sequencing technologies, referred to as 'next-generation' sequencing (NGS), have produced a true revolution and opened new perspectives for research and diagnostic applications, owing to the high speed and throughput of data generation. So far, NGS has been applied to metagenomics-based strategies for the discovery of novel viruses and the characterization of viral communities. Additional applications include whole viral genome sequencing, detection of viral genome variability, and the study of viral dynamics. These applications are particularly suitable for viruses such as human immunodeficiency virus, hepatitis B virus, and hepatitis C virus, whose error-prone replication machinery, combined with the high replication rate, results, in each infected individual, in the formation of many genetically related viral variants referred to as quasi-species. The viral quasi-species, in turn, represents the substrate for the selective pressure exerted by the immune system or by antiviral drugs. With traditional approaches, it is difficult to detect and quantify minority genomes present in viral quasi-species that, in fact, may have biological and clinical relevance. NGS provides, for each patient, a dataset of clonal sequences that is some order of magnitude higher than those obtained with conventional approaches. Hence, NGS is an extremely powerful tool with which to investigate previously inaccessible aspects of viral dynamics, such as the contribution of different viral reservoirs to replicating virus in the course of the natural history of the infection, co-receptor usage in minority viral populations harboured by different cell lineages, the dynamics of development of drug resistance, and the re-emergence of hidden genomes after treatment interruptions. The diagnostic application of NGS is just around the corner. PMID:23279287

  10. Using next generation transcriptome sequencing to predict an ectomycorrhizal metabolome

    PubMed Central

    2011-01-01

    Background Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. Results We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides) roots. The transcriptomic data was used to identify statistically significantly expressed gene models using a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. Conclusions The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems. PMID:21569493