Science.gov

Sample records for generation sequencing platforms

  1. Next-Generation Sequencing Platforms

    NASA Astrophysics Data System (ADS)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  2. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    PubMed

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction. PMID:26464377

  3. Use of Four Next-Generation Sequencing Platforms to Determine HIV-1 Coreceptor Tropism

    PubMed Central

    Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J.; Robertson, David L.; Mimms, Larry; Quiñones-Mateu, Miguel E.

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage. PMID:23166726

  4. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    PubMed Central

    2012-01-01

    Background Next generation sequencing (NGS) technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy. Results Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform. Conclusions All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support. PMID:22827831

  5. FLEXBAR—Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms

    PubMed Central

    Dodt, Matthias; Roehr, Johannes T.; Ahmed, Rina; Dieterich, Christoph

    2012-01-01

    Quantitative and systems biology approaches benefit from the unprecedented depth of next-generation sequencing. A typical experiment yields millions of short reads, which oftentimes carry particular sequence tags. These tags may be: (a) specific to the sequencing platform and library construction method (e.g., adapter sequences); (b) have been introduced by experimental design (e.g., sample barcodes); or (c) constitute some biological signal (e.g., splice leader sequences in nematodes). Our software FLEXBAR enables accurate recognition, sorting and trimming of sequence tags with maximal flexibility, based on exact overlap sequence alignment. The software supports data formats from all current sequencing platforms, including color-space reads. FLEXBAR maintains read pairings and processes separate barcode reads on demand. Our software facilitates the fine-grained adjustment of sequence tag detection parameters and search regions. FLEXBAR is a multi-threaded software and combines speed with precision. Even complex read processing scenarios might be executed with a single command line call. We demonstrate the utility of the software in terms of read mapping applications, library demultiplexing and splice leader detection. FLEXBAR and additional information is available for academic use from the website: http://sourceforge.net/projects/flexbar/. PMID:24832523

  6. FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms.

    PubMed

    Dodt, Matthias; Roehr, Johannes T; Ahmed, Rina; Dieterich, Christoph

    2012-01-01

    Quantitative and systems biology approaches benefit from the unprecedented depth of next-generation sequencing. A typical experiment yields millions of short reads, which oftentimes carry particular sequence tags. These tags may be: (a) specific to the sequencing platform and library construction method (e.g., adapter sequences); (b) have been introduced by experimental design (e.g., sample barcodes); or (c) constitute some biological signal (e.g., splice leader sequences in nematodes). Our software FLEXBAR enables accurate recognition, sorting and trimming of sequence tags with maximal flexibility, based on exact overlap sequence alignment. The software supports data formats from all current sequencing platforms, including color-space reads. FLEXBAR maintains read pairings and processes separate barcode reads on demand. Our software facilitates the fine-grained adjustment of sequence tag detection parameters and search regions. FLEXBAR is a multi-threaded software and combines speed with precision. Even complex read processing scenarios might be executed with a single command line call. We demonstrate the utility of the software in terms of read mapping applications, library demultiplexing and splice leader detection. FLEXBAR and additional information is available for academic use from the website: http://sourceforge.net/projects/flexbar/. PMID:24832523

  7. Preparation of Fragment Libraries for Next-Generation Sequencing on the Applied Biosystems SOLiD Platform

    PubMed Central

    Yegnasubramanian, Srinivasan

    2014-01-01

    The primary purpose of this protocol is to prepare genomic DNA libraries that can then be analyzed by massively parallel next-generation sequencing on the Applied Bio-systems SOLiD platform. This protocol can be adapted to next-generation sequencing workflows to ultimately generate up to 1 billion 50 bp sequence tags from the ends of each of the DNA molecules in the library in a single next-generation sequencing run. PMID:24011046

  8. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  9. A Microfluidic DNA Library Preparation Platform for Next-Generation Sequencing

    PubMed Central

    Sinha, Anupama; Bent, Zachary W.; Solberg, Owen D.; Williams, Kelly P.; Langevin, Stanley A.; Renzi, Ronald F.; Van De Vreugde, James L.; Meagher, Robert J.; Schoeniger, Joseph S.; Lane, Todd W.; Branda, Steven S.; Bartsch, Michael S.; Patel, Kamlesh D.

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories. PMID:23894387

  10. Clinical analysis of genome next-generation sequencing data using the Omicia platform

    PubMed Central

    Coonrod, Emily M; Margraf, Rebecca L; Russell, Archie; Voelkerding, Karl V; Reese, Martin G

    2013-01-01

    Aims Next-generation sequencing is being implemented in the clinical laboratory environment for the purposes of candidate causal variant discovery in patients affected with a variety of genetic disorders. The successful implementation of this technology for diagnosing genetic disorders requires a rapid, user-friendly method to annotate variants and generate short lists of clinically relevant variants of interest. This report describes Omicia’s Opal platform, a new software tool designed for variant discovery and interpretation in a clinical laboratory environment. The software allows clinical scientists to process, analyze, interpret and report on personal genome files. Materials & Methods To demonstrate the software, the authors describe the interactive use of the system for the rapid discovery of disease-causing variants using three cases. Results & Conclusion Here, the authors show the features of the Opal system and their use in uncovering variants of clinical significance. PMID:23895124

  11. StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics.

    PubMed

    Ramirez-Gonzalez, Ricardo H; Leggett, Richard M; Waite, Darren; Thanki, Anil; Drou, Nizar; Caccamo, Mario; Davey, Robert

    2013-01-01

    Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. "provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month". The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages. PMID:24627795

  12. Next-Generation Sequencing Workflow for NSCLC Critical Samples Using a Targeted Sequencing Approach by Ion Torrent PGM™ Platform

    PubMed Central

    Vanni, Irene; Coco, Simona; Truini, Anna; Rusmini, Marta; Dal Bello, Maria Giovanna; Alama, Angela; Banelli, Barbara; Mora, Marco; Rijavec, Erika; Barletta, Giulia; Genova, Carlo; Biello, Federica; Maggioni, Claudia; Grossi, Francesco

    2015-01-01

    Next-generation sequencing (NGS) is a cost-effective technology capable of screening several genes simultaneously; however, its application in a clinical context requires an established workflow to acquire reliable sequencing results. Here, we report an optimized NGS workflow analyzing 22 lung cancer-related genes to sequence critical samples such as DNA from formalin-fixed paraffin-embedded (FFPE) blocks and circulating free DNA (cfDNA). Snap frozen and matched FFPE gDNA from 12 non-small cell lung cancer (NSCLC) patients, whose gDNA fragmentation status was previously evaluated using a multiplex PCR-based quality control, were successfully sequenced with Ion Torrent PGM™. The robust bioinformatic pipeline allowed us to correctly call both Single Nucleotide Variants (SNVs) and indels with a detection limit of 5%, achieving 100% specificity and 96% sensitivity. This workflow was also validated in 13 FFPE NSCLC biopsies. Furthermore, a specific protocol for low input gDNA capable of producing good sequencing data with high coverage, high uniformity, and a low error rate was also optimized. In conclusion, we demonstrate the feasibility of obtaining gDNA from FFPE samples suitable for NGS by performing appropriate quality controls. The optimized workflow, capable of screening low input gDNA, highlights NGS as a potential tool in the detection, disease monitoring, and treatment of NSCLC. PMID:26633390

  13. A platform for leveraging next generation sequencing for routine microbiology and public health use.

    PubMed

    Rusu, Laura I; Wyres, Kelly L; Reumann, Matthias; Queiroz, Carlos; Bojovschi, Alexe; Conway, Tom; Garg, Saurabh; Edwards, David J; Hogg, Geoff; Holt, Kathryn E

    2015-01-01

    Even with the advent of next-generation sequencing (NGS) technologies which have revolutionised the field of bacterial genomics in recent years, a major barrier still exists to the implementation of NGS for routine microbiological use (in public health and clinical microbiology laboratories). Such routine use would make a big difference to investigations of pathogen transmission and prevention/control of (sometimes lethal) infections. The inherent complexity and high frequency of data analyses on very large sets of bacterial DNA sequence data, the ability to ensure data provenance and automatically track and log all analyses for audit purposes, the need for quick and accurate results, together with an essential user-friendly interface for regular non-technical laboratory staff, are all critical requirements for routine use in a public health setting. There are currently no systems to answer positively to all these requirements, in an integrated manner. In this paper, we describe a system for sequence analysis and interpretation that is highly automated and tackles the issues raised earlier, and that is designed for use in diagnostic laboratories by healthcare workers with no specialist bioinformatics knowledge. PMID:25870761

  14. Towards a Next-Generation Sequencing Diagnostic Service for Tumour Genotyping: A Comparison of Panels and Platforms.

    PubMed

    Burghel, George J; Hurst, Carolyn D; Watson, Christopher M; Chambers, Phillip A; Dickinson, Helen; Roberts, Paul; Knowles, Margaret A

    2015-01-01

    Detection of clinically actionable mutations in diagnostic tumour specimens aids in the selection of targeted therapeutics. With an ever increasing number of clinically significant mutations identified, tumour genetic diagnostics is moving from single to multigene analysis. As it is still not feasible for routine diagnostic laboratories to perform sequencing of the entire cancer genome, our approach was to undertake targeted mutation detection. To optimise our diagnostic workflow, we evaluated three target enrichment strategies using two next-generation sequencing (NGS) platforms (Illumina MiSeq and Ion PGM). The target enrichment strategies were Fluidigm Access Array custom amplicon panel including 13 genes (MiSeq sequencing), the Oxford Gene Technologies (OGT) SureSeq Solid Tumour hybridisation panel including 60 genes (MiSeq sequencing), and an Ion AmpliSeq Cancer Hotspot Panel including 50 genes (Ion PGM sequencing). DNA extracted from formalin-fixed paraffin-embedded (FFPE) blocks of eight previously characterised cancer cell lines was tested using the three panels. Matching genomic DNA from fresh cultures of these cell lines was also tested using the custom Fluidigm panel and the OGT SureSeq Solid Tumour panel. Each panel allowed mutation detection of core cancer genes including KRAS, BRAF, and EGFR. Our results indicate that the panels enable accurate variant detection despite sequencing from FFPE DNA. PMID:26351634

  15. Performance Comparison of Illumina and Ion Torrent Next-Generation Sequencing Platforms for 16S rRNA-Based Bacterial Community Profiling

    PubMed Central

    Kawashima, Toana; Rosenthal, Christopher; Hoogestraat, Daniel R.; Cummings, Lisa A.; Sengupta, Dhruba J.; Harkins, Timothy T.; Cookson, Brad T.

    2014-01-01

    High-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common “benchtop” sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone. PMID:25261520

  16. A comprehensive transcriptome assembly of pigeonpea (Cajanauscajan L.) using sanger and second-generation sequencing platforms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18,353 Sanger expressed sequenced tags (ESTs) from more than 16 genotypes. The resultant transcriptome assembly, refer...

  17. Evaluation and comparison of two commercially available targeted next-generation sequencing platforms to assist oncology decision making

    PubMed Central

    Weiss, Glen J; Hoff, Brandi R; Whitehead, Robert P; Sangal, Ashish; Gingrich, Susan A; Penny, Robert J; Mallery, David W; Morris, Scott M; Thompson, Eric J; Loesch, David M; Khemka, Vivek

    2015-01-01

    Background It is widely acknowledged that there is value in examining cancers for genomic aberrations via next-generation sequencing (NGS). How commercially available NGS platforms compare with each other, and the clinical utility of the reported actionable results, are not well known. During the course of the current study, the Foundation One (F1) test generated data on a combination of somatic mutations, insertion and deletion polymorphisms, chromosomal abnormalities, and deoxyribonucleic acid (DNA) copy number changes at ~250× coverage, while the Paradigm Cancer Diagnostic (PCDx) test generated the same type of data at >5,000× coverage, plus provided messenger RNA (mRNA) expression levels. We sought to compare and evaluate paired formalin-fixed paraffin-embedded tumor tissue using these two platforms. Methods Samples from patients with advanced solid tumors were submitted to both the F1 and PCDx vendors for NGS analysis. Turnaround time (TAT) was calculated. Biomarkers were considered clinically actionable if they had a published association with treatment response in humans and were assigned to the following categories: commercially available drug (CA), clinical trial drug (CT), or neither option (hereafter referred to as “None”). Results The demographics of the 21 unique patient tumor samples included ten men and eleven women, with a median age of 56 years. Due to insufficient archival tissue from the same collection period, in one case, we used samples from different collections. PCDx reported first results faster than F1 in 20 cases. When received at both vendors on the same day, PCDx reported first results for 14 of 15 cases, with a median TAT of 9 days earlier than F1 (P<0.0001). Categorization of CA compared to CT and none significantly favored PCDx (P=0.012). Conclusion In the current analysis, commercially available NGS platforms provided clinically relevant actionable targets (CA or CT) in 47%–67% of diverse cancer types. In the samples

  18. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo) genome assembly and analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing technologies were used to rapidly and efficiently sequence the genome of the domestic turkey (Meleagris gallopavo). The current genome assembly (~1.1 Gb) includes 917 Mb of sequence assigned to chromosomes. Innate heterozygosity of the sequenced bird allowed discovery of...

  19. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  20. Multi-platform and cross-methodological reproducibility of transcriptome profiling by RNA-seq in the ABRF Next-Generation Sequencing Study

    PubMed Central

    Nicolet, Charles M.; Grove, Deborah; Levy, Shawn; Farmerie, William; Viale, Agnes; Wright, Chris; Schweitzer, Peter A.; Gao, Yuan; Kim, Dewey; Boland, Joe; Hicks, Belynda; Kim, Ryan; Chhangawala, Sagar; Jafari, Nadereh; Raghavachari, Nalini; Gandara, Jorge; Garcia-Reyero, Natàlia; Hendrickson, Cynthia; Roberson, David; Rosenfeld, Jeffrey; Smith, Todd; Underwood, Jason G.; Wang, May; Zumbo, Paul; Baldwin, Don A.; Grills, George S.; Mason, Christopher E.

    2014-01-01

    High-throughput RNA sequencing (RNA-seq) dramatically expands the potential for novel genomics discoveries, but the wide variety of platforms, protocols and performance has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We tested replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (polyA-selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies’ PGM and Proton, Pacific Biosciences RS and Roche’s 454). The results show high intra-platform and inter-platform concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. These data also demonstrate that ribosomal RNA depletion can both enable effective analysis of degraded RNA samples and be readily compared to polyA-enriched fractions. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq. PMID:25150835

  1. Efficacy of a 3rd generation high-throughput sequencing platform for analyses of 16S rRNA genes from environmental samples.

    PubMed

    Mosher, Jennifer J; Bernberg, Erin L; Shevchenko, Olga; Kan, Jinjun; Kaplan, Louis A

    2013-11-01

    Longer sequences of the bacterial 16S rRNA gene could provide greater phylogenetic and taxonomic resolutions and advance knowledge of population dynamics within complex natural communities. We assessed the accuracy of a Pacific Biosciences (PacBio) single molecule, real time (SMRT) sequencing based on DNA polymerization, a promising 3rd generation high-throughput technique, and compared this to the 2nd generation Roche 454 pyrosequencing platform. Amplicons of the 16S rRNA gene from a known isolate, Shewanella oneidensis MR1, and environmental samples from two streambed habitats, rocks and sediments, and a riparian zone soil, were analyzed. On the PacBio we analyzed ~500 bp amplicons that covered the V1-V3 regions and the full 1500 bp amplicons of the V1-V9 regions. On the Roche 454 we analyzed the ~500 bp amplicons. Error rates associated with the isolate were lowest with the Roche 454 method (2%), increased by more than 2-fold for the 500 bp amplicons with the PacBio SMRT chip (4-5%), and by more than 8-fold for the full gene with the PacBio SMRT chip (17-18%). Higher error rates with the PacBio SMRT chip artificially inflated estimates of richness and lowered estimates of coverage for environmental samples. The 3rd generation sequencing technology we evaluated does not provide greater phylogenetic and taxonomic resolutions for studies of microbial ecology. PMID:23999276

  2. An effective screening strategy for deafness in combination with a next-generation sequencing platform: a consecutive analysis

    PubMed Central

    Sakuma, Naoko; Moteki, Hideaki; Takahashi, Masahiro; Nishio, Shin-ya; Arai, Yasuhiro; Yamashita, Yukiko; Oridate, Nobuhiko; Usami, Shin-ichi

    2016-01-01

    The diagnosis of the genetic etiology of deafness contributes to the clinical management of patients. We performed the following four genetic tests in three stages for 52 consecutive deafness subjects in one facility. We used the Invader assay for 46 mutations in 13 genes and Sanger sequencing for the GJB2 gene or SLC26A4 gene in the first-stage test, the TaqMan genotyping assay in the second-stage test and targeted exon sequencing using massively parallel DNA sequencing in the third-stage test. Overall, we identified the genetic cause in 40% (21/52) of patients. The diagnostic rates of autosomal dominant, autosomal recessive and sporadic cases were 50%, 60% and 34%, respectively. When the sporadic cases with congenital and severe hearing loss were selected, the diagnostic rate rose to 48%. The combination approach using these genetic tests appears to be useful as a diagnostic tool for deafness patients. We recommended that genetic testing for the screening of common mutations in deafness genes using the Invader assay or TaqMan genotyping assay be performed as the initial evaluation. For the remaining undiagnosed cases, targeted exon sequencing using massively parallel DNA sequencing is clinically and economically beneficial. PMID:26763877

  3. Comprehensive transcriptome assembly of chickpea (Cicer arietinum L.) using Sanger and next generation sequencing platforms: development and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A high-quality transcriptome assembly for chickpea has been developed using ~135 million Illumina single-end reads, 7.12 million single-end FLX/454 reads, and 139 thousand Sanger expressed sequence tags (ESTs). This hybrid transcriptome assembly, which we refer to as the "Cicer arietinum Transcripto...

  4. Primer ID Informs Next-Generation Sequencing Platforms and Reveals Preexisting Drug Resistance Mutations in the HIV-1 Reverse Transcriptase Coding Domain

    PubMed Central

    Keys, Jessica R.; Zhou, Shuntai; Anderson, Jeffrey A.; Eron, Joseph J.; Rackoff, Lauren A.; Jabara, Cassandra

    2015-01-01

    Abstract Sequencing of a bulk polymerase chain reaction (PCR) product to identify drug resistance mutations informs antiretroviral therapy selection but has limited sensitivity for minority variants. Alternatively, deep sequencing is capable of detecting minority variants but is subject to sequencing errors and PCR resampling due to low input templates. We screened for resistance mutations among 184 HIV-1-infected, therapy-naive subjects using the 454 sequencing platform to sequence two amplicons spanning HIV-1 reverse transcriptase codons 34–245. Samples from 19 subjects were also analyzed using the MiSeq sequencing platform for comparison. Errors and PCR resampling were addressed by tagging each HIV-1 RNA template copy (i.e., cDNA) with a unique sequence tag (Primer ID), allowing a consensus sequence to be constructed for each original template from resampled sequences. In control reactions, Primer ID reduced 454 and MiSeq errors from 71 to 2.6 and from 24 to 1.2 errors/10,000 nucleotides, respectively. MiSeq also allowed accurate sequencing of codon 65, an important drug resistance position embedded in a homopolymeric run that is poorly resolved by the 454 platform. Excluding homopolymeric positions, 14% of subjects had evidence of ≥1 resistance mutation among Primer ID consensus sequences, compared to 2.7% by bulk population sequencing. When calls were restricted to mutations that appeared twice among consensus sequence populations, 6% of subjects had detectable resistance mutations. The use of Primer ID revealed 5–15% template utilization on average, limiting the depth of deep sequencing sampling and revealing sampling variation due to low template utilization. Primer ID addresses important limitations of deep sequencing and produces less biased estimates of low-level resistance mutations in the viral population. PMID:25748056

  5. Primer ID Informs Next-Generation Sequencing Platforms and Reveals Preexisting Drug Resistance Mutations in the HIV-1 Reverse Transcriptase Coding Domain.

    PubMed

    Keys, Jessica R; Zhou, Shuntai; Anderson, Jeffrey A; Eron, Joseph J; Rackoff, Lauren A; Jabara, Cassandra; Swanstrom, Ronald

    2015-06-01

    Sequencing of a bulk polymerase chain reaction (PCR) product to identify drug resistance mutations informs antiretroviral therapy selection but has limited sensitivity for minority variants. Alternatively, deep sequencing is capable of detecting minority variants but is subject to sequencing errors and PCR resampling due to low input templates. We screened for resistance mutations among 184 HIV-1-infected, therapy-naive subjects using the 454 sequencing platform to sequence two amplicons spanning HIV-1 reverse transcriptase codons 34-245. Samples from 19 subjects were also analyzed using the MiSeq sequencing platform for comparison. Errors and PCR resampling were addressed by tagging each HIV-1 RNA template copy (i.e., cDNA) with a unique sequence tag (Primer ID), allowing a consensus sequence to be constructed for each original template from resampled sequences. In control reactions, Primer ID reduced 454 and MiSeq errors from 71 to 2.6 and from 24 to 1.2 errors/10,000 nucleotides, respectively. MiSeq also allowed accurate sequencing of codon 65, an important drug resistance position embedded in a homopolymeric run that is poorly resolved by the 454 platform. Excluding homopolymeric positions, 14% of subjects had evidence of ≥1 resistance mutation among Primer ID consensus sequences, compared to 2.7% by bulk population sequencing. When calls were restricted to mutations that appeared twice among consensus sequence populations, 6% of subjects had detectable resistance mutations. The use of Primer ID revealed 5-15% template utilization on average, limiting the depth of deep sequencing sampling and revealing sampling variation due to low template utilization. Primer ID addresses important limitations of deep sequencing and produces less biased estimates of low-level resistance mutations in the viral population. PMID:25748056

  6. Profile of bacterial communities in South African mine-water samples using Illumina next-generation sequencing platform.

    PubMed

    Keshri, Jitendra; Mankazana, Boitumelo B J; Momba, Maggy N B

    2015-04-01

    Mine water is an example of an extreme environment that contains a large number of diverse and specific bacteria. It is imperative to gain an understanding of these bacterial communities in order to develop effective strategies for the bioremediation of polluted aquatic systems. In this study, the high-throughput sequencing approach was used to characterize the bacterial communities in two different mine waters of South Africa: vanadium and gold mine water. Over 2629 operational taxonomic units (OTUs) were recovered from 15,802 reads of the 16S ribosomal RNA (rRNA) gene. They represented 8 phyla, 43 orders, 84 families and 105 genera. Proteobacteria and unclassified bacterial sequences were the most dominant. Apart from these, Firmicutes, Bacteroidetes, Actinobacteria, Candidate phylum OD1, Cyanobacteria, Verrucomicrobia and Deinococcus-Thermus were the recovered phyla, although their relative abundance differed between both the mine-water samples. Yet, diversity indices suggested that the bacterial communities inhabiting the vanadium mine water were more diverse than those in gold mine water. Interestingly, substantial percentages of the reads from either sample (58 % in vanadium and 17 % in gold mine water) could not be assigned to any phylum and remained unclassified, suggesting hitherto unidentified populations, and vast untapped microbial diversity. Overall, the results of this study exhibited bacterial community structures with high diversity in mine water, which can be explored further for their role in bioremediation and environmental management. PMID:25416590

  7. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  8. Regulation of next generation sequencing.

    PubMed

    Javitt, Gail H; Carner, Katherine Strong

    2014-01-01

    Next generation sequencing raises new questions within the context of an existing and still evolving regulatory landscape for device manufacturers and clinical laboratories. FDA cleared the first NGS sequencing platform in November 2013, but it is unclear what lies ahead for this technology. NGS will require new types of training and expertise to interpret the vast quantities of genetic data so as to provide meaningful clinical information to physicians and patients. This paper will describe the current regulatory landscape for NGS technologies, identify the regulatory challenges they present, and consider whether new regulatory paradigms are needed to accommodate NGS technologies and services. PMID:25298288

  9. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L.) using sanger and next generation sequencing platforms: development and applications.

    PubMed

    Kudapa, Himabindu; Azam, Sarwar; Sharpe, Andrew G; Taran, Bunyamin; Li, Rong; Deonovic, Benjamin; Cameron, Connor; Farmer, Andrew D; Cannon, Steven B; Varshney, Rajeev K

    2014-01-01

    A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in

  10. A comprehensive transcriptome assembly of Pigeonpea (Cajanus cajan L.) using sanger and second-generation sequencing platforms.

    PubMed

    Kudapa, Himabindu; Bharti, Arvind K; Cannon, Steven B; Farmer, Andrew D; Mulaosmanovic, Benjamin; Kramer, Robin; Bohra, Abhishek; Weeks, Nathan T; Crow, John A; Tuteja, Reetu; Shah, Trushar; Dutta, Sutapa; Gupta, Deepak K; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R; May, Gregory D; Singh, Nagendra K; Varshney, Rajeev K

    2012-09-01

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ~8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea. PMID:22241453

  11. Automatic Command Sequence Generation

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladded, Roy; Khanampompan, Teerapat

    2007-01-01

    Automatic Sequence Generator (Autogen) Version 3.0 software automatically generates command sequences for the Mars Reconnaissance Orbiter (MRO) and several other JPL spacecraft operated by the multi-mission support team. Autogen uses standard JPL sequencing tools like APGEN, ASP, SEQGEN, and the DOM database to automate the generation of uplink command products, Spacecraft Command Message Format (SCMF) files, and the corresponding ground command products, DSN Keywords Files (DKF). Autogen supports all the major multi-mission mission phases including the cruise, aerobraking, mapping/science, and relay mission phases. Autogen is a Perl script, which functions within the mission operations UNIX environment. It consists of two parts: a set of model files and the autogen Perl script. Autogen encodes the behaviors of the system into a model and encodes algorithms for context sensitive customizations of the modeled behaviors. The model includes knowledge of different mission phases and how the resultant command products must differ for these phases. The executable software portion of Autogen, automates the setup and use of APGEN for constructing a spacecraft activity sequence file (SASF). The setup includes file retrieval through the DOM (Distributed Object Manager), an object database used to store project files. This step retrieves all the needed input files for generating the command products. Depending on the mission phase, Autogen also uses the ASP (Automated Sequence Processor) and SEQGEN to generate the command product sent to the spacecraft. Autogen also provides the means for customizing sequences through the use of configuration files. By automating the majority of the sequencing generation process, Autogen eliminates many sequence generation errors commonly introduced by manually constructing spacecraft command sequences. Through the layering of commands into the sequence by a series of scheduling algorithms, users are able to rapidly and reliably construct the

  12. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing.

  13. Relay Sequence Generation Software

    NASA Technical Reports Server (NTRS)

    Gladden, Roy E.; Khanampompan, Teerapat

    2009-01-01

    Due to thermal and electromagnetic interactivity between the UHF (ultrahigh frequency) radio onboard the Mars Reconnaissance Orbiter (MRO), which performs relay sessions with the Martian landers, and the remainder of the MRO payloads, it is required to integrate and de-conflict relay sessions with the MRO science plan. The MRO relay SASF/PTF (spacecraft activity sequence file/ payload target file) generation software facilitates this process by generating a PTF that is needed to integrate the periods of time during which MRO supports relay activities with the rest of the MRO science plans. The software also generates the needed command products that initiate the relay sessions, some features of which are provided by the lander team, some are managed by MRO internally, and some being derived.

  14. MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform

    PubMed Central

    Suyama, Yoshihisa; Matsuki, Yu

    2015-01-01

    Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239

  15. Next generation sequencing of viral RNA genomes

    PubMed Central

    2013-01-01

    Background With the advent of Next Generation Sequencing (NGS) technologies, the ability to generate large amounts of sequence data has revolutionized the genomics field. Most RNA viruses have relatively small genomes in comparison to other organisms and as such, would appear to be an obvious success story for the use of NGS technologies. However, due to the relatively low abundance of viral RNA in relation to host RNA, RNA viruses have proved relatively difficult to sequence using NGS technologies. Here we detail a simple, robust methodology, without the use of ultra-centrifugation, filtration or viral enrichment protocols, to prepare RNA from diagnostic clinical tissue samples, cell monolayers and tissue culture supernatant, for subsequent sequencing on the Roche 454 platform. Results As representative RNA viruses, full genome sequence was successfully obtained from known lyssaviruses belonging to recognized species and a novel lyssavirus species using these protocols and assembling the reads using de novo algorithms. Furthermore, genome sequences were generated from considerably less than 200 ng RNA, indicating that manufacturers’ minimum template guidance is conservative. In addition to obtaining genome consensus sequence, a high proportion of SNPs (Single Nucleotide Polymorphisms) were identified in the majority of samples analyzed. Conclusions The approaches reported clearly facilitate successful full genome lyssavirus sequencing and can be universally applied to discovering and obtaining consensus genome sequences of RNA viruses from a variety of sources. PMID:23822119

  16. AB118. Validation of next generation sequencing by Sanger sequencing

    PubMed Central

    Low, Meow Hong Wendy; Lai, Hwei Meeng Angeline; Jamuar, Saumya Shekhar; Law, Hai Yang

    2015-01-01

    Background and objective Development of the next generation sequencing (NGS) platform was driven by the completion of the Human Genome Project in 2003. With the availability of NGS, the time taken for sequencing of humongous genomic regions was greatly reduced and data generated per unit DNA was also significantly increased. Though the cost to use NGS in a clinically setting is far from ideal, economically speaking, there is a significant decrease in the average cost per sequenced base. To validate findings of NGS on mutation detected for FBN1, TGFBR2, RAF1, RTEL1, LMNA, MID2, KCNK9, DMD, SMARCA2 and IQSEC2 by using gold standard, Sanger Sequencing. Methods The coordinate of the mutation identified by NGS was used to retrieve the adjacent genomic sequence in UCSC Genome Browser (Available from URL: https://genome.ucsc.edu/). Targeted primers were designed with Primer 3 software (Available from URL: http://primer3.ut.ee/) based on the genomic sequence obtained from UCSC. The following step involves the optimization of a Polymerase Chain Reaction (PCR) with the designed primers to amplify the desired DNA template for the targeted region. Upon optimization, the template is purified and subjected to dye terminator sequencing to generate multiple DNA fragments of varying sizes. Lastly, the DNA fragments will be purified and analysed with an automated sequencer. The sequencer separates the DNA fragments based on their size by carrying out capillary electrophoresis. Results A total of 28 cases were validated with Sanger sequencing. Of them, 25 (89.3%) cases concur with the findings from NGS and 3 (10.7%) cases were false-positive calls. Conclusions NGS shows promise in the future molecular diagnostic regime, however, at the present moment, it needs to be done concurrently with Sanger sequencing for clinical applications.

  17. Quasi-Random Sequence Generators.

    Energy Science and Technology Software Center (ESTSC)

    1994-03-01

    Version 00 LPTAU generates quasi-random sequences. The sequences are uniformly distributed sets of L=2**30 points in the N-dimensional unit cube: I**N=[0,1]. The sequences are used as nodes for multidimensional integration, as searching points in global optimization, as trial points in multicriteria decision making, as quasi-random points for quasi Monte Carlo algorithms.

  18. Sequencing platform and library preparation choices impact viral metagenomes

    PubMed Central

    2013-01-01

    Background Microbes drive the biogeochemistry that fuels the planet. Microbial viruses modulate their hosts directly through mortality and horizontal gene transfer, and indirectly by re-programming host metabolisms during infection. However, our ability to study these virus-host interactions is limited by methods that are low-throughput and heavily reliant upon the subset of organisms that are in culture. One way forward are culture-independent metagenomic approaches, but these novel methods are rarely rigorously tested, especially for studies of environmental viruses, air microbiomes, extreme environment microbiology and other areas with constrained sample amounts. Here we perform replicated experiments to evaluate Roche 454, Illumina HiSeq, and Ion Torrent PGM sequencing and library preparation protocols on virus metagenomes generated from as little as 10pg of DNA. Results Using %G + C content to compare metagenomes, we find that (i) metagenomes are highly replicable, (ii) some treatment effects are minimal, e.g., sequencing technology choice has 6-fold less impact than varying input DNA amount, and (iii) when restricted to a limited DNA concentration (<1μg), changing the amount of amplification produces little variation. These trends were also observed when examining the metagenomes for gene function and assembly performance, although the latter more closely aligned to sequencing effort and read length than preparation steps tested. Among Illumina library preparation options, transposon-based libraries diverged from all others and adaptor ligation was a critical step for optimizing sequencing yields. Conclusions These data guide researchers in generating systematic, comparative datasets to understand complex ecosystems, and suggest that neither varied amplification nor sequencing platforms will deter such efforts. PMID:23663384

  19. Bioinformatics for Next Generation Sequencing Data

    PubMed Central

    Magi, Alberto; Benelli, Matteo; Gozzini, Alessia; Girolami, Francesca; Torricelli, Francesca; Brandi, Maria Luisa

    2010-01-01

    The emergence of next-generation sequencing (NGS) platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow. PMID:24710047

  20. Application of genotyping-by-sequencing on semiconductor sequencing platforms: A comparison of genetic and reference-based marker ordering in barley

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid development of next generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach fo...

  1. Wolfcampian sequence stratigraphy of eastern Central Basin platform, Texas

    SciTech Connect

    Candelaria, M.P.; Entzminger, D.J.; Behnken, F.H. ); Sarg, J.F. ); Wilde, G.L. )

    1992-04-01

    Integrated study of well logs, cores, high-resolution seismic data, and biostratigraphy has established the sequence framework of the Atokan (Early Pennsylvanian)-Wolfcampian (Early Permian) stratigraphic section along the eastern margin of the Central Basin platform in the Permian basin. Sequence interpretation of high-resolution, high-fold seismic data through this stratigraphic interval has revealed a complex progradational/retrogradational evolution of the platform margin that has demonstrated overall progradation of at least 12 km during early-middle Wolfcampian. Sequence stratigraphic study of the Wolfcamp interval has revealed details of the internal architecture and morphologic evolution of the contemporaneous platform margin. Two generalized seismic facies assemblages are recognized in the Wolfcampian. Platform interior facies are characterized by high-amplitude, laterally continuous parallel reflections; platform margin facies consist of progradational sigmoidal to oblique clinoforms and are characterized by discontinuous, low-amplitude reflections. Sequence interpretation of carbonate platform-to-basin strata geometries helps in predicting subtle stratigraphic trapping relationships and potential reservoir facies distribution. Moreover, this interpretive method assists in describing complex reservoir heterogeneities that can contribute to significant reserve additions from within existing fields.

  2. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    PubMed

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. PMID:25110940

  3. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  4. Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies

    PubMed Central

    Utturkar, Sagar M; Klingeman, Dawn M; Bruno-Barcena, José M; Chinn, Mari S; Grunden, Amy M; Köpke, Michael; Brown, Steven D

    2015-01-01

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data. PMID:25977818

  5. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    SciTech Connect

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.

  6. Detection of BRAF Mutations Using a Fully Automated Platform and Comparison with High Resolution Melting, Real-Time Allele Specific Amplification, Immunohistochemistry and Next Generation Sequencing Assays, for Patients with Metastatic Melanoma

    PubMed Central

    Harlé, Alexandre; Salleron, Julia; Franczak, Claire; Dubois, Cindy; Filhine-Tressarieu, Pierre; Leroux, Agnès; Merlin, Jean-Louis

    2016-01-01

    Background Metastatic melanoma is a severe disease with one of the highest mortality rate in skin diseases. Overall survival has significantly improved with immunotherapy and targeted therapies. Kinase inhibitors targeting BRAF V600 showed promising results. BRAF genotyping is mandatory for the prescription of anti-BRAF therapies. Methods Fifty-nine formalin-fixed paraffin-embedded melanoma samples were assessed using High-Resolution-Melting (HRM) PCR, Real-time allele-specific amplification (RT-ASA) PCR, Next generation sequencing (NGS), immunohistochemistry (IHC) and the fully-automated molecular diagnostics platform IdyllaTM. Sensitivity, specificity, positive predictive value and negative predictive value were calculated using NGS as the reference standard to compare the different assays. Results BRAF mutations were found in 28(47.5%), 29(49.2%), 31(52.5%), 29(49.2%) and 27(45.8%) samples with HRM, RT-ASA, NGS, IdyllaTM and IHC respectively. Twenty-six (81.2%) samples were found bearing a c.1799T>A (p.Val600Glu) mutation, three (9.4%) with a c.1798_1799delinsAA (p.Val600Lys) mutation and one with c.1789_1790delinsTC (p.Leu597Ser) mutation. Two samples were found bearing complex mutations. Conclusions HRM appears the less sensitive assay for the detection of BRAF V600 mutations. The RT-ASA, IdyllaTM and IHC assays are suitable for routine molecular diagnostics aiming at the prescription of anti-BRAF therapies. IdyllaTM assay is fully-automated and requires less than 2 minutes for samples preparation and is the fastest of the tested assays. PMID:27111917

  7. Expression Profiling Using New Generation Sequencing Technologies

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microarray hybridization technology has become widely used in parallel analysis of gene expression. Recent advances in genome sequencing platforms point to an alternate approach through digital quantitation of sequencing reads produced from cDNA samples. This presentation will compare advantages a...

  8. Replacement Sequence of Events Generator

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladden, Daniel Wenkert Roy; Khanampompan, Teerpat

    2008-01-01

    The soeWINDOW program automates the generation of an ITAR (International Traffic in Arms Regulations)-compliant sub-RSOE (Replacement Sequence of Events) by extracting a specified temporal window from an RSOE while maintaining page header information. RSOEs contain a significant amount of information that is not ITAR-compliant, yet that foreign partners need to see for command details to their instrument, as well as the surrounding commands that provide context for validation. soeWINDOW can serve as an example of how command support products can be made ITAR-compliant for future missions. This software is a Perl script intended for use in the mission operations UNIX environment. It is designed for use to support the MRO (Mars Reconnaissance Orbiter) instrument team. The tool also provides automated DOM (Distributed Object Manager) storage into the special ITAR-okay DOM collection, and can be used for creating focused RSOEs for product review by any of the MRO teams.

  9. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    DOE PAGESBeta

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less

  10. Next generation sequencing methodologies--an overview.

    PubMed

    Pickrell, William O; Rees, Mark I; Chung, Seo-Kyung

    2012-01-01

    Gene discovery has been one of the most important advances in our understanding of human disorders. Early linkage and positional cloning strategies have now given way to next generation sequencing (NGS) with age-old help from biostatistical and bioinformatical input. In this chapter, we present the importance of getting the basics right, namely, how the best phenotyping in the clinical domain will provide a higher chance of a successful NGS experiment. In addition, we show getting the correct submission of DNA samples to NGS providers is dependent on the type of inheritance pattern that may or may not be apparent. We discuss one of the most crucial decisions for investigators when designing a study, namely choosing a trio, quad or cohort for analysis. Following on from this, we compare and contrast the underlying technology adopted by provider companies as they vie for customers and submissions. Each platform has advantages and disadvantages based on false calls, coverage, and read depth; however, some of these issues may be solved with the third wave of sequencing technology development in early commercial roll-out. Lastly, we provide a bioinformatic filtering overview of a "quad"-based submission and show how 3 million SNPs and indels can be reduced to a biologically plausible and experimentally manageable n≤50 gene variants. PMID:23046880

  11. Construction of a rationally designed antibody platform for sequencing-assisted selection.

    PubMed

    Larman, H Benjamin; Xu, George Jing; Pavlova, Natalya N; Elledge, Stephen J

    2012-11-01

    Antibody discovery platforms have become an important source of both therapeutic biomolecules and research reagents. Massively parallel DNA sequencing can be used to assist antibody selection by comprehensively monitoring libraries during selection, thus greatly expanding the power of these systems. We have therefore constructed a rationally designed, fully defined single-chain variable fragment (scFv) library and analysis platform optimized for analysis with short-read deep sequencing. Sequence-defined oligonucleotide libraries encoding three complementarity-determining regions (L3 from the light chain, H2 and H3 from the heavy chain) were synthesized on a programmable microarray and combinatorially cloned into a single scFv framework for molecular display. Our unique complementarity-determining region sequence design optimizes for protein binding by utilizing a hidden Markov model that was trained on all antibody-antigen cocrystal structures in the Protein Data Bank. The resultant ~10(12)-member library was produced in ribosome-display format, and comprehensively analyzed over four rounds of antigen selections by multiplex paired-end Illumina sequencing. The hidden Markov model scFv library generated multiple binders against an emerging cancer antigen and is the basis for a next-generation antibody production platform. PMID:23064642

  12. Membrane platforms for biological nanopore sensing and sequencing.

    PubMed

    Schmidt, Jacob

    2016-06-01

    In the past two decades, biological nanopores have been developed and explored for use in sensing applications as a result of their exquisite sensitivity and easily engineered, reproducible, and economically manufactured structures. Nanopore sensing has been shown to differentiate between highly similar analytes, measure polymer size, detect the presence of specific genes, and rapidly sequence nucleic acids translocating through the pore. Devices featuring protein nanopores have been limited in part by the membrane support containing the nanopore, the shortcomings of which have been addressed in recent work developing new materials, approaches, and apparatus resulting in membrane platforms featuring automatability and increased robustness, lifetime, and measurement throughput. PMID:26773300

  13. ACMG clinical laboratory standards for next-generation sequencing

    PubMed Central

    Rehm, Heidi L.; Bale, Sherri J; Bayrak-Toydemir, Pinar; Berg, Jonathan S; Brown, Kerry K; Deignan, Joshua L; Friez, Michael J; Funke, Birgit H; Hegde, Madhuri R; Lyon, Elaine

    2014-01-01

    Next-generation sequencing technologies have been and continue to be deployed in clinical laboratories, enabling rapid transformations in genomic medicine. These technologies have reduced the cost of large-scale sequencing by several orders of magnitude, and continuous advances are being made. It is now feasible to analyze an individual's near-complete exome or genome to assist in the diagnosis of a wide array of clinical scenarios. Next-generation sequencing technologies are also facilitating further advances in therapeutic decision making and disease prediction for at-risk patients. However, with rapid advances come additional challenges involving the clinical validation and use of these constantly evolving technologies and platforms in clinical laboratories. To assist clinical laboratories with the validation of next-generation sequencing methods and platforms, the ongoing monitoring of next-generation sequencing testing to ensure quality results, and the interpretation and reporting of variants found using these technologies, the American College of Medical Genetics and Genomics has developed the following professional standards and guidelines. PMID:23887774

  14. Utilization of Benchtop Next Generation Sequencing Platforms Ion Torrent PGM and MiSeq in Noninvasive Prenatal Testing for Chromosome 21 Trisomy and Testing of Impact of In Silico and Physical Size Selection on Its Analytical Performance

    PubMed Central

    Minarik, Gabriel; Repiska, Gabriela; Hyblova, Michaela; Nagyova, Emilia; Soltys, Katarina; Budis, Jaroslav; Duris, Frantisek; Sysak, Rastislav; Gerykova Bujalkova, Maria; Vlkova-Izrael, Barbora; Biro, Orsolya; Nagy, Balint; Szemes, Tomas

    2015-01-01

    Objectives The aims of this study were to test the utility of benchtop NGS platforms for NIPT for trisomy 21 using previously published z score calculation methods and to optimize the sample preparation and data analysis with use of in silico and physical size selection methods. Methods Samples from 130 pregnant women were analyzed by whole genome sequencing on benchtop NGS systems Ion Torrent PGM and MiSeq. The targeted yield of 3 million raw reads on each platform was used for z score calculation. The impact of in silico and physical size selection on analytical performance of the test was studied. Results Using a z score value of 3 as the cut-off, 98.11% - 100% (104-106/106) specificity and 100% (24/24) sensitivity and 99.06% - 100% (105-106/106) specificity and 100% (24/24) sensitivity were observed for Ion Torrent PGM and MiSeq, respectively. After in silico based size selection both platforms reached 100% specificity and sensitivity. Following the physical size selection z scores of tested trisomic samples increased significantly—p = 0.0141 and p = 0.025 for Ion Torrent PGM and MiSeq, respectively. Conclusions Noninvasive prenatal testing for chromosome 21 trisomy with the utilization of benchtop NGS systems led to results equivalent to previously published studies performed on high-to-ultrahigh throughput NGS systems. The in silico size selection led to higher specificity of the test. Physical size selection performed on isolated DNA led to significant increase in z scores. The observed results could represent a basis for increasing of cost effectiveness of the test and thus help with its penetration worldwide. PMID:26669558

  15. Metagenomic next-generation sequencing of viruses infecting grapevines.

    PubMed

    Burger, Johan T; Maree, Hans J

    2015-01-01

    Next-generation sequencing (NGS) technologies, for the first time, provide a truly "complete" representation of the viral (and other) pathogens present in a host organism. This is achieved in an unbiased way, and without any prior biological or molecular knowledge of these pathogen(s). During recent years a number of broad approaches, for most of the popular NGS platforms, have been developed. Here we describe such a protocol-one that accurately and reliably analyze viruses (and viroids) infecting grapevine. Our strategy relies on the synthesis of cDNA sequencing libraries from dsRNA, extracted from diseased grapevine tissues; the sequencing of these on an Illumina platform, and a streamlined bioinformatics pipeline to analyze the NGS data, yielding the virus composition (virome) of a specific grapevine tissue type, organ, entire plant, or even a vineyard. PMID:25981264

  16. Concept For Generation Of Long Pseudorandom Sequences

    NASA Technical Reports Server (NTRS)

    Wang, C. C.

    1990-01-01

    Conceptual very-large-scale integrated (VLSI) digital circuit performs exponentiation in finite field. Algorithm that generates unusually long sequences of pseudorandom numbers executed by digital processor that includes such circuits. Concepts particularly advantageous for such applications as spread-spectrum communications, cryptography, and generation of ranging codes, synthetic noise, and test data, where usually desirable to make pseudorandom sequences as long as possible.

  17. Metagenomics using next-generation sequencing.

    PubMed

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis. PMID:24515370

  18. ADS: The Next Generation Search Platform

    NASA Astrophysics Data System (ADS)

    Accomazzi, A.; Kurtz, M. J.; Henneken, E. A.; Chyla, R.; Luker, J.; Grant, C. S.; Thompson, D. M.; Holachek, A.; Dave, R.; Murray, S. S.

    2015-04-01

    Four years after the last LISA meeting, the NASA Astrophysics Data System (ADS) finds itself in the middle of major changes to the infrastructure and contents of its database. In this paper we highlight a number of features of great importance to librarians and discuss the additional functionality that we are currently developing. Our citation coverage has doubled since 2010 and now consists of over 10 million citations. We are normalizing the affiliation information in our records and we have started collecting and linking funding sources with papers in our system. At the same time, we are undergoing major technology changes in the ADS platform. We have rolled out and are now enhancing a new high-performance search engine capable of performing full-text as well as metadata searches using an intuitive query language. We are currently able to index acknowledgments, affiliations, citations, and funding sources. While this effort is still ongoing, some of its benefits are already available through the ADS Labs user interface and API at http://adslabs.org/adsabs/.

  19. Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective

    PubMed Central

    Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

    2011-01-01

    Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead. PMID:22147957

  20. Multifunctional pulse sequence generator for pulse NMR

    NASA Astrophysics Data System (ADS)

    Wang, Dongsheng

    1988-06-01

    A new multifunctional pulse sequence generator has been designed and constructed. It can conveniently generate various pulse sequences used in nuclear-magnetic resonance (NMR) to measure the spin-lattice relaxation time T1, the spin-spin relaxation time T2, and the spin-locking relaxation time T1 ρ. It can also be used in pulse Fourier transform NMR and double resonance. The intervals of pulses can increase automatically with sequence repetitions and the generator can be used in two-dimensional spectrum measurement and spin-density imaging research. The sequences can be generated through four different triggering methods and there are two synchronous pulse outputs and fifteen auxiliary pulse outputs, so the generator can be conveniently interfaced with a computer or other instruments. The circuitry, functions, and features of the generator are described in this article.

  1. A window into third-generation sequencing.

    PubMed

    Schadt, Eric E; Turner, Steve; Kasarskis, Andrew

    2010-10-15

    First- and second-generation sequencing technologies have led the way in revolutionizing the field of genomics and beyond, motivating an astonishing number of scientific advances, including enabling a more complete understanding of whole genome sequences and the information encoded therein, a more complete characterization of the methylome and transcriptome and a better understanding of interactions between proteins and DNA. Nevertheless, there are sequencing applications and aspects of genome biology that are presently beyond the reach of current sequencing technologies, leaving fertile ground for additional innovation in this space. In this review, we describe a new generation of single-molecule sequencing technologies (third-generation sequencing) that is emerging to fill this space, with the potential for dramatically longer read lengths, shorter time to result and lower overall cost. PMID:20858600

  2. NG6: Integrated next generation sequencing storage and processing environment

    PubMed Central

    2012-01-01

    Background Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. Results We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. Conclusions NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data. PMID:22958229

  3. Next generation platforms for high-throughput biodosimetry

    PubMed Central

    Repin, Mikhail; Turner, Helen C.; Garty, Guy; Brenner, David J.

    2014-01-01

    Here the general concept of the combined use of plates and tubes in racks compatible with the American National Standards Institute/the Society for Laboratory Automation and Screening microplate formats as the next generation platforms for increasing the throughput of biodosimetry assays was described. These platforms can be used at different stages of biodosimetry assays starting from blood collection into microtubes organised in standardised racks and ending with the cytogenetic analysis of samples in standardised multiwell and multichannel plates. Robotically friendly platforms can be used for different biodosimetry assays in minimally equipped laboratories and on cost-effective automated universal biotech systems. PMID:24837249

  4. Iterative method for generating correlated binary sequences

    NASA Astrophysics Data System (ADS)

    Usatenko, O. V.; Melnik, S. S.; Apostolov, S. S.; Makarov, N. M.; Krokhin, A. A.

    2014-11-01

    We propose an efficient iterative method for generating random correlated binary sequences with a prescribed correlation function. The method is based on consecutive linear modulations of an initially uncorrelated sequence into a correlated one. Each step of modulation increases the correlations until the desired level has been reached. The robustness and efficiency of the proposed algorithm are tested by generating sequences with inverse power-law correlations. The substantial increase in the strength of correlation in the iterative method with respect to single-step filtering generation is shown for all studied correlation functions. Our results can be used for design of disordered superlattices, waveguides, and surfaces with selective transport properties.

  5. Next-generation sequencing in the clinic: promises and challenges.

    PubMed

    Xuan, Jiekun; Yu, Ying; Qing, Tao; Guo, Lei; Shi, Leming

    2013-11-01

    The advent of next generation sequencing (NGS) technologies has revolutionized the field of genomics, enabling fast and cost-effective generation of genome-scale sequence data with exquisite resolution and accuracy. Over the past years, rapid technological advances led by academic institutions and companies have continued to broaden NGS applications from research to the clinic. A recent crop of discoveries have highlighted the medical impact of NGS technologies on Mendelian and complex diseases, particularly cancer. However, the ever-increasing pace of NGS adoption presents enormous challenges in terms of data processing, storage, management and interpretation as well as sequencing quality control, which hinder the translation from sequence data into clinical practice. In this review, we first summarize the technical characteristics and performance of current NGS platforms. We further highlight advances in the applications of NGS technologies towards the development of clinical diagnostics and therapeutics. Common issues in NGS workflows are also discussed to guide the selection of NGS platforms and pipelines for specific research purposes. PMID:23174106

  6. Double-digest RAD sequencing using Ion Proton semiconductor platform (ddRADseq-ion) with nonmodel organisms.

    PubMed

    Recknagel, Hans; Jacobs, Arne; Herzyk, Pawel; Elmer, Kathryn R

    2015-11-01

    Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double-digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single-end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11,000 polymorphic loci per library of 6-30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost-effective generation of variable and reproducible genetic markers. PMID:25808755

  7. Comparison of Next-Generation Sequencing Systems

    PubMed Central

    Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

    2012-01-01

    With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized. PMID:22829749

  8. Theory of Periodic-Binary-Sequence Generators

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1987-01-01

    Algorithms yield feedback shift registers with maximum regularity. Report provides extensive mathematical treatment of new and previous results related to generation of pseudo-noise binary sequences by feedback shift registers. Generator architectures amenable to efficient implementation in very-large-scale integrated (VLSI) circuits. Report includes literature references to applications of such sequences in random-number generation, radar, VLSI testing, data encryption and decryption, algebraic error-detection and error-correction encoding and decoding, and feedback-shift-register synthesis of sequential machines.

  9. Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments

    PubMed Central

    Qi, Yuan; Liu, Xiuping; Liu, Chang-gong; Wang, Bailing; Hess, Kenneth R.; Symmans, W. Fraser; Shi, Weiwei; Pusztai, Lajos

    2015-01-01

    Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA. We performed targeted sequencing of all known human protein kinase genes (kinome) (~3.2 Mb) using the SOLiD v4 platform. Seventeen breast cancer samples were sequenced in duplicate (n=14) or triplicate (n=3) to assess concordance of all calls and single nucleotide variant (SNV) calls. The concordance rates over the entire sequenced region were >99.99%, while the concordance rates for SNVs were 54.3-75.5%. There was substantial variation in basic sequencing metrics from experiment to experiment. The type of nucleotide substitution and genomic location of the variant had little impact on concordance but concordance increased with coverage level, variant allele count (VAC), variant allele frequency (VAF), variant allele quality and p-value of SNV-call. The most important determinants of concordance were VAC and VAF. Even using the highest stringency of QC metrics the reproducibility of SNV calls was around 80% suggesting that erroneous variant calling can be as high as 20-40% in a single experiment. The sequence data have been deposited into the European Genome-phenome Archive (EGA) with accession number EGAS00001000826. PMID:26136146

  10. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform.

    PubMed

    Shokralla, Shadi; Porter, Teresita M; Gibson, Joel F; Dobosz, Rafal; Janzen, Daniel H; Hallwachs, Winnie; Golding, G Brian; Hajibabaei, Mehrdad

    2015-01-01

    Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions. PMID:25884109

  11. Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform.

    PubMed

    Della Mina, Erika; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

    2015-03-01

    We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8-10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1-2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on 'benchtop' sequencers combining rapid turnaround times with higher manageability. PMID:24848745

  12. Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform

    PubMed Central

    Mina, Erika Della; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

    2015-01-01

    We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8–10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1–2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on ‘benchtop' sequencers combining rapid turnaround times with higher manageability. PMID:24848745

  13. A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms.

    PubMed

    Indugu, Nagaraju; Bittinger, Kyle; Kumar, Sanjay; Vecchiarelli, Bonnie; Pitta, Dipti

    2016-01-01

    Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M (2) = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed. PMID:26870608

  14. A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms

    PubMed Central

    Indugu, Nagaraju; Bittinger, Kyle; Kumar, Sanjay; Vecchiarelli, Bonnie

    2016-01-01

    Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M2 = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed. PMID:26870608

  15. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment.

    PubMed

    Kwak, Daniel; Kam, Alfred; Becerra, David; Zhou, Qikuan; Hops, Adam; Zarour, Eleyine; Kam, Arthur; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814

  16. SNP Discovery through Next-Generation Sequencing and Its Applications

    PubMed Central

    Kumar, Santosh; Banks, Travis W.; Cloutier, Sylvie

    2012-01-01

    The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in various model and nonmodel plant species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. Although excellent reviews have been published on next-generation sequencing, its associated bioinformatics challenges, and the applications of SNPs in genetic studies, a comprehensive review connecting these three intertwined research areas is needed. This paper touches upon various aspects of SNP discovery, highlighting key points in availability and selection of appropriate sequencing platforms, bioinformatics pipelines, SNP filtering criteria, and applications of SNPs in genetic analyses. The use of next-generation sequencing methodologies in many non-model crops leading to discovery and implementation of SNPs in various genetic studies is discussed. Development and improvement of bioinformatics software that are open source and freely available have accelerated the SNP discovery while reducing the associated cost. Key considerations for SNP filtering and associated pipelines are discussed in specific topics. A list of commonly used software and their sources is compiled for easy access and reference. PMID:23227038

  17. Next-Generation Sequence Assembly: Four Stages of Data Processing and Computational Challenges

    PubMed Central

    El-Metwally, Sara; Hamza, Taher; Zakaria, Magdi; Helmy, Mohamed

    2013-01-01

    Decoding DNA symbols using next-generation sequencers was a major breakthrough in genomic research. Despite the many advantages of next-generation sequencers, e.g., the high-throughput sequencing rate and relatively low cost of sequencing, the assembly of the reads produced by these sequencers still remains a major challenge. In this review, we address the basic framework of next-generation genome sequence assemblers, which comprises four basic stages: preprocessing filtering, a graph construction process, a graph simplification process, and postprocessing filtering. Here we discuss them as a framework of four stages for data analysis and processing and survey variety of techniques, algorithms, and software tools used during each stage. We also discuss the challenges that face current assemblers in the next-generation environment to determine the current state-of-the-art. We recommend a layered architecture approach for constructing a general assembler that can handle the sequences generated by different sequencing platforms. PMID:24348224

  18. Platforms.

    PubMed

    Josko, Deborah

    2014-01-01

    The advent of DNA sequencing technologies and the various applications that can be performed will have a dramatic effect on medicine and healthcare in the near future. There are several DNA sequencing platforms available on the market for research and clinical use. Based on the medical laboratory scientist or researcher's needs and taking into consideration laboratory space and budget, one can chose which platform will be beneficial to their institution and their patient population. Although some of the instrument costs seem high, diagnosing a patient quickly and accurately will save hospitals money with fewer hospital stays and targeted treatment based on an individual's genetic make-up. By determining the type of disease an individual has, based on the mutations present or having the ability to prescribe the appropriate antimicrobials based on the knowledge of the organism's resistance patterns, the clinician will be better able to treat and diagnose a patient which ultimately will improve patient outcomes and prognosis. PMID:25219075

  19. Impact of next generation sequencing techniques in food microbiology.

    PubMed

    Mayo, Baltasar; Rachid, Caio T C C; Alegría, Angel; Leite, Analy M O; Peixoto, Raquel S; Delgado, Susana

    2014-08-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  20. Impact of Next Generation Sequencing Techniques in Food Microbiology

    PubMed Central

    Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana

    2014-01-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  1. Next Generation Sequencing Reveals the Hidden Diversity of Zooplankton Assemblages

    PubMed Central

    Harmer, Rachel A.; Somerfield, Paul J.; Atkinson, Angus

    2013-01-01

    Background Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. Methodology/Principle Findings Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. Conclusions Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly

  2. Next-Generation Sequencing for Binary Protein–Protein Interactions

    PubMed Central

    Suter, Bernhard; Zhang, Xinmin; Pesce, C. Gustavo; Mendelsohn, Andrew R.; Dinesh-Kumar, Savithramma P.; Mao, Jian-Hua

    2015-01-01

    The yeast two-hybrid (Y2H) system exploits host cell genetics in order to display binary protein–protein interactions (PPIs) via defined and selectable phenotypes. Numerous improvements have been made to this method, adapting the screening principle for diverse applications, including drug discovery and the scale-up for proteome wide interaction screens in human and other organisms. Here we discuss a systematic workflow and analysis scheme for screening data generated by Y2H and related assays that includes high-throughput selection procedures, readout of comprehensive results via next-generation sequencing (NGS), and the interpretation of interaction data via quantitative statistics. The novel assays and tools will serve the broader scientific community to harness the power of NGS technology to address PPI networks in health and disease. We discuss examples of how this next-generation platform can be applied to address specific questions in diverse fields of biology and medicine. PMID:26734059

  3. A repetitive sequence assembler based on next-generation sequencing.

    PubMed

    Lian, S; Tu, Y; Wang, Y; Chen, X; Wang, L

    2016-01-01

    Repetitive sequences of variable length are common in almost all eukaryotic genomes, and most of them are presumed to have important biomedical functions and can cause genomic instability. Next-generation sequencing (NGS) technologies provide the possibility of identifying capturing these repetitive sequences directly from the NGS data. In this study, we assessed the performances in identifying capturing repeats of leading assemblers, such as Velvet, SOAPdenovo, SGA, MSR-CA, Bambus2, ALLPATHS-LG, and AByss using three real NGS datasets. Our results indicated that most of them performed poorly in capturing the repeats. Consequently, we proposed a repetitive sequence assembler, named NGSReper, for capturing repeats from NGS data. Simulated datasets were used to validate the feasibility of NGSReper. The results indicate that the completeness of capturing repeat is up to 99%. Cross validation was performed in three real NGS datasets, and extensive comparisons indicate that NGSReper performed best in terms of completeness and accuracy in capturing repeats. In conclusion, NGSReper is an appropriate and suitable tool for capturing repeats directly from NGS data. PMID:27525861

  4. Next-Generation Sequencing in Intellectual Disability.

    PubMed

    Carvill, Gemma L; Mefford, Heather C

    2015-09-01

    Next-generation sequencing technologies have revolutionized gene discovery in patients with intellectual disability (ID) and led to an unprecedented expansion in the number of genes implicated in this disorder. We discuss the strategies that have been used to identify these novel genes for both syndromic and nonsyndromic ID and highlight the phenotypic and genetic heterogeneity that underpin this condition. Finally, we discuss the future of defining the genetic etiology of ID, including the role of whole-genome sequencing, mosaicism, and the importance of diagnostic testing in ID. PMID:27617123

  5. Microfluidic Platform Generates Oxygen Landscapes for Localized Hypoxic Activation

    PubMed Central

    Rexius, Megan L.; Mauleon, Gerardo; Malik, Asrar B.; Rehman, Jalees; Eddington, David T.

    2014-01-01

    An open-well microfluidic platform generates an oxygen landscape using gas-perfused networks which diffuse across a membrane. The device enables real-time analysis of cellular and tissue responses to oxygen tension to define how cells adapt to heterogeneous oxygen conditions found in the physiological setting. We demonstrate that localized hypoxic activation of cells elicited specific metabolic and gene responses in human microvascular endothelial cells and bone marrow-derived mesenchymal stem cells. A robust demonstration of the compatibility of the device with standard laboratory techniques demonstrates the wide utility of the method. This platform is ideally suited to study real-time cell responses and cell-cell interactions within physiologically relevant oxygen landscapes. PMID:25315003

  6. deepTools: a flexible platform for exploring deep-sequencing data.

    PubMed

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. PMID:24799436

  7. deepTools: a flexible platform for exploring deep-sequencing data

    PubMed Central

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A.; Manke, Thomas

    2014-01-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. PMID:24799436

  8. Initial steps towards a production platform for DNA sequence analysis on the grid

    PubMed Central

    2010-01-01

    Background Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. Results In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. Conclusions The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/ PMID:21156038

  9. Capturing genomic signatures of DNA sequence variation using a standard anonymous microarray platform

    PubMed Central

    Cannon, C. H.; Kua, C. S.; Lobenhofer, E. K.; Hurban, P.

    2006-01-01

    Comparative genomics, using the model organism approach, has provided powerful insights into the structure and evolution of whole genomes. Unfortunately, only a small fraction of Earth's biodiversity will have its genome sequenced in the foreseeable future. Most wild organisms have radically different life histories and evolutionary genomics than current model systems. A novel technique is needed to expand comparative genomics to a wider range of organisms. Here, we describe a novel approach using an anonymous DNA microarray platform that gathers genomic samples of sequence variation from any organism. Oligonucleotide probe sequences placed on a custom 44 K array were 25 bp long and designed using a simple set of criteria to maximize their complexity and dispersion in sequence probability space. Using whole genomic samples from three known genomes (mouse, rat and human) and one unknown (Gonystylus bancanus), we demonstrate and validate its power, reliability, transitivity and sensitivity. Using two separate statistical analyses, a large numbers of genomic ‘indicator’ probes were discovered. The construction of a genomic signature database based upon this technique would allow virtual comparisons and simple queries could generate optimal subsets of markers to be used in large-scale assays, using simple downstream techniques. Biologists from a wide range of fields, studying almost any organism, could efficiently perform genomic comparisons, at potentially any phylogenetic level after performing a small number of standardized DNA microarray hybridizations. Possibilities for refining and expanding the approach are discussed. PMID:17000641

  10. On the study of microbial transcriptomes using second- and third-generation sequencing technologies.

    PubMed

    Choi, Sang Chul

    2016-08-01

    Second-generation sequencing technologies transformed the study of microbial transcriptomes. They helped reveal the transcription start sites and antisense transcripts of microbial species, improving the microbial genome annotation. Quantification of genome-wide gene expression levels allowed for functional studies of microbial research. Ever-evolving sequencing technologies are reshaping approaches to studying microbial transcriptomes. Recently, Oxford Nanopore Technologies delivered a sequencing platform called MinION, a third-generation sequencing technology, to the research community. We expect it to be the next sequencing technology that enables breakthroughs in life science fields. The studies of microbial transcriptomes will be no exception. In this paper, we review microbial transcriptomics studies using second- generation sequencing technology. We also discuss the prospect of microbial transcriptomics studies with thirdgeneration sequencing. PMID:27480632

  11. Clinical Integration of Next Generation Sequencing Technology

    PubMed Central

    Gullapalli, R.R.; Lyons-Weiler, M.; Petrosko, P.; Dhir, R.; Becich, M.J.; LaFramboise, W.A.

    2012-01-01

    Abstract/Synopsis Recent technological advances in Next Generation Sequencing (NGS) methods have substantially reduced cost and operational complexity leading to the production of bench top sequencers and commercial software solutions for implementation in small research and clinical laboratories. This chapter summarizes requirements and hurdles to the successful implementation of these systems including 1) calibration, validation and optimization of the instrumentation, experimental paradigm and primary readout, 2) secure transfer, storage and secondary processing of the data, 3) implementation of software tools for targeted analysis, and 4) training of research and clinical personnel to evaluate data fidelity and interpret the molecular significance of the genomic output. In light of the commercial and technological impetus to bring NGS technology into the clinical domain, it is critical that novel tests incorporate rigid protocols with built-in calibration standards and that data transfer and processing occur under exacting security measures for interpretation by clinicians with specialized training in molecular diagnostics. PMID:23078661

  12. Next Generation Sequencing in Endocrine Practice

    PubMed Central

    Forlenza, Gregory P.; Calhoun, Amy; Beckman, Kenneth B.; Halvorsen, Tanya; Hamdoun, Elwaseila; Zierhut, Heather; Sarafoglou, Kyriakie; Polgreen, Lynda E.; Miller, Bradley S.; Nathan, Brandon; Petryk, Anna

    2016-01-01

    With the completion of the Human Genome Project and advances in genomic sequencing technologies, the use of clinical molecular diagnostics has grown tremendously over the last decade. Next-generation sequencing (NGS) has overcome many of the practical roadblocks that had slowed the adoption of molecular testing for routine clinical diagnosis. In endocrinology, targeted NGS now complements biochemical testing and imaging studies. The goal of this review is to provide clinicians with a guide to the application of NGS to genetic testing for endocrine conditions, by compiling a list of established gene mutations detectable by NGS, and highlighting key phenotypic features of these disorders. As we outline in this review, the clinical utility of NGS-based molecular testing for endocrine disorders is very high. Identifying an exact genetic etiology improves understanding of the disease, provides clear explanation to families about the cause, and guides decisions about screening, prevention and/or treatment. PMID:25958132

  13. Revealing the Complexity of Breast Cancer by Next Generation Sequencing

    PubMed Central

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of “-omic” platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  14. Next generation sequencing and its applications in forensic genetics.

    PubMed

    Børsting, Claus; Morling, Niels

    2015-09-01

    It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics. PMID:25704953

  15. Whole-transcriptome sequencing of Pinellia ternata using the Illumina platform.

    PubMed

    Huang, X; Jing, Y; Liu, D J; Yang, B Y; Chen, H; Li, M

    2016-01-01

    Pinelliae rhizoma is the dried tuber of Pinellia ternata (Thunb.) Breit., and has been used for thousands of years as a traditional Chinese medicine. However, its genomic background is little known. With the development of high-throughput genomic sequencing, it is now easy and cheap to obtain genomic information. In this study, 193,032,910 high-quality clean reads were generated using the Illumina Hiseq 2000 platform. A total of 53,544 unigenes were identified from the contigs assembled. Functional annotation analysis annotated 37,318, 27,697, 23,043, 22,869, 23,328, and 27,415 unigenes. KEGG analysis revealed that five pathways (169 genes) were associated with alkaloid synthesis, 201 unigenes were related to fatty acid biosynthesis (ko00061), and 133 unigenes were involved in the biosynthesis of unsaturated fatty acids (ko01040). In addition, 6703 simple sequence repeats were designed based on the unigene sequences for screening germplasm resources in the future. These data are a valuable resource for genomic studies on Pinellia plants. PMID:27420994

  16. Next-generation sequencing for mitochondrial disorders

    PubMed Central

    Carroll, C J; Brilhante, V; Suomalainen, A

    2014-01-01

    A great deal of our understanding of mitochondrial function has come from studies of inherited mitochondrial diseases, but still majority of the patients lack molecular diagnosis. Furthermore, effective treatments for mitochondrial disorders do not exist. Development of therapies has been complicated by the fact that the diseases are extremely heterogeneous, and collecting large enough cohorts of similarly affected individuals to assess new therapies properly has been difficult. Next-generation sequencing technologies have in the last few years been shown to be an effective method for the genetic diagnosis of inherited mitochondrial diseases. Here we review the strategies and findings from studies applying next-generation sequencing methods for the genetic diagnosis of mitochondrial disorders. Detailed knowledge of molecular causes also enables collection of homogenous cohorts of patients for therapy trials, and therefore boosts development of intervention. Linked Articles This article is part of a themed issue on Mitochondrial Pharmacology: Energy, Injury & Beyond. To view the other articles in this issue visit http://dx.doi.org/10.1111/bph.2014.171.issue-8 PMID:24138576

  17. Strategies for complete mitochondrial genome sequencing on Ion Torrent PGM™ platform in forensic sciences.

    PubMed

    Zhou, Yishu; Guo, Fei; Yu, Jiao; Liu, Feng; Zhao, Jinling; Shen, Hongying; Zhao, Bin; Jia, Fei; Sun, Zhu; Song, He; Jiang, Xianhua

    2016-05-01

    Next generation sequencing (NGS) is a time saving and cost-efficient method to detect the complete mitochondrial genome (mtGenome) compared to Sanger sequencing. In this study we focused on developing strategies for mtGenome sequencing on the Ion Torrent PGM™ platform and NGS data analysis. With our experience, 4, 15 and 30 samples could be loaded onto Ion 314™, Ion 316™ and Ion 318™ chips respectively at a pooling concentration of 26pM, achieving to sufficient average coverage of ≥1500 × and well strand balance of 1.05. Data processing software is essential to NGS mega data analysis. The in-house Perl scripts were developed for primary data analysis to screen out uncertain positions and samples from variant call format (VCF) reports and for pedigree study to perform pairwise comparisons. The Integrative Genomic Viewer (IGV) and the NextGENe software were introduced to secondary data analysis. The mthap and EMMA were employed for haplogroup assignment. The dataset was reviewed and approved by the EMPOP as the final version, which showed 2.66% error rate generated from the Torrent Variant Caller (TVC). Across the mtGenome, 4022 variants were found at 725 nucleotide positions, where ratio of transitions to transversions was estimated at 20.89:1 and 22.18% of variants was concentrated at hypervariable segments I and II (HVS-I and HVS-II). Totally, 107 complete mtGenome haplotypes were observed from 107 Northern Chinese Han and assigned to 88 haplogroups. The random match probability (RMP) of complete mtGenome was calculated as 0.009345794, decreasing 26.19% by comparison to that of HVS-I only, and the haplotype diversity (HD) was evaluated as 1, increasing 0.33% by comparison to that of HVS-I only. Principal component analysis (PCA) showed that our population was clustered to East and Southeast Asians. The strategies in this study are suitable for complete mtGenome sequencing on Ion Torrent PGM™ platform and Northern Chinese Han (EMP00670) is the first

  18. A Modular Assembly Platform for Rapid Generation of DNA Constructs.

    PubMed

    Akama-Garren, Elliot H; Joshi, Nikhil S; Tammela, Tuomas; Chang, Gregory P; Wagner, Bethany L; Lee, Da-Yae; Rideout, William M; Papagiannakopoulos, Thales; Xue, Wen; Jacks, Tyler

    2016-01-01

    Traditional cloning methods have limitations on the number of DNA fragments that can be simultaneously manipulated, which dramatically slows the pace of molecular assembly. Here we describe GMAP, a Gibson assembly-based modular assembly platform consisting of a collection of promoters and genes, which allows for one-step production of DNA constructs. GMAP facilitates rapid assembly of expression and viral constructs using modular genetic components, as well as increasingly complicated genetic tools using contextually relevant genomic elements. Our data demonstrate the applicability of GMAP toward the validation of synthetic promoters, identification of potent RNAi constructs, establishment of inducible lentiviral systems, tumor initiation in genetically engineered mouse models, and gene-targeting for the generation of knock-in mice. GMAP represents a recombinant DNA technology designed for widespread circulation and easy adaptation for other uses, such as synthetic biology, genetic screens, and CRISPR-Cas9. PMID:26887506

  19. A Modular Assembly Platform for Rapid Generation of DNA Constructs

    PubMed Central

    Akama-Garren, Elliot H.; Joshi, Nikhil S.; Tammela, Tuomas; Chang, Gregory P.; Wagner, Bethany L.; Lee, Da-Yae; Rideout III, William M.; Papagiannakopoulos, Thales; Xue, Wen; Jacks, Tyler

    2016-01-01

    Traditional cloning methods have limitations on the number of DNA fragments that can be simultaneously manipulated, which dramatically slows the pace of molecular assembly. Here we describe GMAP, a Gibson assembly-based modular assembly platform consisting of a collection of promoters and genes, which allows for one-step production of DNA constructs. GMAP facilitates rapid assembly of expression and viral constructs using modular genetic components, as well as increasingly complicated genetic tools using contextually relevant genomic elements. Our data demonstrate the applicability of GMAP toward the validation of synthetic promoters, identification of potent RNAi constructs, establishment of inducible lentiviral systems, tumor initiation in genetically engineered mouse models, and gene-targeting for the generation of knock-in mice. GMAP represents a recombinant DNA technology designed for widespread circulation and easy adaptation for other uses, such as synthetic biology, genetic screens, and CRISPR-Cas9. PMID:26887506

  20. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens

    PubMed Central

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-01-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. PMID:24641208

  1. Second-generation Sequencing for Marker Development in Sugarcane

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Second generation sequencing (also known as next-generation or massively parallel sequencing) involves the simultaneous generation of millions of short DNA sequences. The impact and applications of this technology are still emerging; however, strategies that reduce the complexity of the DNA sample p...

  2. Periodic binary sequence generators: VLSI circuits considerations

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1984-01-01

    Feedback shift registers are efficient periodic binary sequence generators. Polynomials of degree r over a Galois field characteristic 2(GF(2)) characterize the behavior of shift registers with linear logic feedback. The algorithmic determination of the trinomial of lowest degree, when it exists, that contains a given irreducible polynomial over GF(2) as a factor is presented. This corresponds to embedding the behavior of an r-stage shift register with linear logic feedback into that of an n-stage shift register with a single two-input modulo 2 summer (i.e., Exclusive-OR gate) in its feedback. This leads to Very Large Scale Integrated (VLSI) circuit architecture of maximal regularity (i.e., identical cells) with intercell communications serialized to a maximal degree.

  3. Long period pseudo random number sequence generator

    NASA Technical Reports Server (NTRS)

    Wang, Charles C. (Inventor)

    1989-01-01

    A circuit for generating a sequence of pseudo random numbers, (A sub K). There is an exponentiator in GF(2 sup m) for the normal basis representation of elements in a finite field GF(2 sup m) each represented by m binary digits and having two inputs and an output from which the sequence (A sub K). Of pseudo random numbers is taken. One of the two inputs is connected to receive the outputs (E sub K) of maximal length shift register of n stages. There is a switch having a pair of inputs and an output. The switch outputs is connected to the other of the two inputs of the exponentiator. One of the switch inputs is connected for initially receiving a primitive element (A sub O) in GF(2 sup m). Finally, there is a delay circuit having an input and an output. The delay circuit output is connected to the other of the switch inputs and the delay circuit input is connected to the output of the exponentiator. Whereby after the exponentiator initially receives the primitive element (A sub O) in GF(2 sup m) through the switch, the switch can be switched to cause the exponentiator to receive as its input a delayed output A(K-1) from the exponentiator thereby generating (A sub K) continuously at the output of the exponentiator. The exponentiator in GF(2 sup m) is novel and comprises a cyclic-shift circuit; a Massey-Omura multiplier; and, a control logic circuit all operably connected together to perform the function U(sub i) = 92(sup i) (for n(sub i) = 1 or 1 (for n(subi) = 0).

  4. Global Transcriptome Sequencing Using the Illumina Platform and the Development of EST-SSR Markers in Autotetraploid Alfalfa

    PubMed Central

    Liu, Zhipeng; Chen, Tianlong; Ma, Lichao; Zhao, Zhiguang; Zhao, Patrick X.; Nan, Zhibiao; Wang, Yanrong

    2013-01-01

    Background Alfalfa is the most widely cultivated forage legume and one of the most economically valuable crops in the world. The large size and complexity of the alfalfa genome has delayed the development of genomic resources for alfalfa research. Second-generation Illumina transcriptome sequencing is an efficient method for generating a global transcriptome sequence dataset for gene discovery and molecular marker development in alfalfa. Methodology/Principal Findings More than 28 million sequencing reads (5.64 Gb of clean nucleotides) were generated by Illumina paired-end sequencing from 15 different alfalfa tissue samples. In total, 40,433 unigenes with an average length of 803 bp were obtained by de novo assembly. Based on a sequence similarity search of known proteins, a total of 36,684 (90.73%) unigenes were annotated. In addition, 1,649 potential EST-SSRs were identified as potential molecular markers from unigenes with lengths exceeding 1 kb. A total of 100 pairs of PCR primers were randomly selected to validate the assembly quality and develop EST-SSR markers from genomic DNA. Of these primer pairs, 82 were able to amplify sequences in initial screening tests, and 27 primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among 10 alfalfa accessions. Conclusions/Significance The present study provided global sequence data for autotetraploid alfalfa and demonstrates the Illumina platform is a fast and effective approach to EST-SSR markers development in alfalfa. The use of these transcriptome datasets will serve as a valuable public information platform to accelerate studies of the alfalfa genome. PMID:24349529

  5. Preparation of SELEX Samples for Next-Generation Sequencing.

    PubMed

    Tolle, Fabian; Mayer, Günter

    2016-01-01

    Fuelled by massive whole genome sequencing projects such as the human genome project, enormous technological advancements and therefore tremendous price drops could be achieved, rendering next-generation sequencing very attractive for deep sequencing of SELEX libraries. Herein we describe the preparation of SELEX samples for Illumina sequencing, based on the already established whole genome sequencing workflow. We describe the addition of barcode sequences for multiplexing and the adapter ligation, avoiding associated pitfalls. PMID:26552817

  6. A novel three-round multiplex PCR for SNP genotyping with next generation sequencing.

    PubMed

    Chen, Ke; Zhou, Yu-Xun; Li, Kai; Qi, Li-Xin; Zhang, Qi-Fei; Wang, Mao-Chun; Xiao, Jun-Hua

    2016-06-01

    Owing to the high throughput and low cost, next generation sequencing has attracted much attention for SNP genotyping application for researchers. Here, we introduce a new method based on three-round multiplex PCR to precisely genotype SNPs with next generation sequencing. This method can as much as possible consume the equivalent amount of each pair of specific primers to largely eliminate the amplification discrepancy between different loci. After the PCR amplification, the products can be directly subjected to next generation sequencing platform. We simultaneously amplified 37 SNP loci of 757 samples and sequenced all amplicons on ion torrent PGM platform; 90.5 % of the target SNP loci were accurately genotyped (at least 15×) and 90.4 % amplicons had uniform coverage with a variation less than 50-fold. Ligase detection reaction (LDR) was performed to genotype the 19 SNP loci (as part of the 37 SNP loci) with 91 samples randomly selected from the 757 samples, and 99.5 % genotyping data were consistent with the next generation sequencing results. Our results demonstrate that three-round PCR coupled with next generation sequencing is an efficient and economical genotyping approach. Graphical Abstract The schematic diagram of three-round PCR. PMID:27113460

  7. The impact of next-generation sequencing on genomics

    PubMed Central

    Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

    2011-01-01

    This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come. PMID:21477781

  8. Guidelines for diagnostic next-generation sequencing

    PubMed Central

    Matthijs, Gert; Souche, Erika; Alders, Mariëlle; Corveleyn, Anniek; Eck, Sebastian; Feenstra, Ilse; Race, Valérie; Sistermans, Erik; Sturm, Marc; Weiss, Marjan; Yntema, Helger; Bakker, Egbert; Scheffer, Hans; Bauer, Peter

    2016-01-01

    We present, on behalf of EuroGentest and the European Society of Human Genetics, guidelines for the evaluation and validation of next-generation sequencing (NGS) applications for the diagnosis of genetic disorders. The work was performed by a group of laboratory geneticists and bioinformaticians, and discussed with clinical geneticists, industry and patients' representatives, and other stakeholders in the field of human genetics. The statements that were written during the elaboration of the guidelines are presented here. The background document and full guidelines are available as supplementary material. They include many examples to assist the laboratories in the implementation of NGS and accreditation of this service. The work and ideas presented by others in guidelines that have emerged elsewhere in the course of the past few years were also considered and are acknowledged in the full text. Interestingly, a few new insights that have not been cited before have emerged during the preparation of the guidelines. The most important new feature is the presentation of a ‘rating system' for NGS-based diagnostic tests. The guidelines and statements have been applauded by the genetic diagnostic community, and thus seem to be valuable for the harmonization and quality assurance of NGS diagnostics in Europe. PMID:26508566

  9. Applications of next-generation sequencing techniques in plant biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The last several years have seen revolutionary advances in DNA sequencing technologies with the advent of next generation sequencing (NGS) techniques. NGS methods now allow millions of bases to be sequenced in one round, at a fraction of the cost relative to traditional Sanger sequencing, allowing u...

  10. Next-Generation Sequencing: A Review of Technologies and Tools for Wound Microbiome Research

    PubMed Central

    Hodkinson, Brendan P.; Grice, Elizabeth A.

    2015-01-01

    Significance: The colonization of wounds by specific microbes or communities of microbes may delay healing and/or lead to infection-related complication. Studies of wound-associated microbial communities (microbiomes) to date have primarily relied upon culture-based methods, which are known to have extreme biases and are not reliable for the characterization of microbiomes. Biofilms are very resistant to culture and are therefore especially difficult to study with techniques that remain standard in clinical settings. Recent Advances: Culture-independent approaches employing next-generation DNA sequencing have provided researchers and clinicians a window into wound-associated microbiomes that could not be achieved before and has begun to transform our view of wound-associated biodiversity. Within the past decade, many platforms have arisen for performing this type of sequencing, with various types of applications for microbiome research being possible on each. Critical Issues: Wound care incorporating knowledge of microbiomes gained from next-generation sequencing could guide clinical management and treatments. The purpose of this review is to outline the current platforms, their applications, and the steps necessary to undertake microbiome studies using next-generation sequencing. Future Directions: As DNA sequencing technology progresses, platforms will continue to produce longer reads and more reads per run at lower costs. A major future challenge is to implement these technologies in clinical settings for more precise and rapid identification of wound bioburden. PMID:25566414

  11. Generating Functions for the Powers of Fibonacci Sequences

    ERIC Educational Resources Information Center

    Terrana, D.; Chen, H.

    2007-01-01

    In this note, based on the Binet formulas and the power-reducing techniques, closed forms of generating functions for the powers of Fibonacci sequences are presented. The corresponding results are extended to some other famous sequences as well.

  12. An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome

    PubMed Central

    2013-01-01

    Background Second generation sequencing has permitted detailed sequence characterisation at the whole genome level of a growing number of non-model organisms, but the data produced have short read-lengths and biased genome coverage leading to fragmented genome assemblies. The PacBio RS long-read sequencing platform offers the promise of increased read length and unbiased genome coverage and thus the potential to produce genome sequence data of a finished quality containing fewer gaps and longer contigs. However, these advantages come at a much greater cost per nucleotide and with a perceived increase in error-rate. In this investigation, we evaluated the performance of the PacBio RS sequencing platform through the sequencing and de novo assembly of the Potentilla micrantha chloroplast genome. Results Following error-correction, a total of 28,638 PacBio RS reads were recovered with a mean read length of 1,902 bp totalling 54,492,250 nucleotides and representing an average depth of coverage of 320× the chloroplast genome. The dataset covered the entire 154,959 bp of the chloroplast genome in a single contig (100% coverage) compared to seven contigs (90.59% coverage) recovered from an Illumina data, and revealed no bias in coverage of GC rich regions. Post-assembly the data were largely concordant with the Illumina data generated and allowed 187 ambiguities in the Illumina data to be resolved. The additional read length also permitted small differences in the two inverted repeat regions to be assigned unambiguously. Conclusions This is the first report to our knowledge of a chloroplast genome assembled de novo using PacBio sequence data. The PacBio RS data generated here were assembled into a single large contig spanning the P. micrantha chloroplast genome, with a higher degree of accuracy than an Illumina dataset generated at a much greater depth of coverage, due to longer read lengths and lower GC bias in the data. The results we present suggest PacBio data will be

  13. Depositional sequence evolution, Paleozoic and early Mesozoic of the central Saharan platform, North Africa

    SciTech Connect

    Sprague, A.R.G. )

    1991-08-01

    Over 30 depositional sequences have been identified in the Paleozoic and lower Mesozoic of the Ghadames basin of eastern Algeria, southern Tunisia, and western Libya. Well logs and lithologic information from more than 500 wells were used to correlate the 30 sequences throughout the basin (total area more than 1 million km{sup 2}). Based on systematic change in the log response of strata in successively younger sequences, five groups of sequences with distinctive characteristics have been identified: Cambro-Ordivician, Upper Silurian-Middle Devonian, Upper Devonian, Carboniferous, and Middle Triassic-Middle Jurassic. Each sequence group is terminated by a major, tectonically enhanced sequence boundary that is immediately overlain (except for the Carboniferous) by a shale-prone interval deposited in response to basin-wide flooding. The four Paleozoic sequence groups were deposited on the Saharan platform, a north facing, clastic-dominated shelf that covered most of North Africa during the Paleozoic. The sequence boundary at the top of the Carboniferous sequence group is one of several Permian-Carboniferous angular unconformities in North Africa related to the Hercynian orogeny. The youngest sequence group (Middle Triassic to Middle Jurassic) is a clastic-evaporite package that onlaps southward onto the top of Paleozoic sequence boundary. The progressive changes from the Cambrian to the Jurassic, in the nature of the Ghadames basin sequences is a reflection of the interplay between basin morphology and tectonics, vegetation, eustasy, climate, and sediment supply.

  14. Economic regulation of next-generation sequencing.

    PubMed

    Evans, Barbara J

    2014-01-01

    Next-generation sequencing broadens the debate about appropriate regulatory oversight of genetic testing and may force scholars to move beyond familiar privacy and health and safety regulatory issues to address new problems with industry structure and economic regulation. The genetic testing industry is passing through a period of profound structural change in response to shifts in technology and in the legal environment. Making genetic testing safe and effective for consumers increasingly requires access to comprehensive genomic data infrastructures that can support accurate, state-of-the-art interpretation of genetic test results. At present, there are significant barriers to access and there is no sector-specific regulator with power to ensure appropriate data access. Without it, genetic testing will not be safe for consumers even when it is performed at CLIA-certified laboratories using tests that have been FDA-cleared or approved. This article explores the emerging structure of the genetic testing industry and describes its present economic regulatory vacuum. In view of this gap in regulation, the article explores whether generally applicable law, particularly antitrust law, may offer solutions to the industry's data access problems. It concludes that courts may have a useful role to play, particularly in Europe and other jurisdictions where the essential facilities doctrine enjoys continued vitality. After Verizon Communications v. Law Offices of Curtis V. Trinko, the role of U.S. federal courts is less certain. Congress has demonstrated willingness to address access issues as they emerged in other infrastructure industries in recent decades. This article expresses no preference between legislative and judicial solutions. Its aim is simply to highlight an emerging economic regulatory issue which, if left unresolved, presents real health and safety concerns for consumers who receive genetic tests. PMID:25298291

  15. Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

    PubMed Central

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  16. Historical perspective, development and applications of next-generation sequencing in plant virology.

    PubMed

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  17. Detection of Genomic Structural Variants from Next-Generation Sequencing Data

    PubMed Central

    Tattini, Lorenzo; D’Aurizio, Romina; Magi, Alberto

    2015-01-01

    Structural variants are genomic rearrangements larger than 50 bp accounting for around 1% of the variation among human genomes. They impact on phenotypic diversity and play a role in various diseases including neurological/neurocognitive disorders and cancer development and progression. Dissecting structural variants from next-generation sequencing data presents several challenges and a number of approaches have been proposed in the literature. In this mini review, we describe and summarize the latest tools – and their underlying algorithms – designed for the analysis of whole-genome sequencing, whole-exome sequencing, custom captures, and amplicon sequencing data, pointing out the major advantages/drawbacks. We also report a summary of the most recent applications of third-generation sequencing platforms. This assessment provides a guided indication – with particular emphasis on human genetics and copy number variants – for researchers involved in the investigation of these genomic events. PMID:26161383

  18. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  19. Primer and platform effects on 16S rRNA tag sequencing

    PubMed Central

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-01-01

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. Beta diversity metrics are surprisingly robust to both primer and sequencing platform biases. PMID:26300854

  20. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  1. Ad 2.0: a novel recombineering platform for high-throughput generation of tailored adenoviruses.

    PubMed

    Mück-Häusl, Martin; Solanki, Manish; Zhang, Wenli; Ruzsics, Zsolt; Ehrhardt, Anja

    2015-04-30

    Recombinant adenoviruses containing a double-stranded DNA genome of 26-45 kb were broadly explored in basic virology, for vaccination purposes, for treatment of tumors based on oncolytic virotherapy, or simply as a tool for efficient gene transfer. However, the majority of recombinant adenoviral vectors (AdVs) is based on a small fraction of adenovirus types and their genetic modification. Recombineering techniques provide powerful tools for arbitrary engineering of recombinant DNA. Here, we adopted a seamless recombineering technology for high-throughput and arbitrary genetic engineering of recombinant adenoviral DNA molecules. Our cloning platform which also includes a novel recombination pipeline is based on bacterial artificial chromosomes (BACs). It enables generation of novel recombinant adenoviruses from different sources and switching between commonly used early generation AdVs and the last generation high-capacity AdVs lacking all viral coding sequences making them attractive candidates for clinical use. In combination with a novel recombination pipeline allowing cloning of AdVs containing large and complex transgenes and the possibility to generate arbitrary chimeric capsid-modified adenoviruses, these techniques allow generation of tailored AdVs with distinct features. Our technologies will pave the way toward broader applications of AdVs in molecular medicine including gene therapy and vaccination studies. PMID:25609697

  2. A research roadmap for next-generation sequencing informatics.

    PubMed

    Altman, Russ B; Prabhu, Snehit; Sidow, Arend; Zook, Justin M; Goldfeder, Rachel; Litwack, David; Ashley, Euan; Asimenos, George; Bustamante, Carlos D; Donigan, Katherine; Giacomini, Kathleen M; Johansen, Elaine; Khuri, Natalia; Lee, Eunice; Liang, Xueying Sharon; Salit, Marc; Serang, Omar; Tezak, Zivana; Wall, Dennis P; Mansfield, Elizabeth; Kass-Hout, Taha

    2016-04-20

    Next-generation sequencing technologies are fueling a wave of new diagnostic tests. Progress on a key set of nine research challenge areas will help generate the knowledge required to advance effectively these diagnostics to the clinic. PMID:27099173

  3. Polynomials Generated by the Fibonacci Sequence

    NASA Astrophysics Data System (ADS)

    Garth, David; Mills, Donald; Mitchell, Patrick

    2007-06-01

    The Fibonacci sequence's initial terms are F_0=0 and F_1=1, with F_n=F_{n-1}+F_{n-2} for n>=2. We define the polynomial sequence p by setting p_0(x)=1 and p_{n}(x)=x*p_{n-1}(x)+F_{n+1} for n>=1, with p_{n}(x)= sum_{k=0}^{n} F_{k+1}x^{n-k}. We call p_n(x) the Fibonacci-coefficient polynomial (FCP) of order n. The FCP sequence is distinct from the well-known Fibonacci polynomial sequence. We answer several questions regarding these polynomials. Specifically, we show that each even-degree FCP has no real zeros, while each odd-degree FCP has a unique, and (for degree at least 3) irrational, real zero. Further, we show that this sequence of unique real zeros converges monotonically to the negative of the golden ratio. Using Rouche's theorem, we prove that the zeros of the FCP's approach the golden ratio in modulus. We also prove a general result that gives the Mahler measures of an infinite subsequence of the FCP sequence whose coefficients are reduced modulo an integer m>=2. We then apply this to the case that m=L_n, the nth Lucas number, showing that the Mahler measure of the subsequence is phi^{n-1}, where phi=(1+sqrt 5)/2.

  4. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  5. JVM: Java Visual Mapping tool for next generation sequencing read.

    PubMed

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB. PMID:25387956

  6. SNP Discovery Using Next Generation Transcriptomic Sequencing.

    PubMed

    De Wit, Pierre

    2016-01-01

    In this chapter, I will guide the user through methods to find new SNP markers from expressed sequence (RNA-Seq) data, focusing on the sample preparation and also on the bioinformatic analyses needed to sort through the immense flood of data from high-throughput sequencing machines. The general steps included are as follows: sample preparation, sequencing, quality control of data, assembly, mapping, SNP discovery, filtering, validation. The first few steps are traditional laboratory protocols, whereas steps following the sequencing are of bioinformatic nature. The bioinformatics described herein are by no means exhaustive, rather they serve as one example of a simple way of analyzing high-throughput sequence data to find SNP markers. Ideally, one would like to run through this protocol several times with a new dataset, while varying software parameters slightly, in order to determine the robustness of the results. The final validation step, although not described in much detail here, is also quite critical as that will be the final test of the accuracy of the assumptions made in silico.There is a plethora of downstream applications of a SNP dataset, not covered in this chapter. For an example of a more thorough protocol also including differential gene expression and functional enrichment analyses, BLAST annotation and downstream applications of SNP markers, a good starting point could be the "Simple Fool's Guide to population genomics via RNA-Seq," which is available at http://sfg.stanford.edu . PMID:27460371

  7. Analyzing the safety of removal sequences for piles of an offshore jacket platform

    NASA Astrophysics Data System (ADS)

    Pan, Xin-Ying; Zhang, Zhao-De

    2009-12-01

    An inevitable consequence of the development of the offshore petroleum industry is the eventual obsolescence of large offshore structures. Proper methods for removal of decommissioned offshore platforms are becoming an important topic that the oil and gas industry must pay increasing attention to. While removing sections from a decommissioned jacket platform, the stability of the remaining parts is critical. The jacket danger indices D σ and D s defined in this paper are very useful for analyzing the safety of any procedure planned for disassembling a jacket platform. The safest piles cutting sequence can be determined easily by comparing every column of D σ and D s or simply analyzing the figures of every row of D σ and D s .

  8. Image encryption using random sequence generated from generalized information domain

    NASA Astrophysics Data System (ADS)

    Xia-Yan, Zhang; Guo-Ji, Zhang; Xuan, Li; Ya-Zhou, Ren; Jie-Hua, Wu

    2016-05-01

    A novel image encryption method based on the random sequence generated from the generalized information domain and permutation–diffusion architecture is proposed. The random sequence is generated by reconstruction from the generalized information file and discrete trajectory extraction from the data stream. The trajectory address sequence is used to generate a P-box to shuffle the plain image while random sequences are treated as keystreams. A new factor called drift factor is employed to accelerate and enhance the performance of the random sequence generator. An initial value is introduced to make the encryption method an approximately one-time pad. Experimental results show that the random sequences pass the NIST statistical test with a high ratio and extensive analysis demonstrates that the new encryption scheme has superior security.

  9. Variable Speed Wind Turbine Generator with Zero-sequence Filter

    DOEpatents

    Muljadi, Eduard

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  10. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, E.

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility. 14 figs.

  11. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, Eduard

    1998-01-01

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  12. A Study on Sequence Generation Powers of Small Cellular Automata

    NASA Astrophysics Data System (ADS)

    Kamikawa, Naoki; Umeo, Hiroshi

    A model of cellular automata (CA) is considered to be a well-studied non-linear model of complex systems in which an infinite one-dimensional array of finite state machines (cells) updates itself in a synchronous manner according to a uniform local rule. A sequence generation problem on the CAs has been studied and many scholars proposed several real-time sequence generation algorithms for a variety of non-regular sequences such as prime, Fibonacci, and {2n|n=1,2,3,...} sequences etc. The paper describes the sequence generation powers of CAs having a small number of states, focusing on the CAs with one, two, and three internal states, respectively. The authors enumerate all of the sequences generated by two-state CAs and present several non-regular sequences that can be generated in real-time by three-state CAs, but not generated by any two-state CA. It is shown that there exists a sequence generation gap among the powers of those small CAs.

  13. Comparative depositional geometries and facies within windward rimmed platform and carbonate ramp sequences

    SciTech Connect

    Boss, S.K.; Rasmussen, K.A.; Neumann, A.C. )

    1992-01-01

    Northern Great Bahama Bank (NGBB) combines geomorphic aspects of rimmed platforms and carbonate ramps in a windward (high-energy) environment. Analysis of Holocene sediment cores, seismic reflection mapping of the Holocene-Pleistocene unconformity and transgressive Holocene deposits and petrographic study of excavated Holocene submarine-cemented horizons provides an integrated view of evolving depositional geometries within both rimmed platform and ramp settings. Cores display gross textural and compositional homogeneity; all sediments are medium to coarse sands comprised of composite peloids, Halimeda sp., benthic foraminifera and molluscs. Three-dimensional seismic mapping reveals that this basal unconformity exhibits variation in topographic relief related to both constructional and erosional processes; rimmed portions of the platform are associated with topographic plateaus'' with fringing eolianite ridges or (rarely) reefs. These plateaus'' are separated by a somewhat deeper (ca. 5m deep) trough'' exhibiting little relief, but sloping seaward to form a ramp. Multiple intrasequence cemented horizons are a common feature of the thinner deposits of the NGBB ramp where tidal exchange is vigorous and sediment deposition is episodic or in dynamic balance with sediment export. Thus, rimmed carbonate platform facies are thick marine sands with relatively little submarine cementation while open, unsheltered ramp facies are characterized by thin sediment sequences containing numerous, discontinuous submarine-cemented horizons. In the absence of other obvious facies or geomorphic indicators (e.g. preserved reefal rims), the preservation of similar depositional features in ancient limestones may serve as a useful discriminant of rimmed platform versus carbonate ramp settings.

  14. The Feasibility Study of Non-Invasive Fetal Trisomy 18 and 21 Detection with Semiconductor Sequencing Platform

    PubMed Central

    Guo, Qiwei; Chen, Jinchun; Quan, Shengmao; Zhang, Ahong; Zheng, Hailing; Zhu, Xingqiang; Lin, Jin; Xu, Huan; Wu, Ayang; Park, Sin-Gi; Kim, Byung Chul; Joo, Hee Jae; Chen, Hongliang; Bhak, Jong

    2014-01-01

    Objective Recent non-invasive prenatal testing (NIPT) technologies are based on next-generation sequencing (NGS). NGS allows rapid and effective clinical diagnoses to be determined with two common sequencing systems: Illumina and Ion Torrent platforms. The majority of NIPT technology is associated with Illumina platform. We investigated whether fetal trisomy 18 and 21 were sensitively and specifically detectable by semiconductor sequencer: Ion Proton. Methods From March 2012 to October 2013, we enrolled 155 pregnant women with fetuses who were diagnosed as high risk of fetal defects at Xiamen Maternal & Child Health Care Hospital (Xiamen, Fujian, China). Adapter-ligated DNA libraries were analyzed by the Ion Proton™ System (Life Technologies, Grand Island, NY, USA) with an average 0.3× sequencing coverage per nucleotide. Average total raw reads per sample was 6.5 million and mean rate of uniquely mapped reads was 59.0%. The results of this study were derived from BWA mapping. Z-score was used for fetal trisomy 18 and 21 detection. Results Interactive dot diagrams showed the minimal z-score values to discriminate negative versus positive cases of fetal trisomy 18 and 21. For fetal trisomy 18, the minimal z-score value of 2.459 showed 100% positive predictive and negative predictive values. The minimal z-score of 2.566 was used to classify negative versus positive cases of fetal trisomy 21. Conclusion These results provide the evidence that fetal trisomy 18 and 21 detection can be performed with semiconductor sequencer. Our data also suggest that a prospective study should be performed with a larger cohort of clinically diverse obstetrics patients. PMID:25329639

  15. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data. PMID:25959587

  16. Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

    PubMed Central

    Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

    2015-01-01

    Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016

  17. Next Generation Sequencing for the Diagnosis of Cardiac Arrhythmia Syndromes

    PubMed Central

    Lubitz, Steven A.; Ellinor, Patrick T.

    2015-01-01

    Inherited arrhythmia syndromes are collectively associated with substantial morbidity, yet our understanding of the genetic architecture of these conditions remains limited. Recent technological advances in DNA sequencing have led to the commercialization of genetic testing now widely available in clinical practice. In particular, next generation sequencing allows the large-scale and rapid assessment of entire genomes. Although next generation sequencing represents a major technological advance, it has introduced numerous challenges with respect to the interpretation of genetic variation, and has opened a veritable floodgate of biological data of unknown clinical significance to practitioners. In this review, we discuss current genetic testing indications for inherited arrhythmia syndromes, broadly outline characteristics of next generation sequencing techniques, and highlight challenges associated with such testing. We further summarize future directions that will be necessary to address to enable the widespread adoption of next generation sequencing in the routine management of patients with inherited arrhythmia syndromes. PMID:25625719

  18. Methods in virus diagnostics: from ELISA to next generation sequencing.

    PubMed

    Boonham, Neil; Kreuze, Jan; Winter, Stephan; van der Vlugt, René; Bergervoet, Jan; Tomlinson, Jenny; Mumford, Rick

    2014-06-24

    Despite the seemingly continuous development of newer and ever more elaborate methods for detecting and identifying viruses, very few of these new methods get adopted for routine use in testing laboratories, often despite the many and varied claimed advantages they possess. To understand why the rate of uptake of new technologies is so low, requires a strong understanding of what makes a good routine diagnostic tool to begin. This can be done by looking at the two most successfully established plant virus detection methods: enzyme-linked immunosorbant assay (ELISA) and more recently introduced real-time polymerase chain reaction (PCR). By examining the characteristics of this pair of technologies, it becomes clear that they share many benefits, such as an industry standard format and high levels of repeatability and reproducibility. These combine to make methods that are accessible to testing labs, which are easy to establish and robust in their use, even with new and inexperienced users. Hence, to ensure the establishment of new techniques it is necessary to not only provide benefits not found with ELISA or real-time PCR, but also to provide a platform that is easy to establish and use. In plant virus diagnostics, recent developments can be clustered into three core areas: (1) techniques that can be performed in the field or resource poor locations (e.g., loop-mediated isothermal amplification LAMP); (2) multiplex methods that are able to detect many viruses in a single test (e.g., Luminex bead arrays); and (3) methods suited to virus discovery (e.g., next generation sequencing, NGS). Field based methods are not new, with Lateral Flow Devices (LFDs) for the detection being available for a number of years now. However, the widespread uptake of this technology remains poor. LAMP does offer significant advantages over LFDs, in terms of sensitivity and generic application, but still faces challenges in terms of establishment. It is likely that the main barrier to the

  19. Non-random DNA fragmentation in next-generation sequencing

    PubMed Central

    Poptsova, Maria S.; Il'icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-01-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed “reads” are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions. PMID:24681819

  20. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  1. Computer program to generate attitude error equations for a gimballed platform

    NASA Technical Reports Server (NTRS)

    Hall, W. A., Jr.; Morris, T. D.; Rone, K. Y.

    1972-01-01

    Computer program for solving attitude error equations related to gimballed platform is described. Program generates matrix elements of attitude error equations when initial matrices and trigonometric identities have been defined. Program is written for IBM 360 computer.

  2. Bioelectrochemical system platform for sustainable environmental remediation and energy generation.

    PubMed

    Wang, Heming; Luo, Haiping; Fallgren, Paul H; Jin, Song; Ren, Zhiyong Jason

    2015-01-01

    The increasing awareness of the energy-environment nexus is compelling the development of technologies that reduce environmental impacts during energy production as well as energy consumption during environmental remediation. Countries spend billions in pollution cleanup projects, and new technologies with low energy and chemical consumption are needed for sustainable remediation practice. This perspective review provides a comprehensive summary on the mechanisms of the new bioelectrochemical system (BES) platform technology for efficient and low cost remediation, including petroleum hydrocarbons, chlorinated solvents, perchlorate, azo dyes, and metals, and it also discusses the potential new uses of BES approach for some emerging contaminants remediation, such as CO2 in air and nutrients and micropollutants in water. The unique feature of BES for environmental remediation is the use of electrodes as non-exhaustible electron acceptors, or even donors, for contaminant degradation, which requires minimum energy or chemicals but instead produces sustainable energy for monitoring and other onsite uses. BES provides both oxidation (anode) and reduction (cathode) reactions that integrate microbial-electro-chemical removal mechanisms, so complex contaminants with different characteristics can be removed. We believe the BES platform carries great potential for sustainable remediation and hope this perspective provides background and insights for future research and development. PMID:25886880

  3. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    PubMed Central

    Knapp, Michael; Hofreiter, Michael

    2010-01-01

    The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions. PMID:24710043

  4. Mining frequent biological sequences based on bitmap without candidate sequence generation.

    PubMed

    Wang, Qian; Davis, Darryl N; Ren, Jiadong

    2016-02-01

    Biological sequences carry a lot of important genetic information of organisms. Furthermore, there is an inheritance law related to protein function and structure which is useful for applications such as disease prediction. Frequent sequence mining is a core technique for association rule discovery, but existing algorithms suffer from low efficiency or poor error rate because biological sequences differ from general sequences with more characteristics. In this paper, an algorithm for mining Frequent Biological Sequence based on Bitmap, FBSB, is proposed. FBSB uses bitmaps as the simple data structure and transforms each row into a quicksort list QS-list for sequence growth. For the continuity and accuracy requirement of biological sequence mining, tested sequences used during the mining process of FBSB are real ones instead of generated candidates, and all the frequent sequences can be mined without any errors. Comparing with other algorithms, the experimental results show that FBSB can achieve a better performance on both run time and scalability. PMID:26773937

  5. Building a next generation platform for association studies in cacao

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The drastic reductions in cost and time associated with the collection of DNA sequence and genotype data have revolutionized genetic mapping in model systems (e.g. humans, Arabidopsis) and also promise to significantly enhance the power and resolution of genetic mapping in agricultural systems. Prog...

  6. Integrated platform for detection of DNA sequence variants using capillary array electrophoresis

    SciTech Connect

    Qingbro, Li; Liu, Zhaowei; Monroe, Heidi M; Culiat, Cymbeline T

    2002-08-01

    We have developed a highly versatile platform that performs temperature gradient capillary electrophoresis (TGCE) for mutation/single-nucleotide polymorphism (SNP) detection, sequencing and mutation/SNP genotyping for identification of sequence variants on an automated 24-, 96- or 192-capillary array instrument. In the first mode, multiple DNA samples consisting of homoduplexes and heteroduplexes are separated by CE, during which a temperature gradient is applied that covers all possible temperatures of 50% melting equilibrium (Tms) for the samples. The differences in Tms result in separation of homoduplexes from heteroduplexes, thereby identifying the presence of DNA variants. The sequencing mode is then used to determine the exact location of the mutation/SNPs in the DNA variants. The first two modes allow the rapid identification of variants from the screening of a large number of samples. Only the variants need to be sequenced. The third mode utilizes multiplexed single-base extensions (SBEs) to survey mutations and SNPs at the known sites of DNA sequence. The TGCE approach combined with sequencing and SBE is fast and cost-effective for high-throughput mutation/SNP detection.

  7. Next-generation sequencing in clinical virology: Discovery of new viruses

    PubMed Central

    Datta, Sibnarayan; Budhauliya, Raghvendra; Das, Bidisha; Chatterjee, Soumya; Vanlalhmuaka; Veer, Vijay

    2015-01-01

    Viruses are a cause of significant health problem worldwide, especially in the developing nations. Due to different anthropological activities, human populations are exposed to different viral pathogens, many of which emerge as outbreaks. In such situations, discovery of novel viruses is utmost important for deciding prevention and treatment strategies. Since last century, a number of different virus discovery methods, based on cell culture inoculation, sequence-independent PCR have been used for identification of a variety of viruses. However, the recent emergence and commercial availability of next-generation sequencers (NGS) has entirely changed the field of virus discovery. These massively parallel sequencing platforms can sequence a mixture of genetic materials from a very heterogeneous mix, with high sensitivity. Moreover, these platforms work in a sequence-independent manner, making them ideal tools for virus discovery. However, for their application in clinics, sample preparation or enrichment is necessary to detect low abundance virus populations. A number of techniques have also been developed for enrichment or viral nucleic acids. In this manuscript, we review the evolution of sequencing; NGS technologies available today as well as widely used virus enrichment technologies. We also discuss the challenges associated with their applications in the clinical virus discovery. PMID:26279987

  8. Multiple nuclear ortholog next generation sequencing phylogeny of Daucus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing is helping to solve the data insufficiency problem hindering well-resolved dominant gene phylogenies. We used Roche 454 technology to obtain DNA sequences from 93 nuclear orthologs, dispersed throughout all linkage groups of Daucus. Of these 93 orthologs, ten were designed...

  9. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

    PubMed

    Huang, Austin; Kantor, Rami; DeLong, Allison; Schreier, Leeann; Istrail, Sorin

    Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data. PMID:23202421

  10. DTUsat-2 - The Next Generation Animal Migration Research Platform

    NASA Astrophysics Data System (ADS)

    Bjarnø, J. B.; Fléron, R. W.

    2008-08-01

    The DTUsat-2 project aims to demonstrate pico-class (<1kg) satellites as a viable platform for high value scientific investigations. Through the development of both ground and space segments in a generic miniature tracking system, the project specifically targets a major impediment faced by the biological research community, namely the lack of accurate intercontinental tracking solutions for smaller migratory species. Resolving this issue will not only push the boundaries of migratory research, but also entails the capability of bringing global remote tracking access to hitherto inaccessible sciences. This paper outlines the scope of the DTUsat-2 project and the organizational framework behind the mission. Moreover, the system level designs are discussed in relation to the latest advances on actual project implementations and development milestones.

  11. Collaborative Effort for a Centralized Worldwide Tuberculosis Relational Sequencing Data Platform.

    PubMed

    Starks, Angela M; Avilés, Enrique; Cirillo, Daniela M; Denkinger, Claudia M; Dolinger, David L; Emerson, Claudia; Gallarda, Jim; Hanna, Debra; Kim, Peter S; Liwski, Richard; Miotto, Paolo; Schito, Marco; Zignol, Matteo

    2015-10-15

    Continued progress in addressing challenges associated with detection and management of tuberculosis requires new diagnostic tools. These tools must be able to provide rapid and accurate information for detecting resistance to guide selection of the treatment regimen for each patient. To achieve this goal, globally representative genotypic, phenotypic, and clinical data are needed in a standardized and curated data platform. A global partnership of academic institutions, public health agencies, and nongovernmental organizations has been established to develop a tuberculosis relational sequencing data platform (ReSeqTB) that seeks to increase understanding of the genetic basis of resistance by correlating molecular data with results from drug susceptibility testing and, optimally, associated patient outcomes. These data will inform development of new diagnostics, facilitate clinical decision making, and improve surveillance for drug resistance. ReSeqTB offers an opportunity for collaboration to achieve improved patient outcomes and to advance efforts to prevent and control this devastating disease. PMID:26409275

  12. Next Generation Sequencing: Potential and Application in Drug Discovery

    PubMed Central

    Yadav, Navneet Kumar; Shukla, Pooja; Omer, Ankur; Pareek, Shruti; Singh, R. K.

    2014-01-01

    The world has now entered into a new era of genomics because of the continued advancements in the next generation high throughput sequencing technologies, which includes sequencing by synthesis-fluorescent in situ sequencing (FISSEQ), pyrosequencing, sequencing by ligation using polony amplification, supported oligonucleotide detection (SOLiD), sequencing by hybridization along with sequencing by ligation, and nanopore technology. Great impacts of these methods can be seen for solving the genome related problems of plant and animal kingdom that will open the door of a new era of genomics. This may ultimately overcome the Sanger sequencing that ruled for 30 years. NGS is expected to advance and make the drug discovery process more rapid. PMID:24688432

  13. NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data

    PubMed Central

    Patel, Ravi K.; Jain, Mukesh

    2012-01-01

    Next generation sequencing (NGS) technologies provide a high-throughput means to generate large amount of sequence data. However, quality control (QC) of sequence data generated from these technologies is extremely important for meaningful downstream analysis. Further, highly efficient and fast processing tools are required to handle the large volume of datasets. Here, we have developed an application, NGS QC Toolkit, for quality check and filtering of high-quality data. This toolkit is a standalone and open source application freely available at http://www.nipgr.res.in/ngsqctoolkit.html. All the tools in the application have been implemented in Perl programming language. The toolkit is comprised of user-friendly tools for QC of sequencing data generated using Roche 454 and Illumina platforms, and additional tools to aid QC (sequence format converter and trimming tools) and analysis (statistics tools). A variety of options have been provided to facilitate the QC at user-defined parameters. The toolkit is expected to be very useful for the QC of NGS data to facilitate better downstream analysis. PMID:22312429

  14. Next generation sequencing: implications in personalized medicine and pharmacogenomics.

    PubMed

    Rabbani, Bahareh; Nakaoka, Hirofumi; Akhondzadeh, Shahin; Tekin, Mustafa; Mahdieh, Nejat

    2016-05-24

    A breakthrough in next generation sequencing (NGS) in the last decade provided an unprecedented opportunity to investigate genetic variations in humans and their roles in health and disease. NGS offers regional genomic sequencing such as whole exome sequencing of coding regions of all genes, as well as whole genome sequencing. RNA-seq offers sequencing of the entire transcriptome and ChIP-seq allows for sequencing the epigenetic architecture of the genome. Identifying genetic variations in individuals can be used to predict disease risk, with the potential to halt or retard disease progression. NGS can also be used to predict the response to or adverse effects of drugs or to calculate appropriate drug dosage. Such a personalized medicine also provides the possibility to treat diseases based on the genetic makeup of the patient. Here, we review the basics of NGS technologies and their application in human diseases to foster human healthcare and personalized medicine. PMID:27066891

  15. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    PubMed

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes. PMID:27045828

  16. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges

    PubMed Central

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R. Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS. PMID:26473927

  17. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges.

    PubMed

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS. PMID:26473927

  18. A Pulse Generator Based on an Arduino Platform for Ultrasonic Applications

    NASA Astrophysics Data System (ADS)

    Acevedo, Pedro; Vázquez, Mónica; Durán, Joel; Petrearce, Rodolfo

    The objective of this work is to use the Arduino platform as an ultrasonic pulse generator to excite PVDF ultrasonic arrays in transmission. An experimental setup was implemented using a through-transmission configuration to evaluate the performance of the generator.

  19. Third Generation Sequencing Techniques and Applications to Drug Discovery

    PubMed Central

    Ozsolak, Fatih

    2012-01-01

    Introduction There is an immediate need for functional and molecular studies to decipher differences between disease and “normal” settings to identify large quantities of validated targets with the highest therapeutic utilities. Furthermore, drug mechanism of action and biomarkers to predict drug efficacy and safety need to be identified for effective design of clinical trials, decreasing attrition rates, regulatory agency approval process and drug repositioning. By expanding the power of genetics and pharmacogenetics studies, next generation nucleic acid sequencing technologies have started to play an important role in all stages of drug discovery. Areas covered This article reviews the first and second generation sequencing technologies (SGSTs) and challenges they pose to biomedicine. The article then focuses on the emerging third generation sequencing technologies (TGSTs), their technological foundations and potential contributions to drug discovery. Expert Opinion Despite the scientific and commercial success of SGSTs, the goal of rapid, comprehensive and unbiased sequencing of nucleic acids has not been achieved. TGSTs promise to increase sequencing throughput and read lengths, decrease costs, run times and error rates, eliminate biases inherent in SGSTs, and offer capabilities beyond nucleic acid sequencing. Such changes will have positive impact in all sequencing applications to drug discovery. PMID:22468954

  20. Pittosporum cryptic virus 1: genome sequence completion using next-generation sequencing.

    PubMed

    Elbeaino, Toufic; Kubaa, Raied Abou; Tuzlali, Hasan Tuna; Digiaro, Michele

    2016-07-01

    Next-generation sequencing (NGS) was applied to dsRNAs extracted from an Italian pittosporum plant infected with pittosporum cryptic virus 1 (PiCV1). NGS allowed assembly of the full genome sequence of PiCV1, comprising dsRNA1 (1.9 kbp) and dsRNA2 (1.5 kbp), which encode the RNA-dependent RNA polymerase and capsid protein genes, respectively. Phylogenetic and sequence analyses confirmed that PiCV1 is a new member of the genus Deltapartitivirus, family Partiviridae. From the same plant, NSG also permitted assembly of the complete genome sequence of eggplant mottled dwarf virus (EMDV), which shared 86 % to 98 % nucleotide sequence identity with complete and partial sequences (ca 6750 nt) of other known EMDV isolates with sequences available in the GenBank database. PMID:27087112

  1. Repetitive reef to ooid sequences near leeward margin of Caicos Platform, British West Indies

    SciTech Connect

    Waltz, M.; Rossinsky, V.; Wanless, H.R.

    1987-05-01

    Drill core transects and outcrops near the leeward margin of the Caicos Platform, BWI, reveal repetitive (one Holocene and two Pleistocene) shallowing-upward sequences of either (a) reefal boundstones overlain by layered oolitic grainstones or (b) burrowed oolitic grainstones overlain by layered oolitic grainstones. Each sediment sequence is separated from the other by a calcrete exposure surface. A transect, perpendicular to the trend of an exposed Pleistocene barrier reef/ooid sand complex, shows two separate sediment packages of reefal boundstones and reef-derived skeletal packstones overlain by layered oolitic grainstones. The well-exposed upper package consists of a shallowing-upward barrier reef, which is immediately overlain by burrowed and cross-bedded oolitic grainstones, beach rock blocks, and coral rubble, capped by layered oolitic grainstones. Separated by an exposure horizon, the lowermost package consists of coral and skeletal sands overlain by layered oolitic grainstones. Cores from a transect in a non-reefal setting north of the barrier reef complex reveal highly burrowed oolitic grainstones capped by layered oolitic grainstones. As a Holocene example, immediately offshore of this transect, modern reefs and bioturbated oolitic grainstones are presently being buried beneath coral rubble, beach rock blocks, and prograding oolitic beaches. Deposition of the capping layered oolitic grainstones appears to occur during stable and falling sea levels. This co-occurrence of reefal sediment and ooid sands suggests that the two are not mutually exclusive and that reef-ooid succession is a reoccurring part of leeward margin platform margin-building.

  2. Transcriptome sequencing and development of an expression microarray platform for the domestic ferret

    PubMed Central

    2010-01-01

    Background The ferret (Mustela putorius furo) represents an attractive animal model for the study of respiratory diseases, including influenza. Despite its importance for biomedical research, the number of reagents for molecular and immunological analysis is restricted. We present here a parallel sequencing effort to produce an extensive EST (expressed sequence tags) dataset derived from a normalized ferret cDNA library made from mRNA from ferret blood, liver, lung, spleen and brain. Results We produced more than 500000 sequence reads that were assembled into 16000 partial ferret genes. These genes were combined with the available ferret sequences in the GenBank to develop a ferret specific microarray platform. Using this array, we detected tissue specific expression patterns which were confirmed by quantitative real time PCR assays. We also present a set of 41 ferret genes with even transcription profiles across the tested tissues, indicating their usefulness as housekeeping genes. Conclusion The tools developed in this study allow for functional genomic analysis and make further development of reagents for the ferret model possible. PMID:20403183

  3. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition.

    PubMed

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-11-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation. PMID:26536029

  4. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition

    PubMed Central

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-01-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation. PMID:26536029

  5. Sequencing, De novo Assembly, Functional Annotation and Analysis of Phyllanthus amarus Leaf Transcriptome Using the Illumina Platform

    PubMed Central

    Bose Mazumdar, Aparupa; Chattopadhyay, Sharmila

    2016-01-01

    Phyllanthus amarus Schum. and Thonn., a widely distributed annual medicinal herb has a long history of use in the traditional system of medicine for over 2000 years. However, the lack of genomic data for P. amarus, a non-model organism hinders research at the molecular level. In the present study, high-throughput sequencing technology has been employed to enhance better understanding of this herb and provide comprehensive genomic information for future work. Here P. amarus leaf transcriptome was sequenced using the Illumina Miseq platform. We assembled 85,927 non-redundant (nr) “unitranscript” sequences with an average length of 1548 bp, from 18,060,997 raw reads. Sequence similarity analyses and annotation of these unitranscripts were performed against databases like green plants nr protein database, Gene Ontology (GO), Clusters of Orthologous Groups (COG), PlnTFDB, KEGG databases. As a result, 69,394 GO terms, 583 enzyme codes (EC), 134 KEGG maps, and 59 Transcription Factor (TF) families were generated. Functional and comparative analyses of assembled unitranscripts were also performed with the most closely related species like Populus trichocarpa and Ricinus communis using TRAPID. KEGG analysis showed that a number of assembled unitranscripts were involved in secondary metabolites, mainly phenylpropanoid, flavonoid, terpenoids, alkaloids, and lignan biosynthetic pathways that have significant medicinal attributes. Further, Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values of the identified secondary metabolite pathway genes were determined and Reverse Transcription PCR (RT-PCR) of a few of these genes were performed to validate the de novo assembled leaf transcriptome dataset. In addition 65,273 simple sequence repeats (SSRs) were also identified. To the best of our knowledge, this is the first transcriptomic dataset of P. amarus till date. Our study provides the largest genetic resource that will lead to drug development and pave

  6. The 2013 seismic sequence close to gas injection platform of the Castor project, offshore Spain

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Grigoli, Francesco; Heimann, Sebastian; Gonzalez, Alvaro; Buforn, Elisa; Maghsoudi, Samira; Blanch, Estefania; Dahm, Torsten

    2014-05-01

    A spatially localized seismic sequence has originated few tens of kilometres offshore the Mediterranean coast of Spain, starting on September 5, 2013, and lasting at least until October 2013. The sequence culminated in a maximal moment magnitude Mw 4.3 earthquake, on October 1, 2013. The epicentral region is located near the offshore platform of the Castor project, where gas is conducted through a pipeline from mainland and where it was recently injected in a depleted oil reservoir, at about 2 km depth. We analyse the temporal evolution of the seismic sequence and use full waveform techniques to derive absolute and relative locations, estimate depths and focal mechanisms for the largest events in the sequence (with magnitude mbLg larger than 3), and compare them to a previous event (April 8, 2012, mbLg 3.3) taking place in the same region prior to the gas injection. Moment tensor inversion results show that the overall seismicity in this sequence is characterized by oblique mechanisms with a normal fault component, with a 30° low-dip angle plane oriented NNE-SSW and a sub- vertical plane oriented NW-SE. The combined analysis of hypocentral location and focal mechanisms could indicate that the seismic sequence corresponds to rupture processes along sub- horizontal shallow surfaces, which could have been triggered by the gas injection in the reservoir,. An alternative scenario includes the iterated triggering of a system of steep faults oriented NW-SE, which were identified by prior marine seismics investigations. The most relevant seismogenic feature in the area is the Fosa de Amposta fault system, which includes different strands mapped at different distances to the coast, with a general NE-SW orientation, roughly parallel to the coastline. No significant known historical seismicity has involved this fault in the past. Our both scenarios exclude its activation, as its known orientation is inconsistent with focal mechanism results.

  7. [The application of next generation sequencing on epigenetic study].

    PubMed

    Shen, Sheng; Qu, Yanchun; Zhang, Jun

    2014-03-01

    The application of next generation sequencing (NGS) technique has a great impact on epigenetic studies. Coupled with NGS, a number of sequencing-based methodologies have been developed and applied in epigenetic studies, such as Whole Genome Bisulfite Sequencing (WGBS), Reduced Representation Bisulfite Sequencing (RRBS), Methylated DNA Immunoprecipitation Sequencing (MeDIP-seq), Chromatin Immunoprecipitation-Sequencing (ChIP-seq), TAB-seq (Tet-assisted Bisulfite Sequencing), Chromosome Conformation Capture Sequencing (3C-seq) and various of 3C-seq de-rivatives, DNase1-seq/MNase-seq/FAIRE-seqand RNA Sequencing (RNA-seq). These new techniques were used to iden-tify DNA methylation patterns and a broad range of protein/nucleic acid interactions, and to analyze chromatin conforma-tion.With these new technologies, researchers have gained a broader view and better tools to investigate the distributions and dynamic changes of epigenetic markers affected by both internal and external factors. The principles and characteristics of major applications of NGS technologies on epigenetics were summarized; and the recent advances and the future direc-tions in NGS-based epigenetic studies were further discussed. PMID:24846966

  8. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    PubMed

    Star, Bastiaan; Nederbragt, Alexander J; Hansen, Marianne H S; Skage, Morten; Gilfillan, Gregor D; Bradbury, Ian R; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S; Jentoft, Sissel

    2014-01-01

    Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104

  9. Manipulating attentional load in sequence learning through random number generation

    PubMed Central

    Wierzchoń, Michał; Gaillard, Vinciane; Asanowicz, Dariusz; Cleeremans, Axel

    2012-01-01

    Implicit learning is often assumed to be an effortless process. However, some artificial grammar learning and sequence learning studies using dual tasks seem to suggest that attention is essential for implicit learning to occur. This discrepancy probably results from the specific type of secondary task that is used. Different secondary tasks may engage attentional resources differently and therefore may bias performance on the primary task in different ways. Here, we used a random number generation (RNG) task, which may allow for a closer monitoring of a participant’s engagement in a secondary task than the popular secondary task in sequence learning studies: tone counting (TC). In the first two experiments, we investigated the interference associated with performing RNG concurrently with a serial reaction time (SRT) task. In a third experiment, we compared the effects of RNG and TC. In all three experiments, we directly evaluated participants’ knowledge of the sequence with a subsequent sequence generation task. Sequence learning was consistently observed in all experiments, but was impaired under dual-task conditions. Most importantly, our data suggest that RNG is more demanding and impairs learning to a greater extent than TC. Nevertheless, we failed to observe effects of the secondary task in subsequent sequence generation. Our studies indicate that RNG is a promising task to explore the involvement of attention in the SRT task. PMID:22723816

  10. Clinical Next Generation Sequencing for Precision Medicine in Cancer

    PubMed Central

    Dong, Ling; Wang, Wanheng; Li, Alvin; Kansal, Rina; Chen, Yuhan; Chen, Hong; Li, Xinmin

    2015-01-01

    Rapid adoption of next generation sequencing (NGS) in genomic medicine has been driven by low cost, high throughput sequencing and rapid advances in our understanding of the genetic bases of human diseases. Today, the NGS method has dominated sequencing space in genomic research, and quickly entered clinical practice. Because unique features of NGS perfectly meet the clinical reality (need to do more with less), the NGS technology is becoming a driving force to realize the dream of precision medicine. This article describes the strengths of NGS, NGS panels used in precision medicine, current applications of NGS in cytology, and its challenges and future directions for routine clinical use. PMID:27006629

  11. Clinical Next Generation Sequencing for Precision Medicine in Cancer.

    PubMed

    Dong, Ling; Wang, Wanheng; Li, Alvin; Kansal, Rina; Chen, Yuhan; Chen, Hong; Li, Xinmin

    2015-08-01

    Rapid adoption of next generation sequencing (NGS) in genomic medicine has been driven by low cost, high throughput sequencing and rapid advances in our understanding of the genetic bases of human diseases. Today, the NGS method has dominated sequencing space in genomic research, and quickly entered clinical practice. Because unique features of NGS perfectly meet the clinical reality (need to do more with less), the NGS technology is becoming a driving force to realize the dream of precision medicine. This article describes the strengths of NGS, NGS panels used in precision medicine, current applications of NGS in cytology, and its challenges and future directions for routine clinical use. PMID:27006629

  12. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing.

    PubMed

    Nguyen-Dumont, Tú; Pope, Bernard J; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C; Park, Daniel J

    2013-11-15

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  13. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing

    PubMed Central

    Nguyen-Dumont, Tú; Pope, Bernard J.; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C.; Park, Daniel J.

    2013-01-01

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  14. DNA immunoprecipitation semiconductor sequencing (DIP-SC-seq) as a rapid method to generate genome wide epigenetic signatures.

    PubMed

    Thomson, John P; Fawkes, Angie; Ottaviano, Raffaele; Hunter, Jennifer M; Shukla, Ruchi; Mjoseng, Heidi K; Clark, Richard; Coutts, Audrey; Murphy, Lee; Meehan, Richard R

    2015-01-01

    Modification of DNA resulting in 5-methylcytosine (5 mC) or 5-hydroxymethylcytosine (5hmC) has been shown to influence the local chromatin environment and affect transcription. Although recent advances in next generation sequencing technology allow researchers to map epigenetic modifications across the genome, such experiments are often time-consuming and cost prohibitive. Here we present a rapid and cost effective method of generating genome wide DNA modification maps utilising commercially available semiconductor based technology (DNA immunoprecipitation semiconductor sequencing; "DIP-SC-seq") on the Ion Proton sequencer. Focussing on the 5hmC mark we demonstrate, by directly comparing with alternative sequencing strategies, that this platform can successfully generate genome wide 5hmC patterns from as little as 500 ng of genomic DNA in less than 4 days. Such a method can therefore facilitate the rapid generation of multiple genome wide epigenetic datasets. PMID:25985418

  15. DNA immunoprecipitation semiconductor sequencing (DIP-SC-seq) as a rapid method to generate genome wide epigenetic signatures

    PubMed Central

    Thomson, John P.; Fawkes, Angie; Ottaviano, Raffaele; Hunter, Jennifer M.; Shukla, Ruchi; Mjoseng, Heidi K.; Clark, Richard; Coutts, Audrey; Murphy, Lee; Meehan, Richard R.

    2015-01-01

    Modification of DNA resulting in 5-methylcytosine (5 mC) or 5-hydroxymethylcytosine (5hmC) has been shown to influence the local chromatin environment and affect transcription. Although recent advances in next generation sequencing technology allow researchers to map epigenetic modifications across the genome, such experiments are often time-consuming and cost prohibitive. Here we present a rapid and cost effective method of generating genome wide DNA modification maps utilising commercially available semiconductor based technology (DNA immunoprecipitation semiconductor sequencing; “DIP-SC-seq”) on the Ion Proton sequencer. Focussing on the 5hmC mark we demonstrate, by directly comparing with alternative sequencing strategies, that this platform can successfully generate genome wide 5hmC patterns from as little as 500 ng of genomic DNA in less than 4 days. Such a method can therefore facilitate the rapid generation of multiple genome wide epigenetic datasets. PMID:25985418

  16. Using Illumina next generation sequencing technologies to sequence multigene families in de novo species.

    PubMed

    Hughes, Graham M; Gang, Li; Murphy, William J; Higgins, Desmond G; Teeling, Emma C

    2013-05-01

    The advent of Next Generation Sequencing Technology (NGST) has revolutionized molecular biology research, allowing for rapid gene/genome sequencing from a multitude of diverse species. As high throughput sequencing becomes more accessible, more efficient workflows must be developed to deal with the amounts of data produced and better assemble the genomes of de novo lineages. We combine traditional laboratory methods with Illumina NGST to amplify and sequence the largest mammalian multigene family, the Olfactory Receptor gene family, for species with and without a reference genome. We develop novel assembly methods to annotate and filter these data, which can be utilized for any gene family or any species. We find no significant difference between the ratio of genes within their respective gene families of our data compared with available genomic data. Using simulated data we explore the limitations of short-read sequence data and our assembly in recovering this gene family. We highlight the benefits and shortcomings of these methods. Compared with data generated from traditional polymerase chain reaction, cloning and Sanger sequencing methodologies, sequence data generated using our pipeline increases yield and sequencing efficiency without reducing the number of unique genes amplified. A cloning step is not required, therefore shortening data generation time. The novel downstream methodologies and workflows described provide a tool to be utilized by many fields of biology, to access and analyze the vast quantities of data generated. By combining laboratory and in silico methods, we provide a means of extracting genomic information for multigene families without complete genome sequencing. PMID:23480365

  17. Next-Generation Sequencing in the Understanding of Kaposi's Sarcoma-Associated Herpesvirus (KSHV) Biology.

    PubMed

    Strahan, Roxanne; Uppal, Timsy; Verma, Subhash C

    2016-01-01

    Non-Sanger-based novel nucleic acid sequencing techniques, referred to as Next-Generation Sequencing (NGS), provide a rapid, reliable, high-throughput, and massively parallel sequencing methodology that has improved our understanding of human cancers and cancer-related viruses. NGS has become a quintessential research tool for more effective characterization of complex viral and host genomes through its ever-expanding repertoire, which consists of whole-genome sequencing, whole-transcriptome sequencing, and whole-epigenome sequencing. These new NGS platforms provide a comprehensive and systematic genome-wide analysis of genomic sequences and a full transcriptional profile at a single nucleotide resolution. When combined, these techniques help unlock the function of novel genes and the related pathways that contribute to the overall viral pathogenesis. Ongoing research in the field of virology endeavors to identify the role of various underlying mechanisms that control the regulation of the herpesvirus biphasic lifecycle in order to discover potential therapeutic targets and treatment strategies. In this review, we have complied the most recent findings about the application of NGS in Kaposi's sarcoma-associated herpesvirus (KSHV) biology, including identification of novel genomic features and whole-genome KSHV diversities, global gene regulatory network profiling for intricate transcriptome analyses, and surveying of epigenetic marks (DNA methylation, modified histones, and chromatin remodelers) during de novo, latent, and productive KSHV infections. PMID:27043613

  18. A resampling procedure for generating conditioned daily weather sequences

    USGS Publications Warehouse

    Clark, M.P.; Gangopadhyay, S.; Brandon, D.; Werner, K.; Hay, L.; Rajagopalan, B.; Yates, D.

    2004-01-01

    [1] A method is introduced to generate conditioned daily precipitation and temperature time series at multiple stations. The method resamples data from the historical record "nens" times for the period of interest (nens = number of ensemble members) and reorders the ensemble members to reconstruct the observed spatial (intersite) and temporal correlation statistics. The weather generator model is applied to 2307 stations in the contiguous United States and is shown to reproduce the observed spatial correlation between neighboring stations, the observed correlation between variables (e.g., between precipitation and temperature), and the observed temporal correlation between subsequent days in the generated weather sequence. The weather generator model is extended to produce sequences of weather that are conditioned on climate indices (in this case the Nin??o 3.4 index). Example illustrations of conditioned weather sequences are provided for a station in Arizona (Petrified Forest, 34.8??N, 109.9??W), where El Nin??o and La Nin??a conditions have a strong effect on winter precipitation. The conditioned weather sequences generated using the methods described in this paper are appropriate for use as input to hydrologic models to produce multiseason forecasts of streamflow.

  19. Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells.

    PubMed

    Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J; Yancopoulos, George D; Lin, Hsin Chieh; Gromada, Jesper

    2016-03-22

    This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type-specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system. PMID:26951663

  20. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    PubMed Central

    Abernathy, Jason W; Xu, Peng; Li, Ping; Xu, De-Hai; Kucuktas, Huseyin; Klesius, Phillip; Arias, Covadonga; Liu, Zhanjiang

    2007-01-01

    Background The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs) for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs. Results We sequenced 10,368 EST clones using a normalized cDNA library made from pooled samples of the trophont, tomont, and theront life-cycle stages, and generated 9,769 sequences (94.2% success rate). Post-sequencing processing led to 8,432 high quality sequences. Clustering analysis of these ESTs allowed identification of 4,706 unique sequences containing 976 contigs and 3,730 singletons. These unique sequences represent over two million base pairs (~10% of Plasmodium falciparum genome, a phylogenetically related protozoan). BLASTX searches produced 2,518 significant (E-value < 10-5) hits and further Gene Ontology (GO) analysis annotated 1,008 of these genes. The ESTs were analyzed comparatively against the genomes of the related protozoa Tetrahymena thermophila and P. falciparum, allowing putative identification of additional genes. All the EST sequences were deposited by dbEST in GenBank (GenBank: EG957858–EG966289). Gene discovery and annotations are presented and discussed. Conclusion This set of ESTs represents a significant proportion of the Ich transcriptome, and provides a material basis for the development of microarrays useful for gene expression studies concerning Ich development, pathogenesis, and virulence. PMID:17577414

  1. Application of next-generation sequencing in gastrointestinal and liver tumors.

    PubMed

    Mikhail, Sameh; Faltas, Bishoy; Salem, Mohamed E; Bekaii-Saab, Tanios

    2016-05-01

    Malignant transformation of normal cells is associated with the evolution of genomic alterations. This concept has led to the development of molecular testing platforms to identify genomic alterations that can be targeted with novel therapies. Next generation sequencing (NGS) has heralded a new era in precision medicine in which tumor genes can be studied efficiently. Recent developments in NGS have allowed investigators to identify genomic predictive makers and hereditary mutations to guide treatment decision. The application of NGS in gastrointestinal cancers is being extensively studied but continues to face substantial challenges. In our review, we discuss various NGS platforms and highlight their role in identifying familial mutations and markers of response or resistance to cancer therapy. We also provide a balanced discussion of the challenges that limit the routine use of NGS in clinical practice. PMID:26916979

  2. Nanomicroarray and Multiplex Next-Generation Sequencing for Simultaneous Identification and Characterization of Influenza Viruses

    PubMed Central

    Ragupathy, Viswanath; Liu, Jikun; Wang, Xue; Vemula, Sai Vikram; El Mubarak, Haja Sittana; Ye, Zhiping; Landry, Marie L.

    2015-01-01

    Conventional methods for detection and discrimination of influenza viruses are time consuming and labor intensive. We developed a diagnostic platform for simultaneous identification and characterization of influenza viruses that uses a combination of nanomicroarray for screening and multiplex next-generation sequencing (NGS) assays for laboratory confirmation. The nanomicroarray was developed to target hemagglutinin, neuraminidase, and matrix genes to identify influenza A and B viruses. PCR amplicons synthesized by using an adapted universal primer for all 8 gene segments of 9 influenza A subtypes were detected in the nanomicroarray and confirmed by the NGS assays. This platform can simultaneously detect and differentiate multiple influenza A subtypes in a single sample. Use of these methods as part of a new diagnostic algorithm for detection and confirmation of influenza infections may provide ongoing public health benefits by assisting with future epidemiologic studies and improving preparedness for potential influenza pandemics. PMID:25694248

  3. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

    PubMed Central

    2014-01-01

    Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911

  4. Visualizing next-generation sequencing data with JBrowse

    PubMed Central

    Westesson, Oscar; Skinner, Mitchell

    2013-01-01

    JBrowse is a web-based genome browser, allowing many sources of data to be visualized, interpreted and navigated in a coherent visual framework. JBrowse uses efficient data structures, pre-generation of image tiles and client-side rendering to provide a fast, interactive browsing experience. Many of JBrowse's design features make it well suited for visualizing high-volume data, such as aligned next-generation sequencing reads. PMID:22411711

  5. Pattern Recognition on Read Positioning in Next Generation Sequencing

    PubMed Central

    Byeon, Boseon; Kovalchuk, Igor

    2016-01-01

    The usefulness and the utility of the next generation sequencing (NGS) technology are based on the assumption that the DNA or cDNA cleavage required to generate short sequence reads is random. Several previous reports suggest the existence of sequencing bias of NGS reads. To address this question in greater detail, we analyze NGS data from four organisms with different GC content, Plasmodium falciparum (19.39%), Arabidopsis thaliana (36.03%), Homo sapiens (40.91%) and Streptomyces coelicolor (72.00%). Using machine learning techniques, we recognize the pattern that the NGS read start is positioned in the local region where the nucleotide distribution is dissimilar from the global nucleotide distribution. We also demonstrate that the mono-nucleotide distribution underestimates sequencing bias, and the recognized pattern is explained largely by the distribution of multi-nucleotides (di-, tri-, and tetra- nucleotides) rather than mono-nucleotides. This implies that the correction of sequencing bias needs to be performed on the basis of the multi-nucleotide distribution. Providing companion software to quantify the effect of the recognized pattern on read positioning, we exemplify that the bias correction based on the mono-nucleotide distribution may not be sufficient to clean sequencing bias. PMID:27299343

  6. PepTool and GeneTool: platform-independent tools for biological sequence analysis.

    PubMed

    Wishart, D S; Stothard, P; Van Domselaar, G H

    2000-01-01

    Although we are unable to discuss all of the functionality available in PepTool and GeneTool, it should be evident from this brief review that both packages offer a great deal in terms of functionality and ease-of-use. Furthermore, a number of useful innovations including platform-independent GUI design, networked parallelism, direct internet connectivity, database compression, and a variety of enhanced or improved algorithms should make these two programs particularly useful in the rapidly changing world of biological sequence analysis. More complete descriptions of the programs, algorithms and operation of PepTool and GeneTool are available on the BioTools web site (www.biotools.com), in the associated program user manuals and in the on-line Help pages. PMID:10547833

  7. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  8. Generation of allocation sequences in randomised trials: chance, not choice.

    PubMed

    Schulz, Kenneth F; Grimes, David A

    2002-02-01

    The randomised controlled trial sets the gold standard of clinical research. However, randomisation persists as perhaps the least-understood aspect of a trial. Moreover, anything short of proper randomisation courts selection and confounding biases. Researchers should spurn all systematic, non-random methods of allocation. Trial participants should be assigned to comparison groups based on a random process. Simple (unrestricted) randomisation, analogous to repeated fair coin-tossing, is the most basic of sequence generation approaches. Furthermore, no other approach, irrespective of its complexity and sophistication, surpasses simple randomisation for prevention of bias. Investigators should, therefore, use this method more often than they do, and readers should expect and accept disparities in group sizes. Several other complicated restricted randomisation procedures limit the likelihood of undesirable sample size imbalances in the intervention groups. The most frequently used restricted sequence generation procedure is blocked randomisation. If this method is used, investigators should randomly vary the block sizes and use larger block sizes, particularly in an unblinded trial. Other restricted procedures, such as urn randomisation, combine beneficial attributes of simple and restricted randomisation by preserving most of the unpredictability while achieving some balance. The effectiveness of stratified randomisation depends on use of a restricted randomisation approach to balance the allocation sequences for each stratum. Generation of a proper randomisation sequence takes little time and effort but affords big rewards in scientific accuracy and credibility. Investigators should devote appropriate resources to the generation of properly randomised trials and reporting their methods clearly. PMID:11853818

  9. Microfluidic platform for isolating nucleic acid targets using sequence specific hybridization

    PubMed Central

    Wang, Jingjing; Morabito, Kenneth; Tang, Jay X.; Tripathi, Anubhav

    2013-01-01

    The separation of target nucleic acid sequences from biological samples has emerged as a significant process in today's diagnostics and detection strategies. In addition to the possible clinical applications, the fundamental understanding of target and sequence specific hybridization on surface modified magnetic beads is of high value. In this paper, we describe a novel microfluidic platform that utilizes a mobile magnetic field in static microfluidic channels, where single stranded DNA (ssDNA) molecules are isolated via nucleic acid hybridization. We first established efficient isolation of biotinylated capture probe (BP) using streptavidin-coated magnetic beads. Subsequently, we investigated the hybridization of target ssDNA with BP bound to beads and explained these hybridization kinetics using a dual-species kinetic model. The number of hybridized target ssDNA molecules was determined to be about 6.5 times less than that of BP on the bead surface, due to steric hindrance effects. The hybridization of target ssDNA with non-complementary BP bound to bead was also examined, and non-specific hybridization was found to be insignificant. Finally, we demonstrated highly efficient capture and isolation of target ssDNA in the presence of non-target ssDNA, where as low as 1% target ssDNA can be detected from mixture. The microfluidic method described in this paper is significantly relevant and is broadly applicable, especially towards point-of-care biological diagnostic platforms that require binding and separation of known target biomolecules, such as RNA, ssDNA, or protein. PMID:24404041

  10. Next Generation Sequencing to Characterize Mitochondrial Genomic DNA Heteroplasmy

    PubMed Central

    Huang, Taosheng

    2015-01-01

    This protocol is to describe the methodology to characterize mitochondria DNA (mtDNA) heteroplasmy with parallel sequencing. Mitochondria play a very important role in important cellular functions. Each eukaryotic cell contains hundreds of mitochondria with hundreds of mitochondria genomes. The mutant mtDNA and the wild type may co-exist as heteroplasmy, and cause human disease. The purpose of this methodology is to simultaneously determine mtDNA sequence and to quantify the heteroplasmy level. The protocol includes two-fragment mitochondria genome DNA PCR amplification. The PCR product is then mixed at an equimolar ratio. The samples will be barcoded and sequenced with high-throughput next-generation sequencing technology. We found that this technology is highly sensitive, specific, and accurate in determining mtDNA mutations and the degree of heteroplasmic level. PMID:21975941

  11. Application of next generation sequencing technology in Mendelian movement disorders.

    PubMed

    Wang, Yumin; Pan, Xuya; Xue, Dan; Li, Yuwei; Zhang, Xueying; Kuang, Biao; Zheng, Jiabo; Deng, Hao; Li, Xiaoling; Xiong, Wei; Zeng, Zhaoyang; Li, Guiyuan

    2016-02-01

    Next generation sequencing (NGS) has developed very rapidly in the last decade. Compared with Sanger sequencing, NGS has the advantages of high sensitivity and high throughput. Movement disorders are a common type of neurological disease. Although traditional linkage analysis has become a standard method to identify the pathogenic genes in diseases, it is getting difficult to find new pathogenic genes in rare Mendelian disorders, such as movement disorders, due to a lack of appropriate families with high penetrance or enough affected individuals. Thus, NGS is an ideal approach to identify the causal alleles for inherited disorders. NGS is used to identify genes in several diseases and new mutant sites in Mendelian movement disorders. This article reviewed the recent progress in NGS and the use of NGS in Mendelian movement disorders from genome sequencing and transcriptome sequencing. A perspective on how NGS could be employed in rare Mendelian disorders is also provided. PMID:26932219

  12. Plant virology and next generation sequencing: experiences with a Potyvirus.

    PubMed

    Kehoe, Monica A; Coutts, Brenda A; Buirchell, Bevan J; Jones, Roger A C

    2014-01-01

    Next generation sequencing is quickly emerging as the go-to tool for plant virologists when sequencing whole virus genomes, and undertaking plant metagenomic studies for new virus discoveries. This study aims to compare the genomic and biological properties of Bean yellow mosaic virus (BYMV) (genus Potyvirus), isolates from Lupinus angustifolius plants with black pod syndrome (BPS), systemic necrosis or non-necrotic symptoms, and from two other plant species. When one Clover yellow vein virus (ClYVV) (genus Potyvirus) and 22 BYMV isolates were sequenced on the Illumina HiSeq2000, one new ClYVV and 23 new BYMV sequences were obtained. When the 23 new BYMV genomes were compared with 17 other BYMV genomes available on Genbank, phylogenetic analysis provided strong support for existence of nine phylogenetic groupings. Biological studies involving seven isolates of BYMV and one of ClYVV gave no symptoms or reactions that could be used to distinguish BYMV isolates from L. angustifolius plants with black pod syndrome from other isolates. Here, we propose that the current system of nomenclature based on biological properties be replaced by numbered groups (I-IX). This is because use of whole genomes revealed that the previous phylogenetic grouping system based on partial sequences of virus genomes and original isolation hosts was unsustainable. This study also demonstrated that, where next generation sequencing is used to obtain complete plant virus genomes, consideration needs to be given to issues regarding sample preparation, adequate levels of coverage across a genome and methods of assembly. It also provided important lessons that will be helpful to other plant virologists using next generation sequencing in the future. PMID:25102175

  13. Plant Virology and Next Generation Sequencing: Experiences with a Potyvirus

    PubMed Central

    Kehoe, Monica A.; Coutts, Brenda A.; Buirchell, Bevan J.; Jones, Roger A. C.

    2014-01-01

    Next generation sequencing is quickly emerging as the go-to tool for plant virologists when sequencing whole virus genomes, and undertaking plant metagenomic studies for new virus discoveries. This study aims to compare the genomic and biological properties of Bean yellow mosaic virus (BYMV) (genus Potyvirus), isolates from Lupinus angustifolius plants with black pod syndrome (BPS), systemic necrosis or non-necrotic symptoms, and from two other plant species. When one Clover yellow vein virus (ClYVV) (genus Potyvirus) and 22 BYMV isolates were sequenced on the Illumina HiSeq2000, one new ClYVV and 23 new BYMV sequences were obtained. When the 23 new BYMV genomes were compared with 17 other BYMV genomes available on Genbank, phylogenetic analysis provided strong support for existence of nine phylogenetic groupings. Biological studies involving seven isolates of BYMV and one of ClYVV gave no symptoms or reactions that could be used to distinguish BYMV isolates from L. angustifolius plants with black pod syndrome from other isolates. Here, we propose that the current system of nomenclature based on biological properties be replaced by numbered groups (I–IX). This is because use of whole genomes revealed that the previous phylogenetic grouping system based on partial sequences of virus genomes and original isolation hosts was unsustainable. This study also demonstrated that, where next generation sequencing is used to obtain complete plant virus genomes, consideration needs to be given to issues regarding sample preparation, adequate levels of coverage across a genome and methods of assembly. It also provided important lessons that will be helpful to other plant virologists using next generation sequencing in the future. PMID:25102175

  14. Modeling Pseudorandom Sequence Generators using Cellular Automata: The Alternating Step Generator

    NASA Astrophysics Data System (ADS)

    Pazo-Robles, María Eugenia; Fúster-Sabater, Amparo

    2007-12-01

    Stream ciphers are pseudorandom bit generators whose output sequences are combined with the sensitive information by means of a mathematical function currently an addition module 2. The Alternating Step Generator is a pseudorandom sequence generator with good cryptographic properties and non-linear structure. In this work, we propose two different ways to model such a generator by using linear and discrete mathematical functions e.g. Cellular Automata. One of these ways deals with the realization of a linear model from a pair of basic automata provided by the Catell and Muzio algorithm. The other way is a new approach based on automata's addition consisting in the realization of a new automaton with non-primitive polynomial and short length. Both methods provide linear models able to generate the output sequence of the Alternating Step Generator.

  15. RAD in the realm of next-generation sequencing technologies.

    PubMed

    Rowe, H C; Renaut, S; Guggisberg, A

    2011-09-01

    The first North American RAD Sequencing and Genomics Symposium, sponsored by Floragenex (http://www.floragenex.com/radmeeting/), took place in Portland, Oregon (USA) on 19 April 2011. This symposium was convened to promote and discuss the use of restriction-site-associated DNA (RAD) sequencing technologies. RAD sequencing is one of several strategies recently developed to increase the power of data generated via short-read sequencing technologies by reducing their complexity (Baird et al. 2008; Huang et al. 2009; Andolfatto et al. 2011; Elshire et al. 2011). RAD sequencing, as a form of genotyping by sequencing, has been effectively applied in genetic mapping and quantitative trait loci (QTL) analyses in a range of organisms including nonmodel, genetically highly heterogeneous organisms (Table 1; Baird et al. 2008; Baxter et al. 2011; Chutimanitsakun et al. 2011; Pfender et al. 2011). RAD sequencing has recently found applications in phylogeography (Emerson et al. 2010) and population genomics (Hohenlohe et al. 2010). Considering the diversity of talks presented during this meeting, more developments are to be expected in the very near future. PMID:21991593

  16. HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis

    PubMed Central

    Santana-Quintero, Luis; Dingerdissen, Hayley; Thierry-Mieg, Jean; Mazumder, Raja; Simonyan, Vahan

    2014-01-01

    Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for storage and analysis of extra-large data, presents an algorithmic solution: the HIVE-hexagon DNA sequence aligner. HIVE-hexagon implements novel approaches to exploit both characteristics of sequence space and CPU, RAM and Input/Output (I/O) architecture to quickly compute accurate alignments. Key components of HIVE-hexagon include non-redundification and sorting of sequences; floating diagonals of linearized dynamic programming matrices; and consideration of cross-similarity to minimize computations. Availability https://hive.biochemistry.gwu.edu/hive/ PMID:24918764

  17. Automatic generation of primary sequence patterns from sets of related protein sequences.

    PubMed

    Smith, R F; Smith, T F

    1990-01-01

    We have developed a computer algorithm that can extract the pattern of conserved primary sequence elements common to all members of a homologous protein family. The method involves clustering the pairwise similarity scores among a set of related sequences to generate a binary dendrogram (tree). The tree is then reduced in a stepwise manner by progressively replacing the node connecting the two most similar termini by one common pattern until only a single common "root" pattern remains. A pattern is generated at a node by (i) performing a local optimal alignment on the sequence/pattern pair connected by the node with the use of an extended dynamic programming algorithm and then (ii) constructing a single common pattern from this alignment with a nested hierarchy of amino acid classes to identify the minimal inclusive amino acid class covering each paired set of elements in the alignment. Gaps within an alignment are created and/or extended using a "pay once" gap penalty rule, and gapped positions are converted into gap characters that function as 0 or 1 amino acid of any type during subsequent alignment. This method has been used to generate a library of covering patterns for homologous families in the National Biomedical Research Foundation/Protein Identification Resource protein sequence data base. We show that a covering pattern can be more diagnostic for sequence family membership than any of the individual sequences used to construct the pattern. PMID:2296575

  18. Identification of virus encoding microRNAs using 454 FLX sequencing platform.

    PubMed

    Kong, Byung-Whi

    2011-01-01

    MicroRNAs are a class of small noncoding RNA molecules that play a pivotal role in the regulation of gene expression at the posttranscriptional level. Most large double-stranded DNA viruses, mainly the herpesvirus family, are known to express miRNAs. Viral miRNAs can regulate both viral- and cellular transcripts. By eliminating cloning steps for large number of Sanger sequencing reactions, recent development of massively parallel next-generation sequencing methods has accelerated identification of small RNA species expressed from viruses, prokaryotes, and eukaryotes. The miRNAs expressed from infectious laryngotracheitis virus (ILTV), which is an alphaherpesvirus belonging to the herpesviridae family and which causes an acute respiratory disorder in chicken, were identified by small RNA enrichment and the 454 FLX sequencing method. PMID:21431764

  19. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    PubMed

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396

  20. Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

    PubMed

    Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

    2015-12-01

    We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. PMID:26319908

  1. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

    PubMed Central

    Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396

  2. A Multi-Site Study Employing High Resolution HLA Genotyping by Next Generation Sequencing

    PubMed Central

    Holcomb, C. L.; Höglund, B.; Anderson, M. W.; Blake, L.A.; Böhme, I.; Egholm, M.; Ferriola, D.; Gabriel, C.; Gelber, S. E.; Goodridge, D.; Hawbecker, S.; Klein, R.; Ladner, M.; Lind, C.; Monos, D.; Pando, M. J.; Pröll, J.; Sayer, D. C.; Schmitz-Agheguian, G.; Simen, B. B.; Thiele, B.; Trachtenberg, E. A.; Tyan, D. B.; Wassmuth, R.; White, S.; Erlich, H. A.

    2014-01-01

    The high degree of polymorphism at HLA class I and class II loci makes high resolution HLA typing challenging. Current typing methods, including Sanger sequencing, yield ambiguous typing results due to incomplete genomic coverage and inability to set phase for HLA haplotype determination. The 454 Life Sciences GS FLX next generation sequencing system coupled with Conexio ATF software can provide very high resolution HLA genotyping. High throughput genotyping can be achieved by use of primers with multiplex identifier (MID) tags to allow pooling of the amplicons generated from different individuals prior to sequencing. We have conducted a double blind study in which eight laboratory sites performed amplicon sequencing using GS FLX standard chemistry and genotyped the same 20 samples for HLA-A, -B, -C, DPB1, DQA1, DQB1, DRB1, and DRB3, DRB4 and DRB5 (DRB3/4/5) in a single sequencing run. The average sequence read length was 250 base pairs (bp) and the average number of sequence reads per amplicon was 672, providing confidence in the allele assignments. Of the 1280 genotypes considered, assignment was possible in 95% of the cases. Failure to assign genotypes was the result of researcher procedural error or the presence of a novel allele rather than a failure of sequencing technology. Concordance with known genotypes, in cases where assignment was possible, ranged from 95.3% to 99.4% for the eight sites, with overall concordance of 97.2%. We conclude that clonal pyrosequencing using the GS FLX platform and Conexio ATF software allows reliable identification of HLA genotypes at high resolution. PMID:21299525

  3. Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data

    PubMed Central

    Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer

    2012-01-01

    Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available. PMID:22384016

  4. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    PubMed Central

    Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available. PMID:27327771

  5. Preliminary Sequence stratigraphy framework of the SW part of the Actopan Platform, Lower Cretaceous, Hidalgo, Mexico

    NASA Astrophysics Data System (ADS)

    Abascal, G.; Murillo-Muñeton, G.

    2013-05-01

    The oldest sedimentary rocks in what is known as the Actopan Platform, in the State of Hidalgo, Mexico, are superbly exposed toward the southwestern part of such platform. A detailed stratigraphic/sedimentologic study was carried out to a 623 m-thick section; this study was focused to establish a sequence stratigraphic framework. The base of the section consists of a Lower Cretaceous 6223-m thick, mixed siliciclastic-carbonate sedimentary succession that has been named Santuario Formation. The terrigenous facies of this unit correspond to red beds that consist of shales, sandstones y few conglomerates deposited under continental conditions (fluvial). White and yellowish sandstones, possibly deposited by deltaic systems, occur in minor amounts. A tuff layer is found in its lower part. The carbonate facies of the Santuario Formation consist mainly of skeletal mudstones/wackestones de bioclastos-peloides and subordinate quantities of sandy dolostones, skeletal packstones/grainstones and rudist (requeniids) boundstones. The middle and upper parts of the studied stratigraphic section correspond to an essentially carbonate succession that in known as El Abra Formation. This unit is comprised of the following facies: skeletal mudstones/wackestones, skeletal packstones/grainstone, and minor rudist (requeniid and Chondrodonta) boundstones and cryptalgal laminites deposited in shallow subtidal lagoon to tidal flat conditions. At this location, a "Middle" Cretaceous age (Albian-Cenomanian) has been assigned to the El Abra Formation. However, the common presence of the benthic foraminifer Chofatella decipiens Schlumberger in these facies indicates that their age extends, at least, to the Lower Cretaceous (Barremian). This age was confirmed with the dating of zircons in tuff deposited in the base section. The carbonate facies of the Santuario Formation stack forming fifth-order subtidal cycles or parasequences. While the carbonate facies of the El Abra Formation also stack

  6. Generating Researcher Networks with Identified Persons on a Semantic Service Platform

    NASA Astrophysics Data System (ADS)

    Jung, Hanmin; Lee, Mikyoung; Kim, Pyung; Lee, Seungwoo

    This paper describes a Semantic Web-based method to acquire researcher networks by means of identification scheme, ontology, and reasoning. Three steps are required to realize it; resolving co-references, finding experts, and generating researcher networks. We adopt OntoFrame as an underlying semantic service platform and apply reasoning to make direct relations between far-off classes in ontology schema. 453,124 Elsevier journal articles with metadata and full-text documents in information technology and biomedical domains have been loaded and served on the platform as a test set.

  7. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    DOE PAGESBeta

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  8. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    SciTech Connect

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

  9. Seismic stratigraphy of the western Florida carbonate platform above the Mid-Cretaceous sequence boundary (MCSB)

    SciTech Connect

    Jee, J.L. . Dept. of Geology)

    1993-03-01

    From the Apalachicola Basin (AB) to the Sarasota Arch, a web of multifold seismic and 29 wells were analyzed to determine Upper Cretaceous-Cenozoic stratigraphy. Concordant reflection geometries above and below the MCSB throughout most of the study area do not suggest prolonged subaerial exposure of the platform as some have considered. The configuration of the MCSB surface influenced the distribution of overlying sediment such that the section is thick in the basins and thin on the highs. The three main units recognized are Upper Cretaceous, Paleocene-Eocene, and post-Eocene. The Upper Cretaceous has two subunits, KU1 and KU2. KU1 corresponds in age to the Tuscaloosa-Eutaw lithostratigraphic units, has continuous, parallel seismic facies, and tends to thicken in depressions on the MCSB. KU2 is age-equivalent to part of the Selma Gp. Maastrichtian strata are locally thin to partly absent. In the AB, KU2 appears intensely faulted. Sonic velocities in KU2 show southeastward change to more carbonate rock across the Middle Ground Arch, where hummocky-to-contorted seismic facies and thickening on the structural high suggest constructional accumulation. In wells, Paleocene strata lie unconformably on the Upper Cretaceous. The Paleocene section is thin and not easy to resolve on seismic sections. In the AB, the lowermost Eocene sequence is a wedge that thickens dramatically to the west. In the eastern AB, younger Eocene sequences are stacked to form broad en echelon mounds. Post-Eocene strata in the AB are continuous, parallel and drape the upper Eocene surface. Along the southeastern, up-dip margin of the Tampa Embayment (TE), a belt of west-prograding clinoforms marks the Eocene shelf edge. Landward of this, a seismic marbled zone suggests dolomitic facies. In the post-Eocene section of the TE, Oligocene-Lower Miocene strata form successive sequences of progradational clinoforms that steepen as they impinge on the FL Escarpment.

  10. A Semiconductor Chip-Based Next Generation Sequencing Procedure for the Main Pulmonary Hypertension Genes.

    PubMed

    Gómez, Juan; Reguero, Julian R; Alvarez, Celso; Junquera, Manuel R; Arango, Ana; Morís, César; Coto, Eliecer

    2015-08-01

    The aim of this study was to characterize the mutational spectrum of pulmonary hypertension (PH) patients through a next generation sequencing platform. In a total of 22 patients, the BMPR2, SMAD9, CAV1, KCNK3, and EIF2AK4 genes were sequenced with semiconductor chips and the ion torrent personal genome machine. We found six putative mutations in SMAD (p.R263Q), BMPR2 (p.S301P, p.T493I), CAV1 (p.V155I), and EIF2AK4 (p.L489P, p.P1115L) in five patients. One patient was compound heterozygous for BMPR2 + SMAD mutations, and one patient was homozygous for EIF2AK4 p.P1115L. The reported procedure would facilitate the rapid mutational screening of large cohorts of PH patients. PMID:25917481

  11. [Application of next-generation semiconductor sequencing technologies in genetic diagnosis of inherited cardiomyopathies].

    PubMed

    Yue, Zhao; Hong, Zhang; Xueshan, Xia

    2015-07-01

    Inherited cardiomyopathy is the most common hereditary cardiac disease. It also causes a significant proportion of sudden cardiac deaths in young adults and athletes. So far, approximately one hundred genes have been reported to be involved in cardiomyopathies through different mechanisms. Therefore, the identification of the genetic basis and disease mechanisms of cardiomyopathies are important for establishing a clinical diagnosis and genetic testing. Next-generation semiconductor sequencing (NGSS) technology platform is a high-throughput sequencer capable of analyzing clinically derived genomes with high productivity, sensitivity and specificity. It was launched in 2010 by Life Technologies of USA, and it is based on a high density semiconductor chip, which was covered with tens of thousands of wells. NGSS has been successfully used in candidate gene mutation screening to identify hereditary disease. In this review, we summarize these genetic variations, challenge and application of NGSS in inherited cardiomyopathy, and its value in disease diagnosis, prevention and treatment. PMID:26351163

  12. Modification of the Transplex WTA2 Amplification Product for Next Generation Sequencing

    PubMed Central

    Ward, B.; Fenoglio, D.; Heuermann, K.

    2011-01-01

    Transplex Whole Transcriptome Amplification (WTA2)a exponentially amplifies RNA producing a double-stranded cDNA library while precisely maintaining differential levels of individual transcripts in test and reference samples. Though originally designed to amplify nanogram quantities of RNA, Transplex WTA2 has been shown to be exceedingly effective for amplification from damaged RNA template (FFPE and laser captured tissue samples) and single-cell input quantities (picograms). The efficacy of Transplex WTA2 amplification for downstream applications, primarily qPCR and expression microarray analysis, is well-documented. It follows that the utilization of next-generation sequencing for gene expression research and diagnostics would be well served by Transplex amplification of RNA isolated from samples of severely restricted quantity or quality. Strategies for the integration of Transplex WTA2 with next-generation sequencing are examined, with particular emphasis on elimination of the characteristic fixed primer sequence associated with each amplicon in the amplification library. Removal of these sites will allow direct entry of the resulting product into the sequencing workflow. Methods under consideration will enable the WTA2 amplicon to feed into the current sample prep protocols for the Illumina GA and GAII, SoLiD 5500/5500xl, and Roche-454 GS FLX/Junior platforms.

  13. Nanopore-based Fourth-generation DNA Sequencing Technology

    PubMed Central

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  14. Machine-Checked Sequencer for Critical Embedded Code Generator

    NASA Astrophysics Data System (ADS)

    Izerrouken, Nassima; Pantel, Marc; Thirioux, Xavier

    This paper presents the development of a correct-by-construction block sequencer for GeneAuto a qualifiable (according to DO178B/ED12B recommendation) automatic code generator. It transforms Simulink models to MISRA C code for safety critical systems. Our approach which combines classical development process and formal specification and verification using proof-assistants, led to preliminary fruitful exchanges with certification authorities. We present parts of the classical user and tools requirements and derived formal specifications, implementation and verification for the correctness and termination of the block sequencer. This sequencer has been successfully applied to real-size industrial use cases from various transportation domain partners and led to requirement errors detection and a correct-by-construction implementation.

  15. Nanopore-based fourth-generation DNA sequencing technology.

    PubMed

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-02-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  16. Mapping Sensorimotor Sequences to Word Sequences: A Connectionist Model of Language Acquisition and Sentence Generation

    ERIC Educational Resources Information Center

    Takac, Martin; Benuskova, Lubica; Knott, Alistair

    2012-01-01

    In this article we present a neural network model of sentence generation. The network has both technical and conceptual innovations. Its main technical novelty is in its semantic representations: the messages which form the input to the network are structured as sequences, so that message elements are delivered to the network one at a time. Rather…

  17. Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing

    PubMed Central

    Blomquist, Thomas; Crawford, Erin L.; Yeo, Jiyoun; Zhang, Xiaolu; Willey, James C.

    2015-01-01

    Background Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses. Methods Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS. Results For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (R2 = 0.93). Conclusion In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of

  18. All-optical pseudorandom bit sequences generator based on TOADs

    NASA Astrophysics Data System (ADS)

    Sun, Zhenchao; Wang, Zhi; Wu, Chongqing; Wang, Fu; Li, Qiang

    2016-03-01

    A scheme for all-optical pseudorandom bit sequences (PRBS) generator is demonstrated with optical logic gate 'XNOR' and all-optical wavelength converter based on cascaded Tera-Hertz Optical Asymmetric Demultiplexer (TOADs). Its feasibility is verified by generation of return-to-zero on-off keying (RZ-OOK) 263-1 PRBS at the speed of 1 Gb/s with 10% duty radio. The high randomness of ultra-long cycle PRBS is validated by successfully passing the standard benchmark test.

  19. New Generations: Sequencing Machines and Their Computational Challenges

    PubMed Central

    Schwartz, David C.; Waterman, Michael S.

    2011-01-01

    New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands. PMID:22121326

  20. Continuous flow generation of magnetoliposomes in a low-cost portable microfluidic platform.

    PubMed

    Conde, Alvaro J; Batalla, Milena; Cerda, Belén; Mykhaylyk, Olga; Plank, Christian; Podhajcer, Osvaldo; Cabaleiro, Juan M; Madrid, Rossana E; Policastro, Lucia

    2014-12-01

    We present a low-cost, portable microfluidic platform that uses laminated polymethylmethacrylate chips, peristaltic micropumps and LEGO® Mindstorms components for the generation of magnetoliposomes that does not require extrusion steps. Mixtures of lipids reconstituted in ethanol and an aqueous phase were injected independently in order to generate a combination of laminar flows in such a way that we could effectively achieve four hydrodynamic focused nanovesicle generation streams. Monodisperse magnetoliposomes with characteristics comparable to those obtained by traditional methods have been obtained. The magnetoliposomes are responsive to external magnetic field gradients, a result that suggests that the nanovesicles can be used in research and applications in nanomedicine. PMID:25257193

  1. Next generation sequencing in synovial sarcoma reveals novel gene mutations.

    PubMed

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H S; Flucke, Uta E; Groenen, Patricia J T A; Tops, Bastiaan B J; Kamping, Eveline J; Pfundt, Rolph; de Bruijn, Diederik R H; Geurts van Kessel, Ad H M; van Krieken, Han J H J M; van der Graaf, Winette T A; Versleijen-Jonkers, Yvonne M H

    2015-10-27

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  2. Next generation sequencing in synovial sarcoma reveals novel gene mutations

    PubMed Central

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H.S.; Flucke, Uta E.; Groenen, Patricia J.T.A.; Tops, Bastiaan B.J.; Kamping, Eveline J.; Pfundt, Rolph; de Bruijn, Diederik R.H.; van Kessel, Ad H.M. Geurts; van Krieken, Han J.H.J.M.; van der Graaf, Winette T.A.; Versleijen-Jonkers, Yvonne M.H.

    2015-01-01

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  3. Statistical Quantification of Methylation Levels by Next-Generation Sequencing

    PubMed Central

    Wu, Guodong; Yi, Nengjun; Absher, Devin; Zhi, Degui

    2011-01-01

    Background/Aims Recently, next-generation sequencing-based technologies have enabled DNA methylation profiling at high resolution and low cost. Methyl-Seq and Reduced Representation Bisulfite Sequencing (RRBS) are two such technologies that interrogate methylation levels at CpG sites throughout the entire human genome. With rapid reduction of sequencing costs, these technologies will enable epigenotyping of large cohorts for phenotypic association studies. Existing quantification methods for sequencing-based methylation profiling are simplistic and do not deal with the noise due to the random sampling nature of sequencing and various experimental artifacts. Therefore, there is a need to investigate the statistical issues related to the quantification of methylation levels for these emerging technologies, with the goal of developing an accurate quantification method. Methods In this paper, we propose two methods for Methyl-Seq quantification. The first method, the Maximum Likelihood estimate, is both conceptually intuitive and computationally simple. However, this estimate is biased at extreme methylation levels and does not provide variance estimation. The second method, based on Bayesian hierarchical model, allows variance estimation of methylation levels, and provides a flexible framework to adjust technical bias in the sequencing process. Results We compare the previously proposed binary method, the Maximum Likelihood (ML) method, and the Bayesian method. In both simulation and real data analysis of Methyl-Seq data, the Bayesian method offers the most accurate quantification. The ML method is slightly less accurate than the Bayesian method. But both our proposed methods outperform the original binary method in Methyl-Seq. In addition, we applied these quantification methods to simulation data and show that, with sequencing depth above 40–300 (which varies with different tissue samples) per cleavage site, Methyl-Seq offers a comparable quantification

  4. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    PubMed Central

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi. PMID:27507959

  5. Incipiently drowned platform deposit in cyclic Ordovician shelf sequence: Lower Ordovician Chepultepec Formation, Virginia

    SciTech Connect

    Bova, J.A.; Read, J.F.

    1983-03-01

    The Chepultepec interval, 145 to 260 m (476 to 853 ft) thick, in Virginia contains the Lower Member up to 150 m (492 ft) thick, and the Upper Member, up to 85 m (279 ft) thick, of peritidal cyclic limestone and dolomite, and a Middle Member, up to 110 m (360 ft) thick, of subtidal limestone and bioherms, passing northwestward into cyclic facies. Calculated long term subsidence rates were 4 to 5 cm/1000 yr (mature passive margin rates), shelf gradients were 6 cm/km, and average duration of cycles was 140,00 years. Peritidal cyclic sequences are upward shallowing sequences of pellet-skeletal limestone, thrombolites, rippled calcisiltites and intraclast grainstone, and laminite caps. They formed by rapid transgression with apparent submergence increments averaging approximately 2 m (6.5 ft) in Lower Member and 3.5 m (11.4 ft), Upper Member. Deposition during Middle Member time was dominated by skeletal limestone-mudstone, calcisiltite with storm generated fining-upward sequences, and burrow-mixed units that were formed near fair-weather wave base, along with thrombolite bioherms. Locally, there are upward shallowing sequences, of basal wackestone/mudstone to calcisiltite to bioherm complexes (locally with erosional scalloped tops). Following each submergence, carbonate sedimentation was able to build to sea level prior to renewed submergence. Large submergence events caused tidal flats to be shifted far to the west, and they were unable to prograde out onto the open shelf because of insufficient time before subsidence was renewed, and because the open shelf setting inhibited tidal flat deposition. The Middle Member represents an incipiently drowned sequence that developed by repeated submergence events.

  6. Next-generation sequencing for diagnosis of rare diseases in the neonatal intensive care unit

    PubMed Central

    Daoud, Hussein; Luco, Stephanie M.; Li, Rui; Bareke, Eric; Beaulieu, Chandree; Jarinova, Olga; Carson, Nancy; Nikkel, Sarah M.; Graham, Gail E.; Richer, Julie; Armour, Christine; Bulman, Dennis E.; Chakraborty, Pranesh; Geraghty, Michael; Lines, Matthew A.; Lacaze-Masmonteil, Thierry; Majewski, Jacek; Boycott, Kym M.; Dyment, David A.

    2016-01-01

    Background: Rare diseases often present in the first days and weeks of life and may require complex management in the setting of a neonatal intensive care unit (NICU). Exhaustive consultations and traditional genetic or metabolic investigations are costly and often fail to arrive at a final diagnosis when no recognizable syndrome is suspected. For this pilot project, we assessed the feasibility of next-generation sequencing as a tool to improve the diagnosis of rare diseases in newborns in the NICU. Methods: We retrospectively identified and prospectively recruited newborns and infants admitted to the NICU of the Children’s Hospital of Eastern Ontario and the Ottawa Hospital, General Campus, who had been referred to the medical genetics or metabolics inpatient consult service and had features suggesting an underlying genetic or metabolic condition. DNA from the newborns and parents was enriched for a panel of clinically relevant genes and sequenced on a MiSeq sequencing platform (Illumina Inc.). The data were interpreted with a standard informatics pipeline and reported to care providers, who assessed the importance of genotype–phenotype correlations. Results: Of 20 newborns studied, 8 received a diagnosis on the basis of next-generation sequencing (diagnostic rate 40%). The diagnoses were renal tubular dysgenesis, SCN1A-related encephalopathy syndrome, myotubular myopathy, FTO deficiency syndrome, cranioectodermal dysplasia, congenital myasthenic syndrome, autosomal dominant intellectual disability syndrome type 7 and Denys–Drash syndrome. Interpretation: This pilot study highlighted the potential of next-generation sequencing to deliver molecular diagnoses rapidly with a high success rate. With broader use, this approach has the potential to alter health care delivery in the NICU. PMID:27241786

  7. Suppression Subtractive Hybridization Versus Next-Generation Sequencing in Plant Genetic Engineering: Challenges and Perspectives.

    PubMed

    Sahebi, Mahbod; Hanafi, Mohamed M; Azizi, Parisa; Hakim, Abdul; Ashkani, Sadegh; Abiri, Rambod

    2015-10-01

    Suppression subtractive hybridization (SSH) is an effective method to identify different genes with different expression levels involved in a variety of biological processes. This method has often been used to study molecular mechanisms of plants in complex relationships with different pathogens and a variety of biotic stresses. Compared to other techniques used in gene expression profiling, SSH needs relatively smaller amounts of the initial materials, with lower costs, and fewer false positives present within the results. Extraction of total RNA from plant species rich in phenolic compounds, carbohydrates, and polysaccharides that easily bind to nucleic acids through cellular mechanisms is difficult and needs to be considered. Remarkable advancement has been achieved in the next-generation sequencing (NGS) field. As a result of progress within fields related to molecular chemistry and biology as well as specialized engineering, parallelization in the sequencing reaction has exceptionally enhanced the overall read number of generated sequences per run. Currently available sequencing platforms support an earlier unparalleled view directly into complex mixes associated with RNA in addition to DNA samples. NGS technology has demonstrated the ability to sequence DNA with remarkable swiftness, therefore allowing previously unthinkable scientific accomplishments along with novel biological purposes. However, the massive amounts of data generated by NGS impose a substantial challenge with regard to data safe-keeping and analysis. This review examines some simple but vital points involved in preparing the initial material for SSH and introduces this method as well as its associated applications to detect different novel genes from different plant species. This review evaluates general concepts, basic applications, plus the probable results of NGS technology in genomics, with unique mention of feasible potential tools as well as bioinformatics. PMID:26271955

  8. Complete plastid genome sequence of Vaccinium macrocarpon: structure, gene content and rearrangements revealed by next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete plastid genome sequence of the American cranberry was reconstructed using next-generation sequencing data by in silico procedures. We used Roche 454 shotgun sequence data to isolate cranberry plastid-specific sequences of the cultivar ‘HyRed’ via homology comparisons with complete seque...

  9. Metre-scale cyclicity in Middle Eocene platform carbonates in northern Egypt: Implications for facies development and sequence stratigraphy

    NASA Astrophysics Data System (ADS)

    Tawfik, Mohamed; El-Sorogy, Abdelbaset; Moussa, Mahmoud

    2016-07-01

    The shallow-water carbonates of the Middle Eocene in northern Egypt represent a Tethyan reef-rimmed carbonate platform with bedded inner-platform facies. Based on extensive micro- and biofacies documentation, five lithofacies associations were defined and their respective depositional environments were interpreted. Investigated sections were subdivided into three third-order sequences, named S1, S2 and S3. Sequence S1 is interpreted to correspond to the Lutetian, S2 corresponds to the Late Lutetian and Early Bartonian, and S3 represents the Late Bartonian. Each of the three sequences was further subdivided into fourth-order cycle sets and fifth-order cycles. The complete hierarchy of cycles can be correlated along 190 km across the study area, and highlighting a general "layer-cake" stratigraphic architecture. The documentation of the studied outcrops may contribute to the better regional understanding of the Middle Eocene formations in northern Egypt and to Tethyan pericratonic carbonate models in general.

  10. Humans cannot consciously generate random numbers sequences: Polemic study.

    PubMed

    Figurska, Małgorzata; Stańczyk, Maciej; Kulesza, Kamil

    2008-01-01

    It is widely believed, that randomness exists in Nature. In fact such an assumption underlies many scientific theories and is embedded in the foundations of quantum mechanics. Assuming that this hypothesis is valid one can use natural phenomena, like radioactive decay, to generate random numbers. Today, computers are capable of generating the so-called pseudorandom numbers. Such series of numbers are only seemingly random (bias in the randomness quality can be observed). Question whether people can produce random numbers, has been investigated by many scientists in the recent years. The paper "Humans can consciously generate random numbers sequences..." published recently in Medical Hypotheses made claims that were in many ways contrary to state of art; it also stated far-reaching hypotheses. So, we decided to repeat the experiments reported, with special care being taken of proper laboratory procedures. Here, we present the results and discuss possible implications in computer and other sciences. PMID:17888582

  11. Next generation sequencing applications for breast cancer research

    PubMed Central

    PETRIC, ROXANA COJOCNEANU; POP, LAURA-ANCUTA; JURJ, ANCUTA; RADULY, LAJOS; DUMITRASCU, DAN; DRAGOS, NICOLAE; NEAGOE, IOANA BERINDAN

    2015-01-01

    For some time, cancer has not been thought of as a disease, but as a multifaceted, heterogeneous complex of genotypic and phenotypic manifestations leading to tumorigenesis. Due to recent technological progress, the outcome of cancer patients can be greatly improved by introducing in clinical practice the advantages brought about by the development of next generation sequencing techniques. Biomedical suppliers have come up with various applications which medical researchers can use to characterize a patient’s disease from molecular and genetic point of view in order to provide caregivers with rapid and relevant information to guide them in choosing the most appropriate course of treatment, with maximum efficiency and minimal side effects. Breast cancer, whose incidence has risen dramatically, is a good candidate for these novel diagnosis and therapeutic approaches, particularly when referring to specific sequencing panels which are designed to detect germline or somatic mutations in genes that are involved in breast cancer tumorigenesis and progression. Benchtop next generation sequencing machines are becoming a more common presence in the clinical setting, empowering physicians to better treat their patients, by offering early diagnosis alternatives, targeted remedies, and bringing medicine a step closer to achieving its ultimate goal, personalized therapy. PMID:26609257

  12. Next generation sequencing applications for breast cancer research.

    PubMed

    Petric, Roxana Cojocneanu; Pop, Laura-Ancuta; Jurj, Ancuta; Raduly, Lajos; Dumitrascu, Dan; Dragos, Nicolae; Neagoe, Ioana Berindan

    2015-01-01

    For some time, cancer has not been thought of as a disease, but as a multifaceted, heterogeneous complex of genotypic and phenotypic manifestations leading to tumorigenesis. Due to recent technological progress, the outcome of cancer patients can be greatly improved by introducing in clinical practice the advantages brought about by the development of next generation sequencing techniques. Biomedical suppliers have come up with various applications which medical researchers can use to characterize a patient's disease from molecular and genetic point of view in order to provide caregivers with rapid and relevant information to guide them in choosing the most appropriate course of treatment, with maximum efficiency and minimal side effects. Breast cancer, whose incidence has risen dramatically, is a good candidate for these novel diagnosis and therapeutic approaches, particularly when referring to specific sequencing panels which are designed to detect germline or somatic mutations in genes that are involved in breast cancer tumorigenesis and progression. Benchtop next generation sequencing machines are becoming a more common presence in the clinical setting, empowering physicians to better treat their patients, by offering early diagnosis alternatives, targeted remedies, and bringing medicine a step closer to achieving its ultimate goal, personalized therapy. PMID:26609257

  13. Unraveling genomic variation from next generation sequencing data

    PubMed Central

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

  14. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers.

    PubMed

    Myer, Phillip R; Kim, MinSeok; Freetly, Harvey C; Smith, Timothy P L

    2016-08-01

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplification primer selection, and read length, which can affect the apparent microbial community. In this study, we compared short read 16S rRNA variable regions, V1-V3, with that of near-full length 16S regions, V1-V8, using highly diverse steer rumen microbial communities, in order to examine the impact of technology selection on phylogenetic profiles. Short paired-end reads from the Illumina MiSeq platform were used to generate V1-V3 sequence, while long "circular consensus" reads from the Pacific Biosciences RSII instrument were used to generate V1-V8 data. The two platforms revealed similar microbial operational taxonomic units (OTUs), as well as similar species richness, Good's coverage, and Shannon diversity metrics. However, the V1-V8 amplified ruminal community resulted in significant increases in several orders of taxa, such as phyla Proteobacteria and Verrucomicrobia (P < 0.05). Taxonomic classification accuracy was also greater in the near full-length read. UniFrac distance matrices using jackknifed UPGMA clustering also noted differences between the communities. These data support the consensus that longer reads result in a finer phylogenetic resolution that may not be achieved by shorter 16S rRNA gene fragments. Our work on the cattle rumen bacterial community demonstrates that utilizing near full-length 16S reads may be useful in conducting a more thorough study, or for developing a niche-specific database to use in analyzing data from shorter read technologies when budgetary constraints preclude use of near-full length 16S sequencing. PMID:27282101

  15. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach

    PubMed Central

    Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P.

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  16. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    PubMed

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  17. Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

    PubMed Central

    Desikan, Srinidhi; Narayanan, Sujatha

    2015-01-01

    Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019

  18. BING: biomedical informatics pipeline for Next Generation Sequencing.

    PubMed

    Kriseman, Jeffrey; Busick, Christopher; Szelinger, Szabolcs; Dinu, Valentin

    2010-06-01

    High throughput parallel genomic sequencing (Next Generation Sequencing, NGS) shifts the bottleneck in sequencing processes from experimental data production to computationally intensive informatics-based data analysis. This manuscript introduces a biomedical informatics pipeline (BING) for the analysis of NGS data that offers several novel computational approaches to 1. image alignment, 2. signal correlation, compensation, separation, and pixel-based cluster registration, 3. signal measurement and base calling, 4. quality control and accuracy measurement. These approaches address many of the informatics challenges, including image processing, computational performance, and accuracy. These new algorithms are benchmarked against the Illumina Genome Analysis Pipeline. BING is the one of the first software tools to perform pixel-based analysis of NGS data. When compared to the Illumina informatics tool, BING's pixel-based approach produces a significant increase in the number of sequence reads, while reducing the computational time per experiment and error rate (<2%). This approach has the potential of increasing the density and throughput of NGS technologies. PMID:19925883

  19. Metagenome of microorganisms associated with the toxic Cyanobacteria Microcystis aeruginosa analyzed using the 454 sequencing platform

    NASA Astrophysics Data System (ADS)

    Li, Nan; Zhang, Lei; Li, Fuchao; Wang, Yuezhu; Zhu, Yongqiang; Kang, Hui; Wang, Shengyue; Qin, Song

    2011-05-01

    In this study, the 454 pyrosequencing technology was used to analyze the DNA of the Microcystis aeruginosa symbiosis system from cyanobacterial algal blooms in Taihu Lake, China. We generated 183 228 reads with an average length of 248 bp. Running the 454 assembly algorithm over our sequences yielded 22 239 significant contigs. After excluding the M. aeruginosa sequences, we obtained 1 322 assembled contigs longer than 1 000 bp. Taxonomic analysis indicated that four kingdoms were represented in the community: Archaea ( n = 9; 0.01%), Bacteria ( n = 98 921; 99.6%), Eukaryota ( n = 373; 3.7%), and Viruses ( n = 18; 0.02%). The bacterial sequences were predominantly Alphaproteobacteria ( n = 41 805; 83.3%), Betaproteobacteria ( n = 5 254; 10.5%) and Gammaproteobacteria ( n = 1 180; 2.4%). Gene annotations and assignment of COG (clusters of orthologous groups) functional categories indicate that a large number of the predicted genes are involved in metabolic, genetic, and environmental information processes. Our results demonstrate the extraordinary diversity of a microbial community in an ectosymbiotic system and further establish the tremendous utility of pyrosequencing.

  20. Integrated Next-Generation Sequencing and Avatar Mouse Models for Personalized Cancer Treatment

    PubMed Central

    Garralda, Elena; Paz, Keren; López-Casas, Pedro P.; Jones, Siân; Katz, Amanda; Kann, Lisa M.; López-Rios, Fernando; Sarno, Francesca; Al-Shahrour, Fátima; Vasquez, David; Bruckheimer, Elizabeth; Angiuoli, Samuel V.; Calles, Antonio; Diaz, Luis A.; Velculescu, Victor E.; Valencia, Alfonso; Sidransky, David; Hidalgo, Manuel

    2015-01-01

    Background Current technology permits an unbiased massive analysis of somatic genetic alterations from tumor DNA as well as the generation of individualized mouse xenografts (Avatar models). This work aimed to evaluate our experience integrating these two strategies to personalize the treatment of patients with cancer. Methods We performed whole-exome sequencing analysis of 25 patients with advanced solid tumors to identify putatively actionable tumor-specific genomic alterations. Avatar models were used as an in vivo platform to test proposed treatment strategies. Results Successful exome sequencing analyses have been obtained for 23 patients. Tumor-specific mutations and copy-number variations were identified. All samples profiled contained relevant genomic alterations. Tumor was implanted to create an Avatar model from 14 patients and 10 succeeded. Occasionally, actionable alterations such as mutations in NF1, PI3KA, and DDR2 failed to provide any benefit when a targeted drug was tested in the Avatar and, accordingly, treatment of the patients with these drugs was not effective. To date, 13 patients have received a personalized treatment and 6 achieved durable partial remissions. Prior testing of candidate treatments in Avatar models correlated with clinical response and helped to select empirical treatments in some patients with no actionable mutations. Conclusion The use of full genomic analysis for cancer care is encouraging but presents important challenges that will need to be solved for broad clinical application. Avatar models are a promising investigational platform for therapeutic decision making. While limitations still exist, this strategy should be further tested. PMID:24634382

  1. Computational characterisation of cancer molecular profiles derived using next generation sequencing

    PubMed Central

    Oleksiewicz, Urszula; Tomczak, Katarzyna; Woropaj, Jakub; Markowska, Monika; Stępniak, Piotr

    2015-01-01

    Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets. PMID:25691827

  2. Improved timing sequence generator on the DIII-D tokamak

    NASA Astrophysics Data System (ADS)

    Colio, R. A.; Finkenthal, D. F.; Deterly, T. M.

    2011-10-01

    The DIII-D tokamak uses a central clock source and trigger system to synchronize plant operations and diagnostics. The system uses a bi-phase encoding technique to send both clock and trigger signals to remote receivers, and supports both pre-programmed sequences of triggers as well as event-driven triggers. A 1 MHz timebase is used and triggers are encoded as eight-bit hexadecimal words. Currently, the system relies on a cascaded series of CAMAC-based delay generators to produce the trigger sequence. We present a modern and more versatile implementation based on a single FPGA (field programmable gate array) capable of providing clock rates upward of 100 MHz while maintaining compatibility with existing equipment. A proposal for system clock synchronization with GPS for improved precision is also presented. Work supported in part by US DOE under DE-FC02-04ER54698 and the National Undergraduate Fellowship in Fusion Science and Engineering.

  3. Second-generation environmental sequencing unmasks marine metazoan biodiversity

    PubMed Central

    Fonseca, Vera G.; Carvalho, Gary R.; Sung, Way; Johnson, Harriet F.; Power, Deborah M.; Neill, Simon P.; Packer, Margaret; Blaxter, Mark L.; Lambshead, P. John D.; Thomas, W. Kelley; Creer, Simon

    2010-01-01

    Biodiversity is of crucial importance for ecosystem functioning, sustainability and resilience, but the magnitude and organization of marine diversity at a range of spatial and taxonomic scales are undefined. In this paper, we use second-generation sequencing to unmask putatively diverse marine metazoan biodiversity in a Scottish temperate benthic ecosystem. We show that remarkable differences in diversity occurred at microgeographical scales and refute currently accepted ecological and taxonomic paradigms of meiofaunal identity, rank abundance and concomitant understanding of trophic dynamics. Richness estimates from the current benchmarked Operational Clustering of Taxonomic Units from Parallel UltraSequencing analyses are broadly aligned with those derived from morphological assessments. However, the slope of taxon rarefaction curves for many phyla remains incomplete, suggesting that the true alpha diversity is likely to exceed current perceptions. The approaches provide a rapid, objective and cost-effective taxonomic framework for exploring links between ecosystem structure and function of all hitherto intractable, but ecologically important, communities. PMID:20981026

  4. Perspectives of integrative cancer genomics in next generation sequencing era.

    PubMed

    Kwon, So Mee; Cho, Hyunwoo; Choi, Ji Hye; Jee, Byul A; Jo, Yuna; Woo, Hyun Goo

    2012-06-01

    The explosive development of genomics technologies including microarrays and next generation sequencing (NGS) has provided comprehensive maps of cancer genomes, including the expression of mRNAs and microRNAs, DNA copy numbers, sequence variations, and epigenetic changes. These genome-wide profiles of the genetic aberrations could reveal the candidates for diagnostic and/or prognostic biomarkers as well as mechanistic insights into tumor development and progression. Recent efforts to establish the huge cancer genome compendium and integrative omics analyses, so-called "integromics", have extended our understanding on the cancer genome, showing its daunting complexity and heterogeneity. However, the challenges of the structured integration, sharing, and interpretation of the big omics data still remain to be resolved. Here, we review several issues raised in cancer omics data analysis, including NGS, focusing particularly on the study design and analysis strategies. This might be helpful to understand the current trends and strategies of the rapidly evolving cancer genomics research. PMID:23105932

  5. Tablet: Visualizing Next-Generation Sequence Assemblies and Mappings.

    PubMed

    Milne, Iain; Bayer, Micha; Stephen, Gordon; Cardle, Linda; Marshall, David

    2016-01-01

    This chapter is designed to be a practical guide to using Tablet for the visualization of next/second-generation (NGS) sequencing data. NGS data is being produced more frequently and in greater data volumes every year. As such, it is increasingly important to have tools which enable biologists and bioinformaticians to understand and gain key insights into their data. Visualization can play a key role in the exploration of such data as well as aid in the visual validation of sequence assemblies and features such as single nucleotide polymorphisms (SNPs). We aim to show several use cases which demonstrate Tablet's ability to visually highlight various situations of interest which can arise in NGS data. PMID:26519411

  6. Next-Generation Sequencing: Role in Gynecologic Cancers.

    PubMed

    Evans, Tarra; Matulonis, Ursula

    2016-09-01

    Next-generation sequencing (NGS) has risen to the forefront of tumor analysis and has enabled unprecedented advances in the molecular profiling of solid tumors. Through massively parallel sequencing, previously unrecognized genomic alterations have been unveiled in many malignancies, including gynecologic cancers, thus expanding the potential repertoire for the use of targeted therapies. NGS has expanded the understanding of the genomic foundation of gynecologic malignancies and has allowed identification of germline and somatic mutations associated with cancer development, enabled tumor reclassification, and helped determine mechanisms of treatment resistance. NGS has also facilitated rationale therapeutic strategies based on actionable molecular aberrations. However, issues remain regarding cost and clinical utility. This review covers NGS analysis of and its impact thus far on gynecologic cancers, specifically ovarian, endometrial, cervical, and vulvar cancers. PMID:27587626

  7. Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.

    PubMed

    Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L

    2016-05-01

    Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies. PMID:26944624

  8. SRAdb: query and use public next-generation sequencing data from within R

    PubMed Central

    2013-01-01

    Background The Sequence Read Archive (SRA) is the largest public repository of sequencing data from the next generation of sequencing platforms including Illumina (Genome Analyzer, HiSeq, MiSeq, .etc), Roche 454 GS System, Applied Biosystems SOLiD System, Helicos Heliscope, PacBio RS, and others. Results SRAdb is an attempt to make queries of the metadata associated with SRA submission, study, sample, experiment and run more robust and precise, and make access to sequencing data in the SRA easier. We have parsed all the SRA metadata into a SQLite database that is routinely updated and can be easily distributed. The SRAdb R/Bioconductor package then utilizes this SQLite database for querying and accessing metadata. Full text search functionality makes querying metadata very flexible and powerful. Fastq files associated with query results can be downloaded easily for local analysis. The package also includes an interface from R to a popular genome browser, the Integrated Genomics Viewer. Conclusions SRAdb Bioconductor package provides a convenient and integrated framework to query and access SRA metadata quickly and powerfully from within R. PMID:23323543

  9. Next-generation sequencing technology in clinical virology.

    PubMed

    Capobianchi, M R; Giombini, E; Rozera, G

    2013-01-01

    Recent advances in nucleic acid sequencing technologies, referred to as 'next-generation' sequencing (NGS), have produced a true revolution and opened new perspectives for research and diagnostic applications, owing to the high speed and throughput of data generation. So far, NGS has been applied to metagenomics-based strategies for the discovery of novel viruses and the characterization of viral communities. Additional applications include whole viral genome sequencing, detection of viral genome variability, and the study of viral dynamics. These applications are particularly suitable for viruses such as human immunodeficiency virus, hepatitis B virus, and hepatitis C virus, whose error-prone replication machinery, combined with the high replication rate, results, in each infected individual, in the formation of many genetically related viral variants referred to as quasi-species. The viral quasi-species, in turn, represents the substrate for the selective pressure exerted by the immune system or by antiviral drugs. With traditional approaches, it is difficult to detect and quantify minority genomes present in viral quasi-species that, in fact, may have biological and clinical relevance. NGS provides, for each patient, a dataset of clonal sequences that is some order of magnitude higher than those obtained with conventional approaches. Hence, NGS is an extremely powerful tool with which to investigate previously inaccessible aspects of viral dynamics, such as the contribution of different viral reservoirs to replicating virus in the course of the natural history of the infection, co-receptor usage in minority viral populations harboured by different cell lineages, the dynamics of development of drug resistance, and the re-emergence of hidden genomes after treatment interruptions. The diagnostic application of NGS is just around the corner. PMID:23279287

  10. Using next generation transcriptome sequencing to predict an ectomycorrhizal metabolome

    PubMed Central

    2011-01-01

    Background Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. Results We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides) roots. The transcriptomic data was used to identify statistically significantly expressed gene models using a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. Conclusions The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems. PMID:21569493

  11. Using next generation transcriptome sequencing to predict an ectomycorrhizal metablome.

    SciTech Connect

    Larsen, P. E.; Sreedasyam, A.; Trivedi, G; Podila, G. K.; Cseke, L. J.; Collart, F. R.

    2011-05-13

    Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides) roots. The transcriptomic data was used to identify statistically significantly expressed gene models using a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems.

  12. Generation and functional assessment of 3D multicellular spheroids in droplet based microfluidics platform.

    PubMed

    Sabhachandani, P; Motwani, V; Cohen, N; Sarkar, S; Torchilin, V; Konry, T

    2016-02-01

    Here we describe a robust, microfluidic technique to generate and analyze 3D tumor spheroids, which resembles tumor microenvironment and can be used as a more effective preclinical drug testing and screening model. Monodisperse cell-laden alginate droplets were generated in polydimethylsiloxane (PDMS) microfluidic devices that combine T-junction droplet generation and external gelation for spheroid formation. The proposed approach has the capability to incorporate multiple cell types. For the purposes of our study, we generated spheroids with breast cancer cell lines (MCF-7 drug sensitive and resistant) and co-culture spheroids of MCF-7 together with a fibroblast cell line (HS-5). The device has the capability to house 1000 spheroids on chip for drug screening and other functional analysis. Cellular viability of spheroids in the array part of the device was maintained for two weeks by continuous perfusion of complete media into the device. The functional performance of our 3D tumor models and a dose dependent response of standard chemotherapeutic drug, doxorubicin (Dox) and standard drug combination Dox and paclitaxel (PCT) was analyzed on our chip-based platform. Altogether, our work provides a simple and novel, in vitro platform to generate, image and analyze uniform, 3D monodisperse alginate hydrogel tumors for various omic studies and therapeutic efficiency screening, an important translational step before in vivo studies. PMID:26686985

  13. Optimization of next-generation sequencing transcriptome annotation for species lacking sequenced genomes.

    PubMed

    Ockendon, Nina F; O'Connell, Lauren A; Bush, Stephen J; Monzón-Sandoval, Jimena; Barnes, Holly; Székely, Tamás; Hofmann, Hans A; Dorus, Steve; Urrutia, Araxi O

    2016-03-01

    Next-generation sequencing methods, such as RNA-seq, have permitted the exploration of gene expression in a range of organisms which have been studied in ecological contexts but lack a sequenced genome. However, the efficacy and accuracy of RNA-seq annotation methods using reference genomes from related species have yet to be robustly characterized. Here we conduct a comprehensive power analysis employing RNA-seq data from Drosophila melanogaster in conjunction with 11 additional genomes from related Drosophila species to compare annotation methods and quantify the impact of evolutionary divergence between transcriptome and the reference genome. Our analyses demonstrate that, regardless of the level of sequence divergence, direct genome mapping (DGM), where transcript short reads are aligned directly to the reference genome, significantly outperforms the widely used de novo and guided assembly-based methods in both the quantity and accuracy of gene detection. Our analysis also reveals that DGM recovers a more representative profile of Gene Ontology functional categories, which are often used to interpret emergent patterns in genomewide expression analyses. Lastly, analysis of available primate RNA-seq data demonstrates the applicability of our observations across diverse taxa. Our quantification of annotation accuracy and reduced gene detection associated with sequence divergence thus provides empirically derived guidelines for the design of future gene expression studies in species without sequenced genomes. PMID:26358618

  14. Controls on facies and sequence stratigraphy of an upper Miocene carbonate ramp and platform, Melilla basin, NE Morocco

    USGS Publications Warehouse

    Cunningham, K.J.; Collins, Luke S.

    2002-01-01

    Upwelling of cool seawater, paleoceanographic circulation, paleoclimate, local tectonics and relative sea-level change controlled the lithofacies and sequence stratigraphy of a carbonate ramp and overlying platform that are part of a temporally well constrained carbonate complex in the Melilla basin, northeastern Morocco. At Melilla, from oldest to youngest, a third-order depositional sequence within the carbonate complex contains (1) a retrogradational, transgressive, warm temperate-type rhodalgal ramp; (2) an early highstand, progradational, bioclastic platform composed mainly of a temperate-type, bivalve-rich molechfor facies; and (3) late highstand, progradational to downstepping, subtropical/tropical-type chlorozoan fringing Porites reefs. The change from rhodalgal ramp to molechfor platform occurred at 7.0??0.14 Ma near the Tortonian/Messinian boundary. During a late stage in the development of the bioclastic platform a transition from temperate-type molechfor facies to subtropical/tropical-type chlorozoan facies occurred and is bracketed by chron 3An.2n (??? 6.3-6.6 Ma). Comparison to a well-dated carbonate complex in southeastern Spain at Cabo de Gata suggests that upwelling of cool seawater influenced production of temperate-type limestone within the ramp and platform at Melilla during postulated late Tortonian-early Messinian subtropical/tropical paleoclimatic conditions in the western Paleo-Mediterranean region. The upwelling of cool seawater across the bioclastic platform at Melilla could be related to the beginning of 'siphoning' of deep, cold Atlantic waters into the Paleo-Mediterranean Sea at 7.17 Ma. The facies change within the bioclastic platform from molechfor to chlorozoan facies may be coincident with a reduction of the siphoning of Atlantic waters and the end of upwelling at Melilla during chron 3An.2n. The ramp contains one retrogradational parasequence and the bioclastic platform three progradational parasequences. Minor erosional surfaces

  15. NGS-Trex: Next Generation Sequencing Transcriptome profile explorer

    PubMed Central

    2013-01-01

    Background Next-Generation Sequencing (NGS) technology has exceptionally increased the ability to sequence DNA in a massively parallel and cost-effective manner. Nevertheless, NGS data analysis requires bioinformatics skills and computational resources well beyond the possibilities of many "wet biology" laboratories. Moreover, most of projects only require few sequencing cycles and standard tools or workflows to carry out suitable analyses for the identification and annotation of genes, transcripts and splice variants found in the biological samples under investigation. These projects can take benefits from the availability of easy to use systems to automatically analyse sequences and to mine data without the preventive need of strong bioinformatics background and hardware infrastructure. Results To address this issue we developed an automatic system targeted to the analysis of NGS data obtained from large-scale transcriptome studies. This system, we named NGS-Trex (NGS Transcriptome profile explorer) is available through a simple web interface http://www.ngs-trex.org and allows the user to upload raw sequences and easily obtain an accurate characterization of the transcriptome profile after the setting of few parameters required to tune the analysis procedure. The system is also able to assess differential expression at both gene and transcript level (i.e. splicing isoforms) by comparing the expression profile of different samples. By using simple query forms the user can obtain list of genes, transcripts, splice sites ranked and filtered according to several criteria. Data can be viewed as tables, text files or through a simple genome browser which helps the visual inspection of the data. Conclusions NGS-Trex is a simple tool for RNA-Seq data analysis mainly targeted to "wet biology" researchers with limited bioinformatics skills. It offers simple data mining tools to explore transcriptome profiles of samples investigated taking advantage of NGS technologies

  16. Translating next generation sequencing to practice: opportunities and necessary steps.

    PubMed

    Kamalakaran, Sitharthan; Varadan, Vinay; Janevski, Angel; Banerjee, Nilanjana; Tuck, David; McCombie, W Richard; Dimitrova, Nevenka; Harris, Lyndsay N

    2013-08-01

    Next-generation sequencing (NGS) approaches for measuring RNA and DNA benefit from greatly increased sensitivity, dynamic range and detection of novel transcripts. These technologies are rapidly becoming the standard for molecular assays and represent huge potential value to the practice of oncology. However, many challenges exist in the transition of these technologies from research application to clinical practice. This review discusses the value of NGS in detecting mutations, copy number changes and RNA quantification and their applications in oncology, the challenges for adoption and the relevant steps that are needed for translating this potential to routine practice. PMID:23769412

  17. Clinical integration of next generation sequencing: a policy analysis.

    PubMed

    Kaufman, David; Curnutte, Margaret; McGuire, Amy L

    2014-01-01

    Clinical next generation sequencing (NGS) technologies are challenging existing regulatory paradigms. We advocate a coordinate policy approach, which first requires a comprehensive understanding of the existing regulatory and legal structures. This paper introduces four key policy domains - including quality assurance, insurance coverage, intellectual property management, and data sharing - that must be addressed to ensure high quality clinical NGS. In bringing these policy issues into conversation through this special issue for the Journal of Law, Medicine & Ethics, we hope to lay the foundation for further discussion by a range of stakeholder groups with diverse and strong interests in the governance of NGS. PMID:25298287

  18. SNP Discovery in the Transcriptome of White Pacific Shrimp Litopenaeus vannamei by Next Generation Sequencing

    PubMed Central

    Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2014-01-01

    The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies. PMID:24498047

  19. Microfluidic platform for on-demand generation of spatially indexed combinatorial droplets

    PubMed Central

    Zec, Helena; Rane, Tushar D.; Wang, Tza-Huei

    2012-01-01

    We propose a highly versatile and programmable nanolitre droplet-based platform that accepts an unlimited number of sample plugs from a multi-well plate, performs digitization of these sample plugs into smaller daughter droplets and subsequent synchronization-free, robust injection of multiple reagents in to the sample daughter droplets on-demand. This platform combines excellent control of valve-based microfluidics with the high-throughput capability of droplet microfluidics. We demonstrate the functioning of a proof-of-concept device which generates combinatorial mixture droplets from a linear array of sample plugs and four different reagents, using food dyes to mimic samples and reagents. Generation of a one dimensional array of the combinatorial mixture droplets on the device leads to automatic spatial indexing of these droplets, precluding the need to include a barcode in each droplet to identify its contents. We expect this platform to further expand the range of applications of droplet microfluidics to include applications requiring high degree of multiplexing as well as high throughput analysis of multiple samples. PMID:22810353

  20. Estimates of acoustic noise generated by supply vessels working with oil-drilling platforms

    NASA Astrophysics Data System (ADS)

    Rutenko, A. N.; Ushchipovskii, V. G.

    2015-09-01

    The paper presents results on spatial measurements of acoustic noise generated by two types of tugs during their movement near the Molikpaq platform and in a dynamic positioning mode during operation with the PA-B platform. Based on the results of these measurements with the aid of simulation and preliminary research of the loss function conducted on acoustic profiles spanning from the platforms to the nearshore Piltun gray whale summer—fall feeding area, the spectra of equivalent point sources are constructed, which make it possible to construct the 1/3-octave spectra of anthropogenic noise at any point of the western profile and estimate the value of their level in a given frequency band with an accuracy of up to 2 dB. Field measurements have shown that in the dynamic positioning mode, the tugs generate 10 dB more noise than during movement; in fact, a diesel electric tug in both modes produced approximately 5 dB less noise than a diesel tug.

  1. The complete mitogenome of Cylindrus obtusus (Helicidae, Ariantinae) using Illumina next generation sequencing

    PubMed Central

    2012-01-01

    Background This study describes how the complete mitogenome of a terrestrial snail, Cylindrus obtusus (Draparnaud, 1805) was sequenced without PCRs from a collection specimen that had been in 70% ethanol for 8 years. The mitogenome was obtained with Illumina GAIIx shot gun sequencing. Although the used specimen was collected relatively recently and kept in a DNA-friendly preservative (not formalin as frequently used with old museum specimens), we believe that the exclusion of PCRs as facilitated by NGS (Next Generation Sequencing) removes a great obstacle in DNA sequencing of collection specimens. A brief comparison is made between our Illumina GAIIx approach and a similar study that made use of the Roche 454-FLX platform. Results The mtDNA sequence of C. obtusus is 14,610 bases in length (about 0.5 kb larger than other stylommatophoran mitogenomes reported hitherto) and contains the 37 genes (13 protein coding genes, two rRNAs and 22 tRNAs) typical for metazoans. Except for a swap between the position of tRNA-Pro and tRNA-Ala, the gene arrangement of C. obtusus is identical to that reported for Cepaea nemoralis. The 'aberrant' rearrangement of tRNA-Thr and COIII compared to that of other Sigmurethra (and the majority of gastropods), is not unique for C. nemoralis (subfamily Helicinae), but is also shown to occur in C. obtusus (subfamily Ariantinae) and might be a synapomorphy for the family Helicidae. Conclusions Natural history collections potentially harbor a wealth of information for the field of evolutionary genetics, but it can be difficult to amplify DNA from such specimens (due to DNA degradation for instance). Because NGS techniques do not rely on primer-directed amplification (PCR) and allow DNA to be fragmented (DNA gets sheared during library preparation), NGS could be a valuable tool for retrieving DNA sequence data from such specimens. A comparison between Illumina GAIIx and the Roche 454 platform suggests that the former might be more suited for de

  2. Small RNAs in angiosperms: sequence characteristics, distribution and generation.

    PubMed

    Chen, Dijun; Meng, Yijun; Ma, Xiaoxia; Mao, Chuanzao; Bai, Youhuang; Cao, Junjie; Gu, Haibin; Wu, Ping; Chen, Ming

    2010-06-01

    High-throughput sequencing (HTS) has opened up a new era for small RNA (sRNA) exploration. Using HTS data for a global survey of sRNAs in 26 angiosperms, elevated GC contents were detected in the monocots, whereas the 5(')-terminal compositions were quite uniform among the angiosperms. Chromosome-wide distribution patterns of sRNAs were investigated by using scrolling-window analysis. We performed de novo natural antisense transcript (NAT) prediction, and found that the overlapping regions of trans-NATs, but not cis-NATs, were hotspots for sRNA generation. One cis-NAT generates phased natural antisense short interfering RNAs (nat-siRNAs) specifically from flowers in Arabidopsis, while one in rice produces phased nat-siRNAs from grains, suggesting their organ-specific regulatory roles. PMID:20378553

  3. Genetic sequence relationships of Winnipegosis platform carbonates, Southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-07-01

    Examination of cores and well-log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger Vail-type sequences separated by regionally persistent unconformities or their correlative conformities. Sea level rise during the early Eifelian caused southeastward onlap of the Ashern Formation onto Middle Silurian carbonates of the Interlake Formation. Maximum flooding, expressed by deepest marine facies and a hardground surface, suggests the existence of a condensed section at the top of the Ashern Formation. This section was developed during the maximum rate of sea level rise. A decrease in the rate of sea level rise resulted in aggradation of lower Winnipegosis units on a gently dipping ramp. These units are presented by nodular and burrowed open-marine limestones with scattered stromatoporoid patch reefs and grainstone shoals. During the subsequent sea level fall, represented by Temple units, a shelf margin with pronounced depositional topography and adjacent starved basin were developed. Temple strata include coral-brachiopod-stromatoporoid reefs and productive fore-reef talus deposits along the shelf-margin rim. With increased rates of sea level fall, the platform interior and shelf margin were subaerially exposed, slope carbonates were dolomitized, and the E-shale was deposited as a lowstand wedge.

  4. Bacterial signaling systems as platforms for rational design of new generations of biosensors.

    PubMed

    Checa, Susana K; Zurbriggen, Matias D; Soncini, Fernando C

    2012-10-01

    Bacterial signal-responsive regulatory circuits have been employed as platform to design and construct whole-cell bacterial biosensors for reporting toxicity. A new generation of biosensors with improved performance and a wide application range has emerged after the application of synthetic biology concepts to biosensor design. Site-directed mutagenesis, directed evolution and domain swapping were applied to upgrade signal detection or to create novel sensor modules. Rewiring of the genetic circuits allows improving the determinations and reduces the heterogeneity of the response between individual reporter cells. Moreover, the assembly of natural or engineered modules to biosensor platforms provides innovative outputs, expanding the range of application of these devises, from monitoring toxics and bioremediation to killing targeted cells. PMID:22658939

  5. Clinical applications of next-generation sequencing in colorectal cancers

    PubMed Central

    Kim, Tae-Min; Lee, Sug-Hyung; Chung, Yeun-Jun

    2013-01-01

    Like other solid tumors, colorectal cancer (CRC) is a genomic disorder in which various types of genomic alterations, such as point mutations, genomic rearrangements, gene fusions, or chromosomal copy number alterations, can contribute to the initiation and progression of the disease. The advent of a new DNA sequencing technology known as next-generation sequencing (NGS) has revolutionized the speed and throughput of cataloguing such cancer-related genomic alterations. Now the challenge is how to exploit this advanced technology to better understand the underlying molecular mechanism of colorectal carcinogenesis and to identify clinically relevant genetic biomarkers for diagnosis and personalized therapeutics. In this review, we will introduce NGS-based cancer genomics studies focusing on those of CRC, including a recent large-scale report from the Cancer Genome Atlas. We will mainly discuss how NGS-based exome-, whole genome- and methylome-sequencing have extended our understanding of colorectal carcinogenesis. We will also introduce the unique genomic features of CRC discovered by NGS technologies, such as the relationship with bacterial pathogens and the massive genomic rearrangements of chromothripsis. Finally, we will discuss the necessary steps prior to development of a clinical application of NGS-related findings for the advanced management of patients with CRC. PMID:24187453

  6. Application of next-generation sequencing technologies in Neurology

    PubMed Central

    Jiang, Teng; Tan, Meng-Shan

    2014-01-01

    Genetic risk factors that underlie many rare and common neurological diseases remain poorly understood because of the multi-factorial and heterogeneous nature of these disorders. Although genome-wide association studies (GWAS) have successfully uncovered numerous susceptibility genes for these diseases, odds ratios associated with risk alleles are generally low and account for only a small proportion of estimated heritability. These results implicated that there are rare (present in <5% of the population) but not causative variants exist in the pathogenesis of these diseases, which usually have large effect size and cannot be captured by GWAS. With the decreasing cost of next-generation sequencing (NGS) technologies, whole-genome sequencing (WGS) and whole-exome sequencing (WES) have enabled the rapid identification of rare variants with large effect size, which made huge progress in understanding the basis of many Mendelian neurological conditions as well as complex neurological diseases. In this article, recent NGS-based studies that aimed to investigate genetic causes for neurological diseases, including Alzheimer’s disease, Parkinson’s disease, epilepsy, multiple sclerosis, stroke, amyotrophic lateral sclerosis and spinocerebellar ataxias, have been reviewed. In addition, we also discuss the future directions of NGS applications in this article. PMID:25568878

  7. Comparison of DNA Quantification Methods for Next Generation Sequencing

    PubMed Central

    Robin, Jérôme D.; Ludlow, Andrew T.; LaRanger, Ryan; Wright, Woodring E.; Shay, Jerry W.

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library’s heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  8. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    PubMed

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  9. Next Generation Sequencing in Predicting Gene Function in Podophyllotoxin Biosynthesis*

    PubMed Central

    Marques, Joaquim V.; Kim, Kye-Won; Lee, Choonseok; Costa, Michael A.; May, Gregory D.; Crow, John A.; Davin, Laurence B.; Lewis, Norman G.

    2013-01-01

    Podophyllum species are sources of (−)-podophyllotoxin, an aryltetralin lignan used for semi-synthesis of various powerful and extensively employed cancer-treating drugs. Its biosynthetic pathway, however, remains largely unknown, with the last unequivocally demonstrated intermediate being (−)-matairesinol. Herein, massively parallel sequencing of Podophyllum hexandrum and Podophyllum peltatum transcriptomes and subsequent bioinformatics analyses of the corresponding assemblies were carried out. Validation of the assembly process was first achieved through confirmation of assembled sequences with those of various genes previously established as involved in podophyllotoxin biosynthesis as well as other candidate biosynthetic pathway genes. This contribution describes characterization of two of the latter, namely the cytochrome P450s, CYP719A23 from P. hexandrum and CYP719A24 from P. peltatum. Both enzymes were capable of converting (−)-matairesinol into (−)-pluviatolide by catalyzing methylenedioxy bridge formation and did not act on other possible substrates tested. Interestingly, the enzymes described herein were highly similar to methylenedioxy bridge-forming enzymes from alkaloid biosynthesis, whereas candidates more similar to lignan biosynthetic enzymes were catalytically inactive with the substrates employed. This overall strategy has thus enabled facile further identification of enzymes putatively involved in (−)-podophyllotoxin biosynthesis and underscores the deductive power of next generation sequencing and bioinformatics to probe and deduce medicinal plant biosynthetic pathways. PMID:23161544

  10. Next generation sequencing technologies: tool to study avian virus diversity.

    PubMed

    Kapgate, S S; Barbuddhe, S B; Kumanan, K

    2015-03-01

    Increased globalisation, climatic changes and wildlife-livestock interface led to emergence of novel viral pathogens or zoonoses that have become serious concern to avian, animal and human health. High biodiversity and bird migration facilitate spread of the pathogen and provide reservoirs for emerging infectious diseases. Current classical diagnostic methods designed to be virus-specific or aim to be limited to group of viral agents, hinder identifying of novel viruses or viral variants. Recently developed approaches of next-generation sequencing (NGS) provide culture-independent methods that are useful for understanding viral diversity and discovery of novel virus, thereby enabling a better diagnosis and disease control. This review discusses the different possible steps of a NGS study utilizing sequence-independent amplification, high-throughput sequencing and bioinformatics approaches to identify novel avian viruses and their diversity. NGS lead to the identification of a wide range of new viruses such as picobirnavirus, picornavirus, orthoreovirus and avian gamma coronavirus associated with fulminating disease in guinea fowl and is also used in describing viral diversity among avian species. The review also briefly discusses areas of viral-host interaction and disease associated causalities with newly identified avian viruses. PMID:25790045

  11. Deletion of the Pichia pastoris KU70 homologue facilitates platform strain generation for gene expression and synthetic biology.

    PubMed

    Näätsaari, Laura; Mistlberger, Beate; Ruth, Claudia; Hajek, Tanja; Hartner, Franz S; Glieder, Anton

    2012-01-01

    Targeted gene replacement to generate knock-outs and knock-ins is a commonly used method to study the function of unknown genes. In the methylotrophic yeast Pichia pastoris, the importance of specific gene targeting has increased since the genome sequencing projects of the most commonly used strains have been accomplished, but rapid progress in the field has been impeded by inefficient mechanisms for accurate integration. To improve gene targeting efficiency in P. pastoris, we identified and deleted the P. pastoris KU70 homologue. We observed a substantial increase in the targeting efficiency using the two commonly known and used integration loci HIS4 and ADE1, reaching over 90% targeting efficiencies with only 250-bp flanking homologous DNA. Although the ku70 deletion strain was noted to be more sensitive to UV rays than the corresponding wild-type strain, no lethality, severe growth retardation or loss of gene copy numbers could be detected during repetitive rounds of cultivation and induction of heterologous protein production. Furthermore, we demonstrated the use of the ku70 deletion strain for fast and simple screening of genes in the search of new auxotrophic markers by targeting dihydroxyacetone synthase and glycerol kinase genes. Precise knock-out strains for the well-known P. pastoris AOX1, ARG4 and HIS4 genes and a whole series of expression vectors were generated based on the wild-type platform strain, providing a broad spectrum of precise tools for both intracellular and secreted production of heterologous proteins utilizing various selection markers and integration strategies for targeted or random integration of single and multiple genes. The simplicity of targeted integration in the ku70 deletion strain will further support protein production strain generation and synthetic biology using P. pastoris strains as platform hosts. PMID:22768112

  12. Application of next-generation sequencing technologies in virology

    PubMed Central

    Chapman, David; Dixon, Linda; Chantrey, Julian; Darby, Alistair C.; Hall, Neil

    2012-01-01

    The progress of science is punctuated by the advent of revolutionary technologies that provide new ways and scales to formulate scientific questions and advance knowledge. Following on from electron microscopy, cell culture and PCR, next-generation sequencing is one of these methodologies that is now changing the way that we understand viruses, particularly in the areas of genome sequencing, evolution, ecology, discovery and transcriptomics. Possibilities for these methodologies are only limited by our scientific imagination and, to some extent, by their cost, which has restricted their use to relatively small numbers of samples. Challenges remain, including the storage and analysis of the large amounts of data generated. As the chemistries employed mature, costs will decrease. In addition, improved methods for analysis will become available, opening yet further applications in virology including routine diagnostic work on individuals, and new understanding of the interaction between viral and host transcriptomes. An exciting era of viral exploration has begun, and will set us new challenges to understand the role of newly discovered viral diversity in both disease and health. PMID:22647373

  13. Next-Generation Sequencing and Genome Editing in Plant Virology.

    PubMed

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007

  14. Next-Generation Sequencing and Genome Editing in Plant Virology

    PubMed Central

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21–24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007

  15. SMITH: a LIMS for handling next-generation sequencing workflows

    PubMed Central

    2014-01-01

    Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The

  16. Addressing Benefits, Risks and Consent in Next Generation Sequencing Studies

    PubMed Central

    Meller, R

    2016-01-01

    The sequencing of the human genome and technological advances in DNA sequencing have led to a revolution with respect to DNA sequencing and its potential to diagnose genetic disorders. However, requests for open access to genomic data must be balanced against the guiding principles of the Common Rule for human subject research. Unfortunately, the risks to patients involved in genomic studies are still evolving and as such may not be clear to learned and well-intentioned scientists. Central to this issue are the strategies that enable human participants in such studies to remain anonymous, or de-identified. The wealth of genomic data on the Internet in genomic data repositories and other databases has enabled de-identified data to be broken and research subjects to be identified. The security of de-identification neglects the fact that DNA itself is an identifying element. Therefore, it is questionable whether data security standards can ever truly protect the identity of a patient, under the current conditions or in the future. As Big Data methodologies advance, additional sources of data may enable the re-identification of patients enrolled in next-generation sequencing (NGS) studies. As such, it is time to re-evaluate the risks of sharing genomic data and establish new guidelines for good practices. In this commentary, I address the challenges facing federally funded investigators who need to strike a balance between compliance with federal (US) rules for human subjects and the recent requirement for open access/sharing of data from National Institute for Health (NIH)-funded studies involving human subjects. PMID:27375922

  17. Characterization of sequence-specific errors in various next-generation sequencing systems.

    PubMed

    Shin, Sunguk; Park, Joonhong

    2016-03-01

    Next-generation sequencing (NGS) is a popular method for assessing the molecular diversity of microbial communities without cultivation, for identifying polymorphisms in populations, and for comparing genomes and transcriptomes. However, sequence-specific errors (SSEs) by NGS systems can result in genome mis-assembly, overestimation of diversity in microbial community analyses, and false polymorphism discovery. SSEs can be particularly problematic due to rich microbial biodiversity and genomes containing frequent repeats. In this study, SSEs in public data from all popular NGS systems were discovered using a Markov chain model and hotspots for sequence errors were identified. Deletion errors were frequently preceded by homopolymers in non-Illumina NGS systems, such as GS FLX+. Substitution errors were often related to high GC contents and long G/C homopolymers in Illumina sequencing systems such as HiSeq. After removal of long G/C homopolymers in HiSeq, the average lengths of contigs and average SNP quality increased. SSEs were selectively removed from our mock community data by quality filtering, and a bias against specific microbes was identified. Our findings provide a scientific basis for filtering poor-quality reads, correcting deletion errors, preventing genome mis-assembly, and accurately assessing microbial community compositions and polymorphisms. PMID:26790373

  18. Molecular diagnostics of a single drug-resistant multiple myeloma case using targeted next-generation sequencing

    PubMed Central

    Ikeda, Hiroshi; Ishiguro, Kazuya; Igarashi, Tetsuyuki; Aoki, Yuka; Hayashi, Toshiaki; Ishida, Tadao; Sasaki, Yasushi; Tokino, Takashi; Shinomura, Yasuhisa

    2015-01-01

    A 69-year-old man was diagnosed with IgG λ-type multiple myeloma (MM), Stage II in October 2010. He was treated with one cycle of high-dose dexamethasone. After three cycles of bortezomib, the patient exhibited slow elevations in the free light-chain levels and developed a significant new increase of serum M protein. Bone marrow cytogenetic analysis revealed a complex karyotype characteristic of malignant plasma cells. To better understand the molecular pathogenesis of this patient, we sequenced for mutations in the entire coding regions of 409 cancer-related genes using a semiconductor-based sequencing platform. Sequencing analysis revealed eight nonsynonymous somatic mutations in addition to several copy number variants, including CCND1 and RB1. These alterations may play roles in the pathobiology of this disease. This targeted next-generation sequencing can allow for the prediction of drug resistance and facilitate improvements in the treatment of MM patients. PMID:26491355

  19. De novo transcriptome analysis of an imminent biofuel crop, Camelina sativa L. using Illumina GAIIX sequencing platform and identification of SSR markers.

    PubMed

    Mudalkar, Shalini; Golla, Ramesh; Ghatty, Sreenivas; Reddy, Attipalli Ramachandra

    2014-01-01

    Camelina sativa L. is an emerging biofuel crop with potential applications in industry, medicine, cosmetics and human nutrition. The crop is unexploited owing to very limited availability of transcriptome and genomic data. In order to analyse the various metabolic pathways, we performed de novo assembly of the transcriptome on Illumina GAIIX platform with paired end sequencing for obtaining short reads. The sequencing output generated a FastQ file size of 2.97 GB with 10.83 million reads having a maximum read length of 101 nucleotides. The number of contigs generated was 53,854 with maximum and minimum lengths of 10,086 and 200 nucleotides respectively. These trancripts were annotated using BLAST search against the Aracyc, Swiss-Prot, TrEMBL, gene ontology and clusters of orthologous groups (KOG) databases. The genes involved in lipid metabolism were studied and the transcription factors were identified. Sequence similarity studies of Camelina with the other related organisms indicated the close relatedness of Camelina with Arabidopsis. In addition, bioinformatics analysis revealed the presence of a total of 19,379 simple sequence repeats. This is the first report on Camelina sativa L., where the transcriptome of the entire plant, including seedlings, seed, root, leaves and stem was done. Our data established an excellent resource for gene discovery and provide useful information for functional and comparative genomic studies in this promising biofuel crop. PMID:24002439

  20. Low Diversity in the Mitogenome of Sperm Whales Revealed by Next-Generation Sequencing

    PubMed Central

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C. Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity. PMID:23254394

  1. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  2. Sensor Web Approach For Next-Generation Research Aircraft Platform Data Systems

    NASA Astrophysics Data System (ADS)

    Sorenson, C.; Freudinger, L.; Yarbrough, S.

    2006-12-01

    A NASA project creating sensor web tools has resulted in systems which provide network telemetry for airborne experiments, while also providing traditional data system functions. The Research Environment for Vehicle-Embedded Analysis on Linux (REVEAL) software and hardware has now been tested on many Airborne Science campaigns and on several aircraft platforms, both manned and UAV, with participants across the Internet monitoring and contributing to the success of each mission. The software is a self-configuring, self-documenting framework based on open-standards XML which can wrap any schema such as SensorML to automatically generate metadata. Experimenters can participate by providing real-time instrument data in whatever format is most convenient to them. Buffering middleware enables efficient data distribution across the network, where applications can add value and create live displays, despite very limited air-ground bandwidth for telemetering the merged data streams. The system architecture is described, and plans are described to replace or install systems on several NASA Airborne Science platform aircraft. These new systems will be transparent to legacy instruments while enabling sensor web and telepresence applications, and provide a common interface across the platforms.

  3. FaceTOON: a unified platform for feature-based cartoon expression generation

    NASA Astrophysics Data System (ADS)

    Zaharia, Titus; Marre, Olivier; Prêteux, Françoise; Monjaux, Perrine

    2008-02-01

    This paper presents the FaceTOON system, a semi-automatic platform dedicated to the creation of verbal and emotional facial expressions, within the applicative framework of 2D cartoon production. The proposed FaceTOON platform makes it possible to rapidly create 3D facial animations with a minimum amount of user interaction. In contrast with existing commercial 3D modeling softwares, which usually require from the users advanced 3D graphics skills and competences, the FaceTOON system is based exclusively on 2D interaction mechanisms, the 3D modeling stage being completely transparent for the user. The system takes as input a neutral 3D face model, free of any facial feature, and a set of 2D drawings, representing the desired facial features. A 2D/3D virtual mapping procedure makes it possible to obtain a ready-for-animation model which can be directly manipulated and deformed for generating expressions. The platform includes a complete set of dedicated tools for 2D/3D interactive deformation, pose management, key-frame interpolation and MPEG-4 compliant animation and rendering. The proposed FaceTOON system is currently considered for industrial evaluation and commercialization by the Quadraxis company.

  4. A distributed system for fast alignment of next-generation sequencing data

    PubMed Central

    Srimani, Jaydeep K.; Wu, Po-Yen; Phan, John H.; Wang, May D.

    2016-01-01

    We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.

  5. QuRe: software for viral quasispecies reconstruction from next-generation sequencing data

    PubMed Central

    Prosperi, Mattia C. F.; Salemi, Marco

    2012-01-01

    Summary: Next-generation sequencing (NGS) is an ideal framework for the characterization of highly variable pathogens, with a deep resolution able to capture minority variants. However, the reconstruction of all variants of a viral population infecting a host is a challenging task for genome regions larger than the average NGS read length. QuRe is a program for viral quasispecies reconstruction, specifically developed to analyze long read (>100 bp) NGS data. The software performs alignments of sequence fragments against a reference genome, finds an optimal division of the genome into sliding windows based on coverage and diversity and attempts to reconstruct all the individual sequences of the viral quasispecies—along with their prevalence—using a heuristic algorithm, which matches multinomial distributions of distinct viral variants overlapping across the genome division. QuRe comes with a built-in Poisson error correction method and a post-reconstruction probabilistic clustering, both parameterized on given error rates in homopolymeric and non-homopolymeric regions. Availability: QuRe is platform-independent, multi-threaded software implemented in Java. It is distributed under the GNU General Public License, available at https://sourceforge.net/projects/qure/. Contact: ahnven@yahoo.it; ahnven@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22088846

  6. A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies

    PubMed Central

    Zhang, Wenyu; Chen, Jiajia; Yang, Yang; Tang, Yifei; Shang, Jing; Shen, Bairong

    2011-01-01

    The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, maximum random access memory (RAM) occupancy, assembly accuracy and integrity, our study indicate that string-based assemblers, overlap-layout-consensus (OLC) assemblers are well-suited for very short reads and longer reads of small genomes respectively. For large datasets of more than hundred millions of short reads, De Bruijn graph-based assemblers would be more appropriate. In terms of software implementation, string-based assemblers are superior to graph-based ones, of which SOAPdenovo is complex for the creation of configuration file. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers or the developing of novel assemblers. PMID:21423806

  7. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data

    PubMed Central

    Beerenwinkel, Niko; Günthard, Huldrych F.; Roth, Volker; Metzner, Karin J.

    2012-01-01

    Many viruses, including the clinically relevant RNA viruses HIV (human immunodeficiency virus) and HCV (hepatitis C virus), exist in large populations and display high genetic heterogeneity within and between infected hosts. Assessing intra-patient viral genetic diversity is essential for understanding the evolutionary dynamics of viruses, for designing effective vaccines, and for the success of antiviral therapy. Next-generation sequencing (NGS) technologies allow the rapid and cost-effective acquisition of thousands to millions of short DNA sequences from a single sample. However, this approach entails several challenges in experimental design and computational data analysis. Here, we review the entire process of inferring viral diversity from sample collection to computing measures of genetic diversity. We discuss sample preparation, including reverse transcription and amplification, and the effect of experimental conditions on diversity estimates due to in vitro base substitutions, insertions, deletions, and recombination. The use of different NGS platforms and their sequencing error profiles are compared in the context of various applications of diversity estimation, ranging from the detection of single nucleotide variants (SNVs) to the reconstruction of whole-genome haplotypes. We describe the statistical and computational challenges arising from these technical artifacts, and we review existing approaches, including available software, for their solution. Finally, we discuss open problems, and highlight successful biomedical applications and potential future clinical use of NGS to estimate viral diversity. PMID:22973268

  8. Mutation Detection in Patients with Retinal Dystrophies Using Targeted Next Generation Sequencing

    PubMed Central

    Weisschuh, Nicole; Mayer, Anja K.; Strom, Tim M.; Kohl, Susanne; Glöckle, Nicola; Schubach, Max; Andreasson, Sten; Bernd, Antje; Birch, David G.; Hamel, Christian P.; Heckenlively, John R.; Jacobson, Samuel G.; Kamme, Christina; Kellner, Ulrich; Kunstmann, Erdmute; Maffei, Pietro; Reiff, Charlotte M.; Rohrschneider, Klaus; Rosenberg, Thomas; Rudolph, Günther; Vámos, Rita; Varsányi, Balázs; Weleber, Richard G.; Wissinger, Bernd

    2016-01-01

    Retinal dystrophies (RD) constitute a group of blinding diseases that are characterized by clinical variability and pronounced genetic heterogeneity. The different nonsyndromic and syndromic forms of RD can be attributed to mutations in more than 200 genes. Consequently, next generation sequencing (NGS) technologies are among the most promising approaches to identify mutations in RD. We screened a large cohort of patients comprising 89 independent cases and families with various subforms of RD applying different NGS platforms. While mutation screening in 50 cases was performed using a RD gene capture panel, 47 cases were analyzed using whole exome sequencing. One family was analyzed using whole genome sequencing. A detection rate of 61% was achieved including mutations in 34 known and two novel RD genes. A total of 69 distinct mutations were identified, including 39 novel mutations. Notably, genetic findings in several families were not consistent with the initial clinical diagnosis. Clinical reassessment resulted in refinement of the clinical diagnosis in some of these families and confirmed the broad clinical spectrum associated with mutations in RD genes. PMID:26766544

  9. Challenges and opportunities of next-generation sequencing: a cytopathologist's perspective.

    PubMed

    Vigliar, E; Malapelle, U; de Luca, C; Bellevicine, C; Troncone, G

    2015-10-01

    Molecular cytopathology has gene sequencing as its core technology. Until recently, cytological samples were only tested by sequential single-gene mutational tests. Today, with the better understanding of the molecular events involved in malignancy and the mechanisms of pharmacotherapy, larger gene panels are more informative than a single biomarker. Next-generation sequencing (NGS), matched with the multiplex capture of targeted gene regions and analysed by sophisticated bioinformatics tools, enables the simultaneous detection of multiple mutations in multiple genes. With the development of miniaturised technology and benchtop sequencers, it is not unlikely that NGS will soon be adopted for routine molecular diagnostics, including cytological samples. This review addresses (1) the most relevant methodological and technical aspects of the NGS analysis workflow and the diverse platforms available; (2) the issues related to daily practice implementation, namely, the cytological sample requirement and the validation procedures; and (3) the opportunities that NGS offers in different fields of cytopathology, to increase mutation detection sensitivity in paucicellular smears and to extend the analysis to a larger number of gene regions. Cytopathologists involvement and coordination in this rapidly evolving field is crucial for the effective implementation of NGS in the present and future cytological practice. PMID:26399861

  10. A proposed Next Generation Service Delivery Platform (NG-SDP) for eHealth domain.

    PubMed

    Andriopoulou, Foteini Gr; Lazarou, Nicolaos G; Lymberopoulos, Dimitrios K

    2012-01-01

    Nowadays, providing healthcare personalized services in user's intelligent space is an important issue for improving personal health, supporting predictive care and saving medical costs. In this paper, we propose an architecture for the Next Generation Service Delivery Platform (NG-SDP), suitable for composing and delivering personalized healthcare services. The core component of NG-SDP is a Context Decision Making Enabler (CDME) that assesses user contextual and bio information to yield personalized services. A prototype implementation of the proposed NG-SDP is also demonstrated. Finally a real case study demonstrates the CDME performance. PMID:23367312

  11. Defining a sample preparation workflow for advanced virus detection and understanding sensitivity by next-generation sequencing.

    PubMed

    Wang, Christopher J; Feng, Szi Fei; Duncan, Paul

    2014-01-01

    The application of next-generation sequencing (also known as deep sequencing or massively parallel sequencing) for adventitious agent detection is an evolving field that is steadily gaining acceptance in the biopharmaceutical industry. In order for this technology to be successfully applied, a robust method that can isolate viral nucleic acids from a variety of biological samples (such as host cell substrates, cell-free culture fluids, viral vaccine harvests, and animal-derived raw materials) must be established by demonstrating recovery of model virus spikes. In this report, we implement the sample preparation workflow developed by Feng et. al. and assess the sensitivity of virus detection in a next-generation sequencing readout using the Illumina MiSeq platform. We describe a theoretical model to estimate the detection of a target virus in a cell lysate or viral vaccine harvest sample. We show that nuclease treatment can be used for samples that contain a high background of non-relevant nucleic acids (e.g., host cell DNA) in order to effectively increase the sensitivity of sequencing target viruses and reduce the complexity of data analysis. Finally, we demonstrate that at defined spike levels, nucleic acids from a panel of model viruses spiked into representative cell lysate and viral vaccine harvest samples can be confidently recovered by next-generation sequencing. PMID:25475632

  12. Management of Incidental Findings in the Era of Next-generation Sequencing

    PubMed Central

    Blackburn, Heather L.; Schroeder, Bradley; Turner, Clesson; Shriver, Craig D.; Ellsworth, Darrell L.; Ellsworth, Rachel E.

    2015-01-01

    Next-generation sequencing (NGS) technologies allow for the generation of whole exome or whole genome sequencing data, which can be used to identify novel genetic alterations associated with defined phenotypes or to expedite discovery of functional variants for improved patient care. Because this robust technology has the ability to identify all mutations within a genome, incidental findings (IF)- genetic alterations associated with conditions or diseases unrelated to the patient’s present condition for which current tests are being performed- may have important clinical ramifications. The current debate among genetic scientists and clinicians focuses on the following questions: 1) should any IF be disclosed to patients, and 2) which IF should be disclosed – actionable mutations, variants of unknown significance, or all IF? Policies for disclosure of IF are being developed for when and how to convey these findings and whether adults, minors, or individuals unable to provide consent have the right to refuse receipt of IF. In this review, we detail current NGS technology platforms, discuss pressing issues regarding disclosure of IF, and how IF are currently being handled in prenatal, pediatric, and adult patients. PMID:26069456

  13. Plasmid-Based Materials as Multiplex Quality Controls and Calibrators for Clinical Next-Generation Sequencing Assays.

    PubMed

    Sims, David J; Harrington, Robin D; Polley, Eric C; Forbes, Thomas D; Mehaffey, Michele G; McGregor, Paul M; Camalier, Corinne E; Harper, Kneshay N; Bouk, Courtney H; Das, Biswajit; Conley, Barbara A; Doroshow, James H; Williams, P Mickey; Lih, Chih-Jian

    2016-05-01

    Although next-generation sequencing technologies have been widely adapted for clinical diagnostic applications, an urgent need exists for multianalyte calibrator materials and controls to evaluate the performance of these assays. Control materials will also play a major role in the assessment, development, and selection of appropriate alignment and variant calling pipelines. We report an approach to provide effective multianalyte controls for next-generation sequencing assays, referred to as the control plasmid spiked-in genome (CPSG). Control plasmids that contain approximately 1000 bases of human genomic sequence with a specific mutation of interest positioned near the middle of the insert and a nearby 6-bp molecular barcode were synthesized, linearized, quantitated, and spiked into genomic DNA derived from formalin-fixed, paraffin-embedded-prepared hapmap cell lines at defined copy number ratios. Serial titration experiments demonstrated the CPSGs performed with similar efficiency of variant detection as formalin-fixed, paraffin-embedded cell line genomic DNA. Repetitive analyses of one lot of CPSGs 90 times during 18 months revealed that the reagents were stable with consistent detection of each of the plasmids at similar variant allele frequencies. CPSGs are designed to work across most next-generation sequencing methods, platforms, and data analysis pipelines. CPSGs are robust controls and can be used to evaluate the performance of different next-generation sequencing diagnostic assays, assess data analysis pipelines, and ensure robust assay performance metrics. PMID:27105923

  14. Next generation sequencing and the future of genetic diagnosis.

    PubMed

    Lohmann, Katja; Klein, Christine

    2014-10-01

    The introduction of next generation sequencing (NGS) has led to an exponential increase of elucidated genetic causes in both extremely rare diseases and common but heterogeneous disorders. It can be applied to the whole or to selected parts of the genome (genome or exome sequencing, gene panels). NGS is not only useful in large extended families with linkage information, but may also be applied to detect de novo mutations or mosaicism in sporadic patients without a prior hypothesis about the mutated gene. Currently, NGS is applied in both research and clinical settings, and there is a rapid transition of research findings to diagnostic applications. These developments may greatly help to minimize the "diagnostic odyssey" for patients as whole-genome analysis can be performed in a few days at reasonable costs compared with gene-by-gene analysis based on Sanger sequencing following diverse clinical tests. Despite the enthusiasm about NGS, one has to keep in mind its limitations, such as a coverage and accuracy of < 100%, resulting in missing variants and false positive findings. In addition, variant interpretation is challenging as there is usually more than one candidate variant found. Therefore, there is an urgent need to define standards for NGS with respect to run quality and variant interpretation, as well as mechanisms of quality control. Further, there are ethical challenges including incidental findings and how to guide unaffected probands seeking direct-to-customer testing. However, taken together, the application of NGS in research and diagnostics provides a tremendous opportunity to better serve our patients. PMID:25052068

  15. Applications of Next-generation Sequencing in Systemic Autoimmune Diseases

    PubMed Central

    Ma, Yiyangzi; Shi, Na; Li, Mengtao; Chen, Fei; Niu, Haitao

    2015-01-01

    Systemic autoimmune diseases are a group of heterogeneous disorders caused by both genetic and environmental factors. Although numerous causal genes have been identified by genome-wide association studies (GWAS), these susceptibility genes are correlated to a relatively low disease risk, indicating that environmental factors also play an important role in the pathogenesis of disease. The intestinal microbiome, as the main symbiotic ecosystem between the host and host-associated microorganisms, has been demonstrated to regulate the development of the body’s immune system and is likely related to genetic mutations in systemic autoimmune diseases. Next-generation sequencing (NGS) technology, with high-throughput capacity and accuracy, provides a powerful tool to discover genomic mutations, abnormal transcription and intestinal microbiome identification for autoimmune diseases. In this review, we briefly outlined the applications of NGS in systemic autoimmune diseases. This review may provide a reference for future studies in the pathogenesis of systemic autoimmune diseases. PMID:26432094

  16. Unrevealed mosaicism in the next-generation sequencing era.

    PubMed

    Gajecka, Marzena

    2016-04-01

    Mosaicism refers to the presence in an individual of normal and abnormal cells that are genotypically distinct and are derived from a single zygote. The incidence of mosaicism events in the human body is underestimated as the genotypes in the mosaic ratio, especially in the low-grade mosaicism, stay unrevealed. This review summarizes various research outcomes and diagnostic questions in relation to different types of mosaicism. The impact of both tested biological material and applied method on the mosaicism detection rate is especially highlighted. As next-generation sequencing technologies constitute a promising methodological solution in mosaicism detection in the coming years, revisions in current diagnostic protocols are necessary to increase the detection rate of the unrevealed mosaicism events. Since mosaicism identification is a complex process, numerous examples of multistep mosaicism investigations are presented and discussed. PMID:26481646

  17. Application of Next-generation Sequencing Technology in Forensic Science

    PubMed Central

    Yang, Yaran; Xie, Bingbing; Yan, Jiangwei

    2014-01-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice. PMID:25462152

  18. The road from next-generation sequencing to personalized medicine

    PubMed Central

    Gonzalez-Garay, Manuel L.

    2015-01-01

    Moving from a traditional medical model of treating pathologies to an individualized predictive and preventive model of personalized medicine promises to reduce the healthcare cost on an overburdened and overwhelmed system. Next-generation sequencing (NGS) has the potential to accelerate the early detection of disorders and the identification of pharmacogenetics markers to customize treatments. This review explains the historical facts that led to the development of NGS along with the strengths and weakness of NGS, with a special emphasis on the analytical aspects used to process NGS data. There are solutions to all the steps necessary for performing NGS in the clinical context where the majority of them are very efficient, but there are some crucial steps in the process that need immediate attention. PMID:26000024

  19. Clinical application of next-generation sequencing for Mendelian diseases.

    PubMed

    Jamuar, Saumya Shekhar; Tan, Ene-Choo

    2015-01-01

    Over the past decade, next-generation sequencing (NGS) has led to an exponential increase in our understanding of the genetic basis of Mendelian diseases. NGS allows for the analysis of multiple regions of the genome in one single reaction and has been shown to be a cost-effective and efficient tool in investigating patients with Mendelian diseases. More recently, NGS has been successfully deployed in the clinics, with a reported diagnostic yield of ~25 %. However, recommendations on clinical implementation of NGS are still evolving with numerous key challenges that impede the widespread use of genetics in everyday medicine. These challenges include when to order, on whom to order, what type of test to order, and how to interpret and communicate the results, including incidental findings, to the patient and family. In this review, we discuss these challenges and suggest guidelines on implementing NGS in the routine clinical workflow. PMID:26076878

  20. Next-Generation Sequencing in Genetic Hearing Loss

    PubMed Central

    Yan, Denise; Tekin, Mustafa; Blanton, Susan H.

    2013-01-01

    The advent of the $1000 genome has the potential to revolutionize the identification of genes and their mutations underlying genetic disorders. This is especially true for extremely heterogeneous Mendelian conditions such as deafness, where the mutation, and indeed the gene, may be private. The recent technological advances in target-enrichment methods and next generation sequencing offer a unique opportunity to break through the barriers of limitations imposed by gene arrays. These approaches now allow for the complete analysis of all known deafness-causing genes and will result in a new wave of discoveries of the remaining genes for Mendelian disorders. In this review, we describe commonly used genomic technologies as well as the application of these technologies to the genetic diagnosis of hearing loss (HL) and to the discovery of novel genes for syndromic and nonsyndromic HL. PMID:23738631

  1. Applications of Next-generation Sequencing in Systemic Autoimmune Diseases.

    PubMed

    Ma, Yiyangzi; Shi, Na; Li, Mengtao; Chen, Fei; Niu, Haitao

    2015-08-01

    Systemic autoimmune diseases are a group of heterogeneous disorders caused by both genetic and environmental factors. Although numerous causal genes have been identified by genome-wide association studies (GWAS), these susceptibility genes are correlated to a relatively low disease risk, indicating that environmental factors also play an important role in the pathogenesis of disease. The intestinal microbiome, as the main symbiotic ecosystem between the host and host-associated microorganisms, has been demonstrated to regulate the development of the body's immune system and is likely related to genetic mutations in systemic autoimmune diseases. Next-generation sequencing (NGS) technology, with high-throughput capacity and accuracy, provides a powerful tool to discover genomic mutations, abnormal transcription and intestinal microbiome identification for autoimmune diseases. In this review, we briefly outlined the applications of NGS in systemic autoimmune diseases. This review may provide a reference for future studies in the pathogenesis of systemic autoimmune diseases. PMID:26432094

  2. In vivo generation of DNA sequence diversity for cellular barcoding

    PubMed Central

    Peikon, Ian D.; Gizatullina, Diana I.; Zador, Anthony M.

    2014-01-01

    Heterogeneity is a ubiquitous feature of biological systems. A complete understanding of such systems requires a method for uniquely identifying and tracking individual components and their interactions with each other. We have developed a novel method of uniquely tagging individual cells in vivo with a genetic ‘barcode’ that can be recovered by DNA sequencing. Our method is a two-component system comprised of a genetic barcode cassette whose fragments are shuffled by Rci, a site-specific DNA invertase. The system is highly scalable, with the potential to generate theoretical diversities in the billions. We demonstrate the feasibility of this technique in Escherichia coli. Currently, this method could be employed to track the dynamics of populations of microbes through various bottlenecks. Advances of this method should prove useful in tracking interactions of cells within a network, and/or heterogeneity within complex biological samples. PMID:25013177

  3. Recommendations on e-infrastructures for next-generation sequencing.

    PubMed

    Spjuth, Ola; Bongcam-Rudloff, Erik; Dahlberg, Johan; Dahlö, Martin; Kallio, Aleksi; Pireddu, Luca; Vezzi, Francesco; Korpelainen, Eija

    2016-01-01

    With ever-increasing amounts of data being produced by next-generation sequencing (NGS) experiments, the requirements placed on supporting e-infrastructures have grown. In this work, we provide recommendations based on the collective experiences from participants in the EU COST Action SeqAhead for the tasks of data preprocessing, upstream processing, data delivery, and downstream analysis, as well as long-term storage and archiving. We cover demands on computational and storage resources, networks, software stacks, automation of analysis, education, and also discuss emerging trends in the field. E-infrastructures for NGS require substantial effort to set up and maintain over time, and with sequencing technologies and best practices for data analysis evolving rapidly it is important to prioritize both processing capacity and e-infrastructure flexibility when making strategic decisions to support the data analysis demands of tomorrow. Due to increasingly demanding technical requirements we recommend that e-infrastructure development and maintenance be handled by a professional service unit, be it internal or external to the organization, and emphasis should be placed on collaboration between researchers and IT professionals. PMID:27267963

  4. Applications of Next Generation Sequencing to Blood and Marrow Transplantation

    PubMed Central

    Chapman, Michael; Warren, Edus H.; Wu, Catherine J.

    2011-01-01

    Since the advent of next-generation sequencing (NGS) in 2005, there has been an explosion of published studies employing the technology to tackle previously intractable questions in many disparate biological fields. This has been coupled with technology development that has occurred at a remarkable pace. This review discusses the potential impact of this new technology on the field of blood and marrow stem cell transplantation. Hematologic malignancies have been among the forefront of those cancers whose genomes have been the subject of NGS. Hence, these studies have opened novel areas of biology that can be exploited for prognostic, diagnostic, and therapeutic means. Because of the unprecedented depth, resolution and accuracy achievable by NGS, this technology is well-suited for providing detailed information on the diversity of receptors that govern antigen recognition; this approach has the potential to contribute important insights into understanding the biologic effects of transplantation. Finally, the ability to perform comprehensive tumor sequencing provides a systematic approach to the discovery of genetic alterations that can encode peptides with restricted tumor expression, and hence serve as potential target antigens of GvL responses. Altogether, this increasingly affordable technology will undoubtedly impact the future practice and care of patients with hematologic malignancies. PMID:22226099

  5. Second-generation sequencing for gene discovery in the Brassicaceae.

    PubMed

    Hayward, Alice; Vighnesh, Guru; Delay, Christina; Samian, Mohd Rafizan; Manoli, Sahana; Stiller, Jiri; McKenzie, Megan; Edwards, David; Batley, Jacqueline

    2012-08-01

    The Brassicaceae contains the most diverse collection of agriculturally important crop species of all plant families. Yet, this is one of the few families that do not form functional symbiotic associations with mycorrhizal fungi in the soil for improved nutrient acquisition. The genes involved in this symbiosis were more recently recruited by legumes for symbiotic association with nitrogen-fixing rhizobia bacteria. This study applied second-generation sequencing (SGS) and analysis tools to discover that two such genes, NSP1 (Nodulation Signalling Pathway 1) and NSP2, remain conserved in diverse members of the Brassicaceae despite the absence of these symbioses. We demonstrate the utility of SGS data for the discovery of putative gene homologs and their analysis in complex polyploid crop genomes with little prior sequence information. Furthermore, we show how this data can be applied to enhance downstream reverse genetics analyses. We hypothesize that Brassica NSP genes may function in the root in other plant-microbe interaction pathways that were recruited for mycorrhizal and rhizobial symbioses during evolution. PMID:22765874

  6. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits

    PubMed Central

    Karamitros, Timokratis; Magiorkinis, Gkikas

    2015-01-01

    The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage over the targets and their flanking regions. Oxford Nanopore Technologies recently released an USB3.0-interfaced sequencer, the MinION. To date no particular method for enriching MinION libraries has been standardized. Here, using biotinylated PCR-generated baits in a novel approach, we describe a simple and efficient way for multiplexed enrichment of MinION libraries, overcoming technical limitations related with the chemistry of the sequencing-adapters and the length of the DNA fragments. Using Phage Lambda and Escherichia coli as models we selectively enrich for specific targets, significantly increasing the corresponding read-coverage, eliminating unwanted regions. We show that by capturing genomic fragments, which contain the target sequences, we recover reads extending targeted regions and thus can be used for the determination of potentially unknown flanking sequences. By pooling enriched libraries derived from two distinct E. coli strains and analyzing them in parallel, we demonstrate the efficiency of this method in multiplexed format. Crucially we evaluated the optimal bait size for large fragment libraries and we describe for the first time a standardized method for target enrichment in MinION platform. PMID:26240383

  7. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits.

    PubMed

    Karamitros, Timokratis; Magiorkinis, Gkikas

    2015-12-15

    The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage over the targets and their flanking regions. Oxford Nanopore Technologies recently released an USB3.0-interfaced sequencer, the MinION. To date no particular method for enriching MinION libraries has been standardized. Here, using biotinylated PCR-generated baits in a novel approach, we describe a simple and efficient way for multiplexed enrichment of MinION libraries, overcoming technical limitations related with the chemistry of the sequencing-adapters and the length of the DNA fragments. Using Phage Lambda and Escherichia coli as models we selectively enrich for specific targets, significantly increasing the corresponding read-coverage, eliminating unwanted regions. We show that by capturing genomic fragments, which contain the target sequences, we recover reads extending targeted regions and thus can be used for the determination of potentially unknown flanking sequences. By pooling enriched libraries derived from two distinct E. coli strains and analyzing them in parallel, we demonstrate the efficiency of this method in multiplexed format. Crucially we evaluated the optimal bait size for large fragment libraries and we describe for the first time a standardized method for target enrichment in MinION platform. PMID:26240383

  8. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  9. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  10. A targeted next-generation sequencing method for identifying clinically relevant mutation profiles in lung adenocarcinoma

    PubMed Central

    Shao, Di; Lin, Yongping; Liu, Jilong; Wan, Liang; Liu, Zu; Cheng, Shaomin; Fei, Lingna; Deng, Rongqing; Wang, Jian; Chen, Xi; Liu, Liping; Gu, Xia; Liang, Wenhua; He, Ping; Wang, Jun; Ye, Mingzhi; He, Jianxing

    2016-01-01

    Molecular profiling of lung cancer has become essential for prediction of an individual’s response to targeted therapies. Next-generation sequencing (NGS) is a promising technique for routine diagnostics, but has not been sufficiently evaluated in terms of feasibility, reliability, cost and capacity with routine diagnostic formalin-fixed, paraffin-embedded (FFPE) materials. Here, we report the validation and application of a test based on Ion Proton technology for the rapid characterisation of single nucleotide variations (SNVs), short insertions and deletions (InDels), copy number variations (CNVs), and gene rearrangements in 145 genes with FFPE clinical specimens. The validation study, using 61 previously profiled clinical tumour samples, showed a concordance rate of 100% between results obtained by NGS and conventional test platforms. Analysis of tumour cell lines indicated reliable mutation detection in samples with 5% tumour content. Furthermore, application of the panel to 58 clinical cases, identified at least one actionable mutation in 43 cases, 1.4 times the number of actionable alterations detected by current diagnostic tests. We demonstrated that targeted NGS is a cost-effective and rapid platform to detect multiple mutations simultaneously in various genes with high reproducibility and sensitivity. PMID:26936516

  11. Ocean colour products from geostationary platforms, opportunities with Meteosat Second and Third Generation

    NASA Astrophysics Data System (ADS)

    Kwiatkowska, E. J.; Ruddick, K.; Ramon, D.; Vanhellemont, Q.; Brockmann, C.; Lebreton, C.; Bonekamp, H. G.

    2015-12-01

    Ocean colour applications from medium-resolution polar-orbiting satellite sensors have now matured and evolved into operational services. The examples include the Sentinel-3 OLCI missions of the European Earth Observation Copernicus programme and the VIIRS missions of the US Joint Polar Satellite System programme. Key drivers for Copernicus ocean colour services are the national obligations of the EU member states to report on the quality of marine, coastal and inland waters for the EU Water Framework Directive and Marine Strategy Framework Directive. Further applications include CO2 sequestration, carbon cycle and climate, fisheries and aquaculture management, near-real-time alerting to harmful algae blooms, environmental monitoring and forecasting, and assessment of sediment transport in coastal waters. Ocean colour data from polar-orbiting satellite platforms, however, suffer from fractional coverage, primarily due to clouds, and inadequate resolution of quickly varying processes. Ocean colour remote sensing from geostationary platforms can provide significant improvements in coverage and sampling frequency and support new applications and services. EUMETSAT's SEVIRI instrument on the geostationary Meteosat Second Generation platforms (MSG) is not designed to meet ocean colour mission requirements, however, it has been demonstrated to provide valuable contribution, particularly in combination with dedicated ocean colour polar observations. This paper describes the ongoing effort to develop operational ocean colour water turbidity and related products and user services from SEVIRI. A survey of user requirements and a study of technical capabilities and limitations of the SEVIRI instruments are the basis for this development and are described in this paper. The products will support monitoring of sediment transport, water clarity, and tidal dynamics. Further products and services are anticipated from EUMETSAT's FCI instruments on Meteosat Third Generation

  12. Ocean colour opportunities from Meteosat Second and Third Generation geostationary platforms

    NASA Astrophysics Data System (ADS)

    Kwiatkowska, Ewa J.; Ruddick, Kevin; Ramon, Didier; Vanhellemont, Quinten; Brockmann, Carsten; Lebreton, Carole; Bonekamp, Hans G.

    2016-05-01

    Ocean colour applications from medium-resolution polar-orbiting satellite sensors have now matured and evolved into operational services. These applications are enabled by the Sentinel-3 OLCI space sensors of the European Earth Observation Copernicus programme and the VIIRS sensors of the US Joint Polar Satellite System programme. Key drivers for the Copernicus ocean colour services are the national obligations of the EU member states to report on the quality of marine, coastal and inland waters for the EU Water Framework Directive and Marine Strategy Framework Directive. Further applications include CO2 sequestration, carbon cycle and climate, fisheries and aquaculture management, near-real-time alerting to harmful algae blooms, environmental monitoring and forecasting, and assessment of sediment transport in coastal waters. Ocean colour data from polar-orbiting satellite platforms, however, suffer from fractional coverage, primarily due to clouds, and inadequate resolution of quickly varying processes. Ocean colour remote sensing from geostationary platforms can provide significant improvements in coverage and sampling frequency and support new applications and services. EUMETSAT's SEVIRI instrument on the geostationary Meteosat Second Generation platforms (MSG) is not designed to meet ocean colour mission requirements, however, it has been demonstrated to provide valuable contribution, particularly in combination with dedicated ocean colour polar observations. This paper describes the ongoing effort to develop operational ocean colour water turbidity and related products and user services from SEVIRI. SEVIRI's multi-temporal capabilities can benefit users requiring improved local-area coverage and frequent diurnal observations. A survey of user requirements and a study of technical capabilities and limitations of the SEVIRI instruments are the basis for this development and are described in this paper. The products will support monitoring of sediment transport

  13. Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

    PubMed Central

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  14. Second generation drug-eluting stents: a review of the everolimus-eluting platform.

    PubMed

    Whitbeck, Matthew G; Applegate, Robert J

    2013-01-01

    Everolimus-eluting stents (EES) represent the next generation of drug-eluting stents (DES). Important design modifications include thin strut stent backbones, less inflammatory and more biocompatible polymers, and lower drug dosing. The cobalt chromium EES fluoropolymer XIENCE V stent has been the most extensively studied of such stents. In animal models, this stent demonstrated minimal vessel inflammation, a biologically active endothelium with strut coverage similar to a bare metal stent, and inhibition of intimal hyperplasia comparable to that seen with sirolimus-eluting stents. The SPIRIT family of clinical trials demonstrated low rates of late loss, and clinical restenosis, as well as low rates of very late stent thrombosis. These excellent clinical outcomes addressed limitations of the 1st generation DES, and substantiated widespread clinical use of the EES platform. PMID:23926441

  15. Evolution of a Reconfigurable Processing Platform for a Next Generation Space Software Defined Radio

    NASA Technical Reports Server (NTRS)

    Kacpura, Thomas J.; Downey, Joseph A.; Anderson, Keffery R.; Baldwin, Keith

    2014-01-01

    The National Aeronautics and Space Administration (NASA)Harris Ka-Band Software Defined Radio (SDR) is the first, fully reprogrammable space-qualified SDR operating in the Ka-Band frequency range. Providing exceptionally higher data communication rates than previously possible, this SDR offers in-orbit reconfiguration, multi-waveform operation, and fast deployment due to its highly modular hardware and software architecture. Currently in operation on the International Space Station (ISS), this new paradigm of reconfigurable technology is enabling experimenters to investigate navigation and networking in the space environment.The modular SDR and the NASA developed Space Telecommunications Radio System (STRS) architecture standard are the basis for Harris reusable, digital signal processing space platform trademarked as AppSTAR. As a result, two new space radio products are a synthetic aperture radar payload and an Automatic Detection Surveillance Broadcast (ADS-B) receiver. In addition, Harris is currently developing many new products similar to the Ka-Band software defined radio for other applications. For NASAs next generation flight Ka-Band radio development, leveraging these advancements could lead to a more robust and more capable software defined radio.The space environment has special considerations different from terrestrial applications that must be considered for any system operated in space. Each space mission has unique requirements that can make these systems unique. These unique requirements can make products that are expensive and limited in reuse. Space systems put a premium on size, weight and power. A key trade is the amount of reconfigurability in a space system. The more reconfigurable the hardware platform, the easier it is to adapt to the platform to the next mission, and this reduces the amount of non-recurring engineering costs. However, the more reconfigurable platforms often use more spacecraft resources. Software has similar considerations

  16. Scanning the effects of ethyl methanesulfonate on the whole genome of Lotus japonicus using second-generation sequencing analysis.

    PubMed

    Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M; Biswas, Bandana; Batley, Jacqueline

    2015-04-01

    Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm. PMID:25660167

  17. Integrating chemical mutagenesis and whole genome sequencing as a platform for forward and reverse genetic analysis of Chlamydia

    PubMed Central

    Kokes, Marcela; Dunn, Joe Dan; Granek, Joshua A.; Nguyen, Bidong D.; Barker, Jeffrey R.; Valdivia, Raphael H.; Bastidas, Robert J.

    2015-01-01

    SUMMARY Gene inactivation by transposon insertion or allelic exchange is a powerful approach to probe gene function. Unfortunately, many microbes, including Chlamydia, are not amenable to routine molecular genetic manipulations. Here we describe an arrayed library of chemically-induced mutants of the genetically-intransigent pathogen Chlamydia trachomatis, in which all mutations have been identified by whole genome sequencing, providing a platform for reverse genetic applications. An analysis of possible loss-of-function mutations in the collection uncovered plasticity in the central metabolic properties of this obligate intracellular pathogen. We also describe the use of the library in a forward genetic screen that identified InaC as a bacterial factor that binds host ARF and 14-3-3 proteins to modulate F-actin assembly and Golgi redistribution around the pathogenic vacuole. This work provides a robust platform for reverse and forward genetic approaches in Chlamydia and should serve as a valuable resource to the community. PMID:25920978

  18. Splicing Express: a software suite for alternative splicing analysis using next-generation sequencing data

    PubMed Central

    Kroll, Jose E.; Kim, Jihoon; Ohno-Machado, Lucila

    2015-01-01

    Motivation. Alternative splicing events (ASEs) are prevalent in the transcriptome of eukaryotic species and are known to influence many biological phenomena. The identification and quantification of these events are crucial for a better understanding of biological processes. Next-generation DNA sequencing technologies have allowed deep characterization of transcriptomes and made it possible to address these issues. ASEs analysis, however, represents a challenging task especially when many different samples need to be compared. Some popular tools for the analysis of ASEs are known to report thousands of events without annotations and/or graphical representations. A new tool for the identification and visualization of ASEs is here described, which can be used by biologists without a solid bioinformatics background. Results. A software suite named Splicing Express was created to perform ASEs analysis from transcriptome sequencing data derived from next-generation DNA sequencing platforms. Its major goal is to serve the needs of biomedical researchers who do not have bioinformatics skills. Splicing Express performs automatic annotation of transcriptome data (GTF files) using gene coordinates available from the UCSC genome browser and allows the analysis of data from all available species. The identification of ASEs is done by a known algorithm previously implemented in another tool named Splooce. As a final result, Splicing Express creates a set of HTML files composed of graphics and tables designed to describe the expression profile of ASEs among all analyzed samples. By using RNA-Seq data from the Illumina Human Body Map and the Rat Body Map, we show that Splicing Express is able to perform all tasks in a straightforward way, identifying well-known specific events. Availability and Implementation.Splicing Express is written in Perl and is suitable to run only in UNIX-like systems. More details can be found at: http

  19. Authentication of Herbal Supplements Using Next-Generation Sequencing

    PubMed Central

    Braukmann, Thomas W. A.; Borisenko, Alex V.; Zakharov, Evgeny V.

    2016-01-01

    Background DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. Methods We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. Results All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven–by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Conclusion Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product

  20. Statistical signatures of aftershock sequences generated by supershear mainshocks

    NASA Astrophysics Data System (ADS)

    Bhattacharya, P.; Shcherbakov, R.; Tiampo, K. F.; Mansinha, L.

    2010-12-01

    The rupture process during supershear earthquakes generates a seismic shock wave redistributing stress away from the fault resembling a sonic boom produced by a supersonic aircraft. This leads to a relative quiescence in aftershock activity along the supershear segment of the rupture. The occurrence of supershear ruptures is also generally associated with a region of local high pre-stress and an unusually smooth friction profile over the supershear segment, leading to a conspicuous absence of high frequency ground motions. We have considered the aftershock sequences of five well-known supershear earthquakes from around the world (1979 Imperial Valley, 1992 Landers, 1999 Izmit and Duzce and 2002 Denali earthquakes) to test whether the aftershock statistics around the supershear rupture are different from the statistics in the rest of the region due to the aforementioned stress conditions and redistributions. Specifically, we have looked at the frequency-magnitude distribution in order to study the variation of the b value for each of the sequences and observe statistically significant variations. In particular, we have determined that the b value is always higher in the zone surrounding a supershear segment than in the rest of the aftershock region. The Omori Law, however, does not show such clear trends. We also looked at the average difference in magnitude between the mainshock and the largest aftershock and found it is larger than that predicted by Bath's law. The results certainly point towards a relationship between aftershock statistics and the mainshock rupture process and might facilitate a physical process based understanding of the empirical laws of earthquake statistics.

  1. The first FDA marketing authorizations of next-generation sequencing technology and tests: challenges, solutions and impact for future assays.

    PubMed

    Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana

    2015-01-01

    The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays. PMID:25370936

  2. Building a geological reference platform using sequence stratigraphy combined with geostatistical tools

    NASA Astrophysics Data System (ADS)

    Bourgine, Bernard; Lasseur, Éric; Leynet, Aurélien; Badinier, Guillaume; Ortega, Carole; Issautier, Benoit; Bouchet, Valentin

    2015-04-01

    In 2012 BRGM launched an extensive program to build the new French Geological Reference platform (RGF). Among the objectives of this program is to provide the public with validated, reliable and 3D-consistent geological data, with estimation of uncertainty. Approx. 100,000 boreholes over the whole French national territory provide a preliminary interpretation in terms of depths of main geological interfaces, but with an unchecked, unknown and often low reliability. The aim of this paper is to present the procedure that has been tested on two areas in France, in order to validate (or not) these boreholes, with the aim of being generalized as much as possible to the nearly 100,000 boreholes waiting for validation. The approach is based on the following steps, and includes the management of uncertainty at different steps: (a) Selection of a loose network of boreholes owning a logging or coring information enabling a reliable interpretation. This first interpretation is based on the correlation of well log data and allows defining 3D sequence stratigraphic framework identifying isochronous surfaces. A litho-stratigraphic interpretation is also performed. Be "A" the collection of all boreholes used for this step (typically 3 % of the total number of holes to be validated) and "B" the other boreholes to validate, (b) Geostatistical analysis of characteristic geological interfaces. The analysis is carried out firstly on the "A" type data (to validate the variogram model), then on the "B" type data and at last on "B" knowing "A". It is based on cross-validation tests and evaluation of the uncertainty associated to each geological interface. In this step, we take into account inequality constraints provided by boreholes that do not intersect all interfaces, as well as the "litho-stratigraphic pile" defining the formations and their relationships (depositing surfaces or erosion). The goal is to identify quickly and semi-automatically potential errors among the data, up to

  3. SNP Discovery Using Next Generation Transcriptomic Sequencing in Atlantic Herring (Clupea harengus)

    PubMed Central

    Bekkevold, Dorte; Babbucci, Massimiliano; van Houdt, Jeroen; Maes, Gregory E.; Bargelloni, Luca; Nielsen, Rasmus O.; Taylor, Martin I.; Ogden, Rob; Cariani, Alessia; Carvalho, Gary R.; Consortium, FishPopTrace; Panitz, Frank

    2012-01-01

    The introduction of Next Generation Sequencing (NGS) has revolutionised population genetics, providing studies of non-model species with unprecedented genomic coverage, allowing evolutionary biologists to address questions previously far beyond the reach of available resources. Furthermore, the simple mutation model of Single Nucleotide Polymorphisms (SNPs) permits cost-effective high-throughput genotyping in thousands of individuals simultaneously. Genomic resources are scarce for the Atlantic herring (Clupea harengus), a small pelagic species that sustains high revenue fisheries. This paper details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed by de novo sequence clustering and contig assembly, followed by the mapping of reads against consensus contig sequences. Selection of candidate SNPs for genotyping was conducted using an in silico approach. SNP validation and genotyping were performed simultaneously using an Illumina 1,536 GoldenGate assay. Although the conversion rate of candidate SNPs in the genotyping assay cannot be predicted in advance, this approach has the potential to maximise cost and time efficiencies by avoiding expensive and time-consuming laboratory stages of SNP validation. Additionally, the in silico approach leads to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic resources presented here represent a valuable multi-purpose resource for developing informative marker panels for population discrimination, microarray development and for population

  4. A systems approach to designing next generation vaccines: combining α-galactose modified antigens with nanoparticle platforms

    NASA Astrophysics Data System (ADS)

    Phanse, Yashdeep; Carrillo-Conde, Brenda R.; Ramer-Tait, Amanda E.; Broderick, Scott; Kong, Chang Sun; Rajan, Krishna; Flick, Ramon; Mandell, Robert B.; Narasimhan, Balaji; Wannemuehler, Michael J.

    2014-01-01

    Innovative vaccine platforms are needed to develop effective countermeasures against emerging and re-emerging diseases. These platforms should direct antigen internalization by antigen presenting cells and promote immunogenic responses. This work describes an innovative systems approach combining two novel platforms, αGalactose (αGal)-modification of antigens and amphiphilic polyanhydride nanoparticles as vaccine delivery vehicles, to rationally design vaccine formulations. Regimens comprising soluble αGal-modified antigen and nanoparticle-encapsulated unmodified antigen induced a high titer, high avidity antibody response with broader epitope recognition of antigenic peptides than other regimen. Proliferation of antigen-specific CD4+ T cells was also enhanced compared to a traditional adjuvant. Combining the technology platforms and augmenting immune response studies with peptide arrays and informatics analysis provides a new paradigm for rational, systems-based design of next generation vaccine platforms against emerging and re-emerging pathogens.

  5. A systems approach to designing next generation vaccines: combining α-galactose modified antigens with nanoparticle platforms

    PubMed Central

    Phanse, Yashdeep; Carrillo-Conde, Brenda R.; Ramer-Tait, Amanda E.; Broderick, Scott; Kong, Chang Sun; Rajan, Krishna; Flick, Ramon; Mandell, Robert B.; Narasimhan, Balaji; Wannemuehler, Michael J.

    2014-01-01

    Innovative vaccine platforms are needed to develop effective countermeasures against emerging and re-emerging diseases. These platforms should direct antigen internalization by antigen presenting cells and promote immunogenic responses. This work describes an innovative systems approach combining two novel platforms, αGalactose (αGal)-modification of antigens and amphiphilic polyanhydride nanoparticles as vaccine delivery vehicles, to rationally design vaccine formulations. Regimens comprising soluble αGal-modified antigen and nanoparticle-encapsulated unmodified antigen induced a high titer, high avidity antibody response with broader epitope recognition of antigenic peptides than other regimen. Proliferation of antigen-specific CD4+ T cells was also enhanced compared to a traditional adjuvant. Combining the technology platforms and augmenting immune response studies with peptide arrays and informatics analysis provides a new paradigm for rational, systems-based design of next generation vaccine platforms against emerging and re-emerging pathogens. PMID:24441019

  6. RefNetBuilder: a platform for construction of integrated reference gene regulatory networks from expressed sequence tags

    PubMed Central

    2011-01-01

    Background Gene Regulatory Networks (GRNs) provide integrated views of gene interactions that control biological processes. Many public databases contain biological interactions extracted from experimentally validated literature reports, but most furnish only information for a few genetic model organisms. In order to provide a bioinformatic tool for researchers who work with non-model organisms, we developed RefNetBuilder, a new platform that allows construction of putative reference pathways or GRNs from expressed sequence tags (ESTs). Results RefNetBuilder was designed to have the flexibility to extract and archive pathway or GRN information from public databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG). It features sequence alignment tools such as BLAST to allow mapping ESTs to pathways and GRNs in model organisms. A scoring algorithm was incorporated to rank and select the best match for each query EST. We validated RefNetBuilder using DNA sequences of Caenorhabditis elegans, a model organism having manually curated KEGG pathways. Using the earthworm Eisenia fetida as an example, we demonstrated the functionalities and features of RefNetBuilder. Conclusions The RefNetBuilder provides a standalone application for building reference GRNs for non-model organisms on a number of operating system platforms with standard desktop computer hardware. As a new bioinformatic tool aimed for constructing putative GRNs for non-model organisms that have only ESTs available, RefNetBuilder is especially useful to explore pathway- or network-related information in these organisms. PMID:22166047

  7. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    PubMed Central

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T.; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  8. DNA qualification workflow for next generation sequencing of histopathological samples.

    PubMed

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  9. Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries

    PubMed Central

    Nyyssönen, Mari; Tran, Huu M.; Karaoz, Ulas; Weihe, Claudia; Hadi, Masood Z.; Martiny, Jennifer B. H.; Martiny, Adam C.; Brodie, Eoin L.

    2013-01-01

    Recent advances in sequencing technologies generate new predictions and hypotheses about the functional roles of environmental microorganisms. Yet, until we can test these predictions at a scale that matches our ability to generate them, most of them will remain as hypotheses. Function-based mining of metagenomic libraries can provide direct linkages between genes, metabolic traits and microbial taxa and thus bridge this gap between sequence data generation and functional predictions. Here we developed high-throughput screening assays for function-based characterization of activities involved in plant polymer decomposition from environmental metagenomic libraries. The multiplexed assays use fluorogenic and chromogenic substrates, combine automated liquid handling and use a genetically modified expression host to enable simultaneous screening of 12,160 clones for 14 activities in a total of 170,240 reactions. Using this platform we identified 374 (0.26%) cellulose, hemicellulose, chitin, starch, phosphate and protein hydrolyzing clones from fosmid libraries prepared from decomposing leaf litter. Sequencing on the Illumina MiSeq platform, followed by assembly and gene prediction of a subset of 95 fosmid clones, identified a broad range of bacterial phyla, including Actinobacteria, Bacteroidetes, multiple Proteobacteria sub-phyla in addition to some Fungi. Carbohydrate-active enzyme genes from 20 different glycoside hydrolase (GH) families were detected. Using tetranucleotide frequency (TNF) binning of fosmid sequences, multiple enzyme activities from distinct fosmids were linked, demonstrating how biochemically-confirmed functional traits in environmental metagenomes may be attributed to groups of specific organisms. Overall, our results demonstrate how functional screening of metagenomic libraries can be used to connect microbial functionality to community composition and, as a result, complement large-scale metagenomic sequencing efforts. PMID:24069019

  10. Pash 2.0: scaleable sequence anchoring for next-generation sequencing technologies.

    PubMed

    Coarfa, Cristian; Milosavljevic, Aleksandar

    2008-01-01

    Many applications of next-generation sequencing technologies involve anchoring of a sequence fragment or a tag onto a corresponding position on a reference genome assembly. Positional Hashing method, implemented in the Pash 2.0 program, is specifically designed for the task of high-volume anchoring. In this article we present multi-diagonal gapped kmer collation and other improvements introduced in Pash 2.0 that further improve accuracy and speed of Positional Hashing. The goal of this article is to show that gapped kmer matching with cross-diagonal collation suffices for anchoring across close evolutionary distances and for the purpose of human resequencing. We propose a benchmark for evaluating the performance of anchoring programs that captures key parameters in specific applications, including duplicative structure of genomes of humans and other species. We demonstrate speedups of up to tenfold in large-scale anchoring experiments achieved by PASH 2.0 when compared to BLAT, another similarity search program frequently used for anchoring. PMID:18229679

  11. Next-Generation Sequencing in the Understanding of Kaposi’s Sarcoma-Associated Herpesvirus (KSHV) Biology

    PubMed Central

    Strahan, Roxanne; Uppal, Timsy; Verma, Subhash C.

    2016-01-01

    Non-Sanger-based novel nucleic acid sequencing techniques, referred to as Next-Generation Sequencing (NGS), provide a rapid, reliable, high-throughput, and massively parallel sequencing methodology that has improved our understanding of human cancers and cancer-related viruses. NGS has become a quintessential research tool for more effective characterization of complex viral and host genomes through its ever-expanding repertoire, which consists of whole-genome sequencing, whole-transcriptome sequencing, and whole-epigenome sequencing. These new NGS platforms provide a comprehensive and systematic genome-wide analysis of genomic sequences and a full transcriptional profile at a single nucleotide resolution. When combined, these techniques help unlock the function of novel genes and the related pathways that contribute to the overall viral pathogenesis. Ongoing research in the field of virology endeavors to identify the role of various underlying mechanisms that control the regulation of the herpesvirus biphasic lifecycle in order to discover potential therapeutic targets and treatment strategies. In this review, we have complied the most recent findings about the application of NGS in Kaposi’s sarcoma-associated herpesvirus (KSHV) biology, including identification of novel genomic features and whole-genome KSHV diversities, global gene regulatory network profiling for intricate transcriptome analyses, and surveying of epigenetic marks (DNA methylation, modified histones, and chromatin remodelers) during de novo, latent, and productive KSHV infections. PMID:27043613

  12. Using a CD-like microfluidic platform for uniform calcium alginate drug carrier generation

    NASA Astrophysics Data System (ADS)

    Liu, Ming-Kai; Huang, Keng-Shiang; Chang, Jia-Yaw; Wu, Chun-Han; Lin, Yu-Cheng

    2007-01-01

    In this paper the manipulation of monodisperse Ca-alginate microparticles using a polymer-based CD-like microfluidic platform and a reaction of external gelation is presented. Our strategy was based on associating the rapid injection molding process for cross-junction microchannel with the sheath focusing effect to form uniform water-in-oil (w/o) emulsions. These fine emulsions, consisting of 1.5% w/v Na-alginate, were then dripped into an oil solution containing 20% w/v calcium chloride (CaCl II) to accomplish Ca-alginate microspheres in an efficient manner. We have demonstrated that one can control the size of Ca-alginate microparticles from 20 µm to 50 µm in diameter (with a variation less than 10%) by altering the relative sheath/sample flow rate ratio. Experimental data showed that for a given fixed dispersed phase flow (sample flow), the emulsion size decreased as the average velocity of the continuous phase (oil flow) increased. The proposed CD-like microfluidic platform is capable of generating relatively uniform microdroplets and has the advantages of active control of droplet diameter, simple and low cost process, and high throughput.

  13. A platform for rapid generation of single and multiplexed reporters in human iPSC lines

    PubMed Central

    Pei, Ying; Sierra, Guadalupe; Sivapatham, Renuka; Swistowski, Andrzej; Rao, Mahendra S.; Zeng, Xianmin

    2015-01-01

    Induced pluripotent stem cells (iPSC) are important tools for drug discovery assays and toxicology screens. In this manuscript, we design high efficiency TALEN and ZFN to target two safe harbor sites on chromosome 13 and 19 in a widely available and well-characterized integration-free iPSC line. We show that these sites can be targeted in multiple iPSC lines to generate reporter systems while retaining pluripotent characteristics. We extend this concept to making lineage reporters using a C-terminal targeting strategy to endogenous genes that express in a lineage-specific fashion. Furthermore, we demonstrate that we can develop a master cell line strategy and then use a Cre-recombinase induced cassette exchange strategy to rapidly exchange reporter cassettes to develop new reporter lines in the same isogenic background at high efficiency. Equally important we show that this recombination strategy allows targeting at progenitor cell stages, further increasing the utility of the platform system. The results in concert provide a novel platform for rapidly developing custom single or dual reporter systems for screening assays. PMID:25777362

  14. A Computer Program for Generating Sequences of Primary Arithmetic Facts in Random Order.

    ERIC Educational Resources Information Center

    Burns, Edward

    A computer program which generates randomly sequenced problems for testing the abilities of students to add, subtract, and multiply one-digit numbers is described. Appendices provide tables of random sequences with directions for using the tables. The 54-statement FORTRAN program which can be used in generating additional sequences is also…

  15. Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

    PubMed Central

    2010-01-01

    Background Complete chloroplast genome sequences provide a valuable source of molecular markers for studies in molecular ecology and evolution of plants. To obtain complete genome sequences, recent studies have made use of the polymerase chain reaction to amplify overlapping fragments from conserved gene loci. However, this approach is time consuming and can be more difficult to implement where gene organisation differs among plants. An alternative approach is to first isolate chloroplasts and then use the capacity of high-throughput sequencing to obtain complete genome sequences. We report our findings from studies of the latter approach, which used a simple chloroplast isolation procedure, multiply-primed rolling circle amplification of chloroplast DNA, Illumina Genome Analyzer II sequencing, and de novo assembly of paired-end sequence reads. Results A modified rapid chloroplast isolation protocol was used to obtain plant DNA that was enriched for chloroplast DNA, but nevertheless contained nuclear and mitochondrial DNA. Multiply-primed rolling circle amplification of this mixed template produced sufficient quantities of chloroplast DNA, even when the amount of starting material was small, and improved the template quality for Illumina Genome Analyzer II (hereafter Illumina GAII) sequencing. We demonstrate, using independent samples of karaka (Corynocarpus laevigatus), that there is high fidelity in the sequence obtained from this template. Although less than 20% of our sequenced reads could be mapped to chloroplast genome, it was relatively easy to assemble complete chloroplast genome sequences from the mixture of nuclear, mitochondrial and chloroplast reads. Conclusions We report successful whole genome sequencing of chloroplast DNA from karaka, obtained efficiently and with high fidelity. PMID:20920211

  16. IDBA-MT: de novo assembler for metatranscriptomic data generated from next-generation sequencing technology.

    PubMed

    Leung, Henry C M; Yiu, Siu-Ming; Parkinson, John; Chin, Francis Y L

    2013-07-01

    High-throughput next-generation sequencing technology provides a great opportunity for analyzing metatranscriptomic data. However, the reads produced by these technologies are short and an assembling step is required to combine the short reads into longer contigs. As there are many repeat patterns in mRNAs from different genomes and the abundance ratio of mRNAs in a sample varies a lot, existing assemblers for genomic data, transcriptomic data, and metagenomic data do not work on metatranscriptomic data and produce chimeric contigs, that is, incorrect contigs formed by merging multiple mRNA sequences. To our best knowledge, there is no assembler designed for metatranscriptomic data. In this article, we introduce an assembler called IDBA-MT, which is designed for assembling reads from metatranscriptomic data. IDBA-MT produces much fewer chimeric contigs (reduce by 50% or more) when compared with existing assemblers such as Oases, IDBA-UD, and Trinity. PMID:23829653

  17. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues.

    PubMed

    García-Chequer, A J; Méndez-Tenorio, A; Olguín-López, G; Sánchez-Vallejo, C; Isa, P; Arias, C F; Torres, J; Hernández-Angeles, A; Ramírez-Ortiz, M A; Lara, C; Cabrera-Muñoz, Ma de L; Sadowinski-Pine, S; Bravo-Ortiz, J C; Ramón-García, G; Diegopérez-Ramírez, J; Ramírez-Reyes, G; Casarrubias-Islas, R; Ramírez, J; Orjuela, M; Ponce-Castañeda, M V

    2016-03-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma. PMID:26937470

  18. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues

    PubMed Central

    García-Chequer, A.J.; Méndez-Tenorio, A.; Olguín-López, G.; Sánchez-Vallejo, C.; Isa, P.; Arias, C.F.; Torres, J.; Hernández-Angeles, A.; Ramírez-Ortiz, M.A.; Lara, C.; Cabrera-Muñoz, Ma.de.L.; Sadowinski-Pine, S.; Bravo-Ortiz, J.C.; Ramón-García, G.; Diegopérez-Ramírez, J.; Ramírez-Reyes, G.; Casarrubias-Islas, R.; Ramírez, J.; Orjuela, M.; Ponce-Castañeda, M.V.

    2016-01-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma. PMID:26937470

  19. High-Throughput Microdissection for Next-Generation Sequencing

    PubMed Central

    Rosenberg, Avi Z.; Armani, Michael D.; Fetsch, Patricia A.; Xi, Liqiang; Pham, Tina Thu; Raffeld, Mark; Chen, Yun; O’Flaherty, Neil; Stussman, Rebecca; Blackler, Adele R.; Du, Qiang; Hanson, Jeffrey C.; Roth, Mark J.; Filie, Armando C.; Roh, Michael H.; Emmert-Buck, Michael R.; Hipp, Jason D.; Tangrea, Michael A.

    2016-01-01

    Precision medicine promises to enhance patient treatment through the use of emerging molecular technologies, including genomics, transcriptomics, and proteomics. However, current tools in surgical pathology lack the capability to efficiently isolate specific cell populations in complex tissues/tumors, which can confound molecular results. Expression microdissection (xMD) is an immuno-based cell/subcellular isolation tool that procures targets of interest from a cytological or histological specimen. In this study, we demonstrate the accuracy and precision of xMD by rapidly isolating immunostained targets, including cytokeratin AE1/AE3, p53, and estrogen receptor (ER) positive cells and nuclei from tissue sections. Other targets procured included green fluorescent protein (GFP) expressing fibroblasts, in situ hybridization positive Epstein-Barr virus nuclei, and silver stained fungi. In order to assess the effect on molecular data, xMD was utilized to isolate specific targets from a mixed population of cells where the targets constituted only 5% of the sample. Target enrichment from this admixed cell population prior to next-generation sequencing (NGS) produced a minimum 13-fold increase in mutation allele frequency detection. These data suggest a role for xMD in a wide range of molecular pathology studies, as well as in the clinical workflow for samples where tumor cell enrichment is needed, or for those with a relative paucity of target cells. PMID:26999048

  20. Generation of BAC-end sequences for rainbow trout genome analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    For non-sequenced genomes, BAC end sequences (BES) provide a valuable sample of repetitive elements and gene content. Here we report the results of BAC end sequencing of just over half of the rainbow trout (Oncorhynchus mykiss) Swanson HindIII library. We sequenced 177,860 BAC ends that generated 17...

  1. GOblet: a platform for Gene Ontology annotation of anonymous sequence data

    PubMed Central

    Groth, Detlef; Lehrach, Hans; Hennig, Steffen

    2004-01-01

    GOblet is a comprehensive web server application providing the annotation of anonymous sequence data with Gene Ontology (GO) terms. It uses a variety of different protein databases (human, murines, invertebrates, plants, sp-trembl) and their respective GO mappings. The user selects the appropriate database and alignment threshold and thereafter submits single or multiple nucleotide or protein sequences. Results are shown in different ways, e.g. as survey statistics for the main GO categories for all sequences or as detailed results for each single sequence that has been submitted. In its newest version, GOblet allows the batch submission of sequences and provides an improved display of results with the aid of Java applets. All output data, together with the Java applet, are packed to a downloadable archive for local installation and analysis. GOblet can be accessed freely at http://goblet.molgen.mpg.de. PMID:15215401

  2. Generation of droplets to serpentine threads on a rotating compact-disk platform

    NASA Astrophysics Data System (ADS)

    Kar, Shantimoy; Joshi, Sumit; Chaudhary, Kaustav; Maiti, Tapas Kumar; Chakraborty, Suman

    2015-12-01

    We generate stable monodisperse droplets of nano-liter volumes and long serpentine liquid threads in a single, simple "Y"-shaped microchannel mounted on a rotationally actuated lab-on-a-compact-disk platform. Exploitation of Coriolis force offers versatile modus operandi of the present setup, without involving any design complications. Based on the fundamental understanding and subsequent analysis, we present scaling theories consistent with the experimental observations. We also outline specific applications of this technique, in the biological as well as in the physical domain, including digital polymerase chain reaction (PCR), controlled release of medical components, digital counting of colony forming units, hydrogel engineering, optical sensors and scaffolds for living tissues, to name a few.

  3. A microfluidic platform for size-dependent generation of droplet interface bilayer networks on rails

    PubMed Central

    Carreras, P.; Elani, Y.; Law, R. V.; Brooks, N. J.; Seddon, J. M.; Ces, O.

    2015-01-01

    Droplet interface bilayer (DIB) networks are emerging as a cornerstone technology for the bottom up construction of cell-like and tissue-like structures and bio-devices. They are an exciting and versatile model-membrane platform, seeing increasing use in the disciplines of synthetic biology, chemical biology, and membrane biophysics. DIBs are formed when lipid-coated water-in-oil droplets are brought together—oil is excluded from the interface, resulting in a bilayer. Perhaps the greatest feature of the DIB platform is the ability to generate bilayer networks by connecting multiple droplets together, which can in turn be used in applications ranging from tissue mimics, multicellular models, and bio-devices. For such applications, the construction and release of DIB networks of defined size and composition on-demand is crucial. We have developed a droplet-based microfluidic method for the generation of different sized DIB networks (300–1500 pl droplets) on-chip. We do this by employing a droplet-on-rails strategy where droplets are guided down designated paths of a chip with the aid of microfabricated grooves or “rails,” and droplets of set sizes are selectively directed to specific rails using auxiliary flows. In this way we can uniquely produce parallel bilayer networks of defined sizes. By trapping several droplets in a rail, extended DIB networks containing up to 20 sequential bilayers could be constructed. The trapped DIB arrays can be composed of different lipid types and can be released on-demand and regenerated within seconds. We show that chemical signals can be propagated across the bio-network by transplanting enzymatic reaction cascades for inter-droplet communication. PMID:26759638

  4. A microfluidic platform for size-dependent generation of droplet interface bilayer networks on rails.

    PubMed

    Carreras, P; Elani, Y; Law, R V; Brooks, N J; Seddon, J M; Ces, O

    2015-11-01

    Droplet interface bilayer (DIB) networks are emerging as a cornerstone technology for the bottom up construction of cell-like and tissue-like structures and bio-devices. They are an exciting and versatile model-membrane platform, seeing increasing use in the disciplines of synthetic biology, chemical biology, and membrane biophysics. DIBs are formed when lipid-coated water-in-oil droplets are brought together-oil is excluded from the interface, resulting in a bilayer. Perhaps the greatest feature of the DIB platform is the ability to generate bilayer networks by connecting multiple droplets together, which can in turn be used in applications ranging from tissue mimics, multicellular models, and bio-devices. For such applications, the construction and release of DIB networks of defined size and composition on-demand is crucial. We have developed a droplet-based microfluidic method for the generation of different sized DIB networks (300-1500 pl droplets) on-chip. We do this by employing a droplet-on-rails strategy where droplets are guided down designated paths of a chip with the aid of microfabricated grooves or "rails," and droplets of set sizes are selectively directed to specific rails using auxiliary flows. In this way we can uniquely produce parallel bilayer networks of defined sizes. By trapping several droplets in a rail, extended DIB networks containing up to 20 sequential bilayers could be constructed. The trapped DIB arrays can be composed of different lipid types and can be released on-demand and regenerated within seconds. We show that chemical signals can be propagated across the bio-network by transplanting enzymatic reaction cascades for inter-droplet communication. PMID:26759638

  5. Generation and Characterization of an IgG4 Monomeric Fc Platform

    PubMed Central

    Shan, Lu; Colazet, Magali; Rosenthal, Kim L.; Yu, Xiang-Qing; Bee, Jared S.; Ferguson, Andrew; Damschroder, Melissa M.; Wu, Herren; Dall’Acqua, William F.; Tsui, Ping

    2016-01-01

    The immunoglobulin Fc region is a homodimer consisted of two sets of CH2 and CH3 domains and has been exploited to generate two-arm protein fusions with high expression yields, simplified purification processes and extended serum half-life. However, attempts to generate one-arm fusion proteins with monomeric Fc, with one set of CH2 and CH3 domains, are often plagued with challenges such as weakened binding to FcRn or partial monomer formation. Here, we demonstrate the generation of a stable IgG4 Fc monomer with a unique combination of mutations at the CH3-CH3 interface using rational design combined with in vitro evolution methodologies. In addition to size-exclusion chromatography and analytical ultracentrifugation, we used multi-angle light scattering (MALS) to show that the engineered Fc monomer exhibits excellent monodispersity. Furthermore, crystal structure analysis (PDB ID: 5HVW) reveals monomeric properties supported by disrupted interactions at the CH3-CH3 interface. Monomeric Fc fusions with Fab or scFv achieved FcRn binding and serum half-life comparable to wildtype IgG. These results demonstrate that this monomeric IgG4 Fc is a promising therapeutic platform to extend the serum half-life of proteins in a monovalent format. PMID:27479095

  6. Sequence stratigraphy and systems tract development of the Latemar platform, Middle Triassic of the dolomites: Outcrop calibration keyed by cycle stacking patterns

    SciTech Connect

    Goldhammer, R.K.; Dunn, P.A. ); Harris, M.T. ); Hardie, L.A. )

    1991-03-01

    The Middle Triassic Latemar platform provides a seismic-scale outcrop example of an intact carbonate shelf-to-basin transition, ideal for integrating sequence stratigraphy with facies and cyclic stratigraphy. This subcircular, high-relief buildup records two third-order accommodation sequences within the platform interior: the lower Ladinian sequence and the upper Ladinian sequence. Sequence L1 developed atop a widespread, low-relief Middle Anisian carbonate bank (60 m thick). Underlying subtidal bank cycles thin upward into the basal, subaerial sequence boundary (type 1) reflecting decreasing third-order accommodation; above it, platform-interior facies of sequence L1 retrograde. This results in superimposition of Ladinian basinal and foreslope facies atop the underlying, horizontal, shallow-water bank along its periphery. The transgressive (TST) and highstand systems tract (HST) of sequence L1 (as well as L2) are marked by long-term, systematic vertical facies changes and variation in stacking patterns of aggradational high-frequency, 20 kyr cycles within the platform interior. The maximum flooding surface (MFS) is a marine hardground surface displaying evidence of very slow sedimentation and is the platform expression of the condensed section. A type 2 SB caps sequence L1, marked by an interval of vertically superimposed thin subaerial tepees; beneath this, high-frequency cycles are thinning-upward, and above they are thickening-upward. Only the transgressive systems tract of sequence L2 is preserved at the Latemar owing to late Ladinian-Early Carnian volcanism and tectonism which terminated carbonate platform deposition.

  7. Using next generation sequencing approaches for the isolation of simple sequence repeats (SSF) in the plant sciences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The application of next-generation sequencing (NGS) technologies for the development of simple sequence repeat (SSR) or microsatellite loci for genetic research in the botanical sciences is described. The major advantage of using NGS methods to isolate SSR loci is their ability to quickly and cost-e...

  8. SNP discovery in non-model organisms using 454 next generation sequencing.

    PubMed

    Wheat, Christopher W

    2012-01-01

    Roche 454 sequencing of the transcriptome has become a standard approach for efficiently obtaining single nucleotide polymorphisms (SNPs) in non-model species. In this chapter, the primary issues facing the development of SNPs from the transcriptome in non-model species are presented: tissue and sampling choices, mRNA preparation, considerations of normalization, pooling and barcoding, how much to sequence, how to assemble the data and assess the assembly, calling transcriptome SNPs, developing these into genomic SNPs, and publishing the work. Discussion also covers the comparison of this approach to RAD-tag sequencing and the potential of using other sequencing platforms for SNP development. PMID:22665274

  9. Next-Generation Sequencing Tech Panel ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Rhodes, Michael; Fiske, Haley; Knight, Jim; Turner, Steve (Pacific Biosciences

    2012-06-01

    Representatives from several next-generation sequencer manufacturers take part in a panel discussion at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  10. Sample Prep, Workflow Automation and Nucleic Acid Fractionation for Next Generation Sequencing

    SciTech Connect

    Roskey, Mark

    2010-06-03

    Mark Roskey of Caliper LifeSciences discusses how the company's technologies fit into the next generation sequencing workflow on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  11. Next generation sequencing (NGS)technologies and applications

    SciTech Connect

    Vuyisich, Momchilo

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  12. Next-generation sequencing identifies the natural killer cell microRNA transcriptome

    PubMed Central

    Fehniger, Todd A.; Wylie, Todd; Germino, Elizabeth; Leong, Jeffrey W.; Magrini, Vincent J.; Koul, Sunita; Keppel, Catherine R.; Schneider, Stephanie E.; Koboldt, Daniel C.; Sullivan, Ryan P.; Heinz, Michael E.; Crosby, Seth D.; Nagarajan, Rakesh; Ramsingh, Giridharan; Link, Daniel C.; Ley, Timothy J.; Mardis, Elaine R.

    2010-01-01

    Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs) are small noncoding RNAs that post-transcriptionally regulate the translation of their mRNA targets, and are therefore candidates for mediating this control process. While the expression and importance of miRNAs in T and B lymphocytes have been established, little is known about miRNAs in NK cells. Here, we used two next-generation sequencing (NGS) platforms to define the miRNA transcriptomes of resting and cytokine-activated primary murine NK cells, with confirmation by quantitative real-time PCR (qRT-PCR) and microarrays. We delineate a bioinformatics analysis pipeline that identified 302 known and 21 novel mature miRNAs from sequences obtained from NK cell small RNA libraries. These miRNAs are expressed over a broad range and exhibit isomiR complexity, and a subset is differentially expressed following cytokine activation. Using these miRNA NGS data, miR-223 was identified as a mature miRNA present in resting NK cells with decreased expression following cytokine activation. Furthermore, we demonstrate that miR-223 specifically targets the 3′ untranslated region of murine GzmB in vitro, indicating that this miRNA may contribute to control of GzmB translation in resting NK cells. Thus, the sequenced NK cell miRNA transcriptome provides a valuable framework for further elucidation of miRNA expression and function in NK cell biology. PMID:20935160

  13. Alignment-Free Sequence Comparison Based on Next-Generation Sequencing Reads

    PubMed Central

    Song, Kai; Ren, Jie; Zhai, Zhiyuan; Liu, Xuemei

    2013-01-01

    Abstract Next-generation sequencing (NGS) technologies have generated enormous amounts of shotgun read data, and assembly of the reads can be challenging, especially for organisms without template sequences. We study the power of genome comparison based on shotgun read data without assembly using three alignment-free sequence comparison statistics, D2, \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin{document} $$\\textbf{\\textit{D}}_{\\bf 2}^{\\bf *}$$ \\end{document}, and \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin{document} $$\\textbf{\\textit{D}}_{\\bf 2}^S$$ \\end{document}, both theoretically and by simulations. Theoretical formulas for the power of detecting the relationship between two sequences related through a common motif model are derived. It is shown that both \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin{document} $$\\textbf{\\textit{D}}_{\\bf 2}^{\\bf *}$$ \\end{document} and \\documentclass{aastex}\\usepackage{amsbsy}\\usepackage{amsfonts}\\usepackage{amssymb}\\usepackage{bm}\\usepackage{mathrsfs}\\usepackage{pifont}\\usepackage{stmaryrd}\\usepackage{textcomp}\\usepackage{portland, xspace}\\usepackage{amsmath, amsxtra}\\pagestyle{empty}\\DeclareMathSizes{10}{9}{7}{6}\\begin

  14. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression

    PubMed Central

    Popham, Holly J. R.; Gould, Fred; Adang, Michael J.; Jurat-Fuentes, Juan Luis

    2015-01-01

    Heliothine pests such as the tobacco budworm, Heliothis virescens (F.), pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus) and silkworm (Bombyx mori). Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt) Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new biological

  15. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression.

    PubMed

    Perera, Omaththage P; Shelby, Kent S; Popham, Holly J R; Gould, Fred; Adang, Michael J; Jurat-Fuentes, Juan Luis

    2015-01-01

    Heliothine pests such as the tobacco budworm, Heliothis virescens (F.), pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus) and silkworm (Bombyx mori). Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt) Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new biological

  16. A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing

    PubMed Central

    Cantacessi, Cinzia; Jex, Aaron R.; Hall, Ross S.; Young, Neil D.; Campbell, Bronwyn E.; Joachim, Anja; Nolan, Matthew J.; Abubucker, Sahar; Sternberg, Paul W.; Ranganathan, Shoba; Mitreva, Makedonka; Gasser, Robin B.

    2010-01-01

    Transcriptomics (at the level of single cells, tissues and/or whole organisms) underpins many fields of biomedical science, from understanding the basic cellular function in model organisms, to the elucidation of the biological events that govern the development and progression of human diseases, and the exploration of the mechanisms of survival, drug-resistance and virulence of pathogens. Next-generation sequencing (NGS) technologies are contributing to a massive expansion of transcriptomics in all fields and are reducing the cost, time and performance barriers presented by conventional approaches. However, bioinformatic tools for the analysis of the sequence data sets produced by these technologies can be daunting to researchers with limited or no expertise in bioinformatics. Here, we constructed a semi-automated, bioinformatic workflow system, and critically evaluated it for the analysis and annotation of large-scale sequence data sets generated by NGS. We demonstrated its utility for the exploration of differences in the transcriptomes among various stages and both sexes of an economically important parasitic worm (Oesophagostomum dentatum) as well as the prediction and prioritization of essential molecules (including GTPases, protein kinases and phosphatases) as novel drug target candidates. This workflow system provides a practical tool for the assembly, annotation and analysis of NGS data sets, also to researchers with a limited bioinformatic expertise. The custom-written Perl, Python and Unix shell computer scripts used can be readily modified or adapted to suit many different applications. This system is now utilized routinely for the analysis of data sets from pathogens of major socio-economic importance and can, in principle, be applied to transcriptomics data sets from any organism. PMID:20682560

  17. Assessment of Target Enrichment Platforms Using Massively Parallel Sequencing for the Mutation Detection for Congenital Muscular Dystrophy

    PubMed Central

    Valencia, C. Alexander; Rhodenizer, Devin; Bhide, Shruti; Chin, Ephrem; Littlejohn, Martin Robert; Keong, Lisa Mari; Rutkowski, Anne; Bonnemann, Carsten; Hegde, Madhuri

    2012-01-01

    Sequencing individual genes by Sanger sequencing is a time-consuming and costly approach to resolve clinically heterogeneous genetic disorders. Panel testing offers the ability to efficiently and cost-effectively screen all of the genes for a particular genetic disorder. We assessed the analytical sensitivity and specificity of two different enrichment technologies, solution-based hybridization and microdroplet-based PCR target enrichment, in conjunction with next-generation sequencing (NGS), to identify mutations in 321 exons representing 12 different genes involved with congenital muscular dystrophies. Congenital muscular dystrophies present diagnostic challenges due to phenotypic variability, lack of standard access to and inherent difficulties with muscle immunohistochemical stains, and a general lack of clinician awareness. NGS results were analyzed across several parameters, including sequencing metrics and genotype concordance with Sanger sequencing. Genotyping data showed that both enrichment technologies produced suitable calls for use in clinical laboratories. However, microdroplet-based PCR target enrichment is more appropriate for a clinical laboratory, due to excellent sequence specificity and uniformity, reproducibility, high coverage of the target exons, and the ability to distinguish the active gene versus known pseudogenes. Regardless of the method, exons with highly repetitive and high GC regions are not well enriched and require Sanger sequencing for completeness. Our study demonstrates the successful application of targeted sequencing in conjunction with NGS to screen for mutations in hundreds of exons in a genetically heterogeneous human disorder. PMID:22426012

  18. Identification and Characterization of Microsatellite Loci in Maqui (Aristotelia chilensis [Molina] Stunz) Using Next-Generation Sequencing (NGS)

    PubMed Central

    Bastías, Adriana; Correa, Francisco; Rojas, Pamela; Almada, Rubén; Muñoz, Carlos; Sagredo, Boris

    2016-01-01

    Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree native to South America with edible fruit characterized by very high antioxidant capacity and anthocyanin content. To preserve maqui as a genetic resource it is essential to study its genetic diversity. However, the complete genome is unknown and only a few gene sequences are available in databases. Simple sequence repeats (SSR) markers, which are neutral, co-dominant, reproducible and highly variable, are desirable to support genetic studies in maqui populations. By means of identification and characterization of microsatellite loci from a maqui genotype, using 454 sequencing technology, we develop a set of SSR for this species. Obtaining a total of 165,043 shotgun genome sequences, with an average read length of 387 bases, we covered 64 Mb of the maqui genome. Reads were assembled into 4,832 contigs, while 98,546 reads remained as singletons, generating a total of 103,378 consensus genomic sequences. A total of 24,494 SSR maqui markers were identified. Of them, 15,950 SSR maqui markers were classified as perfects. The most common SSR motifs were dinucleotide (31%), followed by tetranucleotide (26%) and trinucleotide motifs (24%). The motif AG/CT (28.4%) was the most abundant, while the motif AC (89 bp) was the largest. Eleven polymorphic SSRs were selected and used to analyze a population of 40 maqui genotypes. Polymorphism information content (PIC) ranged from 0.117 to 0.82, with an average of 0.58. Non-significant groups were observed in the maqui population, showing a panmictic genetic structure. In addition, we also predicted 11150 putative genes and 3 microRNAs (miRNAs) in maqui sequences. This results, including partial sequences of genes, some miRNAs and SSR markers from high throughput next generation sequencing (NGS) of maqui genomic DNA, constitute the first platform to undertake genetic and molecular studies of this important species. PMID:27459734

  19. Identification and Characterization of Microsatellite Loci in Maqui (Aristotelia chilensis [Molina] Stunz) Using Next-Generation Sequencing (NGS).

    PubMed

    Bastías, Adriana; Correa, Francisco; Rojas, Pamela; Almada, Rubén; Muñoz, Carlos; Sagredo, Boris

    2016-01-01

    Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree native to South America with edible fruit characterized by very high antioxidant capacity and anthocyanin content. To preserve maqui as a genetic resource it is essential to study its genetic diversity. However, the complete genome is unknown and only a few gene sequences are available in databases. Simple sequence repeats (SSR) markers, which are neutral, co-dominant, reproducible and highly variable, are desirable to support genetic studies in maqui populations. By means of identification and characterization of microsatellite loci from a maqui genotype, using 454 sequencing technology, we develop a set of SSR for this species. Obtaining a total of 165,043 shotgun genome sequences, with an average read length of 387 bases, we covered 64 Mb of the maqui genome. Reads were assembled into 4,832 contigs, while 98,546 reads remained as singletons, generating a total of 103,378 consensus genomic sequences. A total of 24,494 SSR maqui markers were identified. Of them, 15,950 SSR maqui markers were classified as perfects. The most common SSR motifs were dinucleotide (31%), followed by tetranucleotide (26%) and trinucleotide motifs (24%). The motif AG/CT (28.4%) was the most abundant, while the motif AC (89 bp) was the largest. Eleven polymorphic SSRs were selected and used to analyze a population of 40 maqui genotypes. Polymorphism information content (PIC) ranged from 0.117 to 0.82, with an average of 0.58. Non-significant groups were observed in the maqui population, showing a panmictic genetic structure. In addition, we also predicted 11150 putative genes and 3 microRNAs (miRNAs) in maqui sequences. This results, including partial sequences of genes, some miRNAs and SSR markers from high throughput next generation sequencing (NGS) of maqui genomic DNA, constitute the first platform to undertake genetic and molecular studies of this important species. PMID:27459734

  20. Targeted Next-generation Sequencing of Advanced Prostate Cancer Identifies Potential Therapeutic Targets and Disease Heterogeneity

    PubMed Central

    Beltran, Himisha; Yelensky, Roman; Frampton, Garrett M.; Park, Kyung; Downing, Sean R.; MacDonald, Theresa Y.; Jarosz, Mirna; Lipson, Doron; Tagawa, Scott T.; Nanus, David M.; Stephens, Philip J.; Mosquera, Juan Miguel; Cronin, Maureen T.; Rubin, Mark A.

    2012-01-01

    Background Most personalized cancer care strategies involving DNA sequencing are highly reliant on acquiring sufficient fresh or frozen tissue. It has been challenging to comprehensively evaluate the genome of advanced prostate cancer (PCa) because of limited access to metastatic tissue. Objective To demonstrate the feasibility of a novel next-generation sequencing (NGS) based platform that can be used with archival formalin-fixed paraffin-embedded (FFPE) biopsy tissue to evaluate the spectrum of DNA alterations seen in advanced PCa. Design, setting, and participants FFPE samples (including archival prostatectomies and prostate needle biopsies) were obtained from 45 patients representing the spectrum of disease: localized PCa, metastatic hormone-naive PCa, and metastatic castration-resistant PCa (CRPC). We also assessed paired primaries and metastases to understand disease heterogeneity and disease progression. Intervention At least 50 ng of tumor DNA was extracted from FFPE samples and used for hybridization capture and NGS using the Illumina HiSeq 2000 platform. Outcome measurements and statistical analysis A total of 3320 exons of 182 cancer-associated genes and 37 introns of 14 commonly rearranged genes were evaluated for genomic alterations. Results and limitations We obtained an average sequencing depth of >900X. Overall, 44% of CRPCs harbored genomic alterations involving the androgen receptor gene (AR), including AR copy number gain (24% of CRPCs) or AR point mutation (20% of CRPCs). Other recurrent mutations included transmembrane protease, serine 2 gene (TMPRSS2):v-ets erythroblastosis virus E26 oncogene homolog (avian) gene (ERG) fusion (44%); phosphatase and tensin homolog gene (PTEN) loss (44%); tumor protein p53 gene (TP53) mutation (40%); retinoblastoma gene (RB) loss (28%); v-myc myelocytomatosis viral oncogene homolog (avian) gene (MYC) gain (12%); and phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit α gene (PIK3CA) mutation (4

  1. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  2. Enhanced microbial diversity in the saliva microbiome induced by short-term probiotic intake revealed by 16S rRNA sequencing on the IonTorrent PGM platform.

    PubMed

    Dassi, Erik; Ballarini, Annalisa; Covello, Giuseppina; Quattrone, Alessandro; Jousson, Olivier; De Sanctis, Veronica; Bertorelli, Roberto; Denti, Michela Alessandra; Segata, Nicola

    2014-11-20

    Microbial communities populating several human body habitats are important determinants of human health. Cultivation-free community-wide approaches like bacterial 16S rRNA sequencing recently revolutionized the study of such human-associated microbial diversity, and the continuously decreasing cost/throughput ratio of current sequencing platforms is further enhancing the availability and effectiveness of microbiome research. The IonTorrent PGM platform is among the latest available commercial high-throughput sequencing tools, but it is just starting to be used for 16S rRNA surveys with only episodic assessments of its performance. We present here the first IonTorrent profiling of the human saliva microbiome collected from 12 healthy individuals. In this cohort, a subset of the volunteers was asked to assume a probiotic product, in order to investigate its impact on the composition and the structure of the saliva microbiome. Analysis of the generated dataset suggests the suitability of the IonTorrent platform for 16S rRNA surveys, even though some platform-specific choices are required to optimize the consistency of the obtained bacterial profiles. Interestingly, we found a marked and statistically significant increase of the overall bacterial diversity in the saliva of individuals who received the probiotic product compared to the control group, suggesting a short-term effect of probiotic product administration on oral microbiome composition. PMID:24670254

  3. Applications of Next-Generation Sequencing Technologies to Diagnostic Virology

    PubMed Central

    Barzon, Luisa; Lavezzo, Enrico; Militello, Valentina; Toppo, Stefano; Palù, Giorgio

    2011-01-01

    Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS), provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics. PMID:22174638

  4. Utility of different massive parallel sequencing platforms for mutation profiling in clinical samples and identification of pitfalls using FFPE tissue.

    PubMed

    Fassunke, Jana; Haller, Florian; Hebele, Simone; Moskalev, Evgeny A; Penzel, Roland; Pfarr, Nicole; Merkelbach-Bruse, Sabine; Endris, Volker

    2015-11-01

    In the growing field of personalised medicine, the analysis of numerous potential targets is becoming a challenge in terms of work load, tissue availability, as well as costs. The molecular analysis of non-small cell lung cancer (NSCLC) has shifted from the analysis of the epidermal growth factor receptor (EGFR) mutation status to the analysis of different gene regions, including resistance mutations or translocations. Massive parallel sequencing (MPS) allows rapid comprehensive mutation testing in routine molecular pathological diagnostics even on small formalin-fixed, paraffin‑embedded (FFPE) biopsies. In this study, we compared and evaluated currently used MPS platforms for their application in routine pathological diagnostics. We initiated a first round‑robin testing of 30 cases diagnosed with NSCLC and a known EGFR gene mutation status. In this study, three pathology institutes from Germany received FFPE tumour sections that had been individually processed. Fragment libraries were prepared by targeted multiplex PCR using institution‑specific gene panels. Sequencing was carried out using three MPS systems: MiSeq™, GS Junior and PGM Ion Torrent™. In two institutes, data analysis was performed with the platform-specific software and the Integrative Genomics Viewer. In one institute, data analysis was carried out using an in-house software system. Of 30 samples, 26 were analysed by all institutes. Concerning the EGFR mutation status, concordance was found in 26 out of 26 samples. The analysis of a few samples failed due to poor DNA quality in alternating institutes. We found 100% concordance when comparing the results of the EGFR mutation status. A total of 38 additional mutations were identified in the 26 samples. In two samples, minor variants were found which could not be confirmed by qPCR. Other characteristic variants were identified as fixation artefacts by reanalyzing the respective sample by Sanger sequencing. Overall, the results of this study

  5. Utility of different massive parallel sequencing platforms for mutation profiling in clinical samples and identification of pitfalls using FFPE tissue

    PubMed Central

    FASSUNKE, JANA; HALLER, FLORIAN; HEBELE, SIMONE; MOSKALEV, EVGENY A.; PENZEL, ROLAND; PFARR, NICOLE; MERKELBACH-BRUSE, SABINE; ENDRIS, VOLKER

    2015-01-01

    In the growing field of personalised medicine, the analysis of numerous potential targets is becoming a challenge in terms of work load, tissue availability, as well as costs. The molecular analysis of non-small cell lung cancer (NSCLC) has shifted from the analysis of the epidermal growth factor receptor (EGFR) mutation status to the analysis of different gene regions, including resistance mutations or translocations. Massive parallel sequencing (MPS) allows rapid comprehensive mutation testing in routine molecular pathological diagnostics even on small formalin-fixed, paraffin-embedded (FFPE) biopsies. In this study, we compared and evaluated currently used MPS platforms for their application in routine pathological diagnostics. We initiated a first round-robin testing of 30 cases diagnosed with NSCLC and a known EGFR gene mutation status. In this study, three pathology institutes from Germany received FFPE tumour sections that had been individually processed. Fragment libraries were prepared by targeted multiplex PCR using institution-specific gene panels. Sequencing was carried out using three MPS systems: MiSeq™, GS Junior and PGM Ion Torrent™. In two institutes, data analysis was performed with the platform-specific software and the Integrative Genomics Viewer. In one institute, data analysis was carried out using an in-house software system. Of 30 samples, 26 were analysed by all institutes. Concerning the EGFR mutation status, concordance was found in 26 out of 26 samples. The analysis of a few samples failed due to poor DNA quality in alternating institutes. We found 100% concordance when comparing the results of the EGFR mutation status. A total of 38 additional mutations were identified in the 26 samples. In two samples, minor variants were found which could not be confirmed by qPCR. Other characteristic variants were identified as fixation artefacts by reanalyzing the respective sample by Sanger sequencing. Overall, the results of this study

  6. Observation Platforms and Data Streams of the Arctic Next Generation Ecosystem Experiment (NGEE-Arctic)

    NASA Astrophysics Data System (ADS)

    Hinzman, L. D.; Wullschleger, S. D.; Graham, D. E.; Hubbard, S. S.; Norby, R. J.; Rogers, A.; Torn, M. S.; Wilson, C. J.

    2013-12-01

    The goal of the Arctic Next Generation Ecosystem Experiment (NGEE-Arctic) is to deliver a process-rich ecosystem model, extending from bedrock to the top of the vegetative canopy, in which the evolution of Arctic ecosystems in a changing climate can be modeled at the scale of a high resolution Earth System Model grid cell. Increasing our confidence in climate projections for high-latitude regions of the world requires a coordinated set of observation platforms that target improved process understanding and model representation of important ecosystem-climate feedbacks. The Next-Generation Ecosystem Experiments (NGEE Arctic) seeks to address this challenge by quantifying the physical, chemical, and biological behavior of terrestrial ecosystems in Alaska. Initial research has focused upon the highly dynamic landscapes of the North Slope (Barrow, Alaska) where thaw lakes, drained thaw lake basins, and ice-rich polygonal ground offer distinct land units for investigation and modeling. This vision includes mechanistic studies in the field and in the laboratory; modeling of critical and interrelated water, nitrogen, carbon, and energy dynamics; and characterization of important interactions from molecular to landscape scales that drive feedbacks to the climate system. To complete these investigations, an integrated program of field monitoring has been initiated. These include observations of meteorological, hydrological, ecological and geophysical processes. These data streams are intended to supplement and extend existing polar data sets to advance our understanding of the Arctic environment and its response to a rapidly changing climate.

  7. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  8. Beating heart on a chip: a novel microfluidic platform to generate functional 3D cardiac microtissues.

    PubMed

    Marsano, Anna; Conficconi, Chiara; Lemme, Marta; Occhetta, Paola; Gaudiello, Emanuele; Votta, Emiliano; Cerino, Giulia; Redaelli, Alberto; Rasponi, Marco

    2016-02-01

    In the past few years, microfluidic-based technology has developed microscale models recapitulating key physical and biological cues typical of the native myocardium. However, the application of controlled physiological uniaxial cyclic strains on a defined three-dimension cellular environment is not yet possible. Two-dimension mechanical stimulation was particularly investigated, neglecting the complex three-dimensional cell-cell and cell-matrix interactions. For this purpose, we developed a heart-on-a-chip platform, which recapitulates the physiologic mechanical environment experienced by cells in the native myocardium. The device includes an array of hanging posts to confine cell-laden gels, and a pneumatic actuation system to induce homogeneous uniaxial cyclic strains to the 3D cell constructs during culture. The device was used to generate mature and highly functional micro-engineered cardiac tissues (μECTs), from both neonatal rat and human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CM), strongly suggesting the robustness of our engineered cardiac micro-niche. Our results demonstrated that the cyclic strain was effectively highly uniaxial and uniformly transferred to cells in culture. As compared to control, stimulated μECTs showed superior cardiac differentiation, as well as electrical and mechanical coupling, owing to a remarkable increase in junction complexes. Mechanical stimulation also promoted early spontaneous synchronous beating and better contractile capability in response to electric pacing. Pacing analyses of hiPSC-CM constructs upon controlled administration of isoprenaline showed further promising applications of our platform in drug discovery, delivery and toxicology fields. The proposed heart-on-a-chip device represents a relevant step forward in the field, providing a standard functional three-dimensional cardiac model to possibly predict signs of hypertrophic changes in cardiac phenotype by mechanical and biochemical co

  9. A Framework for the Generation and Dissemination of Drop Size Distribution (DSD) Characteristics Using Multiple Platforms

    NASA Technical Reports Server (NTRS)

    Wolf, David B.; Tokay, Ali; Petersen, Walt; Williams, Christopher; Gatlin, Patrick; Wingo, Mathew

    2010-01-01

    Proper characterization of the precipitation drop size distribution (DSD) is integral to providing realistic and accurate space- and ground-based precipitation retrievals. Current technology allows for the development of DSD products from a variety of platforms, including disdrometers, vertical profilers and dual-polarization radars. Up to now, however, the dissemination or availability of such products has been limited to individual sites and/or field campaigns, in a variety of formats, often using inconsistent algorithms for computing the integral DSD parameters, such as the median- and mass-weighted drop diameter, total number concentration, liquid water content, rain rate, etc. We propose to develop a framework for the generation and dissemination of DSD characteristic products using a unified structure, capable of handling the myriad collection of disdrometers, profilers, and dual-polarization radar data currently available and to be collected during several upcoming GPM Ground Validation field campaigns. This DSD super-structure paradigm is an adaptation of the radar super-structure developed for NASA s Radar Software Library (RSL) and RSL_in_IDL. The goal is to provide the DSD products in a well-documented format, most likely NetCDF, along with tools to ingest and analyze the products. In so doing, we can develop a robust archive of DSD products from multiple sites and platforms, which should greatly benefit the development and validation of precipitation retrieval algorithms for GPM and other precipitation missions. An outline of this proposed framework will be provided as well as a discussion of the algorithms used to calculate the DSD parameters.

  10. Efficient mapping of genomic sequences to optimize multiple pairwise alignment in hybrid cluster platforms.

    PubMed

    Montañola, Alberto; Roig, Concepció; Hernández, Porfidio

    2014-01-01

    Multiple sequence alignment (MSA), used in biocomputing to study similarities between different genomic sequences, is known to require important memory and computation resources. Nowadays, researchers are aligning thousands of these sequences, creating new challenges in order to solve the problem using the available resources efficiently. Determining the efficient amount of resources to allocate is important to avoid waste of them, thus reducing the economical costs required in running for example a specific cloud instance. The pairwise alignment is the initial key step of the MSA problem, which will compute all pair alignments needed. We present a method to determine the optimal amount of memory and computation resources to allocate by the pairwise alignment, and we will validate it through a set of experimental results for different possible inputs. These allow us to determine the best parameters to configure the applications in order to use effectively the available resources of a given system. PMID:25339085

  11. A Systems Approach towards an Intelligent and Self-Controlling Platform for Integrated Continuous Reaction Sequences**

    PubMed Central

    Ingham, Richard J; Battilocchio, Claudio; Fitzpatrick, Daniel E; Sliwinski, Eric; Hawkins, Joel M; Ley, Steven V

    2015-01-01

    Performing reactions in flow can offer major advantages over batch methods. However, laboratory flow chemistry processes are currently often limited to single steps or short sequences due to the complexity involved with operating a multi-step process. Using new modular components for downstream processing, coupled with control technologies, more advanced multi-step flow sequences can be realized. These tools are applied to the synthesis of 2-aminoadamantane-2-carboxylic acid. A system comprising three chemistry steps and three workup steps was developed, having sufficient autonomy and self-regulation to be managed by a single operator. PMID:25377747

  12. NexGen Production – Sequencing and Analysis

    SciTech Connect

    Muzny, Donna

    2010-06-02

    Donna Muzny of the Baylor College of Medicine Human Genome Sequencing Center discusses next generation sequencing platforms and evaluating pipeline performance on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  13. Burrow-generated false facies and phantom sequences

    SciTech Connect

    Wanless, H.R.; Tagett, M.

    1986-05-01

    Callianassa (=Ophiomorpha) and other burrowers deeply rework shallow marine sequences. Through in-situ reworking, they create false sedimentary facies and stratigraphic sequences. Callianassa's key to effectiveness is that it expels sand and mud from burrow excavations but concentrates coarse material at the base of the burrow complex. Coarse material can be derived by falling into the burrow entrance, by reworking the existing sediment sequence, or by a combination of both. Examples come from shallow marine carbonate environments of south Florida and the Turks and Caicos Islands, British West Indies. Many mudbanks in south Florida are formed as stacks of layered mudstone units 20-100 cm thick. Between events, seagrasses may recolonize, and a burrowing benthic community may repopulate the substrate. The layered mudstone beneath older areas of mudbank flats can gradually be converted to a bioturbated skeletal wackestone by the deep burrowing community. Burrowing also causes mixing of faunal assemblages. On Caicos Bank, an extensive carbonate tidal flat (3-4 m thick) is slowly being transgressed. About 1 m of tidal-flat sequence is eroded at the shoreline. The remaining 2-3 m could be preserved as part of the transgressive sequence. Callianassa burrowing, however, quickly reworks the sequence, replacing tidal-flat sands and muds with marine peloidal and skeletal sediment. Within 100 m of the shoreline, the only evidence of the tidal-flat sequence is a concentration of high-spired gastropods in Calliannassa burrows at the base of the Holocene sequence and a few patches of tidal-flat sediment that burrowers missed. What looks like a basal transgressive lag is in fact a biogenic concentrate from in-situ reworking of a now phantom sequence.

  14. Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements

    PubMed Central

    Mecham, Brigham H.; Klus, Gregory T.; Strovel, Jeffrey; Augustus, Meena; Byrne, David; Bozso, Peter; Wetmore, Daniel Z.; Mariani, Thomas J.; Kohane, Isaac S.; Szallasi, Zoltan

    2004-01-01

    Cancer derived microarray data sets are routinely produced by various platforms that are either commercially available or manufactured by academic groups. The fundamental difference in their probe selection strategies holds the promise that identical observations produced by more than one platform prove to be more robust when validated by biology. However, cross-platform comparison requires matching corresponding probe sets. We are introducing here sequence-based matching of probes instead of gene identifier-based matching. We analyzed breast cancer cell line derived RNA aliquots using Agilent cDNA and Affymetrix oligonucleotide microarray platforms to assess the advantage of this method. We show, that at different levels of the analysis, including gene expression ratios and difference calls, cross-platform consistency is significantly improved by sequence- based matching. We also present evidence that sequence-based probe matching produces more consistent results when comparing similar biological data sets obtained by different microarray platforms. This strategy allowed a more efficient transfer of classification of breast cancer samples between data sets produced by cDNA microarray and Affymetrix gene-chip platforms. PMID:15161944

  15. Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements.

    PubMed

    Mecham, Brigham H; Klus, Gregory T; Strovel, Jeffrey; Augustus, Meena; Byrne, David; Bozso, Peter; Wetmore, Daniel Z; Mariani, Thomas J; Kohane, Isaac S; Szallasi, Zoltan

    2004-01-01

    Cancer derived microarray data sets are routinely produced by various platforms that are either commercially available or manufactured by academic groups. The fundamental difference in their probe selection strategies holds the promise that identical observations produced by more than one platform prove to be more robust when validated by biology. However, cross-platform comparison requires matching corresponding probe sets. We are introducing here sequence-based matching of probes instead of gene identifier-based matching. We analyzed breast cancer cell line derived RNA aliquots using Agilent cDNA and Affymetrix oligonucleotide microarray platforms to assess the advantage of this method. We show, that at different levels of the analysis, including gene expression ratios and difference calls, cross-platform consistency is significantly improved by sequence- based matching. We also present evidence that sequence-based probe matching produces more consistent results when comparing similar biological data sets obtained by different microarray platforms. This strategy allowed a more efficient transfer of classification of breast cancer samples between data sets produced by cDNA microarray and Affymetrix gene-chip platforms. PMID:15161944

  16. Spread-spectrum communications using sequences generated by phase filters

    NASA Astrophysics Data System (ADS)

    Bouvet, M.

    The principal characteristics of spread-spectrum communications is to extend the signal spectrum in order to combat jammers and other interferences. The 'noise-like' emitted signal must have a power spectral density as flat as possible. It is shown that the impulse response of an ARMA phase filter can be considered an infinite sequence with this good spectrum property. Such sequences are studied as alternatives for spread-spectrum communications signal design. Characteristics of these signals, such as their autocorrelation, spectrum, and intercorrelation are investigated. Some comparisons with other pseudorandom sequences are given.

  17. Radiation Reduction Capabilities of a Next-Generation Pediatric Imaging Platform.

    PubMed

    Lamers, Luke J; Moran, Martine; Torgeson, Jenna N; Hokanson, John S

    2016-01-01

    The aims of this study were to quantify patient radiation exposure for a single interventional procedure during transition from an adult catheterization laboratory to a next-generation imaging system with pediatric settings, and to compare this radiation data to published benchmarks. Radiation exposure occurs with any X-ray-directed pediatric catheterization. Technologies and imaging techniques that limit dose while preserving image quality benefit patient care. Patient radiation dose metrics, air kerma, and dose-area product (DAP) were retrospectively obtained for patients <20 kg who underwent patent ductus arteriosus (PDA) closure on a standard imaging system (Group 1, n = 11) and a next-generation pediatric imaging system (Group 2, n = 10) with air-gap technique. Group 2 radiation dose metrics were then compared to published benchmarks. Patient demographics, procedural technique, PDA dimensions, closure devices, and fluoroscopy time were similar for the two groups. Air kerma and DAP decreased by 65-70% in Group 2 (p values <0.001). The average number of angiograms approached statistical significance (p value = 0.06); therefore, analysis of covariance (ANCOVA) was conducted that confirmed significantly lower dose measures in Group 2. This degree of dose reduction was similar when Group 2 data (Kerma 28 mGy, DAP 199 µGy m(2)) was compared to published benchmarks for PDA closure (Kerma 76 mGy, DAP 500 µGy m(2)). This is the first clinical study documenting the radiation reduction capabilities of a next-generation pediatric imaging platform. The true benefit of this dose reduction will be seen in patients requiring complex and often recurrent catheterizations. PMID:26215767

  18. Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

    PubMed

    Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

    2005-05-01

    The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species. PMID:15888677

  19. Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization.

    PubMed

    Veidenberg, Andres; Medlar, Alan; Löytynoja, Ari

    2016-04-01

    Wasabi is an open source, web-based environment for evolutionary sequence analysis. Wasabi visualizes sequence data together with a phylogenetic tree within a modern, user-friendly interface: The interface hides extraneous options, supports context sensitive menus, drag-and-drop editing, and displays additional information, such as ancestral sequences, associated with specific tree nodes. The Wasabi environment supports reproducibility by automatically storing intermediate analysis steps and includes built-in functions to share data between users and publish analysis results. For computational analysis, Wasabi supports PRANK and PAGAN for phylogeny-aware alignment and alignment extension, and it can be easily extended with other tools. Along with drag-and-drop import of local files, Wasabi can access remote data through URL and import sequence data, GeneTrees and EPO alignments directly from Ensembl. To demonstrate a typical workflow using Wasabi, we reproduce key findings from recent comparative genomics studies, including a reanalysis of the EGLN1 gene from the tiger genome study: These case studies can be browsed within Wasabi at http://wasabiapp.org:8000?id=usecases. Wasabi runs inside a web browser and does not require any installation. One can start using it at http://wasabiapp.org. All source code is licensed under the AGPLv3. PMID:26635364

  20. Recurrent Network Models of Sequence Generation and Memory.

    PubMed

    Rajan, Kanaka; Harvey, Christopher D; Tank, David W

    2016-04-01

    Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in which a principled mechanism is pre-wired into their connectivity. Here we demonstrate that, starting from random connectivity and modifying a small fraction of connections, a largely disordered recurrent network can produce sequences and implement working memory efficiently. We use this process, called Partial In-Network Training (PINning), to model and match cellular resolution imaging data from the posterior parietal cortex during a virtual memory-guided two-alternative forced-choice task. Analysis of the connectivity reveals that sequences propagate by the cooperation between recurrent synaptic interactions and external inputs, rather than through feedforward or asymmetric connections. Together our results suggest that neural sequences may emerge through learning from largely unstructured network architectures. PMID:26971945

  1. A next-generation marker genotyping platform (AmpSeq) in heterozygous crops: a case study for marker-assisted selection in grapevine

    PubMed Central

    Yang, Shanshan; Fresnedo-Ramírez, Jonathan; Wang, Minghui; Cote, Linda; Schweitzer, Peter; Barba, Paola; Takacs, Elizabeth M; Clark, Matthew; Luby, James; Manns, David C; Sacks, Gavin; Mansfield, Anna Katharine; Londo, Jason; Fennell, Anne; Gadoury, David; Reisch, Bruce; Cadle-Davidson, Lance; Sun, Qi

    2016-01-01

    Marker-assisted selection (MAS) is often employed in crop breeding programs to accelerate and enhance cultivar development, via selection during the juvenile phase and parental selection prior to crossing. Next-generation sequencing and its derivative technologies have been used for genome-wide molecular marker discovery. To bridge the gap between marker development and MAS implementation, this study developed a novel practical strategy with a semi-automated pipeline that incorporates trait-associated single nucleotide polymorphism marker discovery, low-cost genotyping through amplicon sequencing (AmpSeq) and decision making. The results document the development of a MAS package derived from genotyping-by-sequencing using three traits (flower sex, disease resistance and acylated anthocyanins) in grapevine breeding. The vast majority of sequence reads (⩾99%) were from the targeted regions. Across 380 individuals and up to 31 amplicons sequenced in each lane of MiSeq data, most amplicons (83 to 87%) had <10% missing data, and read depth had a median of 220–244×. Several strengths of the AmpSeq platform that make this approach of broad interest in diverse crop species include accuracy, flexibility, speed, high-throughput, low-cost and easily automated analysis. PMID:27257505

  2. Targeted multiplex next-generation sequencing: advances in techniques of mitochondrial and nuclear DNA sequencing for population genomics.

    PubMed

    Hancock-Hanser, Brittany L; Frey, Amy; Leslie, Matthew S; Dutton, Peter H; Archer, Frederick I; Morin, Phillip A

    2013-03-01

    Next-generation sequencing (NGS) is emerging as an efficient and cost-effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi-genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross-species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low-coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species-level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles. PMID:23351075

  3. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    PubMed Central

    Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S.; Singh, Rajesh R.; Roy-Chowdhuri, Sinchita

    2015-01-01

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects. PMID:26343728

  4. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  5. Next generation sequencers: methods and applications in food-borne pathogens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencers are able to produce millions of short sequence reads in a high-throughput, low-cost way. The emergence of these technologies has not only facilitated genome sequencing but also started to change the landscape of life sciences. This chapter will survey their methods and app...

  6. Adaptation to an automated platform of algorithmic combinations of advantageous mutations in genes generated using amino acid scanning mutational strategy.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recent mutational strategies for generating and screening of genes for optimized traits, including directed evolution, domain shuffling, random mutagenesis, and site-directed mutagenesis, have been adapted for automated platforms. Here we discuss the amino acid scanning mutational strategy and its ...

  7. A direct comparison of next generation sequencing enrichment methods using an aortopathy gene panel- clinical diagnostics perspective

    PubMed Central

    2012-01-01

    Background Aortopathies are a group of disorders characterized by aneurysms, dilation, and tortuosity of the aorta. Because of the phenotypic overlap and genetic heterogeneity of diseases featuring aortopathy, molecular testing is often required for timely and correct diagnosis of affected individuals. In this setting next generation sequencing (NGS) offers several advantages over traditional molecular techniques. Methods The purpose of our study was to compare NGS enrichment methods for a clinical assay targeting the nine genes known to be associated with aortopathy. RainDance emulsion PCR and SureSelect RNA-bait hybridization capture enrichment methods were directly compared by enriching DNA from eight samples. Enriched samples were barcoded, pooled, and sequenced on the Illumina HiSeq2000 platform. Depth of coverage, consistency of coverage across samples, and the overlap of variants identified were assessed. This data was also compared to whole-exome sequencing data from ten individuals. Results Read depth was greater and less variable among samples that had been enriched using the RNA-bait hybridization capture enrichment method. In addition, samples enriched by hybridization capture had fewer exons with mean coverage less than 10, reducing the need for followup Sanger sequencing. Variants sets produced were 77% concordant, with both techniques yielding similar numbers of discordant variants. Conclusions When comparing the design flexibility, performance, and cost of the targeted enrichment methods to whole-exome sequencing, the RNA-bait hybridization capture enrichment gene panel offers the better solution for interrogating the aortopathy genes in a clinical laboratory setting. PMID:23148498

  8. Using ultra-sensitive next generation sequencing to dissect DNA damage-induced mutagenesis.

    PubMed

    Wang, Kaile; Ma, Xiaolu; Zhang, Xue; Wu, Dafei; Sun, Chenyi; Sun, Yazhou; Lu, Xuemei; Wu, Chung-I; Guo, Caixia; Ruan, Jue

    2016-01-01

    Next generation sequencing (NGS) technologies have dramatically improved studies in biology and biomedical science. However, no optimal NGS approach is available to conveniently analyze low frequency mutations caused by DNA damage treatments. Here, by developing an exquisite ultra-sensitive NGS (USNGS) platform "EasyMF" and incorporating it with a widely used supF shuttle vector-based mutagenesis system, we can conveniently dissect roles of lesion bypass polymerases in damage-induced mutagenesis. In this improved mutagenesis analysis pipeline, the initial steps are the same as in the supF mutation assay, involving damaging the pSP189 plasmid followed by its transfection into human 293T cells to allow replication to occur. Then "EasyMF" is employed to replace downstream MBM7070 bacterial transformation and other steps for analyzing damage-induced mutation frequencies and spectra. This pipeline was validated by using UV damaged plasmid after its replication in lesion bypass polymerase-deficient 293T cells. The increased throughput and reduced cost of this system will allow us to conveniently screen regulators of translesion DNA synthesis pathway and monitor environmental genotoxic substances, which can ultimately provide insight into the mechanisms of genome stability and mutagenesis. PMID:27122023

  9. Using ultra-sensitive next generation sequencing to dissect DNA damage-induced mutagenesis

    PubMed Central

    Wang, Kaile; Ma, Xiaolu; Zhang, Xue; Wu, Dafei; Sun, Chenyi; Sun, Yazhou; Lu, Xuemei; Wu, Chung-I; Guo, Caixia; Ruan, Jue

    2016-01-01

    Next generation sequencing (NGS) technologies have dramatically improved studies in biology and biomedical science. However, no optimal NGS approach is available to conveniently analyze low frequency mutations caused by DNA damage treatments. Here, by developing an exquisite ultra-sensitive NGS (USNGS) platform “EasyMF” and incorporating it with a widely used supF shuttle vector-based mutagenesis system, we can conveniently dissect roles of lesion bypass polymerases in damage-induced mutagenesis. In this improved mutagenesis analysis pipeline, the initial steps are the same as in the supF mutation assay, involving damaging the pSP189 plasmid followed by its transfection into human 293T cells to allow replication to occur. Then “EasyMF” is employed to replace downstream MBM7070 bacterial transformation and other steps for analyzing damage-induced mutation frequencies and spectra. This pipeline was validated by using UV damaged plasmid after its replication in lesion bypass polymerase-deficient 293T cells. The increased throughput and reduced cost of this system will allow us to conveniently screen regulators of translesion DNA synthesis pathway and monitor environmental genotoxic substances, which can ultimately provide insight into the mechanisms of genome stability and mutagenesis. PMID:27122023

  10. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data

    PubMed Central

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-01-01

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses. PMID:26166306

  11. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data.

    PubMed

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-01-01

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses. PMID:26166306

  12. PileLine: a toolbox to handle genome position information in next-generation sequencing studies

    PubMed Central

    2011-01-01

    Background Genomic position (GP) files currently used in next-generation sequencing (NGS) studies are always difficult to manipulate due to their huge size and the lack of appropriate tools to properly manage them. The structure of these flat files is based on representing one line per position that has been covered by at least one aligned read, imposing significant restrictions from a computational performance perspective. Results PileLine implements a flexible command-line toolkit providing specific support to the management, filtering, comparison and annotation of GP files produced by NGS experiments. PileLine tools are coded in Java and run on both UNIX (Linux, Mac OS) and Windows platforms. The set of tools comprising PileLine are designed to be memory efficient by performing fast seek on-disk operations over sorted GP files. Conclusions Our novel toolbox has been extensively tested taking into consideration performance issues. It is publicly available at http://sourceforge.net/projects/pilelinetools under the GNU LGPL license. Full documentation including common use cases and guided analysis workflows is available at http://sing.ei.uvigo.es/pileline. PMID:21261974

  13. HLA genotyping in the clinical laboratory: comparison of next-generation sequencing methods.

    PubMed

    Profaizer, T; Lázár-Molnár, E; Close, D W; Delgado, J C; Kumánovics, A

    2016-07-01

    Implementation of human leukocyte antigen (HLA) genotyping by next-generation sequencing (NGS) in the clinical lab brings new challenges to the laboratories performing this testing. With the advent of commercially available HLA-NGS typing kits, labs must make numerous decisions concerning capital equipment and address labor considerations. Therefore, careful and unbiased evaluation of available methods is imperative. In this report, we compared our in-house developed HLA NGS typing with two commercially available kits from Illumina and Omixon using 10 International Histocompatibility Working Group (IHWG) and 36 clinical samples. Although all three methods employ long range polymerase chain reaction (PCR) and have been developed on the Illumina MiSeq platform, the methodologies for library preparation show significant variations. There was 100% typing concordance between all three methods at the first field when a HLA type could be assigned. Overall, HLA typing by NGS using in-house or commercially available methods is now feasible in clinical laboratories. However, technical variables such as hands-on time and indexing strategies are sufficiently different among these approaches to impact the workflow of the clinical laboratory. PMID:27524804

  14. Big data challenges in bone research: genome-wide association studies and next-generation sequencing

    PubMed Central

    Alonso, Nerea; Lucas, Gavin; Hysi, Pirro

    2015-01-01

    Genome-wide association studies (GWAS) have been developed as a practical method to identify genetic loci associated with disease by scanning multiple markers across the genome. Significant advances in the genetics of complex diseases have been made owing to advances in genotyping technologies, the progress of projects such as HapMap and 1000G and the emergence of genetics as a collaborative discipline. Because of its great potential to be used in parallel by multiple collaborators, it is important to adhere to strict protocols assuring data quality and analyses. Quality control analyses must be applied to each sample and each single-nucleotide polymorphism (SNP). The software package PLINK is capable of performing the whole range of necessary quality control tests. Genotype imputation has also been developed to substantially increase the power of GWAS methodology. Imputation permits the investigation of associations at genetic markers that are not directly genotyped. Results of individual GWAS reports can be combined through meta-analysis. Finally, next-generation sequencing (NGS) has gained popularity in recent years through its capacity to analyse a much greater number of markers across the genome. Although NGS platforms are capable of examining a higher number of SNPs compared with GWA studies, the results obtained by NGS require careful interpretation, as their biological correlation is incompletely understood. In this article, we will discuss the basic features of such protocols. PMID:25709812

  15. Mutation based treatment recommendations from next generation sequencing data: a comparison of web tools

    PubMed Central

    Patel, Jaymin M.; Knopf, Joshua; Reiner, Eric; Bossuyt, Veerle; Epstein, Lianne; DiGiovanna, Michael; Chung, Gina; Silber, Andrea; Sanft, Tara; Hofstatter, Erin; Mougalian, Sarah; Abu-Khalaf, Maysa; Platt, James; Shi, Weiwei; Gershkovich, Peter; Hatzis, Christos; Pusztai, Lajos

    2016-01-01

    Interpretation of complex cancer genome data, generated by tumor target profiling platforms, is key for the success of personalized cancer therapy. How to draw therapeutic conclusions from tumor profiling results is not standardized and may vary among commercial and academically-affiliated recommendation tools. We performed targeted sequencing of 315 genes from 75 metastatic breast cancer biopsies using the FoundationOne assay. Results were run through 4 different web tools including the Drug-Gene Interaction Database (DGidb), My Cancer Genome (MCG), Personalized Cancer Therapy (PCT), and cBioPortal, for drug and clinical trial recommendations. These recommendations were compared amongst each other and to those provided by FoundationOne. The identification of a gene as targetable varied across the different recommendation sources. Only 33% of cases had 4 or more sources recommend the same drug for at least one of the usually several altered genes found in tumor biopsies. These results indicate further development and standardization of broadly applicable software tools that assist in our therapeutic interpretation of genomic data is needed. Existing algorithms for data acquisition, integration and interpretation will likely need to incorporate artificial intelligence tools to improve both content and real-time status. PMID:26980737

  16. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS)

    PubMed Central

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-dos-Santos, André M.; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-dos-Santos, Ândrea

    2016-01-01

    Abstract Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  17. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS).

    PubMed

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-Dos-Santos, André M; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-Dos-Santos, Ândrea

    2016-05-13

    Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  18. Mutation based treatment recommendations from next generation sequencing data: a comparison of web tools.

    PubMed

    Patel, Jaymin M; Knopf, Joshua; Reiner, Eric; Bossuyt, Veerle; Epstein, Lianne; DiGiovanna, Michael; Chung, Gina; Silber, Andrea; Sanft, Tara; Hofstatter, Erin; Mougalian, Sarah; Abu-Khalaf, Maysa; Platt, James; Shi, Weiwei; Gershkovich, Peter; Hatzis, Christos; Pusztai, Lajos

    2016-04-19

    Interpretation of complex cancer genome data, generated by tumor target profiling platforms, is key for the success of personalized cancer therapy. How to draw therapeutic conclusions from tumor profiling results is not standardized and may vary among commercial and academically-affiliated recommendation tools. We performed targeted sequencing of 315 genes from 75 metastatic breast cancer biopsies using the FoundationOne assay. Results were run through 4 different web tools including the Drug-Gene Interaction Database (DGidb), My Cancer Genome (MCG), Personalized Cancer Therapy (PCT), and cBioPortal, for drug and clinical trial recommendations. These recommendations were compared amongst each other and to those provided by FoundationOne. The identification of a gene as targetable varied across the different recommendation sources. Only 33% of cases had 4 or more sources recommend the same drug for at least one of the usually several altered genes found in tumor biopsies. These results indicate further development and standardization of broadly applicable software tools that assist in our therapeutic interpretation of genomic data is needed. Existing algorithms for data acquisition, integration and interpretation will likely need to incorporate artificial intelligence tools to improve both content and real-time status. PMID:26980737

  19. A new, improved and generalizable approach for the analysis of biological data generated by -omic platforms.

    PubMed

    Pleasants, A B; Wake, G C; Shorten, P R; Hassell-Sweatman, C Z W; McLean, C A; Holbrook, J D; Gluckman, P D; Sheppard, A M

    2015-02-01

    The principles embodied by the Developmental Origins of Health and Disease (DOHaD) view of 'life history' trajectory are increasingly underpinned by biological data arising from molecular-based epigenomic and transcriptomic studies. Although a number of 'omic' platforms are now routinely and widely used in biology and medicine, data generation is frequently confounded by a frequency distribution in the measurement error (an inherent feature of the chemistry and physics of the measurement process), which adversely affect the accuracy of estimation and thus, the inference of relationships to other biological measures such as phenotype. Based on empirical derived data, we have previously derived a probability density function to capture such errors and thus improve the confidence of estimation and inference based on such data. Here we use published open source data sets to calculate parameter values relevant to the most widely used epigenomic and transcriptomic technologies Then by using our own data sets, we illustrate the benefits of this approach by specific application, to measurement of DNA methylation in this instance, in cases where levels of methylation at specific genomic sites represents either (1) a response variable or (2) an independent variable. Further, we extend this formulation to consideration of the 'bivariate' case, in which the co-dependency of methylation levels at two distinct genomic sites is tested for biological significance. These tools not only allow greater accuracy of measurement and improved confidence of functional inference, but in the case of epigenomic data at least, also reveal otherwise cryptic information. PMID:25335490

  20. Generating long sequences of high-intensity femtosecond pulses.

    PubMed

    Bitter, M; Milner, V

    2016-02-01

    We present an approach to creating pulse sequences extending beyond 150 ps in duration, comprised of 100 μJ femtosecond pulses. A quarter of the pulse train is produced by a high-resolution pulse shaper, which allows full controllability over the timing of each pulse. Two nested Michelson interferometers follow to quadruple the pulse number and the sequence duration. To boost the pulse energy, the long train is sent through a multipass Ti:sapphire amplifier, followed by an external compressor. A periodic sequence of 84 pulses of 120 fs width and an average pulse energy of 107 μJ, separated by 2 ps, is demonstrated as a proof of principle. PMID:26836087

  1. Autonomously generating operations sequences for a Mars Rover using AI-based planning

    NASA Technical Reports Server (NTRS)

    Sherwood, Rob; Mishkin, Andrew; Estlin, Tara; Chien, Steve; Backes, Paul; Cooper, Brian; Maxwell, Scott; Rabideau, Gregg

    2001-01-01

    This paper discusses a proof-of-concept prototype for ground-based automatic generation of validated rover command sequences from highlevel science and engineering activities. This prototype is based on ASPEN, the Automated Scheduling and Planning Environment. This Artificial Intelligence (AI) based planning and scheduling system will automatically generate a command sequence that will execute within resource constraints and satisfy flight rules.

  2. Coinfection of Fusobacterium nucleatum and Actinomyces israelii in Mastoiditis Diagnosed by Next-Generation DNA Sequencing

    PubMed Central

    Hoogestraat, Daniel R.; Abbott, April N.; SenGupta, Dhruba J.; Cummings, Lisa A.; Butler-Wu, Susan M.; Stephens, Karen; Cookson, Brad T.; Hoffman, Noah G.

    2014-01-01

    Some bacterial infections involve potentially complex mixtures of species that can now be distinguished using next-generation DNA sequencing. We present a case of mastoiditis where Gram stain, culture, and molecular diagnosis were nondiagnostic or discrepant. Next-generation sequencing implicated coinfection of Fusobacterium nucleatum and Actinomyces israelii, resolving these diagnostic discrepancies. PMID:24574281

  3. Microsatellites from Fosterella christophii (Bromeliaceae) by de novo transcriptome sequencing on the Pacific Biosciences RS platform1

    PubMed Central

    Wöhrmann, Tina; Huettel, Bruno; Wagner, Natascha; Weising, Kurt

    2016-01-01

    Premise of the study: Microsatellite markers were developed in Fosterella christophii (Bromeliaceae) to investigate the genetic diversity and population structure within the F. micrantha group, comprising F. christophii, F. micrantha, and F. villosula. Methods and Results: Full-length cDNAs were isolated from F. christophii and sequenced on a Pacific Biosciences RS platform. A total of 1590 high-quality consensus isoforms were assembled into 971 unigenes containing 421 perfect microsatellites. Thirty primer sets were designed, of which 13 revealed a high level of polymorphism in three populations of F. christophii, with four to nine alleles per locus. Each of these 13 loci cross-amplified in the closely related species F. micrantha and F. villosula, with one to six and one to 11 alleles per locus, respectively. Conclusions: The new markers are promising tools to study the population genetics of F. christophii and to discover species boundaries within the F. micrantha group. PMID:26819858

  4. Validation of targeted next-generation sequencing for RAS mutation detection in FFPE colorectal cancer tissues: comparison with Sanger sequencing and ARMS-Scorpion real-time PCR

    PubMed Central

    Gao, Jie; Wu, Huanwen; Wang, Li; Zhang, Hui; Duan, Huanli; Lu, Junliang; Liang, Zhiyong

    2016-01-01

    Objective To validate the targeted next-generation sequencing (NGS) platform-Ion Torrent PGM for KRAS exon 2 and expanded RAS mutations detection in formalin-fixed paraffin-embedded (FFPE) colorectal cancer (CRC) specimens, with comparison of Sanger sequencing and ARMS-Scorpion real-time PCR. Setting Beijing, China. Participants 51 archived FFPE CRC samples (36 men, 15 women) were retrospectively randomly selected and then checked by an experienced pathologist for sequencing based on histological confirmation of CRC and availability of sufficient tissue. Methods RAS mutations were detected in the 51 FFPE CRC samples by PGM analysis, Sanger sequencing and the Therascreen KRAS assay, respectively. Agreement among the 3 methods was assessed. Assay sensitivity was further determined by sequencing serially diluted DNA from FFPE cell lines with known mutation statuses. Results 13 of 51 (25.5%) cases had a mutation in KRAS exon 2, as determined by PGM analysis. PGM analysis showed 100% (51/51) concordance with Sanger sequencing (κ=1.000, 95% CI 1 to 1) and 98.04% (50/51) agreement with the Therascreen assay (κ=0.947, 95% CI 0.844 to 1) for detecting KRAS exon 2 mutations, respectively. The only discrepant case harboured a KRAS exon 2 mutation (c.37G>T) that was not covered by the Therascreen kit. The dilution series experiment results showed that PGM was able to detect KRAS mutations at a frequency of as low as 1%. Importantly, RAS mutations other than KRAS exon 2 mutations were also detected in 10 samples by PGM. Furthermore, mutations in other CRC-related genes could be simultaneously detected in a single test by PGM. Conclusions The targeted NGS platform is specific and sensitive for KRAS exon 2 mutation detection and is appropriate for use in routine clinical testing. Moreover, it is sample saving and cost-efficient and time-efficient, and has great potential for clinical application to expand testing to include mutations in RAS and other CRC-related genes. PMID

  5. Structural variation detection using next-generation sequencing data: A comparative technical review.

    PubMed

    Guan, Peiyong; Sung, Wing-Kin

    2016-06-01

    Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. PMID:26845461

  6. Complete mitochondrial genome sequence of Heteropneustes fossilis obtained by paired end next generation sequencing.

    PubMed

    Sahoo, Lakshman; Kumar, Santosh; Das, Sofia P; Patnaik, Siddhi; Bit, Amrita; Sundaray, Jitendra Kumar; Jayasankar, P; Das, Paramananda

    2016-07-01

    In the present study, the complete mitochondrial genome sequence of Heteropneustes fossilis is reported using massive parallel sequence technology. The complete mitogenome of H. fossilis is obtained by de novo assembly of paired end Illumina sequences using CLC Genomics Workbench version 7.0.4, which is 16,489 bp in length. It comprised of 13 protein- coding genes, 22 tRNAs, 2 rRNA genes and a putative control region along with the gene order and organization, being similar to most of vertebrates. The mitogenome in the present study has 99% similarity to the complete mitogneome sequence of H. Fossilis, as reported earlier. Phylogenetic analysis of Siluriformes depicted that Heteropneustids were closer to Clariids. The mitogenome sequence of H. fossilis contributes better understanding of population genetics, phylogenetics and evolution of Indian catfish species. PMID:26016883

  7. Effort required to finish shotgun-generated genome sequences differs significantly among vertebrates

    PubMed Central

    2010-01-01

    Background The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage. Results To date, our genome sequencing efforts have focused on comparative studies of targeted genomic regions, requiring sequence finishing of large blocks of orthologous sequence (average size 0.5-2 Mb) from various subsets of 75 vertebrates. This experience has provided a unique opportunity to compare the relative effort required to finish shotgun-generated genome sequence assemblies from different species, which we report here. Importantly, we found that the sequence assemblies generated for the same orthologous regions from various vertebrates show substantial variation with respect to misassemblies and, in particular, the frequency and characteristics of sequence gaps. As a consequence, the work required to finish different species' sequences varied greatly. Application of the same standardized methods for finishing provided a novel opportunity to "assay" characteristics of genome sequences among many vertebrate species. It is important to note that many of the problems we have encountered during sequence finishing reflect unique architectural features of a particular vertebrate's genome, which in some cases may have important functional and/or evolutionary implications. Finally, based on our analyses, we have been able to improve our procedures to overcome some of these problems and to increase the overall efficiency of the sequence-finishing process, although significant challenges still remain. Conclusion Our findings have important implications for the eventual finishing of the draft whole

  8. Using NS5B Sequencing for Hepatitis C Virus Genotyping Reveals Discordances with Commercial Platforms

    PubMed Central

    Chueca, Natalia; Rivadulla, Isidro; Lovatti, Rubén; Reina, Gabriel; Blanco, Ana; Fernandez-Caballero, Jose Angel; Cardeñoso, Laura; Rodriguez-Granjer, Javier; Fernandez-Alonso, Miriam; Aguilera, Antonio; Alvarez, Marta

    2016-01-01

    We aimed to evaluate the correct assignment of HCV genotypes by three commercial methods—Trugene HCV genotyping kit (Siemens), VERSANT HCV Genotype 2.0 assay (Siemens), and Real-Time HCV genotype II (Abbott)—compared to NS5B sequencing. We studied 327 clinical samples that carried representative HCV genotypes of the most frequent geno/subtypes in Spain. After commercial genotyping, the sequencing of a 367 bp fragment in the NS5B gene was used to assign genotypes. Major discrepancies were defined, e.g. differences in the assigned genotype by one of the three methods and NS5B sequencing, including misclassification of subtypes 1a and 1b. Minor discrepancies were considered when differences at subtype levels, other than 1a and 1b, were observed. The overall discordance with the reference method was 34% for Trugene and 15% for VERSANT HCV2.0. The Abbott assay correctly identified all 1a and 1b subtypes, but did not subtype all the 2, 3, 4 and 5 (34%) genotypes. Major discordances were found in 16% of cases for Trugene HCV, and the majority were 1b- to 1a-related discordances; major discordances were found for VERSANT HCV 2.0 in 6% of cases, which were all but one 1b to 1a cases. These results indicated that the Trugene assay especially, and to a lesser extent, Versant HCV 2.0, can fail to differentiate HCV subtypes 1a and 1b, and lead to critical errors in clinical practice for correctly using directly acting antiviral agents. PMID:27097040

  9. Using NS5B Sequencing for Hepatitis C Virus Genotyping Reveals Discordances with Commercial Platforms.

    PubMed

    Chueca, Natalia; Rivadulla, Isidro; Lovatti, Rubén; Reina, Gabriel; Blanco, Ana; Fernandez-Caballero, Jose Angel; Cardeñoso, Laura; Rodriguez-Granjer, Javier; Fernandez-Alonso, Miriam; Aguilera, Antonio; Alvarez, Marta; Galán, Juan Carlos; García, Federico

    2016-01-01

    We aimed to evaluate the correct assignment of HCV genotypes by three commercial methods-Trugene HCV genotyping kit (Siemens), VERSANT HCV Genotype 2.0 assay (Siemens), and Real-Time HCV genotype II (Abbott)-compared to NS5B sequencing. We studied 327 clinical samples that carried representative HCV genotypes of the most frequent geno/subtypes in Spain. After commercial genotyping, the sequencing of a 367 bp fragment in the NS5B gene was used to assign genotypes. Major discrepancies were defined, e.g. differences in the assigned genotype by one of the three methods and NS5B sequencing, including misclassification of subtypes 1a and 1b. Minor discrepancies were considered when differences at subtype levels, other than 1a and 1b, were observed. The overall discordance with the reference method was 34% for Trugene and 15% for VERSANT HCV2.0. The Abbott assay correctly identified all 1a and 1b subtypes, but did not subtype all the 2, 3, 4 and 5 (34%) genotypes. Major discordances were found in 16% of cases for Trugene HCV, and the majority were 1b- to 1a-related discordances; major discordances were found for VERSANT HCV 2.0 in 6% of cases, which were all but one 1b to 1a cases. These results indicated that the Trugene assay especially, and to a lesser extent, Versant HCV 2.0, can fail to differentiate HCV subtypes 1a and 1b, and lead to critical errors in clinical practice for correctly using directly acting antiviral agents. PMID:27097040

  10. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  11. Simple and efficient identification of rare recessive pathologically important sequence variants from next generation exome sequence data.

    PubMed

    Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T

    2013-07-01

    Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. PMID:23554237

  12. Sea level and geostrophic current control on carbonate shelf-slope depositional sequences and erosional patterns, south Florida platform margin

    SciTech Connect

    Locker, S.D.; Hine, A.C. ); Shinn, E.A. )

    1991-03-01

    High-resolution seismic reflection profiles across the shelf-slope margin between the Dry Tortugas and Key West, Florida, indicate that sea-level fluctuations and the eastward flowing Florida Current are major controls on late Quaternary sequence stratigraphy. The study area, a transition zone between the open south Florida shelf and the lower Florida Keys island/reef system, is typified by a shallow shelf with reef margin adjacent to a deeper lower-shelf/slope. The lower-shelf/slope is composed of stacked or prograding sequences that downlap and pinchout on the Pourtales Terrace. Strike oriented stratigraphic sections exhibit many sea-level controlled features such as lowstand erosion, transgressive unconformities, and highstand system tracts. Lowstand reefs, notches, or barriers are observed as deep as 150 m below present sea level. Depositional styles change along-slope from west to east. The western portion of the study area is characterized by thick, low-amplitude, prograding sequences related to abundant supply of sediment through off-shelf transport during high sea-levels as well as along-slope reworking by Florida current. Part of this section has been severely eroded by along-slope currents producing localized cut-and-fill structures and widespread erosional unconformities. To the east, a thinner section of high-amplitude reflections is common seaward of the lower Florida Keys reef tract system. This study provides a new evidence of how strong geostrophic boundary currents along with fluctuating sea levels have interacted to control depositional sequences on a carbonate slope in the Florida/Bahamas platform complex.

  13. Assessing kinetic and epitopic diversity across orthogonal monoclonal antibody generation platforms

    PubMed Central

    Abdiche, Yasmina Noubia; Harriman, Rian; Deng, Xiaodi; Yeung, Yik Andy; Miles, Adam; Morishige, Winse; Boustany, Leila; Zhu, Lei; Izquierdo, Shelley Mettler; Harriman, William

    2016-01-01

    ABSTRACT The ability of monoclonal antibodies (mAbs) to target specific antigens with high precision has led to an increasing demand to generate them for therapeutic use in many disease areas. Historically, the discovery of therapeutic mAbs has relied upon the immunization of mammals and various in vitro display technologies. While the routine immunization of rodents yields clones that are stable in serum and have been selected against vast arrays of endogenous, non-target self-antigens, it is often difficult to obtain species cross-reactive mAbs owing to the generally high sequence similarity shared across human antigens and their mammalian orthologs. In vitro display technologies bypass this limitation, but lack an in vivo screening mechanism, and thus may potentially generate mAbs with undesirable binding specificity and stability issues. Chicken immunization is emerging as an attractive mAb discovery method because it combines the benefits of both in vivo and in vitro display methods. Since chickens are phylogenetically separated from mammals, their proteins share less sequence homology with those of humans, so human proteins are often immunogenic and can readily elicit rodent cross-reactive clones, which are necessary for in vivo proof of mechanism studies. Here, we compare the binding characteristics of mAbs isolated from chicken immunization, mouse immunization, and phage display of human antibody libraries. Our results show that chicken-derived mAbs not only recapitulate the kinetic diversity of mAbs sourced from other methods, but appear to offer an expanded repertoire of epitopes. Further, chicken-derived mAbs can bind their native serum antigen with very high affinity, highlighting their therapeutic potential. PMID:26652308

  14. Assessing kinetic and epitopic diversity across orthogonal monoclonal antibody generation platforms.

    PubMed

    Abdiche, Yasmina Noubia; Harriman, Rian; Deng, Xiaodi; Yeung, Yik Andy; Miles, Adam; Morishige, Winse; Boustany, Leila; Zhu, Lei; Izquierdo, Shelley Mettler; Harriman, William

    2016-01-01

    The ability of monoclonal antibodies (mAbs) to target specific antigens with high precision has led to an increasing demand to generate them for therapeutic use in many disease areas. Historically, the discovery of therapeutic mAbs has relied upon the immunization of mammals and various in vitro display technologies. While the routine immunization of rodents yields clones that are stable in serum and have been selected against vast arrays of endogenous, non-target self-antigens, it is often difficult to obtain species cross-reactive mAbs owing to the generally high sequence similarity shared across human antigens and their mammalian orthologs. In vitro display technologies bypass this limitation, but lack an in vivo screening mechanism, and thus may potentially generate mAbs with undesirable binding specificity and stability issues. Chicken immunization is emerging as an attractive mAb discovery method because it combines the benefits of both in vivo and in vitro display methods. Since chickens are phylogenetically separated from mammals, their proteins share less sequence homology with those of humans, so human proteins are often immunogenic and can readily elicit rodent cross-reactive clones, which are necessary for in vivo proof of mechanism studies. Here, we compare the binding characteristics of mAbs isolated from chicken immunization, mouse immunization, and phage display of human antibody libraries. Our results show that chicken-derived mAbs not only recapitulate the kinetic diversity of mAbs sourced from other methods, but appear to offer an expanded repertoire of epitopes. Further, chicken-derived mAbs can bind their native serum antigen with very high affinity, highlighting their therapeutic potential. PMID:26652308

  15. Functional characterization of a monoclonal antibody epitope using a lambda phage display-deep sequencing platform

    PubMed Central

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; Venza, Mario; Venza, Isabella; Borgogni, Erica; Castellino, Flora; Midiri, Angelina; Galbo, Roberta; Romeo, Letizia; Biondo, Carmelo; Masignani, Vega; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2016-01-01

    We have recently described a method, named PROFILER, for the identification of antigenic regions preferentially targeted by polyclonal antibody responses after vaccination. To test the ability of the technique to provide insights into the functional properties of monoclonal antibody (mAb) epitopes, we used here a well-characterized epitope of meningococcal factor H binding protein (fHbp), which is recognized by mAb 12C1. An fHbp library, engineered on a lambda phage vector enabling surface expression of polypeptides of widely different length, was subjected to massive parallel sequencing of the phage inserts after affinity selection with the 12C1 mAb. We detected dozens of unique antibody-selected sequences, the most enriched of which (designated as FrC) could largely recapitulate the ability of fHbp to bind mAb 12C1. Computational analysis of the cumulative enrichment of single amino acids in the antibody-selected fragments identified two overrepresented stretches of residues (H248-K254 and S140-G154), whose presence was subsequently found to be required for binding of FrC to mAb 12C1. Collectively, these results suggest that the PROFILER technology can rapidly and reliably identify, in the context of complex conformational epitopes, discrete “hot spots” with a crucial role in antigen-antibody interactions, thereby providing useful clues for the functional characterization of the epitope. PMID:27530334

  16. Functional characterization of a monoclonal antibody epitope using a lambda phage display-deep sequencing platform.

    PubMed

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; Venza, Mario; Venza, Isabella; Borgogni, Erica; Castellino, Flora; Midiri, Angelina; Galbo, Roberta; Romeo, Letizia; Biondo, Carmelo; Masignani, Vega; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2016-01-01

    We have recently described a method, named PROFILER, for the identification of antigenic regions preferentially targeted by polyclonal antibody responses after vaccination. To test the ability of the technique to provide insights into the functional properties of monoclonal antibody (mAb) epitopes, we used here a well-characterized epitope of meningococcal factor H binding protein (fHbp), which is recognized by mAb 12C1. An fHbp library, engineered on a lambda phage vector enabling surface expression of polypeptides of widely different length, was subjected to massive parallel sequencing of the phage inserts after affinity selection with the 12C1 mAb. We detected dozens of unique antibody-selected sequences, the most enriched of which (designated as FrC) could largely recapitulate the ability of fHbp to bind mAb 12C1. Computational analysis of the cumulative enrichment of single amino acids in the antibody-selected fragments identified two overrepresented stretches of residues (H248-K254 and S140-G154), whose presence was subsequently found to be required for binding of FrC to mAb 12C1. Collectively, these results suggest that the PROFILER technology can rapidly and reliably identify, in the context of complex conformational epitopes, discrete "hot spots" with a crucial role in antigen-antibody interactions, thereby providing useful clues for the functional characterization of the epitope. PMID:27530334

  17. Long-range PCR in next-generation sequencing: comparison of six enzymes and evaluation on the MiSeq sequencer.

    PubMed

    Jia, Haiying; Guo, Yunfei; Zhao, Weiwei; Wang, Kai

    2014-01-01

    Long-range PCR remains a flexible, fast, efficient and cost-effective choice for sequencing candidate genomic regions in a small number of samples, especially when combined with next-generation sequencing (NGS) platforms. Several long-range DNA polymerases are advertised as being able to amplify up to 15 kb or longer genomic DNA. However, their real-world performance characteristics and their suitability for NGS remain unclear. We evaluated six long-range DNA polymerases (Invitrogen SequalPrep, Invitrogen AccuPrime, TaKaRa PrimeSTAR GXL, TaKaRa LA Taq Hot Start, KAPA Long Range HotStart and QIAGEN LongRange PCR Polymerase) to amplify three amplicons, with sizes of 12.9 kb, 9.7 kb, and 5.8 kb, respectively. Subsequently, we used the PrimeSTAR enzyme to amplify entire BRCA1 (83.2 kb) and BRCA2 (84.2 kb) genes from nine subjects and sequenced them on an Illumina MiSeq sequencer. We found that the TaKaRa PrimeSTAR GXL DNA polymerase can amplify almost all amplicons with different sizes and Tm values under identical PCR conditions. Other enzymes require alteration of PCR conditions to obtain optimal performance. From the MiSeq run, we identified multiple intronic and exonic single-nucleotide variations (SNVs), including one mutation (c.5946delT in BRCA2) in a positive control. Our study provided useful results for sequencing research focused on large genomic regions. PMID:25034901

  18. Quantifying Next Generation Sequencing Sample Pre-Processing Bias in HIV-1 Complete Genome Sequencing.

    PubMed

    Vrancken, Bram; Trovão, Nídia Sequeira; Baele, Guy; van Wijngaerden, Eric; Vandamme, Anne-Mieke; van Laethem, Kristel; Lemey, Philippe

    2016-01-01

    Genetic analyses play a central role in infectious disease research. Massively parallelized "mechanical cloning" and sequencing technologies were quickly adopted by HIV researchers in order to broaden the understanding of the clinical importance of minor drug-resistant variants. These efforts have, however, remained largely limited to small genomic regions. The growing need to monitor multiple genome regions for drug resistance testing, as well as the obvious benefit for studying evolutionary and epidemic processes makes complete genome sequencing an important goal in viral research. In addition, a major drawback for NGS applications to RNA viruses is the need for large quantities of input DNA. Here, we use a generic overlapping amplicon-based near full-genome amplification protocol to compare low-input enzymatic fragmentation (Nextera™) with conventional mechanical shearing for Roche 454 sequencing. We find that the fragmentation method has only a modest impact on the characterization of the population composition and that for reliable results, the variation introduced at all steps of the procedure--from nucleic acid extraction to sequencing--should be taken into account, a finding that is also relevant for NGS technologies that are now more commonly used. Furthermore, by applying our protocol to deep sequence a number of pre-therapy plasma and PBMC samples, we illustrate the potential benefits of a near complete genome sequencing approach in routine genotyping. PMID:26751471

  19. SeqReporter: automating next-generation sequencing result interpretation and reporting workflow in a clinical laboratory.

    PubMed

    Roy, Somak; Durso, Mary Beth; Wald, Abigail; Nikiforov, Yuri E; Nikiforova, Marina N

    2014-01-01

    A wide repertoire of bioinformatics applications exist for next-generation sequencing data analysis; however, certain requirements of the clinical molecular laboratory limit their use: i) comprehensive report generation, ii) compatibility with existing laboratory information systems and computer operating system, iii) knowledgebase development, iv) quality management, and v) data security. SeqReporter is a web-based application developed using ASP.NET framework version 4.0. The client-side was designed using HTML5, CSS3, and Javascript. The server-side processing (VB.NET) relied on interaction with a customized SQL server 2008 R2 database. Overall, 104 cases (1062 variant calls) were analyzed by SeqReporter. Each variant call was classified into one of five report levels: i) known clinical significance, ii) uncertain clinical significance, iii) pending pathologists' review, iv) synonymous and deep intronic, and v) platform and panel-specific sequence errors. SeqReporter correctly annotated and classified 99.9% (859 of 860) of sequence variants, including 68.7% synonymous single-nucleotide variants, 28.3% nonsynonymous single-nucleotide variants, 1.7% insertions, and 1.3% deletions. One variant of potential clinical significance was re-classified after pathologist review. Laboratory information system-compatible clinical reports were generated automatically. SeqReporter also facilitated quality management activities. SeqReporter is an example of a customized and well-designed informatics solution to optimize and automate the downstream analysis of clinical next-generation sequencing data. We propose it as a model that may envisage the development of a comprehensive clinical informatics solution. PMID:24220144

  20. Application of next generation sequencing to molecular diagnosis of inherited diseases.

    PubMed

    Zhang, Wei; Cui, Hong; Wong, Lee-Jun C

    2014-01-01

    Recent development of high throughput, massively parallel sequencing (MPS or next generation sequencing, NGS) technology has revolutionized the molecular diagnosis of human genetic disease. The ability to generate enormous amount of sequence data in a short time at an affordable cost makes this approach ideal for a wide range of applications from sequencing a group of candidate genes, all coding regions (known as exome sequencing) to the entire human genome. The technology brings about an unprecedented application to the identification of the molecular basis of hard-to-diagnose genetic disorders. This chapter reviews the up-to-date published application of next generation sequencing in clinical molecular diagnostic laboratories. We also emphasize the various target gene enrichment methods and their advantages and shortcomings. Obstacles to compliance with regulatory authorities like CLIA/CAP in clinical settings are also briefly discussed. PMID:22576358

  1. Co-detection and sequencing of genes and transcripts from the same single cells facilitated by a microfluidics platform

    NASA Astrophysics Data System (ADS)

    Han, Lin; Zi, Xiaoyuan; Garmire, Lana X.; Wu, Yu; Weissman, Sherman M.; Pan, Xinghua; Fan, Rong

    2014-09-01

    Despite the recent advance of single-cell gene expression analyses, co-measurement of both genomic and transcriptional signatures at the single-cell level has not been realized. However such analysis is necessary in order to accurately delineate how genetic information is transcribed, expressed, and regulated to give rise to an enormously diverse range of cell phenotypes. Here we report on a microfluidics-facilitated approach that allows for controlled separation of cytoplasmic and nuclear contents of a single cell followed by on-chip amplification of genomic DNA and cytoplasmic mRNA. When coupled with off-chip polymerase chain reaction, gel electrophoresis and Sanger sequencing, a panel of genes and transcripts from the same single cell can be co-detected and sequenced. This platform is potentially an enabling tool to permit multiple genomic measurements performed on the same single cells and opens new opportunities to tackle a range of fundamental biology questions including non-genetic cell-to-cell variability, epigenetic regulation, and stem cell fate control. It also helps address clinical challenges such as diagnosing intra-tumor heterogeneity and dissecting complex cellular immune responses.

  2. Next generation sequencing technologies and the changing landscape of phage genomics

    PubMed Central

    Klumpp, Jochen; Fouts, Derrick E.; Sozhamannan, Shanmuga

    2012-01-01

    The dawn of next generation sequencing technologies has opened up exciting possibilities for whole genome sequencing of a plethora of organisms. The 2nd and 3rd generation sequencing technologies, based on cloning-free, massively parallel sequencing, have enabled the generation of a deluge of genomic sequences of both prokaryotic and eukaryotic origin in the last seven years. However, whole genome sequencing of bacterial viruses has not kept pace with this revolution, despite the fact that their genomes are orders of magnitude smaller in size compared with bacteria and other organisms. Sequencing phage genomes poses several challenges; (1) obtaining pure phage genomic material, (2) PCR amplification biases and (3) complex nature of their genetic material due to features such as methylated bases and repeats that are inherently difficult to sequence and assemble. Here we describe conclusions drawn from our efforts in sequencing hundreds of bacteriophage genomes from a variety of Gram-positive and Gram-negative bacteria using Sanger, 454, Illumina and PacBio technologies. Based on our experience we propose several general considerations regarding sample quality, the choice of technology and a “blended approach” for generating reliable whole genome sequences of phages. PMID:23275870

  3. Next generation sequencing technologies and the changing landscape of phage genomics.

    PubMed

    Klumpp, Jochen; Fouts, Derrick E; Sozhamannan, Shanmuga

    2012-07-01

    The dawn of next generation sequencing technologies has opened up exciting possibilities for whole genome sequencing of a plethora of organisms. The 2nd and 3rd generation sequencing technologies, based on cloning-free, massively parallel sequencing, have enabled the generation of a deluge of genomic sequences of both prokaryotic and eukaryotic origin in the last seven years. However, whole genome sequencing of bacterial viruses has not kept pace with this revolution, despite the fact that their genomes are orders of magnitude smaller in size compared with bacteria and other organisms. Sequencing phage genomes poses several challenges; (1) obtaining pure phage genomic material, (2) PCR amplification biases and (3) complex nature of their genetic material due to features such as methylated bases and repeats that are inherently difficult to sequence and assemble. Here we describe conclusions drawn from our efforts in sequencing hundreds of bacteriophage genomes from a variety of Gram-positive and Gram-negative bacteria using Sanger, 454, Illumina and PacBio technologies. Based on our experience we propose several general considerations regarding sample quality, the choice of technology and a "blended approach" for generating reliable whole genome sequences of phages. PMID:23275870

  4. De novo Sequence Assembly and Characterization of Lycoris aurea Transcriptome Using GS FLX Titanium Platform of 454 Pyrosequencing

    PubMed Central

    Wang, Ren; Xu, Sheng; Jiang, Yumei; Jiang, Jingwei; Li, Xiaodan; Liang, Lijian; He, Jia; Peng, Feng; Xia, Bing

    2013-01-01

    Background Lycoris aurea, also called Golden Magic Lily, is an ornamentally and medicinally important species of the Amaryllidaceae family. To date, the sequencing of its whole genome is unavailable as a non-model organism. Transcriptomic information is also scarce for this species. In this study, we performed de novo transcriptome sequencing to produce the first comprehensive expressed sequence tag (EST) dataset for L. aurea using high-throughput sequencing technology. Methodology and Principal Findings Total RNA was isolated from leaves with sodium nitroprusside (SNP), salicylic acid (SA), or methyl jasmonate (MeJA) treatment, stems, and flowers at the bud, blooming, and wilting stages. Equal quantities of RNA from each tissue and stage were pooled to construct a cDNA library. Using 454 pyrosequencing technology, a total of 937,990 high quality reads (308.63 Mb) with an average read length of 329 bp were generated. Clustering and assembly of these reads produced a non-redundant set of 141,111 unique sequences, comprising 24,604 contigs and 116,507 singletons. All of the unique sequences were involved in the biological process, cellular component and molecular function categories by GO analysis. Potential genes and their functions were predicted by KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literatures, many putative genes involved in Amaryllidaceae alkaloids synthesis, including PAL, TYDC OMT, NMT, P450, and other potentially important candidate genes, were identified for the first time in this Lycoris. Furthermore, 6,386 SSRs and 18,107 high-confidence SNPs were identified in this EST dataset. Conclusions The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in L. aurea. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will provide useful information for functional

  5. Architecting Prodiguer: the next generation French climate modelling data management platform

    NASA Astrophysics Data System (ADS)

    Morgan, Mark; Denvil, Sebastien; Bhardwaj, Ashish

    2010-05-01

    The Pierre Simon Laplace Institute (IPSL), like many other climate modeling groups, is involved in the international development of a comprehensive Earth System Model (ESM) to study the interactions between chemical, physical, and biological processes. This work entails the coupling of different components (land, ocean, atmosphere, chemistry...etc) and requires an execution environment platform that can tackle the entire range of interdependent model configurations. Furthermore, the ever-increasing number of simulations, executed against model configurations within scientific computing centres, is generating a huge volume of data and meta-data that must be made available to the international community of researchers, modelers, students and general users. IPSL is in the process of implementing a French national project called Prodiguer whose objective is to ensure that the data and meta-data can be delivered to the French & international communities in a timely and appropriate fashion, hence acheiving the strategic goals outlined above. Prodiguer aims to leverage, extend and build upon the work of international projects such as Earth System Grid, METAFOR and IS-ENES. Thus Prodiguer is to be seen as one actor amongst many attempting the difficult task of information integration within a complex enterprise space. We will present the technical architecture being put in place to achieve the goals of Prodiguer. Such an architecture necessarily encompasses many aspects of Service / Resource Orientated Architural practice. From security to messaging patterns, from message queues to failover strategies, we will illustrate how pragmatism is inevitably the main driver behind such an architecture. We will also illustrate that as the number of actors increases so does workflow complexity, and as a consequence simplicity becomes an important guiding factor in itself.

  6. Next-generation sequencing for the diagnosis of cardiac arrhythmia syndromes.

    PubMed

    Lubitz, Steven A; Ellinor, Patrick T

    2015-05-01

    Inherited arrhythmia syndromes are collectively associated with substantial morbidity, yet our understanding of the genetic architecture of these conditions remains limited. Recent technological advances in DNA sequencing have led to the commercialization of genetic testing now widely available in clinical practice. In particular, next-generation sequencing allows the large-scale and rapid assessment of entire genomes. Although next-generation sequencing represents a major technological advance, it has introduced numerous challenges with respect to the interpretation of genetic variation and has opened a veritable floodgate of biological data of unknown clinical significance to practitioners. In this review, we discuss current genetic testing indications for inherited arrhythmia syndromes, broadly outline characteristics of next-generation sequencing techniques, and highlight challenges associated with such testing. We further summarize future directions that will be necessary to address to enable the widespread adoption of next-generation sequencing in the routine management of patients with inherited arrhythmia syndromes. PMID:25625719

  7. Next-generation sequencing-based method shows increased mutation detection sensitivity in an Indian retinoblastoma cohort

    PubMed Central

    Singh, Jaya; Mishra, Avshesh; Pandian, Arunachalam Jayamuruga; Mallipatna, Ashwin C.; Khetan, Vikas; Sripriya, S.; Kapoor, Suman; Agarwal, Smita; Sankaran, Satish; Katragadda, Shanmukh; Veeramachaneni, Vamsi; Hariharan, Ramesh; Subramanian, Kalyanasundaram

    2016-01-01

    Purpose Retinoblastoma (Rb) is the most common primary intraocular cancer of childhood and one of the major causes of blindness in children. India has the highest number of patients with Rb in the world. Mutations in the RB1 gene are the primary cause of Rb, and heterogeneous mutations are distributed throughout the entire length of the gene. Therefore, genetic testing requires screening of the entire gene, which by conventional sequencing is time consuming and expensive. Methods In this study, we screened the RB1 gene in the DNA isolated from blood or saliva samples of 50 unrelated patients with Rb using the TruSight Cancer panel. Next-generation sequencing (NGS) was done on the Illumina MiSeq platform. Genetic variations were identified using the Strand NGS software and interpreted using the StrandOmics platform. Results We were able to detect germline pathogenic mutations in 66% (33/50) of the cases, 12 of which were novel. We were able to detect all types of mutations, including missense, nonsense, splice site, indel, and structural variants. When we considered bilateral Rb cases only, the mutation detection rate increased to 100% (22/22). In unilateral Rb cases, the mutation detection rate was 30% (6/20). Conclusions Our study suggests that NGS-based approaches increase the sensitivity of mutation detection in the RB1 gene, making it fast and cost-effective compared to the conventional tests performed in a reflex-testing mode. PMID:27582626

  8. Generation of Triple-Transgenic Forsythia Cell Cultures as a Platform for the Efficient, Stable, and Sustainable Production of Lignans

    PubMed Central

    Murata, Jun; Matsumoto, Erika; Morimoto, Kinuyo; Koyama, Tomotsugu; Satake, Honoo

    2015-01-01

    Sesamin is a furofuran lignan biosynthesized from the precursor lignan pinoresinol specifically in sesame seeds. This lignan is shown to exhibit anti-hypertensive activity, protect the liver from damages by ethanol and lipid oxidation, and reduce lung tumor growth. Despite rapidly elevating demand, plant sources of lignans are frequently limited because of the high cost of locating and collecting plants. Indeed, the acquisition of sesamin exclusively depends on the conventional extraction of particular Sesamum seeds. In this study, we have created the efficient, stable and sustainable sesamin production system using triple-transgenic Forsythia koreana cell suspension cultures, U18i-CPi-Fk. These transgenic cell cultures were generated by stably introducing an RNAi sequence against the pinoresinol-glucosylating enzyme, UGT71A18, into existing CPi-Fk cells, which had been created by introducing Sesamum indicum sesamin synthase (CYP81Q1) and an RNA interference (RNAi) sequence against pinoresinol/lariciresinol reductase (PLR) into F. koreanna cells. Compared to its transgenic prototype, U18i-CPi-Fk displayed 5-fold higher production of pinoresinol aglycone and 1.4-fold higher production of sesamin, respectively, while the wildtype cannot produce sesamin due to a lack of any intrinsic sesamin synthase. Moreover, red LED irradiation of U18i-CPi-Fk specifically resulted in 3.0-fold greater production in both pinoresinol aglycone and sesamin than production of these lignans under the dark condition, whereas pinoresinol production was decreased in the wildtype under red LED. Moreover, we developed a procedure for sodium alginate-based long-term storage of U18i-CPi-Fk in liquid nitrogen. Production of sesamin in U18i-CPi-Fk re-thawed after six-month cryopreservation was equivalent to that of non-cryopreserved U18i-CPi-Fk. These data warrant on-demand production of sesamin anytime and anywhere. Collectively, the present study provides evidence that U18i-CP-Fk is an

  9. Designing next-generation platforms for evaluating scientific output: what scientists can learn from the social web

    PubMed Central

    Yarkoni, Tal

    2012-01-01

    Traditional pre-publication peer review of scientific output is a slow, inefficient, and unreliable process. Efforts to replace or supplement traditional evaluation models with open evaluation platforms that leverage advances in information technology are slowly gaining traction, but remain in the early stages of design and implementation. Here I discuss a number of considerations relevant to the development of such platforms. I focus particular attention on three core elements that next-generation evaluation platforms should strive to emphasize, including (1) open and transparent access to accumulated evaluation data, (2) personalized and highly customizable performance metrics, and (3) appropriate short-term incentivization of the userbase. Because all of these elements have already been successfully implemented on a large scale in hundreds of existing social web applications, I argue that development of new scientific evaluation platforms should proceed largely by adapting existing techniques rather than engineering entirely new evaluation mechanisms. Successful implementation of open evaluation platforms has the potential to substantially advance both the pace and the quality of scientific publication and evaluation, and the scientific community has a vested interest in shifting toward such models as soon as possible. PMID:23060783

  10. Next-Generation Sequencing Approaches in Cancer: Where Have They Brought Us and Where Will They Take Us?

    PubMed Central

    LeBlanc, Veronique G.; Marra, Marco A.

    2015-01-01

    Next-generation sequencing (NGS) technologies and data have revolutionized cancer research and are increasingly being deployed to guide clinicians in treatment decision-making. NGS technologies have allowed us to take an “omics” approach to cancer in order to reveal genomic, transcriptomic, and epigenomic landscapes of individual malignancies. Integrative multi-platform analyses are increasingly used in large-scale projects that aim to fully characterize individual tumours as well as general cancer types and subtypes. In this review, we examine how NGS technologies in particular have contributed to “omics” approaches in cancer research, allowing for large-scale integrative analyses that consider hundreds of tumour samples. These types of studies have provided us with an unprecedented wealth of information, providing the background knowledge needed to make small-scale (including “N of 1”) studies informative and relevant. We also take a look at emerging opportunities provided by NGS and state-of-the-art third-generation sequencing technologies, particularly in the context of translational research. Cancer research and care are currently poised to experience significant progress catalyzed by accessible sequencing technologies that will benefit both clinical- and research-based efforts. PMID:26404381

  11. Quantifying Next Generation Sequencing Sample Pre-Processing Bias in HIV-1 Complete Genome Sequencing

    PubMed Central

    Vrancken, Bram; Trovão, Nídia Sequeira; Baele, Guy; van Wijngaerden, Eric; Vandamme, Anne-Mieke; van Laethem, Kristel; Lemey, Philippe

    2016-01-01

    Genetic analyses play a central role in infectious disease research. Massively parallelized “mechanical cloning” and sequencing technologies were quickly adopted by HIV researchers in order to broaden the understanding of the clinical importance of minor drug-resistant variants. These efforts have, however, remained largely limited to small genomic regions. The growing need to monitor multiple genome regions for drug resistance testing, as well as the obvious benefit for studying evolutionary and epidemic processes makes complete genome sequencing an important goal in viral research. In addition, a major drawback for NGS applications to RNA viruses is the need for large quantities of input DNA. Here, we use a generic overlapping amplicon-based near full-genome amplification protocol to compare low-input enzymatic fragmentation (Nextera™) with conventional mechanical shearing for Roche 454 sequencing. We find that the fragmentation method has only a modest impact on the characterization of the population composition and that for reliable results, the variation introduced at all steps of the procedure—from nucleic acid extraction to sequencing—should be taken into account, a finding that is also relevant for NGS technologies that are now more commonly used. Furthermore, by applying our protocol to deep sequence a number of pre-therapy plasma and PBMC samples, we illustrate the potential benefits of a near complete genome sequencing approach in routine genotyping. PMID:26751471

  12. NGSView: an extensible open source editor for next-generation sequencing data

    PubMed Central

    Arner, Erik; Hayashizaki, Yoshihide; Daub, Carsten O.

    2010-01-01

    Summary:High-throughput sequencing technologies introduce novel demands on tools available for data analysis. We have developed NGSView (Next Generation Sequence View), a generally applicable, flexible and extensible next-generation sequence alignment editor. The software allows for visualization and manipulation of millions of sequences simultaneously on a desktop computer, through a graphical interface. NGSView is available under an open source license and can be extended through a well documented API. Availability: http://ngsview.sourceforge.net Contact: arner@gsc.riken.jp PMID:19855106

  13. Genetic sequence relationships of Winnipegosis platform carbonates, southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-01-01

    Examination of cores and well log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger, Vail-type sequences separated by regionally persistent unconformities or their correlative conformities. Sea level rise during the early Eifelian caused southeastward onlap of the Ashern Formation onto Middle Silurian carbonates of the Interlake Formation. Maximum flooding, expressed by deepest marine facies and a hardground surface, suggests the existence of a condensed section at the top of the Ashern Formation. This was developed during the maximum rate of sea level rise. A decrease in the rate of sea level rise resulted in aggradation of lower Winnipegosis units on a gently dipping ramp. These are represented by nodular and burrowed open marine limestones with scattered stromatoporoid patch reefs and grainstone shoals. During the subsequent sea level fall, represented by Temple units, a shelf margin with pronounced depositional topography and adjacent starved basin were developed. Temple strata include coral-brachiopod-stromatoporoid reefs and productive fore-reef talus deposits along the shelf margin rim.

  14. De novo genome assembly of the economically important weed horseweed using integrated data from multiple sequencing platforms.

    PubMed

    Peng, Yanhui; Lai, Zhao; Lane, Thomas; Nageswara-Rao, Madhugiri; Okada, Miki; Jasieniuk, Marie; O'Geen, Henriette; Kim, Ryan W; Sammons, R Douglas; Rieseberg, Loren H; Stewart, C Neal

    2014-11-01

    Horseweed (Conyza canadensis), a member of the Compositae (Asteraceae) family, was the first broadleaf weed to evolve resistance to glyphosate. Horseweed, one of the most problematic weeds in the world, is a true diploid (2n = 2x = 18), with the smallest genome of any known agricultural weed (335 Mb). Thus, it is an appropriate candidate to help us understand the genetic and genomic bases of weediness. We undertook a draft de novo genome assembly of horseweed by combining data from multiple sequencing platforms (454 GS-FLX, Illumina HiSeq 2000, and PacBio RS) using various libraries with different insertion sizes (approximately 350 bp, 600 bp, 3 kb, and 10 kb) of a Tennessee-accessed, glyphosate-resistant horseweed biotype. From 116.3 Gb (approximately 350× coverage) of data, the genome was assembled into 13,966 scaffolds with 50% of the assembly = 33,561 bp. The assembly covered 92.3% of the genome, including the complete chloroplast genome (approximately 153 kb) and a nearly complete mitochondrial genome (approximately 450 kb in 120 scaffolds). The nuclear genome is composed of 44,592 protein-coding genes. Genome resequencing of seven additional horseweed biotypes was performed. These sequence data were assembled and used to analyze genome variation. Simple sequence repeat and single-nucleotide polymorphisms were surveyed. Genomic patterns were detected that associated with glyphosate-resistant or -susceptible biotypes. The draft genome will be useful to better understand weediness and the evolution of herbicide resistance and to devise new management strategies. The genome will also be useful as another reference genome in the Compositae. To our knowledge, this article represents the first published draft genome of an agricultural weed. PMID:25209985

  15. Lessons learned from implementing a national infrastructure in Sweden for storage and analysis of next-generation sequencing data

    PubMed Central

    2013-01-01

    Analyzing and storing data and results from next-generation sequencing (NGS) experiments is a challenging task, hampered by ever-increasing data volumes and frequent updates of analysis methods and tools. Storage and computation have grown beyond the capacity of personal computers and there is a need for suitable e-infrastructures for processing. Here we describe UPPNEX, an implementation of such an infrastructure, tailored to the needs of data storage and analysis of NGS data in Sweden serving various labs and multiple instruments from the major sequencing technology platforms. UPPNEX comprises resources for high-performance computing, large-scale and high-availability storage, an extensive bioinformatics software suite, up-to-date reference genomes and annotations, a support function with system and application experts as well as a web portal and support ticket system. UPPNEX applications are numerous and diverse, and include whole genome-, de novo- and exome sequencing, targeted resequencing, SNP discovery, RNASeq, and methylation analysis. There are over 300 projects that utilize UPPNEX and include large undertakings such as the sequencing of the flycatcher and Norwegian spruce. We describe the strategic decisions made when investing in hardware, setting up maintenance and support, allocating resources, and illustrate major challenges such as managing data growth. We conclude with summarizing our experiences and observations with UPPNEX to date, providing insights into the successful and less successful decisions made. PMID:23800020

  16. Identification and characterization of Highlands J virus from a Mississippi sandhill crane using unbiased next-generation sequencing

    USGS Publications Warehouse

    Ip, Hon S.; Wiley, Michael R.; Long, Renee; Gustavo, Palacios; Shearn-Bochsler, Valerie; Whitehouse, Chris A.

    2014-01-01

    Advances in massively parallel DNA sequencing platforms, commonly termed next-generation sequencing (NGS) technologies, have greatly reduced time, labor, and cost associated with DNA sequencing. Thus, NGS has become a routine tool for new viral pathogen discovery and will likely become the standard for routine laboratory diagnostics of infectious diseases in the near future. This study demonstrated the application of NGS for the rapid identification and characterization of a virus isolated from the brain of an endangered Mississippi sandhill crane. This bird was part of a population restoration effort and was found in an emaciated state several days after Hurricane Isaac passed over the refuge in Mississippi in 2012. Post-mortem examination had identified trichostrongyliasis as the possible cause of death, but because a virus with morphology consistent with a togavirus was isolated from the brain of the bird, an arboviral etiology was strongly suspected. Because individual molecular assays for several known arboviruses were negative, unbiased NGS by Illumina MiSeq was used to definitively identify and characterize the causative viral agent. Whole genome sequencing and phylogenetic analysis revealed the viral isolate to be the Highlands J virus, a known avian pathogen. This study demonstrates the use of unbiased NGS for the rapid detection and characterization of an unidentified viral pathogen and the application of this technology to wildlife disease diagnostics and conservation medicine.

  17. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  18. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  19. Disease vectors in the era of next generation sequencing.

    PubMed

    Rinker, David C; Pitts, R Jason; Zwiebel, Laurence J

    2016-01-01

    Almost 20 % of all infectious human diseases are vector borne and, together, are responsible for over one million deaths per annum. Over the past decade, the decreasing costs of massively parallel sequencing technologies have facilitated the agnostic interrogation of insect vector genomes, giving medical entomologists access to an ever-expanding volume of high-quality genomic and transcriptomic data. In this review, we highlight how genomics resources have provided new insights into the physiology, behavior, and evolution of human disease vectors within the context of the global health landscape. PMID:27154554

  20. The massive dolomitization of platformal and basinal sequences: proposed models from the Paleocene, Northeast Sirte Basin, Libya

    NASA Astrophysics Data System (ADS)

    Mresah, Mohamed H.

    1998-03-01

    The Paleocene carbonate succession in the Northeast Sirte Basin is composed of two shallowing-upward ramp cycles, where each cycle is under- and overlain by deeper-water, pelagic facies. A significant proportion of each of these two cycles is dolomitized. Petrographic study, supported by geochemical data (stoichiometry, stable isotopes, trace elements, and fluid inclusions), and integrated with broader tectono-sedimentary information, has provided the basis for interpreting these Paleocene dolomites. The use of this integrated approach in the study of dolomites suggests that, despite the much publicized uncertainties in interpreting geochemical analyses of ancient dolomites, the results of the Paleocene dolomites show that the geochemical characteristics are generally consistent with regional stratigraphic distribution and petrographic observations. Four distinct types of dolomite have been recognized in this part of the Sirte Basin. Based on the stratigraphic position and petrographic criteria, two of these types have a platformal setting and the other two are basinal. The platform varieties consist of dolomicrites and pervasive stratal dolomites. The dolomicrites, interpreted to be of syn-sedimentary origin, were probably a product of reflux of seawater, with elevated salinity, as suggested by palaeoenvironmental analysis and supported by geochemical evidence (the average S'80 value is -0.1‰ PDB; the average Sr content is 639 ppm). The pervasive dolomites were formed during the progradation of the platform sequences, and probably stabilized and augmented during shallow burial. A meteoric-marine mixing-zone is thought to have been the most likely process for the formation of these dolomites. This interpretation is supported by geochemical evidence (the average δ18O is -2.4‰ PDB; the average Sr content is 72 ppm) combined with a favourable stratigraphic position. The most characteristic feature related to both mixing-zone and reflux dolomitization is the

  1. Next-generation sequencing technologies and applications for human genetic history and forensics

    PubMed Central

    2011-01-01

    Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics. PMID:22115430

  2. Short Communication: Investigating a Chain of HIV Transmission Events Due to Homosexual Exposure and Blood Transfusion Based on a Next Generation Sequencing Method.

    PubMed

    Zhao, Qi; Zhang, Chen; Jiang, Yan; Wen, Yujie; Pan, Pinliang; Li, Yang; Zhang, Guiyun; Zhang, Lei; Qiu, Maofeng

    2015-12-01

    This study investigates a chain of HIV transmission events due to homosexual exposure and blood transfusion in China. The MiSeq platform, a next generation sequencing (NGS) system, was used to obtain genetic details of the HIV-1 env region (336 base pairs). Evolutionary analysis combined with epidemiologic evidence suggests a transmission chain from patient T3 to T2 through homosexual exposure and subsequently to T1 through blood transfusion. More importantly, a phylogenetic study suggested a likely genetic bottleneck for HIV in homosexual transmission from T3 to T2, while T1 inherited the majority of variants from T2. The result from the MiSeq platform is consistent with findings from the epidemiologic survey. The MiSeq platform is a powerful tool for tracing HIV transmissions and intrapersonal evolution. PMID:26355677

  3. Storm-generated bedforms and relict dissolution pits and channels on the Yucatan carbonate platform

    NASA Astrophysics Data System (ADS)

    Gulick, S. P.; Goff, J. A.; Stewart, H. A.; Perez-Cruz, L. L.; Davis, M. B.; Duncan, D.; Saustrup, S.; Sanford, J. C.; Fucugauchi, J. U.

    2013-12-01

    survey area. Therefore, none of these dissolution pits appear to be underlain by a cenote or sink hole. The NW sector of the survey area exhibits a more complex morphology than the alternating ribbon/bare rock morphology elsewhere, including linear scarps (up to ~1 m relief), deeper pitting (up to ~1 m relief), and sinuous, dendritic channeling (up to ~2 m relief). The geologic origin of these features will require further investigation. Sand drifts are present in this region, but are thinner and cover less area. These observations show the dominant modern sediment formation and transport processes on this starved platform are from large storms and hurricanes that place large regions of the platform at wave base. Remaining observed features were generated during times of lower sea level.

  4. Oncogenic viruses: Lessons learned using next-generation sequencing technologies.

    PubMed

    Flippot, Ronan; Malouf, Gabriel G; Su, Xiaoping; Khayat, David; Spano, Jean-Philippe

    2016-07-01

    Fifteen percent of cancers are driven by oncogenic human viruses. Four of those viruses, hepatitis B virus, human papillomavirus, Merkel cell polyomavirus, and human T-cell lymphotropic virus, integrate the host genome. Viral oncogenesis is the result of epigenetic and genetic alterations that happen during viral integration. So far, little data have been available regarding integration mechanisms and modifications in the host genome. However, the emergence of high-throughput sequencing and bioinformatic tools enables researchers to establish the landscape of genomic alterations and predict the events that follow viral integration. Cooperative working groups are currently investigating these factors in large data sets. Herein, we provide novel insights into the initiating events of cancer onset during infection with integrative viruses. Although much remains to be discovered, many improvements are expected from the clinical point of view, from better prognosis classifications to better therapeutic strategies. PMID:27156225

  5. Microbial profiling of South African acid mine water samples using next generation sequencing platform.

    PubMed

    Kamika, I; Azizi, S; Tekere, M

    2016-07-01

    This study monitored changes in bacterial and fungal structure in a mine water in a monthly basis over 4 months. Over the 4-month study period, mine water samples contained more bacteria (91.06 %) compared to fungi (8.94 %). For bacteria, mine water samples were dominated by Proteobacteria (39.14 to 65.06 %) followed by Firmicutes (26.34 to 28.9 %) in summer, and Cyanobacteria (27.05 %) in winter. In the collected samples, 18 % of bacteria could not be assigned to a phylum and remained unclassified suggesting hitherto vast untapped microbial diversity especially during winter. The fungal domain was the sole eukaryotic microorganism found in the mine water samples with unclassified fungi (68.2 to 91 %) as the predominant group, followed by Basidiomycota (6.9 to 27.8 %). The time of collection, which was linked to the weather, had higher impact on bacterial community than fungal community. The bacterial operational taxonomic units (OTUs) ranged from 865 to 4052 over the 4-month sampling period, while fungal OTUs varied from 73 to 249. The diversity indices suggested that the bacterial community inhabiting the mine water samples were more diverse than the fungal community. The canonical correspondence analysis (CCA) results highlighted that the bacterial community variance had the strongest relationship with water temperature, conductivity, pH, and dissolved oxygen (DO) content, as compared to fungi and water characteristics, had the greatest contribution to both bacterial and fungal community variance. The results provided the relationships between microbial community and environmental variables in the studied mining sites. PMID:26980100

  6. Transcriptome de novo assembly sequencing and analysis of the toxic dinoflagellate Alexandrium catenella using the Illumina platform.

    PubMed

    Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia

    2014-03-10

    In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. PMID:24440238

  7. A water-stable metal-organic framework of a zwitterionic carboxylate with dysprosium: a sensing platform for Ebolavirus RNA sequences.

    PubMed

    Qin, Liang; Lin, Li-Xian; Fang, Zhi-Ping; Yang, Shui-Ping; Qiu, Gui-Hua; Chen, Jin-Xiang; Chen, Wen-Hua

    2016-01-01

    We herein report a water-stable 3D dysprosium-based metal-organic framework (MOF) that can non-covalently interact with probe ss-DNA. The formed system can serve as an effective fluorescence sensing platform for the detection of complementary Ebolavirus RNA sequences with the detection limit of 160 pM. PMID:26502791

  8. Thermal Test of an Improved Platform for Silicon Nanowire-Based Thermoelectric Micro-generators

    NASA Astrophysics Data System (ADS)

    Calaza, C.; Fonseca, L.; Salleras, M.; Donmez, I.; Tarancón, A.; Morata, A.; Santos, J. D.; Gadea, G.

    2016-03-01

    This work reports on an improved design intended to enhance the thermal isolation between the hot and cold parts of a silicon-based thermoelectric microgenerator. Micromachining techniques and silicon on insulator substrates are used to obtain a suspended silicon platform surrounded by a bulk silicon rim, in which arrays of bottom-up silicon nanowires are integrated later on to join both parts with a thermoelectric active material. In previous designs the platform was linked to the rim by means of bulk silicon bridges, used as mechanical support and holder for the electrical connections. Such supports severely reduce platform thermal isolation and penalise the functional area due to the need of longer supports. A new technological route is planned to obtain low thermal conductance supports, making use of a particular geometrical design and a wet bulk micromachining process to selectively remove silicon shaping a thin dielectric membrane. Thermal conductance measurements have been performed to analyse the influence of the different design parameters of the suspended platform (support type, bridge/membrane length, separation between platform and silicon rim,) on overall thermal isolation. A thermal conductance reduction from 1.82 mW/K to 1.03 mW/K, has been obtained on tested devices by changing the support type, even though its length has been halved.

  9. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms.

    PubMed

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-01-01

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes. PMID:24296905

  10. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms

    PubMed Central

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-01-01

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes. PMID:24296905

  11. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples.

    PubMed

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-03-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline. PMID:26910355

  12. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples

    PubMed Central

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-01-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline. PMID:26910355

  13. Computer aided graphics simulation modelling using seismogeologic approach in sequence stratigraphy of Early Cretaceous Punjab platform, Central Indus Basin, Pakistan

    SciTech Connect

    Qureshi, T.M.; Khan, K.A.

    1996-08-01

    Modelling stratigraphic sequence by using seismo-geologic approach, integrated with cyclic transgressive-regressive deposits, helps to identify a number of non-structural subtle traps. Most of the hydrocarbons found in Early Cretaceous of Central Indus Basin pertain to structural entrapments of upper transgressive sands. A few wells are producing from middle and basal regressive sands, but the massive regressive sands have not been tested so far. The possibility of stratigraphic traps like wedging or pinch-out, a lateral gradation, an uplift, truncation and overlapping of reservoir rocks is quite promising. The natural basin physiography at times has been modified by extensional episodic events into tectono-morphic terrain. Thus, seismo scanning of tectonically controlled sedimentation might delineate some subtle stratigraphic traps. Amplitude maps representing stratigraphic sequences are generated to identify the traps. Seismic expressions indicate the reservoir quality in terms of amplitude increase or decrease. The data is modelled on computer using graphics simulation techniques.

  14. HPV-QUEST: A highly customized system for automated HPV sequence analysis capable of processing Next Generation sequencing data set.

    PubMed

    Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M

    2012-01-01

    Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses. PMID:22570520

  15. Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing

    PubMed Central

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640

  16. From FASTQ to Function: In Silico Methods for Processing Next-Generation Sequencing Data.

    PubMed

    Preston, Mark D; Stabler, Richard A

    2016-01-01

    This chapter presents a method to process C. difficile whole-genome next-generation sequencing data straight from the sequencer. Quality control processing and de novo assembly of these data enable downstream analyses such as gene annotation and in silico multi-locus strain-type identification. PMID:27507331

  17. Next generation sequencing provides rapid access to the genome of wheat stripe rust

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The wheat stripe rust fungus (Puccinia striiformis f. sp. tritici, PST) is responsible for significant yield losses in wheat production worldwide. In spite of its economic importance, the PST genomic sequence is not currently available. Fortunately Next Generation Sequencing (NGS) has ra...

  18. Next generation sequencing of DNA-launched Chikungunya vaccine virus.

    PubMed

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-03-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3' untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. PMID:26855330

  19. FACS purification of Drosophila larval Neuroblasts for next generation sequencing

    PubMed Central

    Conder, Ryan; Schmauss, Gerald; Knoblich, Juergen A.

    2014-01-01

    Elegant tools are available for the genetic analysis of neural stem cell lineages in Drosophila, but a methodology for purifying stem cells and their differentiated progeny for transcriptome analysis is currently missing. Previous attempts to overcome this problem either involved using RNA isolated from whole larval brain tissue or co-transcriptional in vivo mRNA tagging. As both methods have limited cell type specificity, we developed a protocol for the isolation of Drosophila neural stem cells (neuroblasts, NBs) and their differentiated sibling cells by FACS. We dissected larval brains from fly strains expressing GFP under the control of a NB lineage-specific GAL4 line. Upon dissociation, we made use of differences in GFP intensity and cell size to separate NBs and neurons. The resulting cell populations are over 98% pure and can readily be used for live imaging or gene expression analysis. Our method is optimized for neural stem cells, but it can also be applied to other Drosophila cell types. Primary cell suspensions and sorted cell populations can be obtained within 1 d; material for deep-sequencing library preparation can be obtained within 4 d. PMID:23660757

  20. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  1. Evaluation of 16S Rrna amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  2. High conopeptide diversity in Conus tribblei revealed through analysis of venom duct transcriptome using two high-throughput sequencing platforms

    PubMed Central

    Barghi, Neda; Concepcion, Gisela P.; Olivera, Baldomero M.; Lluisma, Arturo O.

    2015-01-01

    The venom of each species of Conus contains different kinds of pharmacologically-active peptides which are mostly unique to that species. Collectively, the ~500 – 700 species of Conus produce a large number of these peptides, perhaps exceeding 140,000 different types in total. To date, however, only a small fraction of this diversity has been characterized via transcriptome sequencing. In addition, the sampling of this chemical diversity has not been uniform across the different lineages in the genus. In this study, we used high-throughput transcriptome sequencing approach to further investigate the diversity of Conus venom peptides. We chose a species, Conus tribblei, as a representative of a poorly studied clade of Conus. Using the Roche 454 and Illumina platforms, we discovered 136 unique and novel putative conopeptides belonging to 30 known gene superfamilies and 6 new conopeptide groups, the greatest diversity so far observed from a transcriptome. Most of the identified peptides exhibited divergence from the known conopeptides and some contained cysteine frameworks observed for the first time in cone snails. In addition, several enzymes involved in post-translational modification of conopeptides and also some proteins involved in efficient delivery of the conopeptides to prey were identified as well. Interestingly, a number of conopeptides highly similar to the conopeptides identified in a phylogenetically distant species, the generalist feeder Conus californicus, were observed. The high diversity of conopeptides and the presence of conopeptides similar to those in C. californicus suggest that C. tribblei may have a broad range of prey preferences. PMID:25117477

  3. RNA editing generates cellular subsets with diverse sequence within populations.

    PubMed

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J; Rayon-Estrada, Violeta; Papavasiliou, F Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  4. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  5. Navigating the Rapids: The Development of Regulated Next-Generation Sequencing-Based Clinical Trial Assays and Companion Diagnostics

    PubMed Central

    Pant, Saumya; Weiner, Russell; Marton, Matthew J.

    2014-01-01

    Over the past decade, next-generation sequencing (NGS) technology has experienced meteoric growth in the aspects of platform, technology, and supporting bioinformatics development allowing its widespread and rapid uptake in research settings. More recently, NGS-based genomic data have been exploited to better understand disease development and patient characteristics that influence response to a given therapeutic intervention. Cancer, as a disease characterized by and driven by the tumor genetic landscape, is particularly amenable to NGS-based diagnostic (Dx) approaches. NGS-based technologies are particularly well suited to studying cancer disease development, progression and emergence of resistance, all key factors in the development of next-generation cancer Dxs. Yet, to achieve the promise of NGS-based patient treatment, drug developers will need to overcome a number of operational, technical, regulatory, and strategic challenges. Here, we provide a succinct overview of the state of the clinical NGS field in terms of the available clinically targeted platforms and sequencing technologies. We discuss the various operational and practical aspects of clinical NGS testing that will facilitate or limit the uptake of such assays in routine clinical care. We examine the current strategies for analytical validation and Food and Drug Administration (FDA)-approval of NGS-based assays and ongoing efforts to standardize clinical NGS and build quality control standards for the same. The rapidly evolving companion diagnostic (CDx) landscape for NGS-based assays will be reviewed, highlighting the key areas of concern and suggesting strategies to mitigate risk. The review will conclude with a series of strategic questions that face drug developers and a discussion of the likely future course of NGS-based CDx development efforts. PMID:24860780

  6. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    PubMed Central

    Gullapalli, Rama R.; Desai, Ketaki V.; Santana-Santos, Lucas; Kant, Jeffrey A.; Becich, Michael J.

    2012-01-01

    The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it's hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future. PMID:23248761

  7. Efficient generation of transgenic cattle using the DNA transposon and their analysis by next-generation sequencing.

    PubMed

    Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo

    2016-01-01

    Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781

  8. Efficient generation of transgenic cattle using the DNA transposon and their analysis by next-generation sequencing

    PubMed Central

    Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo

    2016-01-01

    Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781

  9. Consensus Rules in Variant Detection from Next-Generation Sequencing Data

    PubMed Central

    Jia, Peilin; Li, Fei; Xia, Jufeng; Chen, Haiquan; Ji, Hongbin; Pao, William; Zhao, Zhongming

    2012-01-01

    A critical step in detecting variants from next-generation sequencing data is post hoc filtering of putative variants called or predicted by computational tools. Here, we highlight four critical parameters that could enhance the accuracy of called single nucleotide variants and insertions/deletions: quality and deepness, refinement and improvement of initial mapping, allele/strand balance, and examination of spurious genes. Use of these sequence features appropriately in variant filtering could greatly improve validation rates, thereby saving time and costs in next-generation sequencing projects. PMID:22715385

  10. Generation of multivariate autoregressive sequences with emphasis on initial values

    NASA Astrophysics Data System (ADS)

    Ula, Taylan A.

    1992-12-01

    Certain aspects of data generation are studied through multivariate autoregressive (AR) models. The main emphasis is on the preservation of certain desired moments and the effect of initial values on these moments. The problem of preservation of moments is approached in a nontraditional way by starting with the initial values. For this purpose, general AR processes with a random start and with time-varying parameters are introduced to lay a foundation for the analysis of all types of AR processes, including the periodic cases. It is shown that an AR process with a random start and with parameters obtained from the moment equations is capable of generating jointly multivariate normal vectors with any specified means and covariance matrices, and with any specified autocovariance matrices up to a given lag. With a random start, there is no transition period involved for achieving these moments. A simple solution is proposed for matrix equations of the form BBT = A which appear in the moment equations. The aggregation properties of general AR process are also studied. A more detailed analysis is given for the two-period first-order periodic autoregressive model, PAR 2(1). For the PAR 2(1) process with a random start and with parameters obtained from the moment equations, it is shown that the autocovariance function depends only on the period and the lag, and therefore the process is periodic (covariance) stationary. The PAR 2(1) process with a fixed start is also studied. It is shown that the moments of this process depend on the absolute time, in addition to the period and the lag, and therefore the process is not periodic stationary. This dependence diminishes with time, and periodic stationarity is realized if the AR parameters satisfy certain conditions. In that case, the PAR 2(1) process with a fixed start converges to that with a random start, but only after a certain transition period. This proves the superiority of a random start over a fixed start.

  11. Comparative study of aCGH and Next Generation Sequencing (NGS) for chromosomal microdeletion and microduplication screening

    PubMed Central

    Russo, Claudio Dello; Di Giacomo, Gianluca; Cignini, Pietro; Padula, Francesco; Mangiafico, Lucia; Mesoraca, Alvaro; D’Emidio, Laura; McCluskey, Megan R.; Paganelli, Arianna; Giorlandino, Claudio

    2014-01-01

    Background prenatal genetic diagnosis of rare disorders is undergoing in recent years a significant enhancement through the application of methods of massive parallel sequencing. Despite the quantity and quality of the data produced, just few analytical tools and software have been developed in order to identify structural and numerical chromosomal anomalies through NGS, mostly not compatible with benchtop NGS platform and routine clinical diagnosis. Methods we developed technical, bioinformatic, interpretive and validation pipelines for Next Generation Sequencing to identify SNPs, indels, aneuploidies, and CNVs (Copy Number Variations). Results we show a new targeted resequencing approach applied to prenatal diagnosis. For sample processing we used an enrichment method for 4,813 genes library preparation; after sequencing our bioinformatic pipelines allowed both SNPs analysis for approximately thirty diseases or diseases family involved in fetus development and numerical chromosomal anomalies screening. Conclusions results obtained are compatible with those obtained through the gold standard technique, aCGH array, moreover allowing identification of genes involved in chromosome deletions or duplications and exclusion of point mutation on allele not affected by chromosome aberrations. PMID:26266003

  12. Next-generation sequencing for disorders of low and high bone mineral density

    PubMed Central

    Sule, Gautam; Campeau, Philippe M.; Zhang, Victor Wei; Nagamani, Sandesh C.S.; Dawson, Brian C.; Grover, Monica; Bacino, Carlos A.; Sutton, V. Reid; Brunetti-Pierri, Nicola; Lu, James T.; Lemire, Edmond; Gibbs, Richard A.; Cohn, Dan H.; Cui, Hong; Wong, Lee-Jun C.; Lee, Brendan H.

    2013-01-01

    Introduction Osteogenesis imperfecta (OI), Ehlers-Danlos syndrome (EDS), and osteopetrosis (OPT)are collectively common inherited skeletal diseases. Evaluation of subjects with these conditions often includes molecular testing which has important counseling, therapeutic and sometimes legal implications. Since several different genes have been implicated in these conditions, Sanger sequencing of each gene can be a prohibitively expensive and time consuming way to reach a molecular diagnosis. Methods In order to circumvent these problems, we have designed and tested a NGS platform that would allow simultaneous sequencing on a single diagnostic platform of different genes implicated in OI, OPT, EDS, and other inherited conditions leading to low or high bone mineral density. We used a liquid-phase probe library that captures 602 exons (~100 kb) of 34 selected genes and have applied it to test clinical samples from patients with bone disorders. Results NGS of the captured exons by Illumina HiSeq2000 resulted in an average coverage of over 900X. The platform was successfully validated by identifying mutations in 6 patients with known mutations. Moreover, in 4 patients with OI or OPT without a prior molecular diagnosis, the assay was able to detect the causative mutations. Conclusions In conclusion, our NGS panel provides a fast and accurate method to arrive at a molecular diagnosis in most patients with inherited high or low bone mineral density disorders. PMID:23443412

  13. Short reads and nonmodel species: exploring the complexities of next-generation sequence assembly and SNP discovery in the absence of a reference genome.

    PubMed

    Everett, M V; Grau, E D; Seeb, J E

    2011-03-01

    How practical is gene and SNP discovery in a nonmodel species using short read sequences? Next-generation sequencing technologies are being applied to an increasing number of species with no reference genome. For nonmodel species, the cost, availability of existing genetic resources, genome complexity and the planned method of assembly must all be considered when selecting a sequencing platform. Our goal was to examine the feasibility and optimal methodology for SNP and gene discovery in the sockeye salmon (Oncorhynchus nerka) using short read sequences. SOLiD short reads (up to 50 bp) were generated from single- and pooled-tissue transcriptome libraries from ten sockeye salmon. The individuals were from five distinct populations from the Wood River Lakes and Mendeltna Creek, Alaska. As no reference genome was available for sockeye salmon, the SOLiD sequence reads were assembled to publicly available EST reference sequences from sockeye salmon and two closely related species, rainbow trout (Oncorhynchus mykiss) and Atlantic salmon (Salmo salar). Addi