Science.gov

Sample records for generation sequencing platforms

  1. Next-Generation Sequencing Platforms

    NASA Astrophysics Data System (ADS)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  2. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    PubMed

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.

  3. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    PubMed

    Archer, John; Weber, Jan; Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J; Robertson, David L; Mimms, Larry; Quiñones-Mateu, Miguel E

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  4. FLEXBAR—Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms

    PubMed Central

    Dodt, Matthias; Roehr, Johannes T.; Ahmed, Rina; Dieterich, Christoph

    2012-01-01

    Quantitative and systems biology approaches benefit from the unprecedented depth of next-generation sequencing. A typical experiment yields millions of short reads, which oftentimes carry particular sequence tags. These tags may be: (a) specific to the sequencing platform and library construction method (e.g., adapter sequences); (b) have been introduced by experimental design (e.g., sample barcodes); or (c) constitute some biological signal (e.g., splice leader sequences in nematodes). Our software FLEXBAR enables accurate recognition, sorting and trimming of sequence tags with maximal flexibility, based on exact overlap sequence alignment. The software supports data formats from all current sequencing platforms, including color-space reads. FLEXBAR maintains read pairings and processes separate barcode reads on demand. Our software facilitates the fine-grained adjustment of sequence tag detection parameters and search regions. FLEXBAR is a multi-threaded software and combines speed with precision. Even complex read processing scenarios might be executed with a single command line call. We demonstrate the utility of the software in terms of read mapping applications, library demultiplexing and splice leader detection. FLEXBAR and additional information is available for academic use from the website: http://sourceforge.net/projects/flexbar/. PMID:24832523

  5. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  6. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    PubMed

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.

  7. A Microfluidic DNA Library Preparation Platform for Next-Generation Sequencing

    PubMed Central

    Sinha, Anupama; Bent, Zachary W.; Solberg, Owen D.; Williams, Kelly P.; Langevin, Stanley A.; Renzi, Ronald F.; Van De Vreugde, James L.; Meagher, Robert J.; Schoeniger, Joseph S.; Lane, Todd W.; Branda, Steven S.; Bartsch, Michael S.; Patel, Kamlesh D.

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories. PMID:23894387

  8. A microfluidic DNA library preparation platform for next-generation sequencing.

    PubMed

    Kim, Hanyoup; Jebrail, Mais J; Sinha, Anupama; Bent, Zachary W; Solberg, Owen D; Williams, Kelly P; Langevin, Stanley A; Renzi, Ronald F; Van De Vreugde, James L; Meagher, Robert J; Schoeniger, Joseph S; Lane, Todd W; Branda, Steven S; Bartsch, Michael S; Patel, Kamlesh D

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  9. Clinical analysis of genome next-generation sequencing data using the Omicia platform

    PubMed Central

    Coonrod, Emily M; Margraf, Rebecca L; Russell, Archie; Voelkerding, Karl V; Reese, Martin G

    2013-01-01

    Aims Next-generation sequencing is being implemented in the clinical laboratory environment for the purposes of candidate causal variant discovery in patients affected with a variety of genetic disorders. The successful implementation of this technology for diagnosing genetic disorders requires a rapid, user-friendly method to annotate variants and generate short lists of clinically relevant variants of interest. This report describes Omicia’s Opal platform, a new software tool designed for variant discovery and interpretation in a clinical laboratory environment. The software allows clinical scientists to process, analyze, interpret and report on personal genome files. Materials & Methods To demonstrate the software, the authors describe the interactive use of the system for the rapid discovery of disease-causing variants using three cases. Results & Conclusion Here, the authors show the features of the Opal system and their use in uncovering variants of clinical significance. PMID:23895124

  10. Performance comparison of next-generation sequencing platforms for determining HIV-1 coreceptor use

    PubMed Central

    Raymond, Stéphanie; Nicot, Florence; Jeanne, Nicolas; Delfour, Olivier; Carcenac, Romain; Lefebvre, Caroline; Cazabat, Michelle; Sauné, Karine; Delobel, Pierre; Izopet, Jacques

    2017-01-01

    The coreceptor used by HIV-1 must be determined before a CCR5 antagonist, part of the arsenal of antiretroviral drugs, is prescribed because viruses that enter cells using the CXCR4 coreceptor are responsible for treatment failure. HIV-1 tropism is also correlated with disease progression and so must be determined for virological studies. Tropism can be determined by next-generation sequencing (NGS), but not all of these new technologies have been fully validated for use in clinical practice. The Illumina NGS technology is used in many laboratories but its ability to predict HIV-1 tropism has not been evaluated while the 454 GS-Junior (Roche) is used for routine diagnosis. The genotypic prediction of HIV-1 tropism is based on sequencing the V3 region and interpreting the results with an appropriate algorithm. We compared the performances of the MiSeq (Illumina) and 454 GS-Junior (Roche) systems with a reference phenotypic assay. We used clinical samples for the NGS tropism predictions and assessed their ability to quantify CXCR4-using variants. The data show that the Illumina platform can be used to detect minor CXCR4-using variants in clinical practice but technical optimization are needed to improve quantification. PMID:28186189

  11. Next-Generation Sequencing Workflow for NSCLC Critical Samples Using a Targeted Sequencing Approach by Ion Torrent PGM™ Platform

    PubMed Central

    Vanni, Irene; Coco, Simona; Truini, Anna; Rusmini, Marta; Dal Bello, Maria Giovanna; Alama, Angela; Banelli, Barbara; Mora, Marco; Rijavec, Erika; Barletta, Giulia; Genova, Carlo; Biello, Federica; Maggioni, Claudia; Grossi, Francesco

    2015-01-01

    Next-generation sequencing (NGS) is a cost-effective technology capable of screening several genes simultaneously; however, its application in a clinical context requires an established workflow to acquire reliable sequencing results. Here, we report an optimized NGS workflow analyzing 22 lung cancer-related genes to sequence critical samples such as DNA from formalin-fixed paraffin-embedded (FFPE) blocks and circulating free DNA (cfDNA). Snap frozen and matched FFPE gDNA from 12 non-small cell lung cancer (NSCLC) patients, whose gDNA fragmentation status was previously evaluated using a multiplex PCR-based quality control, were successfully sequenced with Ion Torrent PGM™. The robust bioinformatic pipeline allowed us to correctly call both Single Nucleotide Variants (SNVs) and indels with a detection limit of 5%, achieving 100% specificity and 96% sensitivity. This workflow was also validated in 13 FFPE NSCLC biopsies. Furthermore, a specific protocol for low input gDNA capable of producing good sequencing data with high coverage, high uniformity, and a low error rate was also optimized. In conclusion, we demonstrate the feasibility of obtaining gDNA from FFPE samples suitable for NGS by performing appropriate quality controls. The optimized workflow, capable of screening low input gDNA, highlights NGS as a potential tool in the detection, disease monitoring, and treatment of NSCLC. PMID:26633390

  12. A platform for leveraging next generation sequencing for routine microbiology and public health use.

    PubMed

    Rusu, Laura I; Wyres, Kelly L; Reumann, Matthias; Queiroz, Carlos; Bojovschi, Alexe; Conway, Tom; Garg, Saurabh; Edwards, David J; Hogg, Geoff; Holt, Kathryn E

    2015-01-01

    Even with the advent of next-generation sequencing (NGS) technologies which have revolutionised the field of bacterial genomics in recent years, a major barrier still exists to the implementation of NGS for routine microbiological use (in public health and clinical microbiology laboratories). Such routine use would make a big difference to investigations of pathogen transmission and prevention/control of (sometimes lethal) infections. The inherent complexity and high frequency of data analyses on very large sets of bacterial DNA sequence data, the ability to ensure data provenance and automatically track and log all analyses for audit purposes, the need for quick and accurate results, together with an essential user-friendly interface for regular non-technical laboratory staff, are all critical requirements for routine use in a public health setting. There are currently no systems to answer positively to all these requirements, in an integrated manner. In this paper, we describe a system for sequence analysis and interpretation that is highly automated and tackles the issues raised earlier, and that is designed for use in diagnostic laboratories by healthcare workers with no specialist bioinformatics knowledge.

  13. Full-length novel MHC class I allele discovery by next-generation sequencing: two platforms are better than one.

    PubMed

    Dudley, Dawn M; Karl, Julie A; Creager, Hannah M; Bohn, Patrick S; Wiseman, Roger W; O'Connor, David H

    2014-01-01

    Deep sequencing has revolutionized major histocompatibility complex (MHC) class I analysis of nonhuman primates by enabling high-throughput, economical, and comprehensive genotyping. Full-length MHC class I cDNA sequences, which are required to generate reagents such as MHC-peptide tetramers, cannot be directly obtained by short read deep sequencing. We combined data from two next-generation sequencing platforms to discover novel full-length MHC class I mRNA/cDNA transcripts in Chinese rhesus macaques. We first genotyped macaques by Roche/454 pyrosequencing using a 530-bp amplicon spanning the densely polymorphic exons 2 through 4 of the MHC class I loci that encode the peptide-binding region. We then mapped short paired-end 250 bp Illumina sequence reads spanning the full-length transcript to each 530-bp amplicon at high stringency and used paired-end information to reconstruct full-length allele sequences. We characterized 65 full-length sequences from six Chinese rhesus macaques. Overall, approximately 70 % of the alleles distinguished in these six animals contained new sequence information, including 29 novel transcripts. The flexibility of this approach should make full-length MHC class I allele genotyping accessible for any nonhuman primate population of interest. We are currently optimizing this method for full-length characterization of other highly polymorphic, duplicated loci such as the MHC class II DRB and killer immunoglobulin-like receptors. We anticipate that this method will facilitate rapid expansion and near completion of sequence libraries of polymorphic loci, such as MHC class I, within a few years.

  14. A two-dimensional pooling strategy for rare variant detection on next-generation sequencing platforms.

    PubMed

    Zuzarte, Philip C; Denroche, Robert E; Fehringer, Gordon; Katzov-Eckert, Hagit; Hung, Rayjean J; McPherson, John D

    2014-01-01

    We describe a method for pooling and sequencing DNA from a large number of individual samples while preserving information regarding sample identity. DNA from 576 individuals was arranged into four 12 row by 12 column matrices and then pooled by row and by column resulting in 96 total pools with 12 individuals in each pool. Pooling of DNA was carried out in a two-dimensional fashion, such that DNA from each individual is present in exactly one row pool and exactly one column pool. By considering the variants observed in the rows and columns of a matrix we are able to trace rare variants back to the specific individuals that carry them. The pooled DNA samples were enriched over a 250 kb region previously identified by GWAS to significantly predispose individuals to lung cancer. All 96 pools (12 row and 12 column pools from 4 matrices) were barcoded and sequenced on an Illumina HiSeq 2000 instrument with an average depth of coverage greater than 4,000×. Verification based on Ion PGM sequencing confirmed the presence of 91.4% of confidently classified SNVs assayed. In this way, each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach. This provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.

  15. A comprehensive transcriptome assembly of pigeonpea (Cajanauscajan L.) using sanger and second-generation sequencing platforms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18,353 Sanger expressed sequenced tags (ESTs) from more than 16 genotypes. The resultant transcriptome assembly, refer...

  16. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  17. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform.

    PubMed

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-09-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus.

  18. Multi-platform and cross-methodological reproducibility of transcriptome profiling by RNA-seq in the ABRF Next-Generation Sequencing Study

    PubMed Central

    Nicolet, Charles M.; Grove, Deborah; Levy, Shawn; Farmerie, William; Viale, Agnes; Wright, Chris; Schweitzer, Peter A.; Gao, Yuan; Kim, Dewey; Boland, Joe; Hicks, Belynda; Kim, Ryan; Chhangawala, Sagar; Jafari, Nadereh; Raghavachari, Nalini; Gandara, Jorge; Garcia-Reyero, Natàlia; Hendrickson, Cynthia; Roberson, David; Rosenfeld, Jeffrey; Smith, Todd; Underwood, Jason G.; Wang, May; Zumbo, Paul; Baldwin, Don A.; Grills, George S.; Mason, Christopher E.

    2014-01-01

    High-throughput RNA sequencing (RNA-seq) dramatically expands the potential for novel genomics discoveries, but the wide variety of platforms, protocols and performance has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We tested replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (polyA-selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies’ PGM and Proton, Pacific Biosciences RS and Roche’s 454). The results show high intra-platform and inter-platform concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. These data also demonstrate that ribosomal RNA depletion can both enable effective analysis of degraded RNA samples and be readily compared to polyA-enriched fractions. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq. PMID:25150835

  19. Towards allele-level human leucocyte antigens genotyping - assessing two next-generation sequencing platforms: Ion Torrent Personal Genome Machine and Illumina MiSeq.

    PubMed

    Duke, J L; Lind, C; Mackiewicz, K; Ferriola, D; Papazoglou, A; Derbeneva, O; Wallace, D; Monos, D S

    2015-10-01

    Human leucocyte antigens (HLA) typing has been a challenge due to extreme polymorphism of the HLA genes and limitations of the current technologies and protocols used for their characterization. Recently, next-generation sequencing techniques have been shown to be a well-suited technology for the complete characterization of the HLA genes. However, a comprehensive assessment of the different platforms for HLA typing, describing the limitations and advantages of each of them, has not been presented. We have compared the Ion Torrent Personal Genome Machine (PGM) and Illumina MiSeq, currently the two most frequently used platforms for diagnostic applications, for a number of metrics including total output, quality score per position across the reads and error rates after alignment which can all affect the accuracy of HLA genotyping. For this purpose, we have used one homozygous and three heterozygous well-characterized samples, at HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1. The total output of bases produced by the MiSeq was higher, and they have higher quality scores and a lower overall error rate than the PGM. The MiSeq also has a higher fidelity when sequencing through homopolymer regions up to 9 bp in length. The need to set phase between distant polymorphic sites was more readily achieved with MiSeq using paired-end sequencing of fragments that are longer than those obtained with PGM. Additionally, we have assessed the workflows of the different platforms for complexity of sample preparation, sequencer operation and turnaround time. The effects of data quality and quantity can impact the genotyping results; having an adequate amount of good quality data to analyse will be imperative for confident HLA genotyping. The overall turnaround time can be very comparable between the two platforms; however, the complexity of sample preparation is higher with PGM, while the actual sequencing time is longer with MiSeq.

  20. Graphical contig analyzer for all sequencing platforms (G4ALL): a new stand-alone tool for finishing and draft generation of bacterial genomes

    PubMed Central

    Ramos, Rommel Thiago Jucá; Carneiro, Adriana R; Caracciolo, Pablo H; Azevedo, Vasco; Schneider, Maria Paula C; Barh, Debmalya; Silva, Artur

    2013-01-01

    Genome assembly has always been complicated due to the inherent difficulties of sequencing technologies, as well the computational methods used to process sequences. Although many of the problems for the generation of contigs from reads are well known, especially those involving short reads, the orientation and ordination of contigs in the finishing stages is still very challenging and time consuming, as it requires the manual curation of the contigs to guarantee correct identification them and prevent misassembly. Due to the large numbers of sequences that are produced, especially from the reads produced by next generation sequencers, this process demands considerable manual effort, and there are few software options available to facilitate the process. To address this problem, we have developed the Graphic Contig Analyzer for All Sequencing Platforms (G4ALL): a stand-alone multi-user tool that facilitates the editing of the contigs produced in the assembly process. Besides providing information on the gene products contained in each contig, obtained through a search of the available biological databases, G4ALL produces a scaffold of the genome, based on the overlap of the contigs after curation. Availability The software is available at: http://www.genoma.ufpa.br/rramos/softwares/g4all.xhtml PMID:23888102

  1. AG-NGS: a powerful and user-friendly computing application for the semi-automated preparation of next-generation sequencing libraries using open liquid handling platforms.

    PubMed

    Callejas, Sergio; Álvarez, Rebeca; Benguria, Alberto; Dopazo, Ana

    2014-01-01

    Next-generation sequencing (NGS) is becoming one of the most widely used technologies in the field of genomics. Library preparation is one of the most critical, hands-on, and time-consuming steps in the NGS workflow. Each library must be prepared in an independent well, increasing the number of hours required for a sequencing run and the risk of human-introduced error. Automation of library preparation is the best option to avoid these problems. With this in mind, we have developed automatic genomics NGS (AG-NGS), a computing application that allows an open liquid handling platform to be transformed into a library preparation station without losing the potential of an open platform. Implementation of AG-NGS does not require programming experience, and the application has also been designed to minimize implementation costs. Automated library preparation with AG-NGS generated high-quality libraries from different samples, demonstrating its efficiency, and all quality control parameters fell within the range of optimal values.

  2. Robustness of Massively Parallel Sequencing Platforms

    PubMed Central

    Kavak, Pınar; Yüksel, Bayram; Aksu, Soner; Kulekci, M. Oguzhan; Güngör, Tunga; Hach, Faraz; Şahinalp, S. Cenk; Alkan, Can; Sağıroğlu, Mahmut Şamil

    2015-01-01

    The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications. PMID:26382624

  3. ORIO (Online Resource for Integrative Omics): a web-based platform for rapid integration of next generation sequencing data.

    PubMed

    Lavender, Christopher A; Shapiro, Andrew J; Burkholder, Adam B; Bennett, Brian D; Adelman, Karen; Fargo, David C

    2017-04-11

    Established and emerging next generation sequencing (NGS)-based technologies allow for genome-wide interrogation of diverse biological processes. However, accessibility of NGS data remains a problem, and few user-friendly resources exist for integrative analysis of NGS data from different sources and experimental techniques. Here, we present Online Resource for Integrative Omics (ORIO; https://orio.niehs.nih.gov/), a web-based resource with an intuitive user interface for rapid analysis and integration of NGS data. To use ORIO, the user specifies NGS data of interest along with a list of genomic coordinates. Genomic coordinates may be biologically relevant features from a variety of sources, such as ChIP-seq peaks for a given protein or transcription start sites from known gene models. ORIO first iteratively finds read coverage values at each genomic feature for each NGS dataset. Data are then integrated using clustering-based approaches, giving hierarchical relationships across NGS datasets and separating individual genomic features into groups. In focusing its analysis on read coverage, ORIO makes limited assumptions about the analyzed data; this allows the tool to be applied across data from a variety of experiments and techniques. Results from analysis are presented in dynamic displays alongside user-controlled statistical tests, supporting rapid statistical validation of observed results. We emphasize the versatility of ORIO through diverse examples, ranging from NGS data quality control to characterization of enhancer regions and integration of gene expression information. Easily accessible on a public web server, we anticipate wide use of ORIO in genome-wide investigations by life scientists.

  4. An effective screening strategy for deafness in combination with a next-generation sequencing platform: a consecutive analysis

    PubMed Central

    Sakuma, Naoko; Moteki, Hideaki; Takahashi, Masahiro; Nishio, Shin-ya; Arai, Yasuhiro; Yamashita, Yukiko; Oridate, Nobuhiko; Usami, Shin-ichi

    2016-01-01

    The diagnosis of the genetic etiology of deafness contributes to the clinical management of patients. We performed the following four genetic tests in three stages for 52 consecutive deafness subjects in one facility. We used the Invader assay for 46 mutations in 13 genes and Sanger sequencing for the GJB2 gene or SLC26A4 gene in the first-stage test, the TaqMan genotyping assay in the second-stage test and targeted exon sequencing using massively parallel DNA sequencing in the third-stage test. Overall, we identified the genetic cause in 40% (21/52) of patients. The diagnostic rates of autosomal dominant, autosomal recessive and sporadic cases were 50%, 60% and 34%, respectively. When the sporadic cases with congenital and severe hearing loss were selected, the diagnostic rate rose to 48%. The combination approach using these genetic tests appears to be useful as a diagnostic tool for deafness patients. We recommended that genetic testing for the screening of common mutations in deafness genes using the Invader assay or TaqMan genotyping assay be performed as the initial evaluation. For the remaining undiagnosed cases, targeted exon sequencing using massively parallel DNA sequencing is clinically and economically beneficial. PMID:26763877

  5. An effective screening strategy for deafness in combination with a next-generation sequencing platform: a consecutive analysis.

    PubMed

    Sakuma, Naoko; Moteki, Hideaki; Takahashi, Masahiro; Nishio, Shin-ya; Arai, Yasuhiro; Yamashita, Yukiko; Oridate, Nobuhiko; Usami, Shin-ichi

    2016-03-01

    The diagnosis of the genetic etiology of deafness contributes to the clinical management of patients. We performed the following four genetic tests in three stages for 52 consecutive deafness subjects in one facility. We used the Invader assay for 46 mutations in 13 genes and Sanger sequencing for the GJB2 gene or SLC26A4 gene in the first-stage test, the TaqMan genotyping assay in the second-stage test and targeted exon sequencing using massively parallel DNA sequencing in the third-stage test. Overall, we identified the genetic cause in 40% (21/52) of patients. The diagnostic rates of autosomal dominant, autosomal recessive and sporadic cases were 50%, 60% and 34%, respectively. When the sporadic cases with congenital and severe hearing loss were selected, the diagnostic rate rose to 48%. The combination approach using these genetic tests appears to be useful as a diagnostic tool for deafness patients. We recommended that genetic testing for the screening of common mutations in deafness genes using the Invader assay or TaqMan genotyping assay be performed as the initial evaluation. For the remaining undiagnosed cases, targeted exon sequencing using massively parallel DNA sequencing is clinically and economically beneficial.

  6. Comprehensive transcriptome assembly of chickpea (Cicer arietinum L.) using Sanger and next generation sequencing platforms: development and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A high-quality transcriptome assembly for chickpea has been developed using ~135 million Illumina single-end reads, 7.12 million single-end FLX/454 reads, and 139 thousand Sanger expressed sequence tags (ESTs). This hybrid transcriptome assembly, which we refer to as the "Cicer arietinum Transcripto...

  7. Automatic Command Sequence Generation

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladded, Roy; Khanampompan, Teerapat

    2007-01-01

    Automatic Sequence Generator (Autogen) Version 3.0 software automatically generates command sequences for the Mars Reconnaissance Orbiter (MRO) and several other JPL spacecraft operated by the multi-mission support team. Autogen uses standard JPL sequencing tools like APGEN, ASP, SEQGEN, and the DOM database to automate the generation of uplink command products, Spacecraft Command Message Format (SCMF) files, and the corresponding ground command products, DSN Keywords Files (DKF). Autogen supports all the major multi-mission mission phases including the cruise, aerobraking, mapping/science, and relay mission phases. Autogen is a Perl script, which functions within the mission operations UNIX environment. It consists of two parts: a set of model files and the autogen Perl script. Autogen encodes the behaviors of the system into a model and encodes algorithms for context sensitive customizations of the modeled behaviors. The model includes knowledge of different mission phases and how the resultant command products must differ for these phases. The executable software portion of Autogen, automates the setup and use of APGEN for constructing a spacecraft activity sequence file (SASF). The setup includes file retrieval through the DOM (Distributed Object Manager), an object database used to store project files. This step retrieves all the needed input files for generating the command products. Depending on the mission phase, Autogen also uses the ASP (Automated Sequence Processor) and SEQGEN to generate the command product sent to the spacecraft. Autogen also provides the means for customizing sequences through the use of configuration files. By automating the majority of the sequencing generation process, Autogen eliminates many sequence generation errors commonly introduced by manually constructing spacecraft command sequences. Through the layering of commands into the sequence by a series of scheduling algorithms, users are able to rapidly and reliably construct the

  8. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L.) using sanger and next generation sequencing platforms: development and applications.

    PubMed

    Kudapa, Himabindu; Azam, Sarwar; Sharpe, Andrew G; Taran, Bunyamin; Li, Rong; Deonovic, Benjamin; Cameron, Connor; Farmer, Andrew D; Cannon, Steven B; Varshney, Rajeev K

    2014-01-01

    A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in

  9. A Comprehensive Transcriptome Assembly of Pigeonpea (Cajanus cajan L.) using Sanger and Second-Generation Sequencing Platforms

    PubMed Central

    Kudapa, Himabindu; Bharti, Arvind K.; Cannon, Steven B.; Farmer, Andrew D.; Mulaosmanovic, Benjamin; Kramer, Robin; Bohra, Abhishek; Weeks, Nathan T.; Crow, John A.; Tuteja, Reetu; Shah, Trushar; Dutta, Sutapa; Gupta, Deepak K.; Singh, Archana; Gaikwad, Kishor; Sharma, Tilak R.; May, Gregory D.; Singh, Nagendra K.; Varshney, Rajeev K.

    2012-01-01

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18 353 Sanger expressed sequenced tags from more than 16 genotypes. The resultant transcriptome assembly, referred to as CcTA v2, comprised 21 434 transcript assembly contigs (TACs) with an N50 of 1510 bp, the largest one being ∼8 kb. Of the 21 434 TACs, 16 622 (77.5%) could be mapped on to the soybean genome build 1.0.9 under fairly stringent alignment parameters. Based on knowledge of intron junctions, 10 009 primer pairs were designed from 5033 TACs for amplifying intron spanning regions (ISRs). By using in silico mapping of BAC-end-derived SSR loci of pigeonpea on the soybean genome as a reference, putative mapping positions at the chromosome level were predicted for 6284 ISR markers, covering all 11 pigeonpea chromosomes. A subset of 128 ISR markers were analyzed on a set of eight genotypes. While 116 markers were validated, 70 markers showed one to three alleles, with an average of 0.16 polymorphism information content (PIC) value. In summary, the CcTA v2 transcript assembly and ISR markers will serve as a useful resource to accelerate genetic research and breeding applications in pigeonpea. PMID:22241453

  10. Next generation sequencing based approaches to epigenomics

    PubMed Central

    Marra, Marco A.

    2010-01-01

    Next generation sequencing has brought epigenomic studies to the forefront of current research. The power of massively parallel sequencing coupled to innovative molecular and computational techniques has allowed researchers to profile the epigenome at resolutions that were unimaginable only a few years ago. With early proof of concept studies published, the field is now moving into the next phase where the importance of method standardization and rigorous quality control are becoming paramount. In this review we will describe methodologies that have been developed to profile the epigenome using next generation sequencing platforms. We will discuss these in terms of library preparation, sequence platforms and analysis techniques. PMID:21266347

  11. Next-Generation Sequencing.

    PubMed

    Le Gallo, Matthieu; Lozy, Fred; Bell, Daphne W

    2017-01-01

    Endometrial cancers are the most frequently diagnosed gynecological malignancy and were expected to be the seventh leading cause of cancer death among American women in 2015. The majority of endometrial cancers are of serous or endometrioid histology. Most human tumors, including endometrial tumors, are driven by the acquisition of pathogenic mutations in cancer genes. Thus, the identification of somatic mutations within tumor genomes is an entry point toward cancer gene discovery. However, efforts to pinpoint somatic mutations in human cancers have, until recently, relied on high-throughput sequencing of single genes or gene families using Sanger sequencing. Although this approach has been fruitful, the cost and throughput of Sanger sequencing generally prohibits systematic sequencing of the ~22,000 genes that make up the exome. The recent development of next-generation sequencing technologies changed this paradigm by providing the capability to rapidly sequence exomes, transcriptomes, and genomes at relatively low cost. Remarkably, the application of this technology to catalog the mutational landscapes of endometrial tumor exomes, transcriptomes, and genomes has revealed, for the first time, that serous and endometrioid endometrial cancers can be classified into four distinct molecular subgroups. In this chapter, we overview the characteristic genomic features of each subgroup and discuss the known and putative cancer genes that have emerged from next-generation sequencing of endometrial carcinomas.

  12. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  13. Next-generation sequencing-based user-friendly platforms for drug-resistant tuberculosis diagnosis: A promise for the near future.

    PubMed

    Dolinger, David L; Colman, Rebecca E; Engelthaler, David M; Rodwell, Timothy C

    2016-12-01

    Since 2002, there has been a gradual worldwide 1.3% annual decrease in the incidence of tuberculosis (TB). This is an encouraging statistic; however, it will not achieve the World Health Organization's goal of eliminating TB by 2050, and it is being compounded by the persistent global incidence of drug-resistant tuberculosis (DR-TB) acquired by transmission and by treatment pressure. One key to effectively control tuberculosis and the spread of multiresistant strains is accurate information pertaining to drug resistance and susceptibility. Next-generation sequencing (NGS) has the potential to effectively change global health and the management of TB. Industry has focused primarily on using NGS for oncology diagnostics and human genomics, but the area in which NGS can rapidly impact health care is in the area of infectious disease diagnostics in low- and middle-income countries. To date, there has been a failure as a community to capitalize on the potential of NGS, especially at the reference laboratory level where it can provide actionable information pertaining to treatment options for patients. The rapid evolution of knowledge about the genetic foundations of tuberculosis drug resistance makes sequencing a versatile technology platform for providing rapid, accurate, and actionable results for treating this disease. No "plug-and-play" and "end-to-end" NGS solutions exist that provide clinically relevant sequence data from the Mycobacterium tuberculosis complex genome from primary clinical samples (e.g., sputum) in high-burden country reference laboratories, which is where they are most needed. However, such a system-based solution is underdeveloped by Foundation for Innovative Diagnostics (FIND), in collaboration with partners from academia, nongovernmental organizations, and industry. The solution is modular and is designed and developed to perform targeted amplicon sequencing directly from a patient's primary sputum sample. This solution will initially allow

  14. Relay Sequence Generation Software

    NASA Technical Reports Server (NTRS)

    Gladden, Roy E.; Khanampompan, Teerapat

    2009-01-01

    Due to thermal and electromagnetic interactivity between the UHF (ultrahigh frequency) radio onboard the Mars Reconnaissance Orbiter (MRO), which performs relay sessions with the Martian landers, and the remainder of the MRO payloads, it is required to integrate and de-conflict relay sessions with the MRO science plan. The MRO relay SASF/PTF (spacecraft activity sequence file/ payload target file) generation software facilitates this process by generating a PTF that is needed to integrate the periods of time during which MRO supports relay activities with the rest of the MRO science plans. The software also generates the needed command products that initiate the relay sessions, some features of which are provided by the lander team, some are managed by MRO internally, and some being derived.

  15. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing. PMID:27547838

  16. Clinical detection of human probiotics and human pathogenic bacteria by using a novel high-throughput platform based on next generation sequencing

    PubMed Central

    2014-01-01

    Background The human body plays host to a vast array of bacteria, found in oral cavities, skin, gastrointestinal tract and the vagina. Some bacteria are harmful while others are beneficial to the host. Despite the availability of many methods to identify bacteria, most of them are only applicable to specific and cultivable bacteria and are also tedious. Based on high throughput sequencing technology, this work derives 16S rRNA sequences of bacteria and analyzes probiotics and pathogens species. Results We constructed a database that recorded the species of probiotics and pathogens from literature, along with a modified Smith-Waterman algorithm for assigning the taxonomy of the sequenced 16S rRNA sequences. We also constructed a bacteria disease risk model for seven diseases based on 98 samples. Applicability of the proposed platform is demonstrated by collecting the microbiome in human gut of 13 samples. Conclusions The proposed platform provides a relatively easy means of identifying a certain amount of bacteria and their species (including uncultivable pathogens) for clinical microbiology applications. That is, detecting how probiotics and pathogens inhabit humans and how affect their health can significantly contribute to develop a diagnosis and treatment method. PMID:24418497

  17. Next generation sequencing of SNPs using the HID-Ion AmpliSeq™ Identity Panel on the Ion Torrent PGM™ platform.

    PubMed

    Guo, Fei; Zhou, Yishu; Song, He; Zhao, Jinling; Shen, Hongying; Zhao, Bin; Liu, Feng; Jiang, Xianhua

    2016-11-01

    The HID-Ion AmpliSeq™ Identity Panel (the HID Identity Panel) is designed to detect 124-plex single nucleotide polymorphisms (SNPs) with next generation sequencing (NGS) technology on the Ion Torrent PGM™ platform, including 90 individual identification SNPs (IISNPs) on autosomal chromosomes and 34 lineage informative SNPs (LISNPs) on Y chromosome. In this study, we evaluated performance for the HID Identity Panel to provide a reference for NGS-SNP application, focusing on locus strand balance, locus coverage balance, heterozygote balance, and background signals. Besides, several experiments were carried out to find out improvements and limitations of this panel, including studies of species specificity, repeatability and concordance, sensitivity, mixtures, case-type samples and degraded samples, population genetics and pedigrees following the Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines. In addition, Southern and Northern Chinese Han were investigated to assess applicability of this panel. Results showed this panel led to cross-reactivity with primates to some extent but rarely with non-primate animals. Repeatable and concordant genotypes could be obtained in triplicate with one exception at rs7520386. Full profiles could be obtained from 100pg input DNA, but the optimal input DNA would be 1ng-200pg with 21 initial PCR cycles. A sample with ≥20% minor contributor could be considered as a mixture by the number of homozygotes, and full profiles belonging to minor contributors could be detected between 9:1 and 1:9 mixtures with known reference profiles. Also, this assay could be used for case-type samples and degraded samples. For autosomal SNPs (A-SNPs), FST across all 90loci was not significantly different between Southern and Northern Chinese Han or between male and female samples. All A-SNP loci were independent in Chinese Han population. Except for 18loci with He <0.4, most of the A-SNPs in the HID Identity Panel presented high

  18. "First generation" automated DNA sequencing technology.

    PubMed

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines.

  19. MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform

    PubMed Central

    Suyama, Yoshihisa; Matsuki, Yu

    2015-01-01

    Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239

  20. MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform.

    PubMed

    Suyama, Yoshihisa; Matsuki, Yu

    2015-11-23

    Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities.

  1. Explanatory chapter: next generation sequencing.

    PubMed

    Yegnasubramanian, Srinivasan

    2013-01-01

    Technological breakthroughs in sequencing technologies have driven the advancement of molecular biology and molecular genetics research. The advent of high-throughput Sanger sequencing (for information on the method, see Sanger Dideoxy Sequencing of DNA) in the mid- to late-1990s made possible the accelerated completion of the human genome project, which has since revolutionized the pace of discovery in biomedical research. Similarly, the advent of next generation sequencing is poised to revolutionize biomedical research and usher a new era of individualized, rational medicine. The term next generation sequencing refers to technologies that have enabled the massively parallel analysis of DNA sequence facilitated through the convergence of advancements in molecular biology, nucleic acid chemistry and biochemistry, computational biology, and electrical and mechanical engineering. The current next generation sequencing technologies are capable of sequencing tens to hundreds of millions of DNA templates simultaneously and generate >4 gigabases of sequence in a single day. These technologies have largely started to replace high-throughput Sanger sequencing for large-scale genomic projects, and have created significant enthusiasm for the advent of a new era of individualized medicine.

  2. Generating barcoded libraries for multiplex high-throughput sequencing.

    PubMed

    Knapp, Michael; Stiller, Mathias; Meyer, Matthias

    2012-01-01

    Molecular barcoding is an essential tool to use the high throughput of next generation sequencing platforms optimally in studies involving more than one sample. Various barcoding strategies allow for the incorporation of short recognition sequences (barcodes) into sequencing libraries, either by ligation or polymerase chain reaction (PCR). Here, we present two approaches optimized for generating barcoded sequencing libraries from low copy number extracts and amplification products typical of ancient DNA studies.

  3. Choice of next-generation sequencing pipelines.

    PubMed

    Del Chierico, F; Ancora, M; Marcacci, M; Cammà, C; Putignani, L; Conti, Salvatore

    2015-01-01

    The next-generation sequencing (NGS) technologies are revolutionary tools which have made possible achieving remarkable advances in genetics since the beginning of the twenty-first century. Thanks to the possibility to produce large amount of sequence data, these tools are going to completely substitute other high-throughput technologies. Moreover, the large applications of NGS protocols are increasing the genetic decoding of biological systems through studies of genome anatomy and gene mapping, coupled to the transcriptome pictures. The application of NGS pipelines such as (1) de-novo genomic sequencing by mate-paired and whole-genome shotgun strategies; (2) specific gene sequencing on large bacterial communities; and (3) RNA-seq methods including whole transcriptome sequencing and Serial Analysis of Gene Expression (Sage-analysis) are fundamental in the genome-wide fields like metagenomics. Recently, the availability of these advanced protocols has allowed to overcome the usual sequencing technical issues related to the mapping specificity over standard shotgun library sequencing, the detection of large structural genomes variations and bridging sequencing gaps, as well as more precise gene annotation. In this chapter we will discuss how to manage a successful NGS pipeline from the planning of sequencing projects through the choice of the platforms up to the data analysis management.

  4. Application of genotyping-by-sequencing on semiconductor sequencing platforms: A comparison of genetic and reference-based marker ordering in barley

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid development of next generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach fo...

  5. Wolfcampian sequence stratigraphy of eastern Central Basin platform, Texas

    SciTech Connect

    Candelaria, M.P.; Entzminger, D.J.; Behnken, F.H. ); Sarg, J.F. ); Wilde, G.L. )

    1992-04-01

    Integrated study of well logs, cores, high-resolution seismic data, and biostratigraphy has established the sequence framework of the Atokan (Early Pennsylvanian)-Wolfcampian (Early Permian) stratigraphic section along the eastern margin of the Central Basin platform in the Permian basin. Sequence interpretation of high-resolution, high-fold seismic data through this stratigraphic interval has revealed a complex progradational/retrogradational evolution of the platform margin that has demonstrated overall progradation of at least 12 km during early-middle Wolfcampian. Sequence stratigraphic study of the Wolfcamp interval has revealed details of the internal architecture and morphologic evolution of the contemporaneous platform margin. Two generalized seismic facies assemblages are recognized in the Wolfcampian. Platform interior facies are characterized by high-amplitude, laterally continuous parallel reflections; platform margin facies consist of progradational sigmoidal to oblique clinoforms and are characterized by discontinuous, low-amplitude reflections. Sequence interpretation of carbonate platform-to-basin strata geometries helps in predicting subtle stratigraphic trapping relationships and potential reservoir facies distribution. Moreover, this interpretive method assists in describing complex reservoir heterogeneities that can contribute to significant reserve additions from within existing fields.

  6. Generating Exome Enriched Sequencing Libraries from Formalin-Fixed, Paraffin-Embedded Tissue DNA for Next-Generation Sequencing.

    PubMed

    Marosy, Beth A; Craig, Brian D; Hetrick, Kurt N; Witmer, P Dane; Ling, Hua; Griffith, Sean M; Myers, Benjamin; Ostrander, Elaine A; Stanford, Janet L; Brody, Lawrence C; Doheny, Kimberly F

    2017-01-11

    This unit describes a technique for generating exome-enriched sequencing libraries using DNA extracted from formalin-fixed paraffin-embedded (FFPE) samples. Utilizing commercially available kits, we present a low-input FFPE workflow starting with 50 ng of DNA. This procedure includes a repair step to address damage caused by FFPE preservation that improves sequence quality. Subsequently, libraries undergo an in-solution-targeted selection for exons, followed by sequencing using the Illumina next-generation short-read sequencing platform. © 2017 by John Wiley & Sons, Inc.

  7. Replacement Sequence of Events Generator

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladden, Daniel Wenkert Roy; Khanampompan, Teerpat

    2008-01-01

    The soeWINDOW program automates the generation of an ITAR (International Traffic in Arms Regulations)-compliant sub-RSOE (Replacement Sequence of Events) by extracting a specified temporal window from an RSOE while maintaining page header information. RSOEs contain a significant amount of information that is not ITAR-compliant, yet that foreign partners need to see for command details to their instrument, as well as the surrounding commands that provide context for validation. soeWINDOW can serve as an example of how command support products can be made ITAR-compliant for future missions. This software is a Perl script intended for use in the mission operations UNIX environment. It is designed for use to support the MRO (Mars Reconnaissance Orbiter) instrument team. The tool also provides automated DOM (Distributed Object Manager) storage into the special ITAR-okay DOM collection, and can be used for creating focused RSOEs for product review by any of the MRO teams.

  8. Automated Sequence Generation Process and Software

    NASA Technical Reports Server (NTRS)

    Gladden, Roy

    2007-01-01

    "Automated sequence generation" (autogen) signifies both a process and software used to automatically generate sequences of commands to operate various spacecraft. The autogen software comprises the autogen script plus the Activity Plan Generator (APGEN) program. APGEN can be used for planning missions and command sequences.

  9. Assembly Algorithms for Next-Generation Sequencing Data

    PubMed Central

    Miller, Jason R.; Koren, Sergey; Sutton, Granger

    2010-01-01

    The emergence of next-generation sequencing platforms led to resurgence of research in whole-genome shotgun assembly algorithms and software. DNA sequencing data from the Roche 454, Illumina/Solexa, and ABI SOLiD platforms typically present shorter read lengths, higher coverage, and different error profiles compared with Sanger sequencing data. Since 2005, several assembly software packages have been created or revised specifically for de novo assembly of next-generation sequencing data. This review summarizes and compares the published descriptions of packages named SSAKE, SHARCGS, VCAKE, Newbler, Celera Assembler, Euler, Velvet, ABySS, AllPaths, and SOAPdenovo. More generally, it compares the two standard methods known as the de Bruijn graph approach and the overlap/layout/consensus approach to assembly. PMID:20211242

  10. Assembly algorithms for next-generation sequencing data.

    PubMed

    Miller, Jason R; Koren, Sergey; Sutton, Granger

    2010-06-01

    The emergence of next-generation sequencing platforms led to resurgence of research in whole-genome shotgun assembly algorithms and software. DNA sequencing data from the Roche 454, Illumina/Solexa, and ABI SOLiD platforms typically present shorter read lengths, higher coverage, and different error profiles compared with Sanger sequencing data. Since 2005, several assembly software packages have been created or revised specifically for de novo assembly of next-generation sequencing data. This review summarizes and compares the published descriptions of packages named SSAKE, SHARCGS, VCAKE, Newbler, Celera Assembler, Euler, Velvet, ABySS, AllPaths, and SOAPdenovo. More generally, it compares the two standard methods known as the de Bruijn graph approach and the overlap/layout/consensus approach to assembly.

  11. Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies

    PubMed Central

    Utturkar, Sagar M; Klingeman, Dawn M; Bruno-Barcena, José M; Chinn, Mari S; Grunden, Amy M; Köpke, Michael; Brown, Steven D

    2015-01-01

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data. PMID:25977818

  12. Sequence data for Clostridium autoethanogenum using three generations of sequencing technologies.

    PubMed

    Utturkar, Sagar M; Klingeman, Dawn M; Bruno-Barcena, José M; Chinn, Mari S; Grunden, Amy M; Köpke, Michael; Brown, Steven D

    2015-01-01

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.

  13. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    SciTech Connect

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.

  14. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  15. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    DOE PAGES

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; ...

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less

  16. Advances in clinical next-generation sequencing: target enrichment and sequencing technologies.

    PubMed

    Ballester, Leomar Y; Luthra, Rajyalakshmi; Kanagal-Shamanna, Rashmi; Singh, Rajesh R

    2016-01-01

    The huge parallel sequencing capabilities of next generation sequencing technologies have made them the tools of choice to characterize genomic aberrations for research and diagnostic purposes. For clinical applications, screening the whole genome or exome is challenging owing to the large genomic area to be sequenced, associated costs, complexity of data, and lack of known clinical significance of all genes. Consequently, routine screening involves limited markers with established clinical relevance. This process, referred to as targeted genome sequencing, requires selective enrichment of the genomic areas comprising these markers via one of several primer or probe-based enrichment strategies, followed by sequencing of the enriched genomic areas. Here, the authors review current target enrichment approaches and next generation sequencing platforms, focusing on the underlying principles, capabilities, and limitations of each technology along with validation and implementation for clinical testing.

  17. Improved pipeline for reducing erroneous identification by 16S rRNA sequences using the Illumina MiSeq platform.

    PubMed

    Jeon, Yoon-Seong; Park, Sang-Cheol; Lim, Jeongmin; Chun, Jongsik; Kim, Bong-Soo

    2015-01-01

    The cost of DNA sequencing has decreased due to advancements in Next Generation Sequencing. The number of sequences obtained from the Illumina platform is large, use of this platform can reduce costs more than the 454 pyrosequencer. However, the Illumina platform has other challenges, including bioinformatics analysis of large numbers of sequences and the need to reduce erroneous nucleotides generated at the 3'-ends of the sequences. These erroneous sequences can lead to errors in analysis of microbial communities. Therefore, correction of these erroneous sequences is necessary for accurate taxonomic identification. Several studies that have used the Illumina platform to perform metagenomic analyses proposed curating pipelines to increase accuracy. In this study, we evaluated the likelihood of obtaining an erroneous microbial composition using the MiSeq 250 bp paired sequence platform and improved the pipeline to reduce erroneous identifications. We compared different sequencing conditions by varying the percentage of control phiX added, the concentration of the sequencing library, and the 16S rRNA gene target region using a mock community sample composed of known sequences. Our recommended method corrected erroneous nucleotides and improved identification accuracy. Overall, 99.5% of the total reads shared 95% similarity with the corresponding template sequences and 93.6% of the total reads shared over 97% similarity. This indicated that the MiSeq platform can be used to analyze microbial communities at the genus level with high accuracy. The improved analysis method recommended in this study can be applied to amplicon studies in various environments using high-throughput reads generated on the MiSeq platform.

  18. Next-generation sequencing strategies for characterizing the turkey genome.

    PubMed

    Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

    2014-02-01

    The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry.

  19. Next-generation sequencing - feasibility and practicality in haematology.

    PubMed

    Kohlmann, Alexander; Grossmann, Vera; Nadarajah, Niroshan; Haferlach, Torsten

    2013-03-01

    Next-generation sequencing platforms have evolved to provide an accurate and comprehensive means for the detection of molecular mutations in heterogeneous tumour specimens. Here, we review the feasibility and practicality of this novel laboratory technology. In particular, we focus on the utility of next-generation sequencing technology in characterizing haematological neoplasms and the landmark findings in key haematological malignancies. We also discuss deep-sequencing strategies to analyse the constantly increasing number of molecular markers applied for disease classification, patient stratification and individualized monitoring of minimal residual disease. Although many facets of this assay need to be taken into account, amplicon deep-sequencing has already demonstrated a promising technical performance and is being continuously developed towards routine application in diagnostic laboratories so that an impact on clinical practice can be achieved.

  20. Next-Generation Sequencing for Cancer Diagnostics: a Practical Perspective

    PubMed Central

    Meldrum, Cliff; Doyle, Maria A; Tothill, Richard W

    2011-01-01

    Next-generation sequencing (NGS) is arguably one of the most significant technological advances in the biological sciences of the last 30 years. The second generation sequencing platforms have advanced rapidly to the point that several genomes can now be sequenced simultaneously in a single instrument run in under two weeks. Targeted DNA enrichment methods allow even higher genome throughput at a reduced cost per sample. Medical research has embraced the technology and the cancer field is at the forefront of these efforts given the genetic aspects of the disease. World-wide efforts to catalogue mutations in multiple cancer types are underway and this is likely to lead to new discoveries that will be translated to new diagnostic, prognostic and therapeutic targets. NGS is now maturing to the point where it is being considered by many laboratories for routine diagnostic use. The sensitivity, speed and reduced cost per sample make it a highly attractive platform compared to other sequencing modalities. Moreover, as we identify more genetic determinants of cancer there is a greater need to adopt multi-gene assays that can quickly and reliably sequence complete genes from individual patient samples. Whilst widespread and routine use of whole genome sequencing is likely to be a few years away, there are immediate opportunities to implement NGS for clinical use. Here we review the technology, methods and applications that can be immediately considered and some of the challenges that lie ahead. PMID:22147957

  1. ADS: The Next Generation Search Platform

    NASA Astrophysics Data System (ADS)

    Accomazzi, A.; Kurtz, M. J.; Henneken, E. A.; Chyla, R.; Luker, J.; Grant, C. S.; Thompson, D. M.; Holachek, A.; Dave, R.; Murray, S. S.

    2015-04-01

    Four years after the last LISA meeting, the NASA Astrophysics Data System (ADS) finds itself in the middle of major changes to the infrastructure and contents of its database. In this paper we highlight a number of features of great importance to librarians and discuss the additional functionality that we are currently developing. Our citation coverage has doubled since 2010 and now consists of over 10 million citations. We are normalizing the affiliation information in our records and we have started collecting and linking funding sources with papers in our system. At the same time, we are undergoing major technology changes in the ADS platform. We have rolled out and are now enhancing a new high-performance search engine capable of performing full-text as well as metadata searches using an intuitive query language. We are currently able to index acknowledgments, affiliations, citations, and funding sources. While this effort is still ongoing, some of its benefits are already available through the ADS Labs user interface and API at http://adslabs.org/adsabs/.

  2. Iterative method for generating correlated binary sequences

    NASA Astrophysics Data System (ADS)

    Usatenko, O. V.; Melnik, S. S.; Apostolov, S. S.; Makarov, N. M.; Krokhin, A. A.

    2014-11-01

    We propose an efficient iterative method for generating random correlated binary sequences with a prescribed correlation function. The method is based on consecutive linear modulations of an initially uncorrelated sequence into a correlated one. Each step of modulation increases the correlations until the desired level has been reached. The robustness and efficiency of the proposed algorithm are tested by generating sequences with inverse power-law correlations. The substantial increase in the strength of correlation in the iterative method with respect to single-step filtering generation is shown for all studied correlation functions. Our results can be used for design of disordered superlattices, waveguides, and surfaces with selective transport properties.

  3. Next-generation sequencing discoveries in lymphoma.

    PubMed

    Slack, Graham W; Gascoyne, Randy D

    2013-03-01

    Since the mapping of the human genome and the advent of next-generation sequencing technology thorough examination of the cancer genome has become a reality. Over the last few years several studies have used next-generation sequencing technology to investigate the genetic landscape of Hodgkin and non-Hodgkin lymphomas, identifying novel genetic mutations and gene rearrangements that have shed new light on the underlying tumor biology in these diseases as well as identifying possible targets for directed therapy. This review covers the major discoveries in lymphoma using next-generation sequencing technology.

  4. NG6: Integrated next generation sequencing storage and processing environment

    PubMed Central

    2012-01-01

    Background Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. Results We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. Conclusions NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data. PMID:22958229

  5. Toward a new paradigm of DNA writing using a massively parallel sequencing platform and degenerate oligonucleotide

    PubMed Central

    Hwang, Byungjin; Bang, Duhee

    2016-01-01

    All synthetic DNA materials require prior programming of the building blocks of the oligonucleotide sequences. The development of a programmable microarray platform provides cost-effective and time-efficient solutions in the field of data storage using DNA. However, the scalability of the synthesis is not on par with the accelerating sequencing capacity. Here, we report on a new paradigm of generating genetic material (writing) using a degenerate oligonucleotide and optomechanical retrieval method that leverages sequencing (reading) throughput to generate the desired number of oligonucleotides. As a proof of concept, we demonstrate the feasibility of our concept in digital information storage in DNA. In simulation, the ability to store data is expected to exponentially increase with increase in degenerate space. The present study highlights the major framework change in conventional DNA writing paradigm as a sequencer itself can become a potential source of making genetic materials. PMID:27876825

  6. Next-generation sequencing and large genome assemblies

    PubMed Central

    Henson, Joseph; Tischler, German; Ning, Zemin

    2012-01-01

    The next-generation sequencing (NGS) revolution has drastically reduced time and cost requirements for sequencing of large genomes, and also qualitatively changed the problem of assembly. This article reviews the state of the art in de novo genome assembly, paying particular attention to mammalian-sized genomes. The strengths and weaknesses of the main sequencing platforms are highlighted, leading to a discussion of assembly and the new challenges associated with NGS data. Current approaches to assembly are outlined and the various software packages available are introduced and compared. The question of whether quality assemblies can be produced using short-read NGS data alone, or whether it must be combined with more expensive sequencing techniques, is considered. Prospects for future assemblers and tests of assembly performance are also discussed. PMID:22676195

  7. Variant Calling From Next Generation Sequence Data.

    PubMed

    Hansen, Nancy F

    2016-01-01

    The use of next generation nucleotide sequencing to discover and genotype small sequence variants has led to numerous insights into the molecular causes of various diseases. This chapter describes the use of freely available software to align next generation sequencing reads to a reference and then to use the resulting alignments to call, annotate, view, and filter small sequence variants. The suggested variant calling workflow includes read alignment with novoalign, the removal of polymerase chain reaction duplicate sequences with samtools or bamUtils, and the detection of variants with Freebayes or bam2mpg software. ANNOVAR is then used to annotate the predicted variants using gene models, population frequencies, and predicted mutation severity, producing variant files which can be viewed and filtered with the variant display tool VarSifter.

  8. Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform

    PubMed Central

    Schirmer, Melanie; Ijaz, Umer Z.; D'Amore, Rosalinda; Hall, Neil; Sloan, William T.; Quince, Christopher

    2015-01-01

    With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. PMID:25586220

  9. Theory of Periodic-Binary-Sequence Generators

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1987-01-01

    Algorithms yield feedback shift registers with maximum regularity. Report provides extensive mathematical treatment of new and previous results related to generation of pseudo-noise binary sequences by feedback shift registers. Generator architectures amenable to efficient implementation in very-large-scale integrated (VLSI) circuits. Report includes literature references to applications of such sequences in random-number generation, radar, VLSI testing, data encryption and decryption, algebraic error-detection and error-correction encoding and decoding, and feedback-shift-register synthesis of sequential machines.

  10. Comparison of Next-Generation Sequencing Systems

    PubMed Central

    Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

    2012-01-01

    With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized. PMID:22829749

  11. Next Generation DNA Sequencing and the Future of Genomic Medicine

    PubMed Central

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpretation, laboratory workflow, data storage, and ethical considerations. This review describes the current high-throughput sequencing platforms commercially available, and compares the inherent advantages and disadvantages of each. The potential applications for clinical diagnostics are considered, as well as the need for software and analysis tools to interpret the vast amount of data generated. Finally, we discuss the clinical and ethical implications of the wealth of genetic information generated by these methods. Despite the challenges, we anticipate that the evolution and refinement of high-throughput DNA sequencing technologies will catalyze a new era of personalized medicine based on individualized genomic analysis. PMID:24710010

  12. Sequence Depth, Not PCR Replication, Improves Ecological Inference from Next Generation DNA Sequencing

    PubMed Central

    Smith, Dylan P.; Peay, Kabir G.

    2014-01-01

    Recent advances in molecular approaches and DNA sequencing have greatly progressed the field of ecology and allowed for the study of complex communities in unprecedented detail. Next generation sequencing (NGS) can reveal powerful insights into the diversity, composition, and dynamics of cryptic organisms, but results may be sensitive to a number of technical factors, including molecular practices used to generate amplicons, sequencing technology, and data processing. Despite the popularity of some techniques over others, explicit tests of the relative benefits they convey in molecular ecology studies remain scarce. Here we tested the effects of PCR replication, sequencing depth, and sequencing platform on ecological inference drawn from environmental samples of soil fungi. We sequenced replicates of three soil samples taken from pine biomes in North America represented by pools of either one, two, four, eight, or sixteen PCR replicates with both 454 pyrosequencing and Illumina MiSeq. Increasing the number of pooled PCR replicates had no detectable effect on measures of α- and β-diversity. Pseudo-β-diversity – which we define as dissimilarity between re-sequenced replicates of the same sample – decreased markedly with increasing sampling depth. The total richness recovered with Illumina was significantly higher than with 454, but measures of α- and β-diversity between a larger set of fungal samples sequenced on both platforms were highly correlated. Our results suggest that molecular ecology studies will benefit more from investing in robust sequencing technologies than from replicating PCRs. This study also demonstrates the potential for continuous integration of older datasets with newer technology. PMID:24587293

  13. Extended blood group molecular typing and next-generation sequencing.

    PubMed

    Liu, Zhugong; Liu, Meihong; Mercado, Teresita; Illoh, Orieji; Davey, Richard

    2014-10-01

    Several high-throughput multiplex blood group molecular typing platforms have been developed to predict blood group antigen phenotypes. These molecular systems support extended donor/patient matching by detecting commonly encountered blood group polymorphisms as well as rare alleles that determine the expression of blood group antigens. Extended molecular typing of a large number of blood donors by high-throughput platforms can increase the likelihood of identifying donor red blood cells that match those of recipients. This is especially important in the management of multiply-transfused patients who may have developed several alloantibodies. Nevertheless, current molecular techniques have limitations. For example, they detect only predefined genetic variants. In contrast, target enrichment next-generation sequencing (NGS) is an emerging technology that provides comprehensive sequence information, focusing on specified genomic regions. Target enrichment NGS is able to assess genetic variations that cannot be achieved by traditional Sanger sequencing or other genotyping platforms. Target enrichment NGS has been used to detect both known and de novo genetic polymorphisms, including single-nucleotide polymorphisms, indels (insertions/deletions), and structural variations. This review discusses the methodology, advantages, and limitations of the current blood group genotyping techniques and describes various target enrichment NGS approaches that can be used to develop an extended blood group genotyping assay system.

  14. Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments

    PubMed Central

    Qi, Yuan; Liu, Xiuping; Liu, Chang-gong; Wang, Bailing; Hess, Kenneth R.; Symmans, W. Fraser; Shi, Weiwei; Pusztai, Lajos

    2015-01-01

    Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA. We performed targeted sequencing of all known human protein kinase genes (kinome) (~3.2 Mb) using the SOLiD v4 platform. Seventeen breast cancer samples were sequenced in duplicate (n=14) or triplicate (n=3) to assess concordance of all calls and single nucleotide variant (SNV) calls. The concordance rates over the entire sequenced region were >99.99%, while the concordance rates for SNVs were 54.3-75.5%. There was substantial variation in basic sequencing metrics from experiment to experiment. The type of nucleotide substitution and genomic location of the variant had little impact on concordance but concordance increased with coverage level, variant allele count (VAC), variant allele frequency (VAF), variant allele quality and p-value of SNV-call. The most important determinants of concordance were VAC and VAF. Even using the highest stringency of QC metrics the reproducibility of SNV calls was around 80% suggesting that erroneous variant calling can be as high as 20-40% in a single experiment. The sequence data have been deposited into the European Genome-phenome Archive (EGA) with accession number EGAS00001000826. PMID:26136146

  15. What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research.

    PubMed

    Bräutigam, A; Gowik, U

    2010-11-01

    Next generation sequencing (NGS) technologies have opened fascinating opportunities for the analysis of plants with and without a sequenced genome on a genomic scale. During the last few years, NGS methods have become widely available and cost effective. They can be applied to a wide variety of biological questions, from the sequencing of complete eukaryotic genomes and transcriptomes, to the genome-scale analysis of DNA-protein interactions. In this review, we focus on the use of NGS for plant transcriptomics, including gene discovery, transcript quantification and marker discovery for non-model plants, as well as transcript annotation and quantification, small RNA discovery and antisense transcription analysis for model plants. We discuss the experimental design for analysis of plants with and without a sequenced genome, including considerations on sampling, RNA preparation, sequencing platforms and bioinformatics tools for data analysis. NGS technologies offer exciting new opportunities for the plant sciences, especially for work on plants without a sequenced genome, since large sequence resources can be generated at moderate cost.

  16. Therapeutic assessment of SEED: a new engineered antibody platform designed to generate mono- and bispecific antibodies.

    PubMed

    Muda, Marco; Gross, Alec W; Dawson, Jessica P; He, Chaomei; Kurosawa, Emmi; Schweickhardt, Rene; Dugas, Melanie; Soloviev, Maria; Bernhardt, Anna; Fischer, David; Wesolowski, John S; Kelton, Christie; Neuteboom, Berend; Hock, Bjoern

    2011-05-01

    The strand-exchange engineered domain (SEED) platform was designed to generate asymmetric and bispecific antibody-like molecules, a capability that expands therapeutic applications of natural antibodies. This new protein engineered platform is based on exchanging structurally related sequences of immunoglobulin within the conserved CH3 domains. Alternating sequences from human IgA and IgG in the SEED CH3 domains generate two asymmetric but complementary domains, designated AG and GA. The SEED design allows efficient generation of AG/GA heterodimers, while disfavoring homodimerization of AG and GA SEED CH3 domains. Using a clinically validated antibody (C225), we tested whether Fab derivatives constructed on the SEED platform retain desirable therapeutic antibody features such as in vitro and in vivo stability, favorable pharmacokinetics, ligand binding and effector functions including antibody-dependent cell-mediated cytotoxicity and complement-dependent cytotoxicity. In addition, we tested SEED with combinations of binder domains (scFv, VHH, Fab). Mono- and bivalent Fab-SEED fusions retain full binding affinity, have excellent biochemical and biophysical stability, and retain desirable antibody-like characteristics conferred by Fc domains. Furthermore, SEED is compatible with different combinations of Fab, scFv and VHH domains. Our assessment shows that the new SEED platform expands therapeutic applications of natural antibodies by generating heterodimeric Fc-analog proteins.

  17. Standardization and quality management in next-generation sequencing.

    PubMed

    Endrullat, Christoph; Glökler, Jörn; Franke, Philipp; Frohme, Marcus

    2016-09-01

    DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data.

  18. PCR Techniques in Next-Generation Sequencing.

    PubMed

    Goswami, Rashmi S

    2016-01-01

    With the advent of next-generation sequencing and its prolific use in the clinical realm, it would appear that techniques such as PCR would not be in high demand. This is not the case however, as PCR techniques play an important role in the success of NGS technology. Although NGS has rapidly become an important part of clinical molecular diagnostics, whole genome sequencing is still difficult to implement in a clinical laboratory due to high costs of sequencing, as well as issues surrounding data processing, analysis, and data storage, which can reduce efficiency and increase turnaround times. As a result, targeted sequencing is often used in clinical diagnostics, due to its increased efficiency. PCR techniques play an integral role in targeted NGS sequencing, allowing for the generation of multiple NGS libraries and the sequencing of multiple targeted regions simultaneously. We will outline the methods we employ in PCR amplification of targeted genomic regions for cancer mutation hotspots using the Ampliseq Cancer Hotspot v2 panel (Life Technologies, Carlsbad, CA).

  19. Impact of Next Generation Sequencing Techniques in Food Microbiology

    PubMed Central

    Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana

    2014-01-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  20. Impact of next generation sequencing techniques in food microbiology.

    PubMed

    Mayo, Baltasar; Rachid, Caio T C C; Alegría, Angel; Leite, Analy M O; Peixoto, Raquel S; Delgado, Susana

    2014-08-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety.

  1. Neural mechanisms of sequence generation in songbirds

    NASA Astrophysics Data System (ADS)

    Langford, Bruce

    Animal models in research are useful for studying more complex behavior. For example, motor sequence generation of actions requiring good muscle coordination such as writing with a pen, playing an instrument, or speaking, may involve the interaction of many areas in the brain, each a complex system in itself; thus it can be difficult to determine causal relationships between neural behavior and the behavior being studied. Birdsong, however, provides an excellent model behavior for motor sequence learning, memory, and generation. The song consists of learned sequences of notes that are spectrographically stereotyped over multiple renditions of the song, similar to syllables in human speech. The main areas of the songbird brain involve in singing are known, however, the mechanisms by which these systems store and produce song are not well understood. We used a custom built, head-mounted, miniature motorized microdrive to chronically record the neural firing patterns of identified neurons in HVC, a pre-motor cortical nucleus which has been shown to be important in song timing. These were done in Bengalese finch which generate a song made up of stereotyped notes but variable note sequences. We observed song related bursting in neurons projecting to Area X, a homologue to basal ganglia, and tonic firing in HVC interneurons. Interneuron had firing rate patterns that were consistent over multiple renditions of the same note sequence. We also designed and built a light-weight, low-powered wireless programmable neural stimulator using Bluetooth Low Energy Protocol. It was able to generate perturbations in the song when current pulses were administered to RA, which projects to the brainstem nucleus responsible for syringeal muscle control.

  2. Next Generation Sequencing Reveals the Hidden Diversity of Zooplankton Assemblages

    PubMed Central

    Harmer, Rachel A.; Somerfield, Paul J.; Atkinson, Angus

    2013-01-01

    Background Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. Methodology/Principle Findings Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. Conclusions Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly

  3. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment

    PubMed Central

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814

  4. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment.

    PubMed

    Kwak, Daniel; Kam, Alfred; Becerra, David; Zhou, Qikuan; Hops, Adam; Zarour, Eleyine; Kam, Arthur; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem.

  5. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  6. Next-generation sequencing: advances and applications in cancer diagnosis

    PubMed Central

    Serratì, Simona; De Summa, Simona; Pilato, Brunella; Petriella, Daniela; Lacalamita, Rosanna; Tommasi, Stefania; Pinto, Rosamaria

    2016-01-01

    Technological advances have led to the introduction of next-generation sequencing (NGS) platforms in cancer investigation. NGS allows massive parallel sequencing that affords maximal tumor genomic assessment. NGS approaches are different, and concern DNA and RNA analysis. DNA sequencing includes whole-genome, whole-exome, and targeted sequencing, which focuses on a selection of genes of interest for a specific disease. RNA sequencing facilitates the detection of alternative gene-spliced transcripts, posttranscriptional modifications, gene fusion, mutations/single-nucleotide polymorphisms, small and long noncoding RNAs, and changes in gene expression. Most applications are in the cancer research field, but lately NGS technology has been revolutionizing cancer molecular diagnostics, due to the many advantages it offers compared to traditional methods. There is greater knowledge on solid cancer diagnostics, and recent interest has been shown also in the field of hematologic cancer. In this review, we report the latest data on NGS diagnostic/predictive clinical applications in solid and hematologic cancers. Moreover, since the amount of NGS data produced is very large and their interpretation is very complex, we briefly discuss two bioinformatic aspects, variant-calling accuracy and copy-number variation detection, which are gaining a lot of importance in cancer-diagnostic assessment. PMID:27980425

  7. Microfluidic Platform Generates Oxygen Landscapes for Localized Hypoxic Activation

    PubMed Central

    Rexius, Megan L.; Mauleon, Gerardo; Malik, Asrar B.; Rehman, Jalees; Eddington, David T.

    2014-01-01

    An open-well microfluidic platform generates an oxygen landscape using gas-perfused networks which diffuse across a membrane. The device enables real-time analysis of cellular and tissue responses to oxygen tension to define how cells adapt to heterogeneous oxygen conditions found in the physiological setting. We demonstrate that localized hypoxic activation of cells elicited specific metabolic and gene responses in human microvascular endothelial cells and bone marrow-derived mesenchymal stem cells. A robust demonstration of the compatibility of the device with standard laboratory techniques demonstrates the wide utility of the method. This platform is ideally suited to study real-time cell responses and cell-cell interactions within physiologically relevant oxygen landscapes. PMID:25315003

  8. Microfluidic platform generates oxygen landscapes for localized hypoxic activation.

    PubMed

    Rexius-Hall, Megan L; Mauleon, Gerardo; Malik, Asrar B; Rehman, Jalees; Eddington, David T

    2014-12-21

    An open-well microfluidic platform generates an oxygen landscape using gas-perfused networks which diffuse across a membrane. The device enables real-time analysis of cellular and tissue responses to oxygen tension to define how cells adapt to heterogeneous oxygen conditions found in the physiological setting. We demonstrate that localized hypoxic activation of cells elicited specific metabolic and gene responses in human microvascular endothelial cells and bone marrow-derived mesenchymal stem cells. A robust demonstration of the compatibility of the device with standard laboratory techniques demonstrates the wide utility of the method. This platform is ideally suited to study real-time cell responses and cell-cell interactions within physiologically relevant oxygen landscapes.

  9. Next Generation Sequencing in Endocrine Practice

    PubMed Central

    Forlenza, Gregory P.; Calhoun, Amy; Beckman, Kenneth B.; Halvorsen, Tanya; Hamdoun, Elwaseila; Zierhut, Heather; Sarafoglou, Kyriakie; Polgreen, Lynda E.; Miller, Bradley S.; Nathan, Brandon; Petryk, Anna

    2016-01-01

    With the completion of the Human Genome Project and advances in genomic sequencing technologies, the use of clinical molecular diagnostics has grown tremendously over the last decade. Next-generation sequencing (NGS) has overcome many of the practical roadblocks that had slowed the adoption of molecular testing for routine clinical diagnosis. In endocrinology, targeted NGS now complements biochemical testing and imaging studies. The goal of this review is to provide clinicians with a guide to the application of NGS to genetic testing for endocrine conditions, by compiling a list of established gene mutations detectable by NGS, and highlighting key phenotypic features of these disorders. As we outline in this review, the clinical utility of NGS-based molecular testing for endocrine disorders is very high. Identifying an exact genetic etiology improves understanding of the disease, provides clear explanation to families about the cause, and guides decisions about screening, prevention and/or treatment. PMID:25958132

  10. Next Generation Sequencing in Alzheimer's Disease.

    PubMed

    Bertram, Lars

    2016-01-01

    For the first time in the history of human genetics research, it is now both technically feasible and economically affordable to screen individual genomes for novel disease-causing mutations at base-pair resolution using "next-generation sequencing" (NGS). One popular aim in many of today's NGS studies is genome resequencing (in part or whole) to identify DNA variants potentially accounting for the "missing heritability" problem observed in many genetically complex traits. Thus far, only relatively few projects have applied these powerful new technologies to search for novel Alzheimer's disease (AD) related sequence variants. In this review, I summarize the findings from the first NGS-based resequencing studies in AD and discuss their potential implications and limitations. Notable recent discoveries using NGS include the identification of rare susceptibility modifying alleles in APP, TREM2, and PLD3. Several other large-scale NGS projects are currently underway so that additional discoveries can be expected over the coming years.

  11. Generating matrix and sums of Fibonacci and Pell sequences

    NASA Astrophysics Data System (ADS)

    Ho, C. K.; Woon, H. S.; Chong, Chin-Yoon

    2014-07-01

    In this paper, we study the Fibonacci sequence and Pell sequence and developed generating matrices for them. First we proved two results on the even sum of the Fibonacci sequence and the Pell sequence, using the generating matrix approach. We then deduce the odd sums, some identities and recursive formulas for these two sequences.

  12. Next Generation Sequence Assembly with AMOS

    PubMed Central

    Treangen, Todd J; Sommer, Dan D; Angly, Florent E; Koren, Sergey; Pop, Mihai

    2011-01-01

    A Modular Open-Source Assembler (AMOS) was designed to offer a modular approach to genome assembly. AMOS includes a wide range of tools for assembly, including lightweight de novo assemblers Minimus and Minimo, and Bambus 2, a robust scaffolder able to handle metagenomic and polymorphic data. This protocol describes how to configure and use AMOS for the assembly of Next Generation sequence data. Additionally, we provide three tutorial examples that include bacterial, viral, and metagenomic datasets with specific tips for improving assembly quality. PMID:21400694

  13. Next generation sequencing and its applications in forensic genetics.

    PubMed

    Børsting, Claus; Morling, Niels

    2015-09-01

    It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics.

  14. Revealing the Complexity of Breast Cancer by Next Generation Sequencing

    PubMed Central

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of “-omic” platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  15. Next-generation sequencing for mitochondrial disorders

    PubMed Central

    Carroll, C J; Brilhante, V; Suomalainen, A

    2014-01-01

    A great deal of our understanding of mitochondrial function has come from studies of inherited mitochondrial diseases, but still majority of the patients lack molecular diagnosis. Furthermore, effective treatments for mitochondrial disorders do not exist. Development of therapies has been complicated by the fact that the diseases are extremely heterogeneous, and collecting large enough cohorts of similarly affected individuals to assess new therapies properly has been difficult. Next-generation sequencing technologies have in the last few years been shown to be an effective method for the genetic diagnosis of inherited mitochondrial diseases. Here we review the strategies and findings from studies applying next-generation sequencing methods for the genetic diagnosis of mitochondrial disorders. Detailed knowledge of molecular causes also enables collection of homogenous cohorts of patients for therapy trials, and therefore boosts development of intervention. Linked Articles This article is part of a themed issue on Mitochondrial Pharmacology: Energy, Injury & Beyond. To view the other articles in this issue visit http://dx.doi.org/10.1111/bph.2014.171.issue-8 PMID:24138576

  16. deepTools: a flexible platform for exploring deep-sequencing data.

    PubMed

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy.

  17. Microfluidics for genome-wide studies involving next generation sequencing

    PubMed Central

    Murphy, Travis W.; Lu, Chang

    2017-01-01

    Next-generation sequencing (NGS) has revolutionized how molecular biology studies are conducted. Its decreasing cost and increasing throughput permit profiling of genomic, transcriptomic, and epigenomic features for a wide range of applications. Microfluidics has been proven to be highly complementary to NGS technology with its unique capabilities for handling small volumes of samples and providing platforms for automation, integration, and multiplexing. In this article, we review recent progress on applying microfluidics to facilitate genome-wide studies. We emphasize on several technical aspects of NGS and how they benefit from coupling with microfluidic technology. We also summarize recent efforts on developing microfluidic technology for genomic, transcriptomic, and epigenomic studies, with emphasis on single cell analysis. We envision rapid growth in these directions, driven by the needs for testing scarce primary cell samples from patients in the context of precision medicine.

  18. Initial steps towards a production platform for DNA sequence analysis on the grid

    PubMed Central

    2010-01-01

    Background Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. Results In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. Conclusions The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/ PMID:21156038

  19. Periodic binary sequence generators: VLSI circuits considerations

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1984-01-01

    Feedback shift registers are efficient periodic binary sequence generators. Polynomials of degree r over a Galois field characteristic 2(GF(2)) characterize the behavior of shift registers with linear logic feedback. The algorithmic determination of the trinomial of lowest degree, when it exists, that contains a given irreducible polynomial over GF(2) as a factor is presented. This corresponds to embedding the behavior of an r-stage shift register with linear logic feedback into that of an n-stage shift register with a single two-input modulo 2 summer (i.e., Exclusive-OR gate) in its feedback. This leads to Very Large Scale Integrated (VLSI) circuit architecture of maximal regularity (i.e., identical cells) with intercell communications serialized to a maximal degree.

  20. Long period pseudo random number sequence generator

    NASA Technical Reports Server (NTRS)

    Wang, Charles C. (Inventor)

    1989-01-01

    A circuit for generating a sequence of pseudo random numbers, (A sub K). There is an exponentiator in GF(2 sup m) for the normal basis representation of elements in a finite field GF(2 sup m) each represented by m binary digits and having two inputs and an output from which the sequence (A sub K). Of pseudo random numbers is taken. One of the two inputs is connected to receive the outputs (E sub K) of maximal length shift register of n stages. There is a switch having a pair of inputs and an output. The switch outputs is connected to the other of the two inputs of the exponentiator. One of the switch inputs is connected for initially receiving a primitive element (A sub O) in GF(2 sup m). Finally, there is a delay circuit having an input and an output. The delay circuit output is connected to the other of the switch inputs and the delay circuit input is connected to the output of the exponentiator. Whereby after the exponentiator initially receives the primitive element (A sub O) in GF(2 sup m) through the switch, the switch can be switched to cause the exponentiator to receive as its input a delayed output A(K-1) from the exponentiator thereby generating (A sub K) continuously at the output of the exponentiator. The exponentiator in GF(2 sup m) is novel and comprises a cyclic-shift circuit; a Massey-Omura multiplier; and, a control logic circuit all operably connected together to perform the function U(sub i) = 92(sup i) (for n(sub i) = 1 or 1 (for n(subi) = 0).

  1. Applicability of Next Generation Sequencing Technology in Microsatellite Instability Testing

    PubMed Central

    Gan, Chun; Love, Clare; Beshay, Victoria; Macrae, Finlay; Fox, Stephen; Waring, Paul; Taylor, Graham

    2015-01-01

    Microsatellite instability (MSI) is a useful marker for risk assessment, prediction of chemotherapy responsiveness and prognosis in patients with colorectal cancer. Here, we describe a next generation sequencing approach for MSI testing using the MiSeq platform. Different from other MSI capturing strategies that are based on targeted gene capture, we utilize “deep resequencing”, where we focus the sequencing on only the microsatellite regions of interest. We sequenced a series of 44 colorectal tumours with normal controls for five MSI loci (BAT25, BAT26, BAT34c4, D18S55, D5S346) and a second series of six colorectal tumours (no control) with two mononucleotide loci (BAT25, BAT26). In the first series, we were able to determine 17 MSI-High, 1 MSI-Low and 26 microsatellite stable (MSS) tumours. In the second series, there were three MSI-High and three MSS tumours. Although there was some variation within individual markers, this NGS method produced the same overall MSI status for each tumour, as obtained with the traditional multiplex PCR-based method. PMID:25685876

  2. Next-Generation Phylogeography: A Targeted Approach for Multilocus Sequencing of Non-Model Organisms

    PubMed Central

    Puritz, Jonathan B.; Addison, Jason A.; Toonen, Robert J.

    2012-01-01

    The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA) sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua) at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers. PMID:22470543

  3. Minimum information for reporting next generation sequence genotyping (MIRING): Guidelines for reporting HLA and KIR genotyping via next generation sequencing.

    PubMed

    Mack, Steven J; Milius, Robert P; Gifford, Benjamin D; Sauter, Jürgen; Hofmann, Jan; Osoegawa, Kazutoyo; Robinson, James; Groeneweg, Mathijs; Turenchalk, Gregory S; Adai, Alex; Holcomb, Cherie; Rozemuller, Erik H; Penning, Maarten T; Heuer, Michael L; Wang, Chunlin; Salit, Marc L; Schmidt, Alexander H; Parham, Peter R; Müller, Carlheinz; Hague, Tim; Fischer, Gottfried; Fernandez-Viňa, Marcelo; Hollenbach, Jill A; Norman, Paul J; Maiers, Martin

    2015-12-01

    The development of next-generation sequencing (NGS) technologies for HLA and KIR genotyping is rapidly advancing knowledge of genetic variation of these highly polymorphic loci. NGS genotyping is poised to replace older methods for clinical use, but standard methods for reporting and exchanging these new, high quality genotype data are needed. The Immunogenomic NGS Consortium, a broad collaboration of histocompatibility and immunogenetics clinicians, researchers, instrument manufacturers and software developers, has developed the Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines. MIRING is a checklist that specifies the content of NGS genotyping results as well as a set of messaging guidelines for reporting the results. A MIRING message includes five categories of structured information - message annotation, reference context, full genotype, consensus sequence and novel polymorphism - and references to three categories of accessory information - NGS platform documentation, read processing documentation and primary data. These eight categories of information ensure the long-term portability and broad application of this NGS data for all current histocompatibility and immunogenetics use cases. In addition, MIRING can be extended to allow the reporting of genotype data generated using pre-NGS technologies. Because genotyping results reported using MIRING are easily updated in accordance with reference and nomenclature databases, MIRING represents a bold departure from previous methods of reporting HLA and KIR genotyping results, which have provided static and less-portable data. More information about MIRING can be found online at miring.immunogenomics.org.

  4. The Molecular Blueprint of a Fungus by Next-Generation Sequencing (NGS).

    PubMed

    Grumaz, Christian; Kirstahler, Philipp; Sohn, Kai

    2017-01-01

    Sequencing the whole genome of an organism is invaluable for its comprehensive molecular characterization and has been drastically facilitated by the advent of high-throughput sequencing techniques. Especially in clinical microbiology the impact of sequenced strains increases as resistance and virulence markers can easily be detected. Here, we describe a combined approach for sequencing a fungal genome and transcriptome from initial nucleic acid isolation through the generation of ready-to-load DNA libraries for the Illumina platform and the final step of genome assembly with subsequent gene annotation.

  5. Deep sequencing analysis of phage libraries using Illumina platform.

    PubMed

    Matochko, Wadim L; Chu, Kiki; Jin, Bingjie; Lee, Sam W; Whitesides, George M; Derda, Ratmir

    2012-09-01

    This paper presents an analysis of phage-displayed libraries of peptides using Illumina. We describe steps for the preparation of short DNA fragments for deep sequencing and MatLab software for the analysis of the results. Screening of peptide libraries displayed on the surface of bacteriophage (phage display) can be used to discover peptides that bind to any target. The key step in this discovery is the analysis of peptide sequences present in the library. This analysis is usually performed by Sanger sequencing, which is labor intensive and limited to examination of a few hundred phage clones. On the other hand, Illumina deep-sequencing technology can characterize over 10(7) reads in a single run. We applied Illumina sequencing to analyze phage libraries. Using PCR, we isolated the variable regions from M13KE phage vectors from a phage display library. The PCR primers contained (i) sequences flanking the variable region, (ii) barcodes, and (iii) variable 5'-terminal region. We used this approach to examine how diversity of peptides in phage display libraries changes as a result of amplification of libraries in bacteria. Using HiSeq single-end Illumina sequencing of these fragments, we acquired over 2×10(7) reads, 57 base pairs (bp) in length. Each read contained information about the barcode (6bp), one complimentary region (12bp) and a variable region (36bp). We applied this sequencing to a model library of 10(6) unique clones and observed that amplification enriches ∼150 clones, which dominate ∼20% of the library. Deep sequencing, for the first time, characterized the collapse of diversity in phage libraries. The results suggest that screens based on repeated amplification and small-scale sequencing identify a few binding clones and miss thousands of useful clones. The deep sequencing approach described here could identify under-represented clones in phage screens. It could also be instrumental in developing new screening strategies, which can preserve

  6. Strategies for complete mitochondrial genome sequencing on Ion Torrent PGM™ platform in forensic sciences.

    PubMed

    Zhou, Yishu; Guo, Fei; Yu, Jiao; Liu, Feng; Zhao, Jinling; Shen, Hongying; Zhao, Bin; Jia, Fei; Sun, Zhu; Song, He; Jiang, Xianhua

    2016-05-01

    Next generation sequencing (NGS) is a time saving and cost-efficient method to detect the complete mitochondrial genome (mtGenome) compared to Sanger sequencing. In this study we focused on developing strategies for mtGenome sequencing on the Ion Torrent PGM™ platform and NGS data analysis. With our experience, 4, 15 and 30 samples could be loaded onto Ion 314™, Ion 316™ and Ion 318™ chips respectively at a pooling concentration of 26pM, achieving to sufficient average coverage of ≥1500 × and well strand balance of 1.05. Data processing software is essential to NGS mega data analysis. The in-house Perl scripts were developed for primary data analysis to screen out uncertain positions and samples from variant call format (VCF) reports and for pedigree study to perform pairwise comparisons. The Integrative Genomic Viewer (IGV) and the NextGENe software were introduced to secondary data analysis. The mthap and EMMA were employed for haplogroup assignment. The dataset was reviewed and approved by the EMPOP as the final version, which showed 2.66% error rate generated from the Torrent Variant Caller (TVC). Across the mtGenome, 4022 variants were found at 725 nucleotide positions, where ratio of transitions to transversions was estimated at 20.89:1 and 22.18% of variants was concentrated at hypervariable segments I and II (HVS-I and HVS-II). Totally, 107 complete mtGenome haplotypes were observed from 107 Northern Chinese Han and assigned to 88 haplogroups. The random match probability (RMP) of complete mtGenome was calculated as 0.009345794, decreasing 26.19% by comparison to that of HVS-I only, and the haplotype diversity (HD) was evaluated as 1, increasing 0.33% by comparison to that of HVS-I only. Principal component analysis (PCA) showed that our population was clustered to East and Southeast Asians. The strategies in this study are suitable for complete mtGenome sequencing on Ion Torrent PGM™ platform and Northern Chinese Han (EMP00670) is the first

  7. Preparation of SELEX Samples for Next-Generation Sequencing.

    PubMed

    Tolle, Fabian; Mayer, Günter

    2016-01-01

    Fuelled by massive whole genome sequencing projects such as the human genome project, enormous technological advancements and therefore tremendous price drops could be achieved, rendering next-generation sequencing very attractive for deep sequencing of SELEX libraries. Herein we describe the preparation of SELEX samples for Illumina sequencing, based on the already established whole genome sequencing workflow. We describe the addition of barcode sequences for multiplexing and the adapter ligation, avoiding associated pitfalls.

  8. The impact of next-generation sequencing on genomics

    PubMed Central

    Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

    2011-01-01

    This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come. PMID:21477781

  9. Next-Generation Sequencing: A Review of Technologies and Tools for Wound Microbiome Research

    PubMed Central

    Hodkinson, Brendan P.; Grice, Elizabeth A.

    2015-01-01

    Significance: The colonization of wounds by specific microbes or communities of microbes may delay healing and/or lead to infection-related complication. Studies of wound-associated microbial communities (microbiomes) to date have primarily relied upon culture-based methods, which are known to have extreme biases and are not reliable for the characterization of microbiomes. Biofilms are very resistant to culture and are therefore especially difficult to study with techniques that remain standard in clinical settings. Recent Advances: Culture-independent approaches employing next-generation DNA sequencing have provided researchers and clinicians a window into wound-associated microbiomes that could not be achieved before and has begun to transform our view of wound-associated biodiversity. Within the past decade, many platforms have arisen for performing this type of sequencing, with various types of applications for microbiome research being possible on each. Critical Issues: Wound care incorporating knowledge of microbiomes gained from next-generation sequencing could guide clinical management and treatments. The purpose of this review is to outline the current platforms, their applications, and the steps necessary to undertake microbiome studies using next-generation sequencing. Future Directions: As DNA sequencing technology progresses, platforms will continue to produce longer reads and more reads per run at lower costs. A major future challenge is to implement these technologies in clinical settings for more precise and rapid identification of wound bioburden. PMID:25566414

  10. Generating Functions for the Powers of Fibonacci Sequences

    ERIC Educational Resources Information Center

    Terrana, D.; Chen, H.

    2007-01-01

    In this note, based on the Binet formulas and the power-reducing techniques, closed forms of generating functions for the powers of Fibonacci sequences are presented. The corresponding results are extended to some other famous sequences as well.

  11. Historical perspective, development and applications of next-generation sequencing in plant virology.

    PubMed

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-06

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology.

  12. Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

    PubMed Central

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  13. Polynomials Generated by the Fibonacci Sequence

    NASA Astrophysics Data System (ADS)

    Garth, David; Mills, Donald; Mitchell, Patrick

    2007-06-01

    The Fibonacci sequence's initial terms are F_0=0 and F_1=1, with F_n=F_{n-1}+F_{n-2} for n>=2. We define the polynomial sequence p by setting p_0(x)=1 and p_{n}(x)=x*p_{n-1}(x)+F_{n+1} for n>=1, with p_{n}(x)= sum_{k=0}^{n} F_{k+1}x^{n-k}. We call p_n(x) the Fibonacci-coefficient polynomial (FCP) of order n. The FCP sequence is distinct from the well-known Fibonacci polynomial sequence. We answer several questions regarding these polynomials. Specifically, we show that each even-degree FCP has no real zeros, while each odd-degree FCP has a unique, and (for degree at least 3) irrational, real zero. Further, we show that this sequence of unique real zeros converges monotonically to the negative of the golden ratio. Using Rouche's theorem, we prove that the zeros of the FCP's approach the golden ratio in modulus. We also prove a general result that gives the Mahler measures of an infinite subsequence of the FCP sequence whose coefficients are reduced modulo an integer m>=2. We then apply this to the case that m=L_n, the nth Lucas number, showing that the Mahler measure of the subsequence is phi^{n-1}, where phi=(1+sqrt 5)/2.

  14. A research roadmap for next-generation sequencing informatics.

    PubMed

    Altman, Russ B; Prabhu, Snehit; Sidow, Arend; Zook, Justin M; Goldfeder, Rachel; Litwack, David; Ashley, Euan; Asimenos, George; Bustamante, Carlos D; Donigan, Katherine; Giacomini, Kathleen M; Johansen, Elaine; Khuri, Natalia; Lee, Eunice; Liang, Xueying Sharon; Salit, Marc; Serang, Omar; Tezak, Zivana; Wall, Dennis P; Mansfield, Elizabeth; Kass-Hout, Taha

    2016-04-20

    Next-generation sequencing technologies are fueling a wave of new diagnostic tests. Progress on a key set of nine research challenge areas will help generate the knowledge required to advance effectively these diagnostics to the clinic.

  15. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  16. Next generation sequencing for neurological diseases: New hope or new hype?

    PubMed Central

    Keogh, M.J.; Chinnery, P.F.

    2013-01-01

    Over the past year huge advances have been made in our ability to determine the genetic aetiology of many neurological diseases through the utilisation of next generation sequencing platforms. This technology is, on a daily basis, providing new breakthroughs in neurological disease. The aim of this article is to clearly describe the technological platforms, methods of data analysis, established breakthroughs, and potential future clinical and research applications of this innovative and exciting technique which has relevance to all those working within clinical neuroscience. PMID:23200550

  17. Learning gene regulatory networks from next generation sequencing data.

    PubMed

    Jia, Bochao; Xu, Suwa; Xiao, Guanghua; Lamba, Vishal; Liang, Faming

    2017-03-10

    In recent years, next generation sequencing (NGS) has gradually replaced microarray as the major platform in measuring gene expressions. Compared to microarray, NGS has many advantages, such as less noise and higher throughput. However, the discreteness of NGS data also challenges the existing statistical methodology. In particular, there still lacks an appropriate statistical method for reconstructing gene regulatory networks using NGS data in the literature. The existing local Poisson graphical model method is not consistent and can only infer certain local structures of the network. In this article, we propose a random effect model-based transformation to continuize NGS data and then we transform the continuized data to Gaussian via a semiparametric transformation and apply an equivalent partial correlation selection method to reconstruct gene regulatory networks. The proposed method is consistent. The numerical results indicate that the proposed method can lead to much more accurate inference of gene regulatory networks than the local Poisson graphical model and other existing methods. The proposed data-continuized transformation fills the theoretical gap for how to transform discrete data to continuous data and facilitates NGS data analysis. The proposed data-continuized transformation also makes it feasible to integrate different types of data, such as microarray and RNA-seq data, in reconstruction of gene regulatory networks.

  18. Image encryption using random sequence generated from generalized information domain

    NASA Astrophysics Data System (ADS)

    Xia-Yan, Zhang; Guo-Ji, Zhang; Xuan, Li; Ya-Zhou, Ren; Jie-Hua, Wu

    2016-05-01

    A novel image encryption method based on the random sequence generated from the generalized information domain and permutation-diffusion architecture is proposed. The random sequence is generated by reconstruction from the generalized information file and discrete trajectory extraction from the data stream. The trajectory address sequence is used to generate a P-box to shuffle the plain image while random sequences are treated as keystreams. A new factor called drift factor is employed to accelerate and enhance the performance of the random sequence generator. An initial value is introduced to make the encryption method an approximately one-time pad. Experimental results show that the random sequences pass the NIST statistical test with a high ratio and extensive analysis demonstrates that the new encryption scheme has superior security.

  19. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, Eduard

    1998-01-01

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  20. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, E.

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility. 14 figs.

  1. Variable Speed Wind Turbine Generator with Zero-sequence Filter

    DOEpatents

    Muljadi, Eduard

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  2. Characterisation and Next-generation Sequencing Analysis of Unknown Arboviruses

    DTIC Science & Technology

    2012-09-01

    using techniques such as PCR-select subtraction and next-generation sequencing. Preliminary analysis of the four sequenced viruses has shown that they...HOJV) and Harrison Dam virus (HARDV), and two unknown bunyaviruses, Buffalo Creek Virus (BCV) and Maprik virus (MPKV). It describes the techniques such...unknown viruses with greater speed and at lower cost. The rapid advancement of new generation sequencing techniques allows for highly specific acquisition

  3. Depositional sequence evolution, Paleozoic and early Mesozoic of the central Saharan platform, North Africa

    SciTech Connect

    Sprague, A.R.G. )

    1991-08-01

    Over 30 depositional sequences have been identified in the Paleozoic and lower Mesozoic of the Ghadames basin of eastern Algeria, southern Tunisia, and western Libya. Well logs and lithologic information from more than 500 wells were used to correlate the 30 sequences throughout the basin (total area more than 1 million km{sup 2}). Based on systematic change in the log response of strata in successively younger sequences, five groups of sequences with distinctive characteristics have been identified: Cambro-Ordivician, Upper Silurian-Middle Devonian, Upper Devonian, Carboniferous, and Middle Triassic-Middle Jurassic. Each sequence group is terminated by a major, tectonically enhanced sequence boundary that is immediately overlain (except for the Carboniferous) by a shale-prone interval deposited in response to basin-wide flooding. The four Paleozoic sequence groups were deposited on the Saharan platform, a north facing, clastic-dominated shelf that covered most of North Africa during the Paleozoic. The sequence boundary at the top of the Carboniferous sequence group is one of several Permian-Carboniferous angular unconformities in North Africa related to the Hercynian orogeny. The youngest sequence group (Middle Triassic to Middle Jurassic) is a clastic-evaporite package that onlaps southward onto the top of Paleozoic sequence boundary. The progressive changes from the Cambrian to the Jurassic, in the nature of the Ghadames basin sequences is a reflection of the interplay between basin morphology and tectonics, vegetation, eustasy, climate, and sediment supply.

  4. Transcriptome Sequencing and Development of an Expression Microarray Platform for Liver Infection in Adenovirus Type 5-Infected Syrian Golden Hamsters

    PubMed Central

    Ying, Baoling; Toth, Karoly; Spencer, Jacqueline F.; Aurora, Rajeev; Wold, William S.M.

    2015-01-01

    The Syrian golden hamster is an attractive animal for research on infectious diseases and other diseases. We report here the sequencing, assembly, and annotation of the Syrian hamster transcriptome. We include transcripts from ten pooled tissues from a naïve hamster and one stimulated with lipopolysaccharide. Our data set identified 42,707 non-redundant transcripts, representing 34,191 unique genes. Based on the transcriptome data, we generated a custom microarray and used this new platform to investigate the transcriptional response in the Syrian hamster liver following intravenous adenovirus type 5 (Ad5) infection. We found that Ad5 infection caused a massive change in regulation of liver transcripts, with robust up-regulation of genes involved in the antiviral response, indicating that the innate immune response functions in the host defense against Ad5 infection of the liver. The data and novel platforms developed in this study will facilitate further development of this important animal model. PMID:26319212

  5. A high-throughput optomechanical retrieval method for sequence-verified clonal DNA from the NGS platform.

    PubMed

    Lee, Howon; Kim, Hyoki; Kim, Sungsik; Ryu, Taehoon; Kim, Hwangbeom; Bang, Duhee; Kwon, Sunghoon

    2015-02-02

    Writing DNA plays a significant role in the fields of synthetic biology, functional genomics and bioengineering. DNA clones on next-generation sequencing (NGS) platforms have the potential to be a rich and cost-effective source of sequence-verified DNAs as a precursor for DNA writing. However, it is still very challenging to retrieve target clonal DNA from high-density NGS platforms. Here we propose an enabling technology called 'Sniper Cloning' that enables the precise mapping of target clone features on NGS platforms and non-contact rapid retrieval of targets for the full utilization of DNA clones. By merging the three cutting-edge technologies of NGS, DNA microarray and our pulse laser retrieval system, Sniper Cloning is a week-long process that produces 5,188 error-free synthetic DNAs in a single run of NGS with a single microarray DNA pool. We believe that this technology has potential as a universal tool for DNA writing in biological sciences.

  6. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGES

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; ...

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  7. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  8. Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants

    PubMed Central

    Kim, Kyung; Seong, Moon-Woo; Chung, Won-Hyong; Park, Sung Sup; Leem, Sangseob; Park, Won; Kim, Jihyun; Lee, KiYoung; Park, Rae Woong; Kim, Namshin

    2015-01-01

    Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ~200×. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of 120×. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about 120×. Moreover, the phenomena were consistent across the breast cancer samples. PMID:26175660

  9. Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

    PubMed Central

    Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

    2015-01-01

    Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016

  10. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  11. Exploring the potential of next-generation sequencing in detection of respiratory viruses.

    PubMed

    Prachayangprecha, Slinporn; Schapendonk, Claudia M E; Koopmans, Marion P; Osterhaus, Albert D M E; Schürch, Anita C; Pas, Suzan D; van der Eijk, Annemiek A; Poovorawan, Yong; Haagmans, Bart L; Smits, Saskia L

    2014-10-01

    Efficient detection of human respiratory viral pathogens is crucial in the management of patients with acute respiratory tract infection. Sequence-independent amplification of nucleic acids combined with next-generation sequencing technology and bioinformatics analyses is a promising strategy for identifying pathogens in clinical and public health settings. It allows the characterization of hundreds of different known pathogens simultaneously and of novel pathogens that elude conventional testing. However, major hurdles for its routine use exist, including cost, turnaround time, and especially sensitivity of the assay, as the detection limit is dependent on viral load, host genetic material, and sequencing depth. To obtain insights into these aspects, we analyzed nasopharyngeal aspirates from a cohort of 81 Thai children with respiratory disease for the presence of respiratory viruses using a sequence-independent next-generation sequencing approach and routinely used diagnostic real-time reverse transcriptase PCR (real-time RT-PCR) assays. With respect to the detection of rhinovirus and human metapneumovirus, the next-generation sequencing approach was at least as sensitive as diagnostic real-time RT-PCR in this small cohort, whereas for bocavirus and enterovirus, next-generation sequencing was less sensitive than real-time RT-PCR. The advantage of the sequencing approach over real-time RT-PCR was the immediate availability of virus-typing information. Considering the development of platforms capable of generating more output data at declining costs, next-generation sequencing remains of interest for future virus diagnosis in clinical and public health settings and certainly as an additional tool when screening results from real-time RT-PCR are negative.

  12. Multiple nuclear ortholog next generation sequencing phylogeny of Daucus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing is helping to solve the data insufficiency problem hindering well-resolved dominant gene phylogenies. We used Roche 454 technology to obtain DNA sequences from 93 nuclear orthologs, dispersed throughout all linkage groups of Daucus. Of these 93 orthologs, ten were designed...

  13. Analyzing the safety of removal sequences for piles of an offshore jacket platform

    NASA Astrophysics Data System (ADS)

    Pan, Xin-Ying; Zhang, Zhao-De

    2009-12-01

    An inevitable consequence of the development of the offshore petroleum industry is the eventual obsolescence of large offshore structures. Proper methods for removal of decommissioned offshore platforms are becoming an important topic that the oil and gas industry must pay increasing attention to. While removing sections from a decommissioned jacket platform, the stability of the remaining parts is critical. The jacket danger indices D σ and D s defined in this paper are very useful for analyzing the safety of any procedure planned for disassembling a jacket platform. The safest piles cutting sequence can be determined easily by comparing every column of D σ and D s or simply analyzing the figures of every row of D σ and D s .

  14. Bioelectrochemical system platform for sustainable environmental remediation and energy generation.

    PubMed

    Wang, Heming; Luo, Haiping; Fallgren, Paul H; Jin, Song; Ren, Zhiyong Jason

    2015-01-01

    The increasing awareness of the energy-environment nexus is compelling the development of technologies that reduce environmental impacts during energy production as well as energy consumption during environmental remediation. Countries spend billions in pollution cleanup projects, and new technologies with low energy and chemical consumption are needed for sustainable remediation practice. This perspective review provides a comprehensive summary on the mechanisms of the new bioelectrochemical system (BES) platform technology for efficient and low cost remediation, including petroleum hydrocarbons, chlorinated solvents, perchlorate, azo dyes, and metals, and it also discusses the potential new uses of BES approach for some emerging contaminants remediation, such as CO2 in air and nutrients and micropollutants in water. The unique feature of BES for environmental remediation is the use of electrodes as non-exhaustible electron acceptors, or even donors, for contaminant degradation, which requires minimum energy or chemicals but instead produces sustainable energy for monitoring and other onsite uses. BES provides both oxidation (anode) and reduction (cathode) reactions that integrate microbial-electro-chemical removal mechanisms, so complex contaminants with different characteristics can be removed. We believe the BES platform carries great potential for sustainable remediation and hope this perspective provides background and insights for future research and development.

  15. New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing

    PubMed Central

    Song, Kai; Ren, Jie; Reinert, Gesine; Deng, Minghua

    2014-01-01

    With the development of next-generation sequencing (NGS) technologies, a large amount of short read data has been generated. Assembly of these short reads can be challenging for genomes and metagenomes without template sequences, making alignment-based genome sequence comparison difficult. In addition, sequence reads from NGS can come from different regions of various genomes and they may not be alignable. Sequence signature-based methods for genome comparison based on the frequencies of word patterns in genomes and metagenomes can potentially be useful for the analysis of short reads data from NGS. Here we review the recent development of alignment-free genome and metagenome comparison based on the frequencies of word patterns with emphasis on the dissimilarity measures between sequences, the statistical power of these measures when two sequences are related and the applications of these measures to NGS data. PMID:24064230

  16. Comparative depositional geometries and facies within windward rimmed platform and carbonate ramp sequences

    SciTech Connect

    Boss, S.K.; Rasmussen, K.A.; Neumann, A.C. )

    1992-01-01

    Northern Great Bahama Bank (NGBB) combines geomorphic aspects of rimmed platforms and carbonate ramps in a windward (high-energy) environment. Analysis of Holocene sediment cores, seismic reflection mapping of the Holocene-Pleistocene unconformity and transgressive Holocene deposits and petrographic study of excavated Holocene submarine-cemented horizons provides an integrated view of evolving depositional geometries within both rimmed platform and ramp settings. Cores display gross textural and compositional homogeneity; all sediments are medium to coarse sands comprised of composite peloids, Halimeda sp., benthic foraminifera and molluscs. Three-dimensional seismic mapping reveals that this basal unconformity exhibits variation in topographic relief related to both constructional and erosional processes; rimmed portions of the platform are associated with topographic plateaus'' with fringing eolianite ridges or (rarely) reefs. These plateaus'' are separated by a somewhat deeper (ca. 5m deep) trough'' exhibiting little relief, but sloping seaward to form a ramp. Multiple intrasequence cemented horizons are a common feature of the thinner deposits of the NGBB ramp where tidal exchange is vigorous and sediment deposition is episodic or in dynamic balance with sediment export. Thus, rimmed carbonate platform facies are thick marine sands with relatively little submarine cementation while open, unsheltered ramp facies are characterized by thin sediment sequences containing numerous, discontinuous submarine-cemented horizons. In the absence of other obvious facies or geomorphic indicators (e.g. preserved reefal rims), the preservation of similar depositional features in ancient limestones may serve as a useful discriminant of rimmed platform versus carbonate ramp settings.

  17. Building a next generation platform for association studies in cacao

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The drastic reductions in cost and time associated with the collection of DNA sequence and genotype data have revolutionized genetic mapping in model systems (e.g. humans, Arabidopsis) and also promise to significantly enhance the power and resolution of genetic mapping in agricultural systems. Prog...

  18. Next generation barcode tagged sequencing for monitoring microbial community dynamics.

    PubMed

    Breakwell, Katy; Tetu, Sasha G; Elbourne, Liam D H

    2014-01-01

    Microbial identification using 16S rDNA variable regions has become increasingly popular over the past decade. The application of next-generation amplicon sequencing to these regions allows microbial communities to be sequenced in far greater depth than previous techniques, as well as allowing for the identification of unculturable or rare organisms within a sample. Multiplexing can be used to sequence multiple samples in tandem through the use of sample-specific identification sequences which are attached to each amplicon, making this a cost-effective method for large-scale microbial identification experiments.

  19. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

    PubMed

    Huang, Austin; Kantor, Rami; DeLong, Allison; Schreier, Leeann; Istrail, Sorin

    Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.

  20. Next generation sequencing technologies for insect virus discovery.

    PubMed

    Liu, Sijun; Vijayendran, Diveena; Bonning, Bryony C

    2011-10-01

    Insects are commonly infected with multiple viruses including those that cause sublethal, asymptomatic, and latent infections. Traditional methods for virus isolation typically lack the sensitivity required for detection of such viruses that are present at low abundance. In this respect, next generation sequencing technologies have revolutionized methods for the discovery and identification of new viruses from insects. Here we review both traditional and modern methods for virus discovery, and outline analysis of transcriptome and small RNA data for identification of viral sequences. We will introduce methods for de novo assembly of viral sequences, identification of potential viral sequences from BLAST data, and bioinformatics for generating full-length or near full-length viral genome sequences. We will also discuss implications of the ubiquity of viruses in insects and in insect cell lines. All of the methods described in this article can also apply to the discovery of viruses in other organisms.

  1. Next Generation Sequencing Technologies for Insect Virus Discovery

    PubMed Central

    Liu, Sijun; Vijayendran, Diveena; Bonning, Bryony C.

    2011-01-01

    Insects are commonly infected with multiple viruses including those that cause sublethal, asymptomatic, and latent infections. Traditional methods for virus isolation typically lack the sensitivity required for detection of such viruses that are present at low abundance. In this respect, next generation sequencing technologies have revolutionized methods for the discovery and identification of new viruses from insects. Here we review both traditional and modern methods for virus discovery, and outline analysis of transcriptome and small RNA data for identification of viral sequences. We will introduce methods for de novo assembly of viral sequences, identification of potential viral sequences from BLAST data, and bioinformatics for generating full-length or near full-length viral genome sequences. We will also discuss implications of the ubiquity of viruses in insects and in insect cell lines. All of the methods described in this article can also apply to the discovery of viruses in other organisms. PMID:22069519

  2. Strategy for microbiome analysis using 16S rRNA gene sequence analysis on the Illumina sequencing platform.

    PubMed

    Ram, Jeffrey L; Karim, Aos S; Sendler, Edward D; Kato, Ikuko

    2011-06-01

    Understanding the identity and changes of organisms in the urogenital and other microbiomes of the human body may be key to discovering causes and new treatments of many ailments, such as vaginosis. High-throughput sequencing technologies have recently enabled discovery of the great diversity of the human microbiome. The cost per base of many of these sequencing platforms remains high (thousands of dollars per sample); however, the Illumina Genome Analyzer (IGA) is estimated to have a cost per base less than one-fifth of its nearest competitor. The main disadvantage of the IGA for sequencing PCR-amplified 16S rRNA genes is that the maximum read-length of the IGA is only 100 bases; whereas, at least 300 bases are needed to obtain phylogenetically informative data down to the genus and species level. In this paper we describe and conduct a pilot test of a multiplex sequencing strategy suitable for achieving total reads of > 300 bases per extracted DNA molecule on the IGA. Results show that all proposed primers produce products of the expected size and that correct sequences can be obtained, with all proposed forward primers. Various bioinformatic optimization of the Illumina Bustard analysis pipeline proved necessary to extract the correct sequence from IGA image data, and these modifications of the data files indicate that further optimization of the analysis pipeline may improve the quality rankings of the data and enable more sequence to be correctly analyzed. The successful application of this method could result in an unprecedentedly deep description (800,000 taxonomic identifications per sample) of the urogenital and other microbiomes in a large number of samples at a reasonable cost per sample.

  3. Phylogenetic properties of 50 nuclear loci in Medicago (Leguminosae) generated using multiplexed sequence capture and next-generation sequencing.

    PubMed

    de Sousa, Filipe; Bertrand, Yann J K; Nylinder, Stephan; Oxelman, Bengt; Eriksson, Jonna S; Pfeil, Bernard E

    2014-01-01

    Next-generation sequencing technology has increased the capacity to generate molecular data for plant biological research, including phylogenetics, and can potentially contribute to resolving complex phylogenetic problems. The evolutionary history of Medicago L. (Leguminosae: Trifoliae) remains unresolved due to incongruence between published phylogenies. Identification of the processes causing this genealogical incongruence is essential for the inference of a correct species phylogeny of the genus and requires that more molecular data, preferably from low-copy nuclear genes, are obtained across different species. Here we report the development of 50 novel LCN markers in Medicago and assess the phylogenetic properties of each marker. We used the genomic resources available for Medicago truncatula Gaertn., hybridisation-based gene enrichment (sequence capture) techniques and Next-Generation Sequencing to generate sequences. This alternative proves to be a cost-effective approach to amplicon sequencing in phylogenetic studies at the genus or tribe level and allows for an increase in number and size of targeted loci. Substitution rate estimates for each of the 50 loci are provided, and an overview of the variation in substitution rates among a large number of low-copy nuclear genes in plants is presented for the first time. Aligned sequences of major species lineages of Medicago and its sister genus are made available and can be used in further probe development for sequence-capture of the same markers.

  4. Phylogenetic Properties of 50 Nuclear Loci in Medicago (Leguminosae) Generated Using Multiplexed Sequence Capture and Next-Generation Sequencing

    PubMed Central

    de Sousa, Filipe; Bertrand, Yann J. K.; Nylinder, Stephan; Oxelman, Bengt; Eriksson, Jonna S.; Pfeil, Bernard E.

    2014-01-01

    Next-generation sequencing technology has increased the capacity to generate molecular data for plant biological research, including phylogenetics, and can potentially contribute to resolving complex phylogenetic problems. The evolutionary history of Medicago L. (Leguminosae: Trifoliae) remains unresolved due to incongruence between published phylogenies. Identification of the processes causing this genealogical incongruence is essential for the inference of a correct species phylogeny of the genus and requires that more molecular data, preferably from low-copy nuclear genes, are obtained across different species. Here we report the development of 50 novel LCN markers in Medicago and assess the phylogenetic properties of each marker. We used the genomic resources available for Medicago truncatula Gaertn., hybridisation-based gene enrichment (sequence capture) techniques and Next-Generation Sequencing to generate sequences. This alternative proves to be a cost-effective approach to amplicon sequencing in phylogenetic studies at the genus or tribe level and allows for an increase in number and size of targeted loci. Substitution rate estimates for each of the 50 loci are provided, and an overview of the variation in substitution rates among a large number of low-copy nuclear genes in plants is presented for the first time. Aligned sequences of major species lineages of Medicago and its sister genus are made available and can be used in further probe development for sequence-capture of the same markers. PMID:25329401

  5. Third Generation Sequencing Techniques and Applications to Drug Discovery

    PubMed Central

    Ozsolak, Fatih

    2012-01-01

    Introduction There is an immediate need for functional and molecular studies to decipher differences between disease and “normal” settings to identify large quantities of validated targets with the highest therapeutic utilities. Furthermore, drug mechanism of action and biomarkers to predict drug efficacy and safety need to be identified for effective design of clinical trials, decreasing attrition rates, regulatory agency approval process and drug repositioning. By expanding the power of genetics and pharmacogenetics studies, next generation nucleic acid sequencing technologies have started to play an important role in all stages of drug discovery. Areas covered This article reviews the first and second generation sequencing technologies (SGSTs) and challenges they pose to biomedicine. The article then focuses on the emerging third generation sequencing technologies (TGSTs), their technological foundations and potential contributions to drug discovery. Expert Opinion Despite the scientific and commercial success of SGSTs, the goal of rapid, comprehensive and unbiased sequencing of nucleic acids has not been achieved. TGSTs promise to increase sequencing throughput and read lengths, decrease costs, run times and error rates, eliminate biases inherent in SGSTs, and offer capabilities beyond nucleic acid sequencing. Such changes will have positive impact in all sequencing applications to drug discovery. PMID:22468954

  6. Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

    PubMed

    Caruccio, Nicholas

    2011-01-01

    DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.

  7. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition

    PubMed Central

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-01-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation. PMID:26536029

  8. Microbial Contamination in Next Generation Sequencing: Implications for Sequence-Based Analysis of Clinical Samples

    PubMed Central

    Strong, Michael J.; Xu, Guorong; Morici, Lisa; Splinter Bon-Durant, Sandra; Baddoo, Melody; Lin, Zhen; Fewell, Claire; Taylor, Christopher M.; Flemington, Erik K.

    2014-01-01

    The high level of accuracy and sensitivity of next generation sequencing for quantifying genetic material across organismal boundaries gives it tremendous potential for pathogen discovery and diagnosis in human disease. Despite this promise, substantial bacterial contamination is routinely found in existing human-derived RNA-seq datasets that likely arises from environmental sources. This raises the need for stringent sequencing and analysis protocols for studies investigating sequence-based microbial signatures in clinical samples. PMID:25412476

  9. The Generation Challenge Programme Platform: Semantic Standards and Workbench for Crop Science

    PubMed Central

    Bruskiewich, Richard; Senger, Martin; Davenport, Guy; Ruiz, Manuel; Rouard, Mathieu; Hazekamp, Tom; Takeya, Masaru; Doi, Koji; Satoh, Kouji; Costa, Marcos; Simon, Reinhard; Balaji, Jayashree; Akintunde, Akinnola; Mauleon, Ramil; Wanchana, Samart; Shah, Trushar; Anacleto, Mylah; Portugal, Arllet; Ulat, Victor Jun; Thongjuea, Supat; Braak, Kyle; Ritter, Sebastian; Dereeper, Alexis; Skofic, Milko; Rojas, Edwin; Martins, Natalia; Pappas, Georgios; Alamban, Ryan; Almodiel, Roque; Barboza, Lord Hendrix; Detras, Jeffrey; Manansala, Kevin; Mendoza, Michael Jonathan; Morales, Jeffrey; Peralta, Barry; Valerio, Rowena; Zhang, Yi; Gregorio, Sergio; Hermocilla, Joseph; Echavez, Michael; Yap, Jan Michael; Farmer, Andrew; Schiltz, Gary; Lee, Jennifer; Casstevens, Terry; Jaiswal, Pankaj; Meintjes, Ayton; Wilkinson, Mark; Good, Benjamin; Wagner, James; Morris, Jane; Marshall, David; Collins, Anthony; Kikuchi, Shoshi; Metz, Thomas; McLaren, Graham; van Hintum, Theo

    2008-01-01

    The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making. PMID:18483570

  10. A Robust High Throughput Platform to Generate Functional Recombinant Monoclonal Antibodies Using Rabbit B Cells from Peripheral Blood

    PubMed Central

    Seeber, Stefan; Ros, Francesca; Thorey, Irmgard; Tiefenthaler, Georg; Kaluza, Klaus; Lifke, Valeria; Fischer, Jens André Alexander; Klostermann, Stefan; Endl, Josef; Kopetzki, Erhard; Pashine, Achal; Siewe, Basile; Kaluza, Brigitte; Platzer, Josef; Offner, Sonja

    2014-01-01

    We have developed a robust platform to generate and functionally characterize rabbit-derived antibodies using B cells from peripheral blood. The rapid high throughput procedure generates a diverse set of antibodies, yet requires only few animals to be immunized without the need to sacrifice them. The workflow includes (i) the identification and isolation of single B cells from rabbit blood expressing IgG antibodies, (ii) an elaborate short term B-cell cultivation to produce sufficient monoclonal antigen specific IgG for comprehensive phenotype screens, (iii) the isolation of VH and VL coding regions via PCR from B-cell clones producing antigen specific and functional antibodies followed by the sequence determination, and (iv) the recombinant expression and purification of IgG antibodies. The fully integrated and to a large degree automated platform (demonstrated in this paper using IL1RL1 immunized rabbits) yielded clonal and very diverse IL1RL1-specific and functional IL1RL1-inhibiting rabbit antibodies. These functional IgGs from individual animals were obtained at a short time range after immunization and could be identified already during primary screening, thus substantially lowering the workload for the subsequent B-cell PCR workflow. Early availability of sequence information permits one to select early-on function- and sequence-diverse antibodies for further characterization. In summary, this powerful technology platform has proven to be an efficient and robust method for the rapid generation of antigen specific and functional monoclonal rabbit antibodies without sacrificing the immunized animal. PMID:24503933

  11. Comparison of Two Massively Parallel Sequencing Platforms using 83 Single Nucleotide Polymorphisms for Human Identification.

    PubMed

    Apaga, Dame Loveliness T; Dennis, Sheila E; Salvador, Jazelyn M; Calacal, Gayvelline C; De Ungria, Maria Corazon A

    2017-03-24

    The potential of Massively Parallel Sequencing (MPS) technology to vastly expand the capabilities of human identification led to the emergence of different MPS platforms that use forensically relevant genetic markers. Two of the MPS platforms that are currently available are the MiSeq(®) FGx™ Forensic Genomics System (Illumina) and the HID-Ion Personal Genome Machine (PGM)™ (Thermo Fisher Scientific). These are coupled with the ForenSeq™ DNA Signature Prep kit (Illumina) and the HID-Ion AmpliSeq™ Identity Panel (Thermo Fisher Scientific), respectively. In this study, we compared the genotyping performance of the two MPS systems based on 83 SNP markers that are present in both MPS marker panels. Results show that MiSeq(®) FGx™ has greater sample-to-sample variation than the HID-Ion PGM™ in terms of read counts for all the 83 SNP markers. Allele coverage ratio (ACR) values show generally balanced heterozygous reads for both platforms. Two and four SNP markers from the MiSeq(®) FGx™ and HID-Ion PGM™, respectively, have average ACR values lower than the recommended value of 0.67. Comparison of genotype calls showed 99.7% concordance between the two platforms.

  12. Pittosporum cryptic virus 1: genome sequence completion using next-generation sequencing.

    PubMed

    Elbeaino, Toufic; Kubaa, Raied Abou; Tuzlali, Hasan Tuna; Digiaro, Michele

    2016-07-01

    Next-generation sequencing (NGS) was applied to dsRNAs extracted from an Italian pittosporum plant infected with pittosporum cryptic virus 1 (PiCV1). NGS allowed assembly of the full genome sequence of PiCV1, comprising dsRNA1 (1.9 kbp) and dsRNA2 (1.5 kbp), which encode the RNA-dependent RNA polymerase and capsid protein genes, respectively. Phylogenetic and sequence analyses confirmed that PiCV1 is a new member of the genus Deltapartitivirus, family Partiviridae. From the same plant, NSG also permitted assembly of the complete genome sequence of eggplant mottled dwarf virus (EMDV), which shared 86 % to 98 % nucleotide sequence identity with complete and partial sequences (ca 6750 nt) of other known EMDV isolates with sequences available in the GenBank database.

  13. A Pulse Generator Based on an Arduino Platform for Ultrasonic Applications

    NASA Astrophysics Data System (ADS)

    Acevedo, Pedro; Vázquez, Mónica; Durán, Joel; Petrearce, Rodolfo

    The objective of this work is to use the Arduino platform as an ultrasonic pulse generator to excite PVDF ultrasonic arrays in transmission. An experimental setup was implemented using a through-transmission configuration to evaluate the performance of the generator.

  14. Next-Generation Sequencing: From Understanding Biology to Personalized Medicine

    PubMed Central

    Frese, Karen S.; Katus, Hugo A.; Meder, Benjamin

    2013-01-01

    Within just a few years, the new methods for high-throughput next-generation sequencing have generated completely novel insights into the heritability and pathophysiology of human disease. In this review, we wish to highlight the benefits of the current state-of-the-art sequencing technologies for genetic and epigenetic research. We illustrate how these technologies help to constantly improve our understanding of genetic mechanisms in biological systems and summarize the progress made so far. This can be exemplified by the case of heritable heart muscle diseases, so-called cardiomyopathies. Here, next-generation sequencing is able to identify novel disease genes, and first clinical applications demonstrate the successful translation of this technology into personalized patient care. PMID:24832667

  15. High-Throughput Next-Generation Sequencing of Polioviruses.

    PubMed

    Montmayeur, Anna M; Ng, Terry Fei Fan; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A; Oberste, M Steven; Burns, Cara C

    2017-02-01

    The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance.

  16. Nanopore-based fourth-generation DNA sequencing technology.

    PubMed

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-02-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  17. Clinical Next Generation Sequencing for Precision Medicine in Cancer

    PubMed Central

    Dong, Ling; Wang, Wanheng; Li, Alvin; Kansal, Rina; Chen, Yuhan; Chen, Hong; Li, Xinmin

    2015-01-01

    Rapid adoption of next generation sequencing (NGS) in genomic medicine has been driven by low cost, high throughput sequencing and rapid advances in our understanding of the genetic bases of human diseases. Today, the NGS method has dominated sequencing space in genomic research, and quickly entered clinical practice. Because unique features of NGS perfectly meet the clinical reality (need to do more with less), the NGS technology is becoming a driving force to realize the dream of precision medicine. This article describes the strengths of NGS, NGS panels used in precision medicine, current applications of NGS in cytology, and its challenges and future directions for routine clinical use. PMID:27006629

  18. The Motif Tool Assessment Platform (MTAP) for sequence-based transcription factor binding site prediction tools.

    PubMed

    Quest, Daniel; Ali, Hesham

    2010-01-01

    Predicting transcription factor binding sites (TFBS) from sequence is one of the most challenging problems in computational biology. The development of (semi-)automated computer-assisted prediction methods is needed to find TFBS over an entire genome, which is a first step in reconstructing mechanisms that control gene activity. Bioinformatics journals continue to publish diverse methods for predicting TFBS on a monthly basis. To help practitioners in deciding which method to use to predict for a particular TFBS, we provide a platform to assess the quality and applicability of the available methods. Assessment tools allow researchers to determine how methods can be expected to perform on specific organisms or on specific transcription factor families. This chapter introduces the TFBS detection problem and reviews current strategies for evaluating algorithm effectiveness. In this chapter, a novel and robust assessment tool, the Motif Tool Assessment Platform (MTAP), is introduced and discussed.

  19. Collaborative Effort for a Centralized Worldwide Tuberculosis Relational Sequencing Data Platform

    PubMed Central

    Starks, Angela M.; Avilés, Enrique; Cirillo, Daniela M.; Denkinger, Claudia M.; Dolinger, David L.; Emerson, Claudia; Gallarda, Jim; Hanna, Debra; Kim, Peter S.; Liwski, Richard; Miotto, Paolo; Schito, Marco; Zignol, Matteo

    2015-01-01

    Continued progress in addressing challenges associated with detection and management of tuberculosis requires new diagnostic tools. These tools must be able to provide rapid and accurate information for detecting resistance to guide selection of the treatment regimen for each patient. To achieve this goal, globally representative genotypic, phenotypic, and clinical data are needed in a standardized and curated data platform. A global partnership of academic institutions, public health agencies, and nongovernmental organizations has been established to develop a tuberculosis relational sequencing data platform (ReSeqTB) that seeks to increase understanding of the genetic basis of resistance by correlating molecular data with results from drug susceptibility testing and, optimally, associated patient outcomes. These data will inform development of new diagnostics, facilitate clinical decision making, and improve surveillance for drug resistance. ReSeqTB offers an opportunity for collaboration to achieve improved patient outcomes and to advance efforts to prevent and control this devastating disease. PMID:26409275

  20. A resampling procedure for generating conditioned daily weather sequences

    USGS Publications Warehouse

    Clark, M.P.; Gangopadhyay, S.; Brandon, D.; Werner, K.; Hay, L.; Rajagopalan, B.; Yates, D.

    2004-01-01

    [1] A method is introduced to generate conditioned daily precipitation and temperature time series at multiple stations. The method resamples data from the historical record "nens" times for the period of interest (nens = number of ensemble members) and reorders the ensemble members to reconstruct the observed spatial (intersite) and temporal correlation statistics. The weather generator model is applied to 2307 stations in the contiguous United States and is shown to reproduce the observed spatial correlation between neighboring stations, the observed correlation between variables (e.g., between precipitation and temperature), and the observed temporal correlation between subsequent days in the generated weather sequence. The weather generator model is extended to produce sequences of weather that are conditioned on climate indices (in this case the Nin??o 3.4 index). Example illustrations of conditioned weather sequences are provided for a station in Arizona (Petrified Forest, 34.8??N, 109.9??W), where El Nin??o and La Nin??a conditions have a strong effect on winter precipitation. The conditioned weather sequences generated using the methods described in this paper are appropriate for use as input to hydrologic models to produce multiseason forecasts of streamflow.

  1. Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data

    NASA Astrophysics Data System (ADS)

    Sandmann, Sarah; de Graaf, Aniek O.; Karimi, Mohsen; van der Reijden, Bert A.; Hellström-Lindberg, Eva; Jansen, Joop H.; Dugas, Martin

    2017-02-01

    Valid variant calling results are crucial for the use of next-generation sequencing in clinical routine. However, there are numerous variant calling tools that usually differ in algorithms, filtering strategies, recommendations and thus, also in the output. We evaluated eight open-source tools regarding their ability to call single nucleotide variants and short indels with allelic frequencies as low as 1% in non-matched next-generation sequencing data: GATK HaplotypeCaller, Platypus, VarScan, LoFreq, FreeBayes, SNVer, SAMtools and VarDict. We analysed two real datasets from patients with myelodysplastic syndrome, covering 54 Illumina HiSeq samples and 111 Illumina NextSeq samples. Mutations were validated by re-sequencing on the same platform, on a different platform and expert based review. In addition we considered two simulated datasets with varying coverage and error profiles, covering 50 samples each. In all cases an identical target region consisting of 19 genes (42,322 bp) was analysed. Altogether, no tool succeeded in calling all mutations. High sensitivity was always accompanied by low precision. Influence of varying coverages- and background noise on variant calling was generally low. Taking everything into account, VarDict performed best. However, our results indicate that there is a need to improve reproducibility of the results in the context of multithreading.

  2. Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data

    PubMed Central

    Sandmann, Sarah; de Graaf, Aniek O.; Karimi, Mohsen; van der Reijden, Bert A.; Hellström-Lindberg, Eva; Jansen, Joop H.; Dugas, Martin

    2017-01-01

    Valid variant calling results are crucial for the use of next-generation sequencing in clinical routine. However, there are numerous variant calling tools that usually differ in algorithms, filtering strategies, recommendations and thus, also in the output. We evaluated eight open-source tools regarding their ability to call single nucleotide variants and short indels with allelic frequencies as low as 1% in non-matched next-generation sequencing data: GATK HaplotypeCaller, Platypus, VarScan, LoFreq, FreeBayes, SNVer, SAMtools and VarDict. We analysed two real datasets from patients with myelodysplastic syndrome, covering 54 Illumina HiSeq samples and 111 Illumina NextSeq samples. Mutations were validated by re-sequencing on the same platform, on a different platform and expert based review. In addition we considered two simulated datasets with varying coverage and error profiles, covering 50 samples each. In all cases an identical target region consisting of 19 genes (42,322 bp) was analysed. Altogether, no tool succeeded in calling all mutations. High sensitivity was always accompanied by low precision. Influence of varying coverages- and background noise on variant calling was generally low. Taking everything into account, VarDict performed best. However, our results indicate that there is a need to improve reproducibility of the results in the context of multithreading. PMID:28233799

  3. Detection of Bacillus anthracis DNA in Complex Soil and Air Samples Using Next-Generation Sequencing

    PubMed Central

    Be, Nicholas A.; Thissen, James B.; Gardner, Shea N.; McLoughlin, Kevin S.; Fofanov, Viacheslav Y.; Koshinsky, Heather; Ellingson, Sally R.; Brettin, Thomas S.; Jackson, Paul J.; Jaing, Crystal J.

    2013-01-01

    Bacillus anthracis is the potentially lethal etiologic agent of anthrax disease, and is a significant concern in the realm of biodefense. One of the cornerstones of an effective biodefense strategy is the ability to detect infectious agents with a high degree of sensitivity and specificity in the context of a complex sample background. The nature of the B. anthracis genome, however, renders specific detection difficult, due to close homology with B. cereus and B. thuringiensis. We therefore elected to determine the efficacy of next-generation sequencing analysis and microarrays for detection of B. anthracis in an environmental background. We applied next-generation sequencing to titrated genome copy numbers of B. anthracis in the presence of background nucleic acid extracted from aerosol and soil samples. We found next-generation sequencing to be capable of detecting as few as 10 genomic equivalents of B. anthracis DNA per nanogram of background nucleic acid. Detection was accomplished by mapping reads to either a defined subset of reference genomes or to the full GenBank database. Moreover, sequence data obtained from B. anthracis could be reliably distinguished from sequence data mapping to either B. cereus or B. thuringiensis. We also demonstrated the efficacy of a microbial census microarray in detecting B. anthracis in the same samples, representing a cost-effective and high-throughput approach, complementary to next-generation sequencing. Our results, in combination with the capacity of sequencing for providing insights into the genomic characteristics of complex and novel organisms, suggest that these platforms should be considered important components of a biosurveillance strategy. PMID:24039948

  4. Next-Generation Sequencing in the Understanding of Kaposi's Sarcoma-Associated Herpesvirus (KSHV) Biology.

    PubMed

    Strahan, Roxanne; Uppal, Timsy; Verma, Subhash C

    2016-03-31

    Non-Sanger-based novel nucleic acid sequencing techniques, referred to as Next-Generation Sequencing (NGS), provide a rapid, reliable, high-throughput, and massively parallel sequencing methodology that has improved our understanding of human cancers and cancer-related viruses. NGS has become a quintessential research tool for more effective characterization of complex viral and host genomes through its ever-expanding repertoire, which consists of whole-genome sequencing, whole-transcriptome sequencing, and whole-epigenome sequencing. These new NGS platforms provide a comprehensive and systematic genome-wide analysis of genomic sequences and a full transcriptional profile at a single nucleotide resolution. When combined, these techniques help unlock the function of novel genes and the related pathways that contribute to the overall viral pathogenesis. Ongoing research in the field of virology endeavors to identify the role of various underlying mechanisms that control the regulation of the herpesvirus biphasic lifecycle in order to discover potential therapeutic targets and treatment strategies. In this review, we have complied the most recent findings about the application of NGS in Kaposi's sarcoma-associated herpesvirus (KSHV) biology, including identification of novel genomic features and whole-genome KSHV diversities, global gene regulatory network profiling for intricate transcriptome analyses, and surveying of epigenetic marks (DNA methylation, modified histones, and chromatin remodelers) during de novo, latent, and productive KSHV infections.

  5. Pattern Recognition on Read Positioning in Next Generation Sequencing

    PubMed Central

    Byeon, Boseon; Kovalchuk, Igor

    2016-01-01

    The usefulness and the utility of the next generation sequencing (NGS) technology are based on the assumption that the DNA or cDNA cleavage required to generate short sequence reads is random. Several previous reports suggest the existence of sequencing bias of NGS reads. To address this question in greater detail, we analyze NGS data from four organisms with different GC content, Plasmodium falciparum (19.39%), Arabidopsis thaliana (36.03%), Homo sapiens (40.91%) and Streptomyces coelicolor (72.00%). Using machine learning techniques, we recognize the pattern that the NGS read start is positioned in the local region where the nucleotide distribution is dissimilar from the global nucleotide distribution. We also demonstrate that the mono-nucleotide distribution underestimates sequencing bias, and the recognized pattern is explained largely by the distribution of multi-nucleotides (di-, tri-, and tetra- nucleotides) rather than mono-nucleotides. This implies that the correction of sequencing bias needs to be performed on the basis of the multi-nucleotide distribution. Providing companion software to quantify the effect of the recognized pattern on read positioning, we exemplify that the bias correction based on the mono-nucleotide distribution may not be sufficient to clean sequencing bias. PMID:27299343

  6. Performance Evaluation Tools for Next Generation Scalable Computing Platforms

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Sarukkai, Sekhar; Craw, James (Technical Monitor)

    1995-01-01

    The Federal High Performance and Communications (HPCC) Program continue to focus on R&D in a wide range of high performance computing and communications technologies. Using its accomplishments in the past four years as building blocks towards a Global Information Infrastructure (GII), an Implementation Plan that identifies six Strategic Focus Areas for R&D has been proposed. This white paper argues that a new generation of system software and programming tools must be developed to support these focus areas, so that the R&D we invest today can lead to technology pay-off a decade from now. The Global Computing Infrastructure (GCI) in the Year 2000 and Beyond would consists of thousands of powerful computing nodes connected via high-speed networks across the globe. Users will be able to obtain computing in formation services the GCI with the ease of using a plugging a toaster into the electrical outlet on the wall anywhere in the country. Developing and managing the GO requires performance prediction and monitoring capabilities that do not exist. Various accomplishments in this field today must be integrated and expanded to support this vision.

  7. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  8. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

    PubMed Central

    2014-01-01

    Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911

  9. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    PubMed Central

    Abernathy, Jason W; Xu, Peng; Li, Ping; Xu, De-Hai; Kucuktas, Huseyin; Klesius, Phillip; Arias, Covadonga; Liu, Zhanjiang

    2007-01-01

    Background The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs) for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs. Results We sequenced 10,368 EST clones using a normalized cDNA library made from pooled samples of the trophont, tomont, and theront life-cycle stages, and generated 9,769 sequences (94.2% success rate). Post-sequencing processing led to 8,432 high quality sequences. Clustering analysis of these ESTs allowed identification of 4,706 unique sequences containing 976 contigs and 3,730 singletons. These unique sequences represent over two million base pairs (~10% of Plasmodium falciparum genome, a phylogenetically related protozoan). BLASTX searches produced 2,518 significant (E-value < 10-5) hits and further Gene Ontology (GO) analysis annotated 1,008 of these genes. The ESTs were analyzed comparatively against the genomes of the related protozoa Tetrahymena thermophila and P. falciparum, allowing putative identification of additional genes. All the EST sequences were deposited by dbEST in GenBank (GenBank: EG957858–EG966289). Gene discovery and annotations are presented and discussed. Conclusion This set of ESTs represents a significant proportion of the Ich transcriptome, and provides a material basis for the development of microarrays useful for gene expression studies concerning Ich development, pathogenesis, and virulence. PMID:17577414

  10. Future Developments of the Next Generation Manned Space Platforms (European and Russian Space Students Perspectives)

    NASA Astrophysics Data System (ADS)

    Robinson, Douglas K. R.

    2002-01-01

    The opportunities for research made available by in-orbit manned space platforms is extensive. Research topics from space life science and biotechnology to material science and structural mechanics, from Astrophysics to the Low Earth Orbit environment to name a few. The list is long and has been growing steadily since the launch of Salyut 1 in 1971 till the present day ISS. With the construction of the ISS now into its final phase, what is the future of such research platforms? What will the "Next Generation" space station comprise of? What of manned research platforms beyond LEO and what constraints are foreseen after ISS. This paper presents current issues concerning the conceptual design of the "Next Generation" manned space platforms, the obstacles that are predicted concerning major subsystems of such platforms and also predictions of where the foci of research will concentrate. Future developments of the next generation manned space platforms presents research by the author in both his previous academic institutions1, personal opinions and the opinions of other young space research students and space professionals including Super Aero (France), Leicester University and Space Research Centre (UK) and Moscow State University (Russia). Here the author will detail the areas in which the contributors (representing the next generation space professionals) believe manned space platform architectures will be evolved, new technological developments and barriers to be overcome. In addition, new methods of Spacecraft design will also be presented, referring in the main to the Space Station Design Workshop 2002 (ESTEC Concurrent Design Facility) a week long workshop where a group of 30 young space professionals where brought together to design a conceptual space station. Future developments of the next generation manned space platforms has been composed with two aims. Firstly, to convey to both young space enthusiasts and more mature space professionals the ideas

  11. Application of next generation sequencing technology in Mendelian movement disorders.

    PubMed

    Wang, Yumin; Pan, Xuya; Xue, Dan; Li, Yuwei; Zhang, Xueying; Kuang, Biao; Zheng, Jiabo; Deng, Hao; Li, Xiaoling; Xiong, Wei; Zeng, Zhaoyang; Li, Guiyuan

    2016-02-01

    Next generation sequencing (NGS) has developed very rapidly in the last decade. Compared with Sanger sequencing, NGS has the advantages of high sensitivity and high throughput. Movement disorders are a common type of neurological disease. Although traditional linkage analysis has become a standard method to identify the pathogenic genes in diseases, it is getting difficult to find new pathogenic genes in rare Mendelian disorders, such as movement disorders, due to a lack of appropriate families with high penetrance or enough affected individuals. Thus, NGS is an ideal approach to identify the causal alleles for inherited disorders. NGS is used to identify genes in several diseases and new mutant sites in Mendelian movement disorders. This article reviewed the recent progress in NGS and the use of NGS in Mendelian movement disorders from genome sequencing and transcriptome sequencing. A perspective on how NGS could be employed in rare Mendelian disorders is also provided.

  12. Using chaos to generate variations on movement sequences

    NASA Astrophysics Data System (ADS)

    Bradley, Elizabeth; Stuart, Joshua

    1998-12-01

    We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.

  13. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    PubMed

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes.

  14. Development of microsatellite markers for the Korean Mussel, Mytilus coruscus (Mytilidae) using next-generation sequencing.

    PubMed

    An, Hye Suck; Lee, Jang Wook

    2012-01-01

    Mytilus coruscus (family Mytilidae) is one of the most important marine shellfish species in Korea. During the past few decades, this species has become endangered due to the loss of habitats and overfishing. Despite this species' importance, information on its genetic background is scarce. In this study, we developed microsatellite markers for M. coruscus using next-generation sequencing. A total of 263,900 raw reads were obtained from a quarter-plate run on the 454 GS-FLX titanium platform, and 176,327 unique sequences were generated with an average length of 381 bp; 2569 (1.45%) sequences contained a minimum of five di- to tetra-nucleotide repeat motifs. Of the 51 loci screened, 46 were amplified successfully, and 22 were polymorphic among 30 individuals, with seven of trinucleotide repeats and three of tetranucleotide repeats. All loci exhibited high genetic variability, with an average of 17.32 alleles per locus, and the mean observed and expected heterozygosities were 0.67 and 0.90, respectively. In addition, cross-amplification was tested for all 22 loci in another congener species, M. galloprovincialis. None of the primer pairs resulted in effective amplification, which might be due to their high mutation rates. Our work demonstrated the utility of next-generation 454 sequencing as a method for the rapid and cost-effective identification of microsatellites. The high degree of polymorphism exhibited by the 22 newly developed microsatellites will be useful in future conservation genetic studies of this species.

  15. Temporally consistent virtual camera generation from stereo image sequences

    NASA Astrophysics Data System (ADS)

    Fox, Simon R.; Flack, Julien; Shao, Juliang; Harman, Phil

    2004-05-01

    The recent emergence of auto-stereoscopic 3D viewing technologies has increased demand for the creation of 3D video content. A range of glasses-free multi-viewer screens have been developed that require as many as 9 views generated for each frame of video. This presents difficulties in both view generation and transmission bandwidth. This paper examines the use of stereo video capture as a means to generate multiple scene views via disparity analysis. A machine learning approach is applied to learn relationships between disparity generated depth information and source footage, and to generate depth information in a temporally smooth manner for both left and right eye image sequences. A view morphing approach to multiple view rendering is described which provides an excellent 3D effect on a range of glasses-free displays, while providing robustness to inaccurate stereo disparity calculations.

  16. Repetitive reef to ooid sequences near leeward margin of Caicos Platform, British West Indies

    SciTech Connect

    Waltz, M.; Rossinsky, V.; Wanless, H.R.

    1987-05-01

    Drill core transects and outcrops near the leeward margin of the Caicos Platform, BWI, reveal repetitive (one Holocene and two Pleistocene) shallowing-upward sequences of either (a) reefal boundstones overlain by layered oolitic grainstones or (b) burrowed oolitic grainstones overlain by layered oolitic grainstones. Each sediment sequence is separated from the other by a calcrete exposure surface. A transect, perpendicular to the trend of an exposed Pleistocene barrier reef/ooid sand complex, shows two separate sediment packages of reefal boundstones and reef-derived skeletal packstones overlain by layered oolitic grainstones. The well-exposed upper package consists of a shallowing-upward barrier reef, which is immediately overlain by burrowed and cross-bedded oolitic grainstones, beach rock blocks, and coral rubble, capped by layered oolitic grainstones. Separated by an exposure horizon, the lowermost package consists of coral and skeletal sands overlain by layered oolitic grainstones. Cores from a transect in a non-reefal setting north of the barrier reef complex reveal highly burrowed oolitic grainstones capped by layered oolitic grainstones. As a Holocene example, immediately offshore of this transect, modern reefs and bioturbated oolitic grainstones are presently being buried beneath coral rubble, beach rock blocks, and prograding oolitic beaches. Deposition of the capping layered oolitic grainstones appears to occur during stable and falling sea levels. This co-occurrence of reefal sediment and ooid sands suggests that the two are not mutually exclusive and that reef-ooid succession is a reoccurring part of leeward margin platform margin-building.

  17. Sequence variation of 22 autosomal STR loci detected by next generation sequencing.

    PubMed

    Gettings, Katherine Butler; Kiesler, Kevin M; Faith, Seth A; Montano, Elizabeth; Baker, Christine H; Young, Brian A; Guerrieri, Richard A; Vallone, Peter M

    2016-03-01

    Sequencing short tandem repeat (STR) loci allows for determination of repeat motif variations within the STR (or entire PCR amplicon) which cannot be ascertained by size-based PCR fragment analysis. Sanger sequencing has been used in research laboratories to further characterize STR loci, but is impractical for routine forensic use due to the laborious nature of the procedure in general and additional steps required to separate heterozygous alleles. Recent advances in library preparation methods enable high-throughput next generation sequencing (NGS) and technological improvements in sequencing chemistries now offer sufficient read lengths to encompass STR alleles. Herein, we present sequencing results from 183 DNA samples, including African American, Caucasian, and Hispanic individuals, at 22 autosomal forensic STR loci using an assay designed for NGS. The resulting dataset has been used to perform population genetic analyses of allelic diversity by length compared to sequence, and exemplifies which loci are likely to achieve the greatest gains in discrimination via sequencing. Within this data set, six loci demonstrate greater than double the number of alleles obtained by sequence compared to the number of alleles obtained by length: D12S391, D2S1338, D21S11, D8S1179, vWA, and D3S1358. As expected, repeat region sequences which had not previously been reported in forensic literature were identified.

  18. Sequence variation of 22 autosomal STR loci detected by next generation sequencing

    PubMed Central

    Gettings, Katherine Butler; Kiesler, Kevin M.; Faith, Seth A.; Montano, Elizabeth; Baker, Christine H.; Young, Brian A.; Guerrieri, Richard A.; Vallone, Peter M.

    2016-01-01

    Sequencing short tandem repeat (STR) loci allows for determination of repeat motif variations within the STR (or entire PCR amplicon) which cannot be ascertained by size-based PCR fragment analysis. Sanger sequencing has been used in research laboratories to further characterize STR loci, but is impractical for routine forensic use due to the laborious nature of the procedure in general and additional steps required to separate heterozygous alleles. Recent advances in library preparation methods enable high-throughput next generation sequencing (NGS) and technological improvements in sequencing chemistries now offer sufficient read lengths to encompass STR alleles. Herein, we present sequencing results from 183 DNA samples, including African American, Caucasian, and Hispanic individuals, at 22 autosomal forensic STR loci using an assay designed for NGS. The resulting dataset has been used to perform population genetic analyses of allelic diversity by length compared to sequence, and exemplifies which loci are likely to achieve the greatest gains in discrimination via sequencing. Within this data set, six loci demonstrate greater than double the number of alleles obtained by sequence compared to the number of alleles obtained by length: D12S391, D2S1338, D21S11, D8S1179, vWA, and D3S1358. As expected, repeat region sequences which had not previously been reported in forensic literature were identified. PMID:26701720

  19. Next-generation sequencing technologies: breaking the sound barrier of human genetics.

    PubMed

    Bahassi, El Mustapha; Stambrook, Peter J

    2014-09-01

    Demand for new technologies that deliver fast, inexpensive and accurate genome information has never been greater. This challenge has catalysed the rapid development of advances in next-generation sequencing (NGS). The generation of large volumes of sequence data and the speed of data acquisition are the primary advantages over previous, more standard methods. In 2013, the Food and Drug Administration granted marketing authorisation for the first high-throughput NG sequencer, Illumina's MiSeqDx, which allowed the development and use of a large number of new genome-based tests. Here, we present a review of template preparation, nucleic acid sequencing and imaging, genome assembly and alignment approaches as well as recent advances in current and near-term commercially available NGS instruments. We also outline the broad range of applications for NGS technologies and provide guidelines for platform selection to best address biological questions of interest. DNA sequencing has revolutionised biological and medical research, and is poised to have a similar impact on the practice of medicine. This tool is but one of an increasing arsenal of developing tools that enhance our capabilities to identify, quantify and functionally characterise the components of biological networks that keep us healthy or make us sick. Despite advances in other 'omic' technologies, DNA sequencing and analysis, in many respects, have played the leading role to date. The new technologies provide a bridge between genotype and phenotype, both in man and model organisms, and have revolutionised how risk of developing a complex human disease may be assessed. The generation of large DNA sequence data sets is producing a wealth of medically relevant information on a large number of individuals and populations that will potentially form the basis of truly individualised medical care in the future.

  20. Next generation sequencing in sporadic retinoblastoma patients reveals somatic mosaicism.

    PubMed

    Amitrano, Sara; Marozza, Annabella; Somma, Serena; Imperatore, Valentina; Hadjistilianou, Theodora; De Francesco, Sonia; Toti, Paolo; Galimberti, Daniela; Meloni, Ilaria; Cetta, Francesco; Piu, Pietro; Di Marco, Chiara; Dosa, Laura; Lo Rizzo, Caterina; Carignani, Giulia; Mencarelli, Maria Antonietta; Mari, Francesca; Renieri, Alessandra; Ariani, Francesca

    2015-11-01

    In about 50% of sporadic cases of retinoblastoma, no constitutive RB1 mutations are detected by conventional methods. However, recent research suggests that, at least in some of these cases, there is somatic mosaicism with respect to RB1 normal and mutant alleles. The increased availability of next generation sequencing improves our ability to detect the exact percentage of patients with mosaicism. Using this technology, we re-tested a series of 40 patients with sporadic retinoblastoma: 10 of them had been previously classified as constitutional heterozygotes, whereas in 30 no RB1 mutations had been found in lymphocytes. In 3 of these 30 patients, we have now identified low-level mosaic variants, varying in frequency between 8 and 24%. In 7 out of the 10 cases previously classified as heterozygous from testing blood cells, we were able to test additional tissues (ocular tissues, urine and/or oral mucosa): in three of them, next generation sequencing has revealed mosaicism. Present results thus confirm that a significant fraction (6/40; 15%) of sporadic retinoblastoma cases are due to postzygotic events and that deep sequencing is an efficient method to unambiguously distinguish mosaics. Re-testing of retinoblastoma patients through next generation sequencing can thus provide new information that may have important implications with respect to genetic counseling and family care.

  1. In Silico Proficiency Testing for Clinical Next-Generation Sequencing.

    PubMed

    Duncavage, Eric J; Abel, Haley J; Pfeifer, John D

    2017-01-01

    Quality assurance for clinical next-generation sequencing (NGS)-based assays is difficult given the complex methods and the range of sequence variants such assays can detect. As the number and range of mutations detected by clinical NGS assays has increased, it is difficult to apply standard analyte-specific proficiency testing (PT). Most current proficiency testing challenges for NGS are methods-based PT surveys that use DNA from reference samples engineered to harbor specific mutations that test both sequence generation and bioinformatics analysis. These methods-based PTs are limited by the number and types of mutations that can be physically introduced into a single DNA sample. In silico proficiency testing, which evaluates only the bioinformatics component of NGS assays, is a recently introduced PT method that allows for evaluation of numerous mutations spanning a range of variant classes. In silico PT data sets can be generated from simulated or actual sequencing data and are used to test alignment through variant detection and annotation steps. In silico PT has several advantages over the use of physical samples, including greater flexibility in tested variants, the ability to design laboratory-specific challenges, and lower costs. Herein, we review the use of in silico PT as an alternative to traditional methods-based PT as it is evolving in oncology applications and discuss how the approach is applicable more broadly.

  2. Alignment-free sequence comparison based on next-generation sequencing reads.

    PubMed

    Song, Kai; Ren, Jie; Zhai, Zhiyuan; Liu, Xuemei; Deng, Minghua; Sun, Fengzhu

    2013-02-01

    Next-generation sequencing (NGS) technologies have generated enormous amounts of shotgun read data, and assembly of the reads can be challenging, especially for organisms without template sequences. We study the power of genome comparison based on shotgun read data without assembly using three alignment-free sequence comparison statistics, D(2), D(*)(2) and D(s)(2), both theoretically and by simulations. Theoretical formulas for the power of detecting the relationship between two sequences related through a common motif model are derived. It is shown that both D(*)(2) and D(s)(2), outperform D(2) for detecting the relationship between two sequences based on NGS data. We then study the effects of length of the tuple, read length, coverage, and sequencing error on the power of D(*)(2) and D(s)(2). Finally, variations of these statistics, d(2), d(*)(2) and d(s)(2), respectively, are used to first cluster five mammalian species with known phylogenetic relationships, and then cluster 13 tree species whose complete genome sequences are not available using NGS shotgun reads. The clustering results using d(s)(2) are consistent with biological knowledge for the 5 mammalian and 13 tree species, respectively. Thus, the statistic d(s)(2) provides a powerful alignment-free comparison tool to study the relationships among different organisms based on NGS read data without assembly.

  3. Sequencing, De novo Assembly, Functional Annotation and Analysis of Phyllanthus amarus Leaf Transcriptome Using the Illumina Platform

    PubMed Central

    Bose Mazumdar, Aparupa; Chattopadhyay, Sharmila

    2016-01-01

    Phyllanthus amarus Schum. and Thonn., a widely distributed annual medicinal herb has a long history of use in the traditional system of medicine for over 2000 years. However, the lack of genomic data for P. amarus, a non-model organism hinders research at the molecular level. In the present study, high-throughput sequencing technology has been employed to enhance better understanding of this herb and provide comprehensive genomic information for future work. Here P. amarus leaf transcriptome was sequenced using the Illumina Miseq platform. We assembled 85,927 non-redundant (nr) “unitranscript” sequences with an average length of 1548 bp, from 18,060,997 raw reads. Sequence similarity analyses and annotation of these unitranscripts were performed against databases like green plants nr protein database, Gene Ontology (GO), Clusters of Orthologous Groups (COG), PlnTFDB, KEGG databases. As a result, 69,394 GO terms, 583 enzyme codes (EC), 134 KEGG maps, and 59 Transcription Factor (TF) families were generated. Functional and comparative analyses of assembled unitranscripts were also performed with the most closely related species like Populus trichocarpa and Ricinus communis using TRAPID. KEGG analysis showed that a number of assembled unitranscripts were involved in secondary metabolites, mainly phenylpropanoid, flavonoid, terpenoids, alkaloids, and lignan biosynthetic pathways that have significant medicinal attributes. Further, Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values of the identified secondary metabolite pathway genes were determined and Reverse Transcription PCR (RT-PCR) of a few of these genes were performed to validate the de novo assembled leaf transcriptome dataset. In addition 65,273 simple sequence repeats (SSRs) were also identified. To the best of our knowledge, this is the first transcriptomic dataset of P. amarus till date. Our study provides the largest genetic resource that will lead to drug development and pave

  4. Biomarker discovery by CE-MS enables sequence analysis via MS/MS with platform-independent separation.

    PubMed

    Zürbig, Petra; Renfrow, Matthew B; Schiffer, Eric; Novak, Jan; Walden, Michael; Wittke, Stefan; Just, Ingo; Pelzing, Matthias; Neusüss, Christian; Theodorescu, Dan; Root, Karen E; Ross, Mark M; Mischak, Harald

    2006-06-01

    CE-MS is a successful proteomic platform for the definition of biomarkers in different body fluids. Besides the biomarker defining experimental parameters, CE migration time and molecular weight, especially biomarker's sequence identity is an indispensable cornerstone for deeper insights into the pathophysiological pathways of diseases or for made-to-measure therapeutic drug design. Therefore, this report presents a detailed discussion of different peptide sequencing platforms consisting of high performance separation method either coupled on-line or off-line to different MS/MS devices, such as MALDI-TOF-TOF, ESI-IT, ESI-QTOF and Fourier transform ion cyclotron resonance, for sequencing indicative peptides. This comparison demonstrates the unique feature of CE-MS technology to serve as a reliable basis for the assignment of peptide sequence data obtained using different separation MS/MS methods to the biomarker defining parameters, CE migration time and molecular weight. Discovery of potential biomarkers by CE-MS enables sequence analysis via MS/MS with platform-independent sample separation. This is due to the fact that the number of basic and neutral polar amino acids of biomarkers sequences distinctly correlates with their CE-MS migration time/molecular weight coordinates. This uniqueness facilitates the independent entry of different sequencing platforms for peptide sequencing of CE-MS-defined biomarkers from highly complex mixtures.

  5. Fourth Generation of Next-Generation Sequencing Technologies: Promise and Consequences.

    PubMed

    Ke, Rongqin; Mignardi, Marco; Hauling, Thomas; Nilsson, Mats

    2016-12-01

    In this review, we discuss the emergence of the fourth-generation sequencing technologies that preserve the spatial coordinates of RNA and DNA sequences with up to subcellular resolution, thus enabling back mapping of sequencing reads to the original histological context. This information is used, for example, in two current large-scale projects that aim to unravel the function of the brain. Also in cancer research, fourth-generation sequencing has the potential to revolutionize the field. Cancer Research UK has named "Mapping the molecular and cellular tumor microenvironment in order to define new targets for therapy and prognosis" one of the grand challenges in tumor biology. We discuss the advantages of sequencing nucleic acids directly in fixed cells over traditional next-generation sequencing (NGS) methods, the limitations and challenges that these new methods have to face to become broadly applicable, and the impact that the information generated by the combination of in situ sequencing and NGS methods will have in research and diagnostics.

  6. Fourth Generation of Next‐Generation Sequencing Technologies: Promise and Consequences

    PubMed Central

    Ke, Rongqin; Mignardi, Marco; Hauling, Thomas

    2016-01-01

    ABSTRACT In this review, we discuss the emergence of the fourth‐generation sequencing technologies that preserve the spatial coordinates of RNA and DNA sequences with up to subcellular resolution, thus enabling back mapping of sequencing reads to the original histological context. This information is used, for example, in two current large‐scale projects that aim to unravel the function of the brain. Also in cancer research, fourth‐generation sequencing has the potential to revolutionize the field. Cancer Research UK has named “Mapping the molecular and cellular tumor microenvironment in order to define new targets for therapy and prognosis” one of the grand challenges in tumor biology. We discuss the advantages of sequencing nucleic acids directly in fixed cells over traditional next‐generation sequencing (NGS) methods, the limitations and challenges that these new methods have to face to become broadly applicable, and the impact that the information generated by the combination of in situ sequencing and NGS methods will have in research and diagnostics. PMID:27406789

  7. DNA extraction from vegetative tissue for next-generation sequencing.

    PubMed

    Furtado, Agnelo

    2014-01-01

    The quality of extracted DNA is crucial for several applications in molecular biology. If the DNA is to be used for next-generation sequencing (NGS), then microgram quantities of good-quality DNA is required. In addition, the DNA must substantially be of high molecular weight so that it can be used for library preparation and NGS sequencing. Contaminating phenol or starch in the isolated DNA can be easily removed by filtration through kit-based cartridges. In this chapter we describe a simple two-reagent DNA extraction protocol which yields a high quality and quantity of DNA which can be used for different applications including NGS.

  8. Next-Generation Technologies for Multiomics Approaches Including Interactome Sequencing

    PubMed Central

    Ohashi, Hiroyuki; Miyamoto-Sato, Etsuko

    2015-01-01

    The development of high-speed analytical techniques such as next-generation sequencing and microarrays allows high-throughput analysis of biological information at a low cost. These techniques contribute to medical and bioscience advancements and provide new avenues for scientific research. Here, we outline a variety of new innovative techniques and discuss their use in omics research (e.g., genomics, transcriptomics, metabolomics, proteomics, and interactomics). We also discuss the possible applications of these methods, including an interactome sequencing technology that we developed, in future medical and life science research. PMID:25649523

  9. Next-generation sequencing in schizophrenia and other neuropsychiatric disorders.

    PubMed

    Schreiber, Matthew; Dorschner, Michael; Tsuang, Debby

    2013-10-01

    Schizophrenia is a debilitating lifelong illness that lacks a cure and poses a worldwide public health burden. The disease is characterized by a heterogeneous clinical and genetic presentation that complicates research efforts to identify causative genetic variations. This review examines the potential of current findings in schizophrenia and in other related neuropsychiatric disorders for application in next-generation technologies, particularly whole-exome sequencing (WES) and whole-genome sequencing (WGS). These approaches may lead to the discovery of underlying genetic factors for schizophrenia and may thereby identify and target novel therapeutic targets for this devastating disorder.

  10. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    PubMed

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  11. Detection of false positive mutations in BRCA gene by next generation sequencing.

    PubMed

    Suryavanshi, Moushumi; Kumar, Dushyant; Panigrahi, Manoj Kumar; Chowdhary, Meenakshi; Mehta, Anurag

    2016-11-15

    BRCA1 and BRCA2 genes are implicated in 20-25% of hereditary breast and ovarian cancers. New age sequencing platforms have revolutionized massively parallel sequencing in clinical practice by providing cost effective, rapid, and sensitive sequencing. This study critically evaluates the false positives in multiplex panels and suggests the need for careful analysis. We employed multiplex PCR based BRCA1 and BRCA2 community Panel with ion torrent PGM machine for evaluation of these mutations. Out of all 41samples analyzed for BRCA1 and BRCA2 five were found with 950_951 insA(Asn319fs) at Chr13:32906565 position and one sample with 1032_1033 insA(Asn346fs) at Chr13:32906647, both being frame-shift mutations in BRCA2 gene. 950_951 insA(Asn319fs) mutation is reported as pathogenic allele in NCBI dbSNP. On examination of IGV for all these samples, it was seen that both mutations had 'A' nucleotide insertion at 950, and 1032 position in exon 10 of BRCA2 gene. Sanger Sequencing did not confirm these insertions. Next-generation sequencing shows great promise by allowing rapid mutational analysis of multiple genes in human cancer but our results indicate the need for careful sequence analysis to avoid false positive results.

  12. Histoimmunogenetics Markup Language 1.0: Reporting next generation sequencing-based HLA and KIR genotyping.

    PubMed

    Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin

    2015-12-01

    We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS.

  13. Large-scale MHC class II genotyping of a wild lemur population by next generation sequencing.

    PubMed

    Huchard, Elise; Albrecht, Christina; Schliehe-Diecks, Susanne; Baniel, Alice; Roos, Christian; Kappeler, Peter M; Peter, Peter M Kappeler; Brameier, Markus

    2012-12-01

    The critical role of major histocompatibility complex (MHC) genes in disease resistance, along with their putative function in sexual selection, reproduction and chemical ecology, make them an important genetic system in evolutionary ecology. Studying selective pressures acting on MHC genes in the wild nevertheless requires population-wide genotyping, which has long been challenging because of their extensive polymorphism. Here, we report on large-scale genotyping of the MHC class II loci of the grey mouse lemur (Microcebus murinus) from a wild population in western Madagascar. The second exons from MHC-DRB and -DQB of 772 and 672 individuals were sequenced, respectively, using a 454 sequencing platform, generating more than 800,000 reads. Sequence analysis, through a stepwise variant validation procedure, allowed reliable typing of more than 600 individuals. The quality of our genotyping was evaluated through three independent methods, namely genotyping the same individuals by both cloning and 454 sequencing, running duplicates, and comparing parent-offspring dyads; each displaying very high accuracy. A total of 61 (including 20 new) and 60 (including 53 new) alleles were detected at DRB and DQB genes, respectively. Both loci were non-duplicated, in tight linkage disequilibrium and in Hardy-Weinberg equilibrium, despite the fact that sequence analysis revealed clear evidence of historical selection. Our results highlight the potential of 454 sequencing technology in attempts to investigate patterns of selection shaping MHC variation in contemporary populations. The power of this approach will nevertheless be conditional upon strict quality control of the genotyping data.

  14. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    PubMed Central

    2002-01-01

    Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134

  15. The 2013 seismic sequence close to gas injection platform of the Castor project, offshore Spain

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Grigoli, Francesco; Heimann, Sebastian; Gonzalez, Alvaro; Buforn, Elisa; Maghsoudi, Samira; Blanch, Estefania; Dahm, Torsten

    2014-05-01

    A spatially localized seismic sequence has originated few tens of kilometres offshore the Mediterranean coast of Spain, starting on September 5, 2013, and lasting at least until October 2013. The sequence culminated in a maximal moment magnitude Mw 4.3 earthquake, on October 1, 2013. The epicentral region is located near the offshore platform of the Castor project, where gas is conducted through a pipeline from mainland and where it was recently injected in a depleted oil reservoir, at about 2 km depth. We analyse the temporal evolution of the seismic sequence and use full waveform techniques to derive absolute and relative locations, estimate depths and focal mechanisms for the largest events in the sequence (with magnitude mbLg larger than 3), and compare them to a previous event (April 8, 2012, mbLg 3.3) taking place in the same region prior to the gas injection. Moment tensor inversion results show that the overall seismicity in this sequence is characterized by oblique mechanisms with a normal fault component, with a 30° low-dip angle plane oriented NNE-SSW and a sub- vertical plane oriented NW-SE. The combined analysis of hypocentral location and focal mechanisms could indicate that the seismic sequence corresponds to rupture processes along sub- horizontal shallow surfaces, which could have been triggered by the gas injection in the reservoir,. An alternative scenario includes the iterated triggering of a system of steep faults oriented NW-SE, which were identified by prior marine seismics investigations. The most relevant seismogenic feature in the area is the Fosa de Amposta fault system, which includes different strands mapped at different distances to the coast, with a general NE-SW orientation, roughly parallel to the coastline. No significant known historical seismicity has involved this fault in the past. Our both scenarios exclude its activation, as its known orientation is inconsistent with focal mechanism results.

  16. Clinical Application of Targeted Next Generation Sequencing for Colorectal Cancers

    PubMed Central

    Fontanges, Quitterie; De Mendonca, Ricardo; Salmon, Isabelle; Le Mercier, Marie; D’Haene, Nicky

    2016-01-01

    Promising targeted therapy and personalized medicine are making molecular profiling of tumours a priority. For colorectal cancer (CRC) patients, international guidelines made RAS (KRAS and NRAS) status a prerequisite for the use of anti-epidermal growth factor receptor agents (anti-EGFR). Daily, new data emerge on the theranostic and prognostic role of molecular biomarkers, which is a strong incentive for a validated, sensitive and broadly available molecular screening test in order to implement and improve multi-modal therapy strategy and clinical trials. Next generation sequencing (NGS) has begun to supplant other technologies for genomic profiling. Targeted NGS is a method that allows parallel sequencing of thousands of short DNA sequences in a single test offering a cost-effective approach for detecting multiple genetic alterations with a minimum amount of DNA. In the present review, we collected data concerning the clinical application of NGS technology in the setting of colorectal cancer. PMID:27999270

  17. Automatic Generation of Randomized Trial Sequences for Priming Experiments

    PubMed Central

    Ihrke, Matthias; Behrendt, Jörg

    2011-01-01

    In most psychological experiments, a randomized presentation of successive displays is crucial for the validity of the results. For some paradigms, this is not a trivial issue because trials are interdependent, e.g., priming paradigms. We present a software that automatically generates optimized trial sequences for (negative-) priming experiments. Our implementation is based on an optimization heuristic known as genetic algorithms that allows for an intuitive interpretation due to its similarity to natural evolution. The program features a graphical user interface that allows the user to generate trial sequences and to interactively improve them. The software is based on freely available software and is released under the GNU General Public License. PMID:22007178

  18. All-optical pseudorandom bit sequences generator based on TOADs

    NASA Astrophysics Data System (ADS)

    Sun, Zhenchao; Wang, Zhi; Wu, Chongqing; Wang, Fu; Li, Qiang

    2016-03-01

    A scheme for all-optical pseudorandom bit sequences (PRBS) generator is demonstrated with optical logic gate 'XNOR' and all-optical wavelength converter based on cascaded Tera-Hertz Optical Asymmetric Demultiplexer (TOADs). Its feasibility is verified by generation of return-to-zero on-off keying (RZ-OOK) 263-1 PRBS at the speed of 1 Gb/s with 10% duty radio. The high randomness of ultra-long cycle PRBS is validated by successfully passing the standard benchmark test.

  19. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    DOE PAGES

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  20. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    SciTech Connect

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

  1. Estimating individual admixture proportions from next generation sequencing data.

    PubMed

    Skotte, Line; Korneliussen, Thorfinn Sand; Albrechtsen, Anders

    2013-11-01

    Inference of population structure and individual ancestry is important both for population genetics and for association studies. With next generation sequencing technologies it is possible to obtain genetic data for all accessible genetic variations in the genome. Existing methods for admixture analysis rely on known genotypes. However, individual genotypes cannot be inferred from low-depth sequencing data without introducing errors. This article presents a new method for inferring an individual's ancestry that takes the uncertainty introduced in next generation sequencing data into account. This is achieved by working directly with genotype likelihoods that contain all relevant information of the unobserved genotypes. Using simulations as well as publicly available sequencing data, we demonstrate that the presented method has great accuracy even for very low-depth data. At the same time, we demonstrate that applying existing methods to genotypes called from the same data can introduce severe biases. The presented method is implemented in the NGSadmix software available at http://www.popgen.dk/software.

  2. New Generations: Sequencing Machines and Their Computational Challenges

    PubMed Central

    Schwartz, David C.; Waterman, Michael S.

    2011-01-01

    New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands. PMID:22121326

  3. New Generations: Sequencing Machines and Their Computational Challenges.

    PubMed

    Schwartz, David C; Waterman, Michael S

    2010-01-01

    New generation sequencing systems are changing how molecular biology is practiced. The widely promoted $1000 genome will be a reality with attendant changes for healthcare, including personalized medicine. More broadly the genomes of many new organisms with large samplings from populations will be commonplace. What is less appreciated is the explosive demands on computation, both for CPU cycles and storage as well as the need for new computational methods. In this article we will survey some of these developments and demands.

  4. Mapping Sensorimotor Sequences to Word Sequences: A Connectionist Model of Language Acquisition and Sentence Generation

    ERIC Educational Resources Information Center

    Takac, Martin; Benuskova, Lubica; Knott, Alistair

    2012-01-01

    In this article we present a neural network model of sentence generation. The network has both technical and conceptual innovations. Its main technical novelty is in its semantic representations: the messages which form the input to the network are structured as sequences, so that message elements are delivered to the network one at a time. Rather…

  5. Vidjil: A Web Platform for Analysis of High-Throughput Repertoire Sequencing

    PubMed Central

    Duez, Marc; Herbert, Ryan; Rocher, Tatiana; Salson, Mikaël; Thonier, Florian

    2016-01-01

    Background The B and T lymphocytes are white blood cells playing a key role in the adaptive immunity. A part of their DNA, called the V(D)J recombinations, is specific to each lymphocyte, and enables recognition of specific antigenes. Today, with new sequencing techniques, one can get billions of DNA sequences from these regions. With dedicated Repertoire Sequencing (RepSeq) methods, it is now possible to picture population of lymphocytes, and to monitor more accurately the immune response as well as pathologies such as leukemia. Methods and Results Vidjil is an open-source platform for the interactive analysis of high-throughput sequencing data from lymphocyte recombinations. It contains an algorithm gathering reads into clonotypes according to their V(D)J junctions, a web application made of a sample, experiment and patient database and a visualization for the analysis of clonotypes along the time. Vidjil is implemented in C++, Python and Javascript and licensed under the GPLv3 open-source license. Source code, binaries and a public web server are available at http://www.vidjil.org and at http://bioinfo.lille.inria.fr/vidjil. Using the Vidjil web application consists of four steps: 1. uploading a raw sequence file (typically a FASTQ); 2. running RepSeq analysis software; 3. visualizing the results; 4. annotating the results and saving them for future use. For the end-user, the Vidjil web application needs no specific installation and just requires a connection and a modern web browser. Vidjil is used by labs in hematology or immunology for research and clinical applications. PMID:27835690

  6. Generation of control sequences for a pilot-disassembly system

    NASA Astrophysics Data System (ADS)

    Seliger, Guenther; Kim, Hyung-Ju; Keil, Thomas

    2002-02-01

    Closing the product and material cycles has emerged as a paradigm for industry in the 21st century. Disassembly plays a key role in a life cycle economy since it enables the recovery of resources. A partly automated disassembly system should adapt to a large variety of products and different degrees of devaluation. Also the amounts of products to be disassembled can vary strongly. To cope with these demands an approach to generate on-line disassembly control sequences will be presented. In order to react on these demands the technological feasibility is considered within a procedure for the generation of disassembly control sequences. Procedures are designed to find available and technologically feasible disassembly processes. The control system is formed by modularised and parameterised control units in the cell level within the entire control architecture. In the first development stage product and process analyses at the sample product washing machine were executed. Furthermore a generalized disassembly process was defined. Afterwards these processes were structured in primary and secondary functions. In the second stage the disassembly control at the technological level was investigated. Factors were the availability of the disassembly tools and the technological feasibility of the disassembly processes within the disassembly system. Technical alternative disassembly processes are determined as a result of availability of the tools and technological feasibility of processes. The fourth phase was the concept for the generation of the disassembly control sequences. The approach will be proved in a prototypical disassembly system.

  7. Evaluation of GS Junior and MiSeq next-generation sequencing technologies as an alternative to Trugene population sequencing in the clinical HIV laboratory.

    PubMed

    Ram, Daniela; Leshkowitz, Dena; Gonzalez, Dimitri; Forer, Relly; Levy, Itzchak; Chowers, Michal; Lorber, Margalit; Hindiyeh, Musa; Mendelson, Ella; Mor, Orna

    2015-02-01

    Population HIV-1 sequencing is currently the method of choice for the identification and follow-up of HIV-1 antiretroviral drug resistance. It has limited sensitivity and results in a consensus sequence showing the most prevalent nucleotide per position. Moreover concomitant sequencing and interpretation of the results for several samples together is laborious and time consuming. In this study, the practical use of GS Junior and MiSeq bench-top next generation sequencing (NGS) platforms as an alternative to Trugene Sanger-based population sequencing in the clinical HIV laboratory was assessed. DeepChek(®)-HIV TherapyEdge software was used for processing all the protease and reverse transcriptase sequences and for resistance interpretation. Plasma samples from nine HIV-1 carriers, representing the major HIV-1 subtypes in Israel, were compared. The total number of amino acid substitutions identified in the nine samples by GS Junior (232 substitutions) and MiSeq (243 substitutions) was similar and higher than Trugene (181 substitutions), emphasizing the advantage of deep sequencing on population sequencing. More than 80% of the identified substitutions were identical between the GS Junior and MiSeq platforms, most of which (184 of 199) at similar frequency. Low abundance substitutions accounted for 20.9% of the MiSeq and 21.9% of the GS Junior output, the majority of which were not detected by Trugene. More drug resistance mutations were identified by both the NGS platforms, primarily, but not only, at low abundance. In conclusion, in combination with DeepChek, both GS Junior and MiSeq were found to be more sensitive than Trugene and adequate for HIV-1 resistance analysis in the clinical HIV laboratory.

  8. Rep-Seq: uncovering the immunological repertoire through next-generation sequencing

    PubMed Central

    Benichou, Jennifer; Ben-Hamo, Rotem; Louzoun, Yoram; Efroni, Sol

    2012-01-01

    Recent scientific discoveries fuelled by the application of next-generation DNA and RNA sequencing technologies highlight the striking impact of these platforms in characterizing multiple aspects in genomics research. This technology has been used in the study of the B-cell and T-cell receptor repertoire. The novelty of immunosequencing comes from the recent rapid development of techniques and the exponential reduction in cost of sequencing. Here, we describe some of the technologies, which we collectively refer to as Rep-Seq (repertoire sequencing), to portray achievements in the field and to present the essential and inseparable role of next-generation sequencing to the understanding of entities in immune response. The large Rep-Seq data sets that should be available in the near future call for new computational algorithms to segue the transition from ‘classic’ molecular-based analysis to system-wide analysis. The combination of new algorithms with high-throughput data will form the basis for possible new clinical implications in personalized medicine and deeper understanding of immune behaviour and immune response. PMID:22043864

  9. Generating Researcher Networks with Identified Persons on a Semantic Service Platform

    NASA Astrophysics Data System (ADS)

    Jung, Hanmin; Lee, Mikyoung; Kim, Pyung; Lee, Seungwoo

    This paper describes a Semantic Web-based method to acquire researcher networks by means of identification scheme, ontology, and reasoning. Three steps are required to realize it; resolving co-references, finding experts, and generating researcher networks. We adopt OntoFrame as an underlying semantic service platform and apply reasoning to make direct relations between far-off classes in ontology schema. 453,124 Elsevier journal articles with metadata and full-text documents in information technology and biomedical domains have been loaded and served on the platform as a test set.

  10. Quantifying population genetic differentiation from next-generation sequencing data.

    PubMed

    Fumagalli, Matteo; Vieira, Filipe G; Korneliussen, Thorfinn Sand; Linderoth, Tyler; Huerta-Sánchez, Emilia; Albrechtsen, Anders; Nielsen, Rasmus

    2013-11-01

    Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage.

  11. Next-generation sequencing techniques for eukaryotic microorganisms: sequencing-based solutions to biological problems.

    PubMed

    Nowrousian, Minou

    2010-09-01

    Over the past 5 years, large-scale sequencing has been revolutionized by the development of several so-called next-generation sequencing (NGS) technologies. These have drastically increased the number of bases obtained per sequencing run while at the same time decreasing the costs per base. Compared to Sanger sequencing, NGS technologies yield shorter read lengths; however, despite this drawback, they have greatly facilitated genome sequencing, first for prokaryotic genomes and within the last year also for eukaryotic ones. This advance was possible due to a concomitant development of software that allows the de novo assembly of draft genomes from large numbers of short reads. In addition, NGS can be used for metagenomics studies as well as for the detection of sequence variations within individual genomes, e.g., single-nucleotide polymorphisms (SNPs), insertions/deletions (indels), or structural variants. Furthermore, NGS technologies have quickly been adopted for other high-throughput studies that were previously performed mostly by hybridization-based methods like microarrays. This includes the use of NGS for transcriptomics (RNA-seq) or the genome-wide analysis of DNA/protein interactions (ChIP-seq). This review provides an overview of NGS technologies that are currently available and the bioinformatics analyses that are necessary to obtain information from the flood of sequencing data as well as applications of NGS to address biological questions in eukaryotic microorganisms.

  12. Efficient and sensitive identification and quantification of airborne pollen using next-generation DNA sequencing.

    PubMed

    Kraaijeveld, Ken; de Weger, Letty A; Ventayol García, Marina; Buermans, Henk; Frank, Jeroen; Hiemstra, Pieter S; den Dunnen, Johan T

    2015-01-01

    Pollen monitoring is an important and widely used tool in allergy research and creation of awareness in pollen-allergic patients. Current pollen monitoring methods are microscope-based, labour intensive and cannot identify pollen to the genus level in some relevant allergenic plant groups. Therefore, a more efficient, cost-effective and sensitive method is needed. Here, we present a method for identification and quantification of airborne pollen using DNA sequencing. Pollen is collected from ambient air using standard techniques. DNA is extracted from the collected pollen, and a fragment of the chloroplast gene trnL is amplified using PCR. The PCR product is subsequently sequenced on a next-generation sequencing platform (Ion Torrent). Amplicon molecules are sequenced individually, allowing identification of different sequences from a mixed sample. We show that this method provides an accurate qualitative and quantitative view of the species composition of samples of airborne pollen grains. We also show that it correctly identifies the individual grass genera present in a mixed sample of grass pollen, which cannot be achieved using microscopic pollen identification. We conclude that our method is more efficient and sensitive than current pollen monitoring techniques and therefore has the potential to increase the throughput of pollen monitoring.

  13. Genomic Selection in the Era of Next Generation Sequencing for Complex Traits in Plant Breeding

    PubMed Central

    Bhat, Javaid A.; Ali, Sajad; Salgotra, Romesh K.; Mir, Zahoor A.; Dutta, Sutapa; Jadon, Vasudha; Tyagi, Anshika; Mushtaq, Muntazir; Jain, Neelu; Singh, Pradeep K.; Singh, Gyanendra P.; Prabhu, K. V.

    2016-01-01

    Genomic selection (GS) is a promising approach exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. In plant breeding, it provides opportunities to increase genetic gain of complex traits per unit time and cost. The cost-benefit balance was an important consideration for GS to work in crop plants. Availability of genome-wide high-throughput, cost-effective and flexible markers, having low ascertainment bias, suitable for large population size as well for both model and non-model crop species with or without the reference genome sequence was the most important factor for its successful and effective implementation in crop species. These factors were the major limitations to earlier marker systems viz., SSR and array-based, and was unimaginable before the availability of next-generation sequencing (NGS) technologies which have provided novel SNP genotyping platforms especially the genotyping by sequencing. These marker technologies have changed the entire scenario of marker applications and made the use of GS a routine work for crop improvement in both model and non-model crop species. The NGS-based genotyping have increased genomic-estimated breeding value prediction accuracies over other established marker platform in cereals and other crop species, and made the dream of GS true in crop breeding. But to harness the true benefits from GS, these marker technologies will be combined with high-throughput phenotyping for achieving the valuable genetic gain from complex traits. Moreover, the continuous decline in sequencing cost will make the WGS feasible and cost effective for GS in near future. Till that time matures the targeted sequencing seems to be more cost-effective option for large scale marker discovery and GS, particularly in case of large and un-decoded genomes. PMID:28083016

  14. Applications of next-generation sequencing to phylogeography and phylogenetics.

    PubMed

    McCormack, John E; Hird, Sarah M; Zellmer, Amanda J; Carstens, Bryan C; Brumfield, Robb T

    2013-02-01

    This is a time of unprecedented transition in DNA sequencing technologies. Next-generation sequencing (NGS) clearly holds promise for fast and cost-effective generation of multilocus sequence data for phylogeography and phylogenetics. However, the focus on non-model organisms, in addition to uncertainty about which sample preparation methods and analyses are appropriate for different research questions and evolutionary timescales, have contributed to a lag in the application of NGS to these fields. Here, we outline some of the major obstacles specific to the application of NGS to phylogeography and phylogenetics, including the focus on non-model organisms, the necessity of obtaining orthologous loci in a cost-effective manner, and the predominate use of gene trees in these fields. We describe the most promising methods of sample preparation that address these challenges. Methods that reduce the genome by restriction digest and manual size selection are most appropriate for studies at the intraspecific level, whereas methods that target specific genomic regions (i.e., target enrichment or sequence capture) have wider applicability from the population level to deep-level phylogenomics. Additionally, we give an overview of how to analyze NGS data to arrive at data sets applicable to the standard toolkit of phylogeography and phylogenetics, including initial data processing to alignment and genotype calling (both SNPs and loci involving many SNPs). Even though whole-genome sequencing is likely to become affordable rather soon, because phylogeography and phylogenetics rely on analysis of hundreds of individuals in many cases, methods that reduce the genome to a subset of loci should remain more cost-effective for some time to come.

  15. Preparation of next-generation sequencing libraries from damaged DNA.

    PubMed

    Briggs, Adrian W; Heyn, Patricia

    2012-01-01

    Next-generation sequencing (NGS) has revolutionized ancient DNA research, especially when combined with high-throughput target enrichment methods. However, attaining high sequencing depth and accuracy from samples often remains problematic due to the damaged state of ancient DNA, in particular the extremely low copy number of ancient DNA and the abundance of uracil residues derived from cytosine deamination that lead to miscoding errors. It is therefore critical to use a highly efficient procedure for conversion of a raw DNA extract into an adaptor-ligated sequencing library, and equally important to reduce errors from uracil residues. We present a protocol for NGS library preparation that allows highly efficient conversion of DNA fragments into an adaptor-ligated form. The protocol incorporates an option to remove the vast majority of uracil miscoding lesions as part of the library preparation process. The procedure requires only two spin column purification steps and no gel purification or bead handling. Starting from an aliquot of DNA extract, a finished, highly amplified library can be generated in 5 h, or under 3 h if uracil removal is not required.

  16. Unraveling genomic variation from next generation sequencing data

    PubMed Central

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

  17. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    PubMed

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi.

  18. Next-generation sequencing for diagnosis of rare diseases in the neonatal intensive care unit

    PubMed Central

    Daoud, Hussein; Luco, Stephanie M.; Li, Rui; Bareke, Eric; Beaulieu, Chandree; Jarinova, Olga; Carson, Nancy; Nikkel, Sarah M.; Graham, Gail E.; Richer, Julie; Armour, Christine; Bulman, Dennis E.; Chakraborty, Pranesh; Geraghty, Michael; Lines, Matthew A.; Lacaze-Masmonteil, Thierry; Majewski, Jacek; Boycott, Kym M.; Dyment, David A.

    2016-01-01

    Background: Rare diseases often present in the first days and weeks of life and may require complex management in the setting of a neonatal intensive care unit (NICU). Exhaustive consultations and traditional genetic or metabolic investigations are costly and often fail to arrive at a final diagnosis when no recognizable syndrome is suspected. For this pilot project, we assessed the feasibility of next-generation sequencing as a tool to improve the diagnosis of rare diseases in newborns in the NICU. Methods: We retrospectively identified and prospectively recruited newborns and infants admitted to the NICU of the Children’s Hospital of Eastern Ontario and the Ottawa Hospital, General Campus, who had been referred to the medical genetics or metabolics inpatient consult service and had features suggesting an underlying genetic or metabolic condition. DNA from the newborns and parents was enriched for a panel of clinically relevant genes and sequenced on a MiSeq sequencing platform (Illumina Inc.). The data were interpreted with a standard informatics pipeline and reported to care providers, who assessed the importance of genotype–phenotype correlations. Results: Of 20 newborns studied, 8 received a diagnosis on the basis of next-generation sequencing (diagnostic rate 40%). The diagnoses were renal tubular dysgenesis, SCN1A-related encephalopathy syndrome, myotubular myopathy, FTO deficiency syndrome, cranioectodermal dysplasia, congenital myasthenic syndrome, autosomal dominant intellectual disability syndrome type 7 and Denys–Drash syndrome. Interpretation: This pilot study highlighted the potential of next-generation sequencing to deliver molecular diagnoses rapidly with a high success rate. With broader use, this approach has the potential to alter health care delivery in the NICU. PMID:27241786

  19. Suppression Subtractive Hybridization Versus Next-Generation Sequencing in Plant Genetic Engineering: Challenges and Perspectives.

    PubMed

    Sahebi, Mahbod; Hanafi, Mohamed M; Azizi, Parisa; Hakim, Abdul; Ashkani, Sadegh; Abiri, Rambod

    2015-10-01

    Suppression subtractive hybridization (SSH) is an effective method to identify different genes with different expression levels involved in a variety of biological processes. This method has often been used to study molecular mechanisms of plants in complex relationships with different pathogens and a variety of biotic stresses. Compared to other techniques used in gene expression profiling, SSH needs relatively smaller amounts of the initial materials, with lower costs, and fewer false positives present within the results. Extraction of total RNA from plant species rich in phenolic compounds, carbohydrates, and polysaccharides that easily bind to nucleic acids through cellular mechanisms is difficult and needs to be considered. Remarkable advancement has been achieved in the next-generation sequencing (NGS) field. As a result of progress within fields related to molecular chemistry and biology as well as specialized engineering, parallelization in the sequencing reaction has exceptionally enhanced the overall read number of generated sequences per run. Currently available sequencing platforms support an earlier unparalleled view directly into complex mixes associated with RNA in addition to DNA samples. NGS technology has demonstrated the ability to sequence DNA with remarkable swiftness, therefore allowing previously unthinkable scientific accomplishments along with novel biological purposes. However, the massive amounts of data generated by NGS impose a substantial challenge with regard to data safe-keeping and analysis. This review examines some simple but vital points involved in preparing the initial material for SSH and introduces this method as well as its associated applications to detect different novel genes from different plant species. This review evaluates general concepts, basic applications, plus the probable results of NGS technology in genomics, with unique mention of feasible potential tools as well as bioinformatics.

  20. Trends in Next-Generation Sequencing and a New Era for Whole Genome Sequencing.

    PubMed

    Park, Sang Tae; Kim, Jayoung

    2016-11-01

    This article is a mini-review that provides a general overview for next-generation sequencing (NGS) and introduces one of the most popular NGS applications, whole genome sequencing (WGS), developed from the expansion of human genomics. NGS technology has brought massively high throughput sequencing data to bear on research questions, enabling a new era of genomic research. Development of bioinformatic software for NGS has provided more opportunities for researchers to use various applications in genomic fields. De novo genome assembly and large scale DNA resequencing to understand genomic variations are popular genomic research tools for processing a tremendous amount of data at low cost. Studies on transcriptomes are now available, from previous-hybridization based microarray methods. Epigenetic studies are also available with NGS applications such as whole genome methylation sequencing and chromatin immunoprecipitation followed by sequencing. Human genetics has faced a new paradigm of research and medical genomics by sequencing technologies since the Human Genome Project. The trend of NGS technologies in human genomics has brought a new era of WGS by enabling the building of human genomes databases and providing appropriate human reference genomes, which is a necessary component of personalized medicine and precision medicine.

  1. Trends in Next-Generation Sequencing and a New Era for Whole Genome Sequencing

    PubMed Central

    2016-01-01

    This article is a mini-review that provides a general overview for next-generation sequencing (NGS) and introduces one of the most popular NGS applications, whole genome sequencing (WGS), developed from the expansion of human genomics. NGS technology has brought massively high throughput sequencing data to bear on research questions, enabling a new era of genomic research. Development of bioinformatic software for NGS has provided more opportunities for researchers to use various applications in genomic fields. De novo genome assembly and large scale DNA resequencing to understand genomic variations are popular genomic research tools for processing a tremendous amount of data at low cost. Studies on transcriptomes are now available, from previous-hybridization based microarray methods. Epigenetic studies are also available with NGS applications such as whole genome methylation sequencing and chromatin immunoprecipitation followed by sequencing. Human genetics has faced a new paradigm of research and medical genomics by sequencing technologies since the Human Genome Project. The trend of NGS technologies in human genomics has brought a new era of WGS by enabling the building of human genomes databases and providing appropriate human reference genomes, which is a necessary component of personalized medicine and precision medicine. PMID:27915479

  2. Light-generated oligonucleotide arrays for rapid DNA sequence analysis.

    PubMed Central

    Pease, A C; Solas, D; Sullivan, E J; Cronin, M T; Holmes, C P; Fodor, S P

    1994-01-01

    In many areas of molecular biology there is a need to rapidly extract and analyze genetic information; however, current technologies for DNA sequence analysis are slow and labor intensive. We report here how modern photolithographic techniques can be used to facilitate sequence analysis by generating miniaturized arrays of densely packed oligonucleotide probes. These probe arrays, or DNA chips, can then be applied to parallel DNA hybridization analysis, directly yielding sequence information. In a preliminary experiment, a 1.28 x 1.28 cm array of 256 different octanucleotides was produced in 16 chemical reaction cycles, requiring 4 hr to complete. The hybridization pattern of fluorescently labeled oligonucleotide targets was then detected by epifluorescence microscopy. The fluorescence signals from complementary probes were 5-35 times stronger than those with single or double base-pair hybridization mismatches, demonstrating specificity in the identification of complementary sequences. This method should prove to be a powerful tool for rapid investigations in human genetics and diagnostics, pathogen detection, and DNA molecular recognition. Images PMID:8197176

  3. BING: biomedical informatics pipeline for Next Generation Sequencing.

    PubMed

    Kriseman, Jeffrey; Busick, Christopher; Szelinger, Szabolcs; Dinu, Valentin

    2010-06-01

    High throughput parallel genomic sequencing (Next Generation Sequencing, NGS) shifts the bottleneck in sequencing processes from experimental data production to computationally intensive informatics-based data analysis. This manuscript introduces a biomedical informatics pipeline (BING) for the analysis of NGS data that offers several novel computational approaches to 1. image alignment, 2. signal correlation, compensation, separation, and pixel-based cluster registration, 3. signal measurement and base calling, 4. quality control and accuracy measurement. These approaches address many of the informatics challenges, including image processing, computational performance, and accuracy. These new algorithms are benchmarked against the Illumina Genome Analysis Pipeline. BING is the one of the first software tools to perform pixel-based analysis of NGS data. When compared to the Illumina informatics tool, BING's pixel-based approach produces a significant increase in the number of sequence reads, while reducing the computational time per experiment and error rate (<2%). This approach has the potential of increasing the density and throughput of NGS technologies.

  4. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach

    PubMed Central

    Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P.

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  5. Improved timing sequence generator on the DIII-D tokamak

    NASA Astrophysics Data System (ADS)

    Colio, R. A.; Finkenthal, D. F.; Deterly, T. M.

    2011-10-01

    The DIII-D tokamak uses a central clock source and trigger system to synchronize plant operations and diagnostics. The system uses a bi-phase encoding technique to send both clock and trigger signals to remote receivers, and supports both pre-programmed sequences of triggers as well as event-driven triggers. A 1 MHz timebase is used and triggers are encoded as eight-bit hexadecimal words. Currently, the system relies on a cascaded series of CAMAC-based delay generators to produce the trigger sequence. We present a modern and more versatile implementation based on a single FPGA (field programmable gate array) capable of providing clock rates upward of 100 MHz while maintaining compatibility with existing equipment. A proposal for system clock synchronization with GPS for improved precision is also presented. Work supported in part by US DOE under DE-FC02-04ER54698 and the National Undergraduate Fellowship in Fusion Science and Engineering.

  6. Actionable Diagnosis of Neuroleptospirosis by Next-Generation Sequencing

    PubMed Central

    Wilson, Michael R.; Naccache, Samia N.; Samayoa, Erik; Biagtan, Mark; Bashir, Hiba; Yu, Guixia; Salamat, Shahriar M.; Somasekar, Sneha; Federman, Scot; Miller, Steve; Sokolic, Robert; Garabedian, Elizabeth; Candotti, Fabio; Buckley, Rebecca H.; Reed, Kurt D.; Meyer, Teresa L.; Seroogy, Christine M.; Galloway, Renee; Henderson, Sheryl L.; Gern, James E.; DeRisi, Joseph L.; Chiu, Charles Y.

    2014-01-01

    SUMMARY A 14-year-old boy with severe combined immunodeficiency presented three times to a medical facility over a period of 4 months with fever and headache that progressed to hydrocephalus and status epilepticus necessitating a medically induced coma. Diagnostic workup including brain biopsy was unrevealing. Unbiased next-generation sequencing of the cerebrospinal fluid identified 475 of 3,063,784 sequence reads (0.016%) corresponding to leptospira infection. Clinical assays for leptospirosis were negative. Targeted antimicrobial agents were administered, and the patient was discharged home 32 days later with a status close to his premorbid condition. Polymerase-chain-reaction (PCR) and serologic testing at the Centers for Disease Control and Prevention (CDC) subsequently confirmed evidence of Leptospira santarosai infection. PMID:24896819

  7. Next-generation sequencing: big data meets high performance computing.

    PubMed

    Schmidt, Bertil; Hildebrandt, Andreas

    2017-02-02

    The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern high performance computing systems is required.

  8. Second-generation environmental sequencing unmasks marine metazoan biodiversity

    PubMed Central

    Fonseca, Vera G.; Carvalho, Gary R.; Sung, Way; Johnson, Harriet F.; Power, Deborah M.; Neill, Simon P.; Packer, Margaret; Blaxter, Mark L.; Lambshead, P. John D.; Thomas, W. Kelley; Creer, Simon

    2010-01-01

    Biodiversity is of crucial importance for ecosystem functioning, sustainability and resilience, but the magnitude and organization of marine diversity at a range of spatial and taxonomic scales are undefined. In this paper, we use second-generation sequencing to unmask putatively diverse marine metazoan biodiversity in a Scottish temperate benthic ecosystem. We show that remarkable differences in diversity occurred at microgeographical scales and refute currently accepted ecological and taxonomic paradigms of meiofaunal identity, rank abundance and concomitant understanding of trophic dynamics. Richness estimates from the current benchmarked Operational Clustering of Taxonomic Units from Parallel UltraSequencing analyses are broadly aligned with those derived from morphological assessments. However, the slope of taxon rarefaction curves for many phyla remains incomplete, suggesting that the true alpha diversity is likely to exceed current perceptions. The approaches provide a rapid, objective and cost-effective taxonomic framework for exploring links between ecosystem structure and function of all hitherto intractable, but ecologically important, communities. PMID:20981026

  9. Using next generation transcriptome sequencing to predict an ectomycorrhizal metabolome

    PubMed Central

    2011-01-01

    Background Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. Results We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides) roots. The transcriptomic data was used to identify statistically significantly expressed gene models using a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. Conclusions The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems. PMID:21569493

  10. Next-generation sequencing technology in clinical virology.

    PubMed

    Capobianchi, M R; Giombini, E; Rozera, G

    2013-01-01

    Recent advances in nucleic acid sequencing technologies, referred to as 'next-generation' sequencing (NGS), have produced a true revolution and opened new perspectives for research and diagnostic applications, owing to the high speed and throughput of data generation. So far, NGS has been applied to metagenomics-based strategies for the discovery of novel viruses and the characterization of viral communities. Additional applications include whole viral genome sequencing, detection of viral genome variability, and the study of viral dynamics. These applications are particularly suitable for viruses such as human immunodeficiency virus, hepatitis B virus, and hepatitis C virus, whose error-prone replication machinery, combined with the high replication rate, results, in each infected individual, in the formation of many genetically related viral variants referred to as quasi-species. The viral quasi-species, in turn, represents the substrate for the selective pressure exerted by the immune system or by antiviral drugs. With traditional approaches, it is difficult to detect and quantify minority genomes present in viral quasi-species that, in fact, may have biological and clinical relevance. NGS provides, for each patient, a dataset of clonal sequences that is some order of magnitude higher than those obtained with conventional approaches. Hence, NGS is an extremely powerful tool with which to investigate previously inaccessible aspects of viral dynamics, such as the contribution of different viral reservoirs to replicating virus in the course of the natural history of the infection, co-receptor usage in minority viral populations harboured by different cell lineages, the dynamics of development of drug resistance, and the re-emergence of hidden genomes after treatment interruptions. The diagnostic application of NGS is just around the corner.

  11. A prototypic microfluidic platform generating stepwise concentration gradients for real-time study of cell apoptosis.

    PubMed

    Dai, Wen; Zheng, Yizhe; Luo, Kathy Qian; Wu, Hongkai

    2010-04-16

    This work describes the development of a prototypic microfluidic platform for the generation of stepwise concentration gradients of drugs. A sensitive apoptotic analysis method is integrated into this microfluidic system for studying apoptosis of HeLa cells under the influence of anticancer drug, etoposide, with various concentrations in parallel; it measures the yellow fluorescent proteincyan fluorescent protein fluorescence resonance energy transfer (FRET) signal that responds to the activation of caspase-3, an indicator of cell apoptosis. Sets of microfluidic valves on the chip generate stepwise concentration gradient of etoposide in various cell-culture microchambers. The FRET signals from multiple chambers are simultaneously monitored under a fluorescent microscope for long-time observation and the on-chip results are compared with those from 96-well plate study and the methylthiazolyldiphenyl-tetrazolium bromide (MTT) assay. The microfluidic platform shows several advantages including high-throughput capacity, low drug consumption, and high sensitivity.

  12. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    PubMed Central

    Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available. PMID:27327771

  13. The Antibody Genetics of Multiple Sclerosis: Comparing Next-Generation Sequencing to Sanger Sequencing

    PubMed Central

    Rounds, William H.; Ligocki, Ann J.; Levin, Mikhail K.; Greenberg, Benjamin M.; Bigwood, Douglas W.; Eastman, Eric M.; Cowell, Lindsay G.; Monson, Nancy L.

    2014-01-01

    We previously identified a distinct mutation pattern in the antibody genes of B cells isolated from cerebrospinal fluid (CSF) that can identify patients who have relapsing-remitting multiple sclerosis (RRMS) and patients with clinically isolated syndromes who will convert to RRMS. This antibody gene signature (AGS) was developed using Sanger sequencing of single B cells. While potentially helpful to patients, Sanger sequencing is not an assay that can be practically deployed in clinical settings. In order to provide AGS evaluations to patients as part of their diagnostic workup, we developed protocols to generate AGS scores using next-generation DNA sequencing (NGS) on CSF-derived cell pellets without the need to isolate single cells. This approach has the potential to increase the coverage of the B-cell population being analyzed, reduce the time needed to generate AGS scores, and may improve the overall performance of the AGS approach as a diagnostic test in the future. However, no investigations have focused on whether NGS-based repertoires will properly reflect antibody gene frequencies and somatic hypermutation patterns defined by Sanger sequencing. To address this issue, we isolated paired CSF samples from eight patients who either had MS or were at risk to develop MS. Here, we present data that antibody gene frequencies and somatic hypermutation patterns are similar in Sanger and NGS-based antibody repertoires from these paired CSF samples. In addition, AGS scores derived from the NGS database correctly identified the patients who initially had or subsequently converted to RRMS, with precision similar to that of the Sanger sequencing approach. Further investigation of the utility of the AGS in predicting conversion to MS using NGS-derived antibody repertoires in a larger cohort of patients is warranted. PMID:25278930

  14. An integrated approach for analyzing clinical genomic variant data from next-generation sequencing.

    PubMed

    Crowgey, Erin L; Stabley, Deborah L; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M; Polson, Shawn W; Sol-Church, Katia; Wu, Cathy H

    2015-04-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource's iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease.

  15. An Integrated Approach for Analyzing Clinical Genomic Variant Data from Next-Generation Sequencing

    PubMed Central

    Stabley, Deborah L.; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M.; Polson, Shawn W.; Sol-Church, Katia; Wu, Cathy H.

    2015-01-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource’s iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease. PMID:25649353

  16. A Next-Generation Sequencing Method for Genotyping-by-Sequencing of Highly Heterozygous Autotetraploid Potato

    PubMed Central

    Uitdewilligen, Jan G. A. M. L.; Wolters, Anne-Marie A.; D’hoop, Bjorn B.; Borm, Theo J. A.; Visser, Richard G. F.; van Eck, Herman J.

    2013-01-01

    Assessment of genomic DNA sequence variation and genotype calling in autotetraploids implies the ability to distinguish among five possible alternative allele copy number states. This study demonstrates the accuracy of genotyping-by-sequencing (GBS) of a large collection of autotetraploid potato cultivars using next-generation sequencing. It is still costly to reach sufficient read depths on a genome wide scale, across the cultivated gene pool. Therefore, we enriched cultivar-specific DNA sequencing libraries using an in-solution hybridisation method (SureSelect). This complexity reduction allowed to confine our study to 807 target genes distributed across the genomes of 83 tetraploid cultivars and one reference (DM 1–3 511). Indexed sequencing libraries were paired-end sequenced in 7 pools of 12 samples using Illumina HiSeq2000. After filtering and processing the raw sequence data, 12.4 Gigabases of high-quality sequence data was obtained, which mapped to 2.1 Mb of the potato reference genome, with a median average read depth of 63× per cultivar. We detected 129,156 sequence variants and genotyped the allele copy number of each variant for every cultivar. In this cultivar panel a variant density of 1 SNP/24 bp in exons and 1 SNP/15 bp in introns was obtained. The average minor allele frequency (MAF) of a variant was 0.14. Potato germplasm displayed a large number of relatively rare variants and/or haplotypes, with 61% of the variants having a MAF below 0.05. A very high average nucleotide diversity (π = 0.0107) was observed. Nucleotide diversity varied among potato chromosomes. Several genes under selection were identified. Genotyping-by-sequencing results, with allele copy number estimates, were validated with a KASP genotyping assay. This validation showed that read depths of ∼60–80× can be used as a lower boundary for reliable assessment of allele copy number of sequence variants in autotetraploids. Genotypic data were associated with traits, and

  17. Computational characterisation of cancer molecular profiles derived using next generation sequencing

    PubMed Central

    Oleksiewicz, Urszula; Tomczak, Katarzyna; Woropaj, Jakub; Markowska, Monika; Stępniak, Piotr

    2015-01-01

    Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets. PMID:25691827

  18. Investigation of a steam generator tube rupture sequence using VICTORIA

    SciTech Connect

    Bixler, N.E.; Erickson, C.M.; Schaperow, J.H.

    1995-12-31

    VICTORIA-92 is a mechanistic computer code for analyzing fission product behavior within the reactor coolant system (RCS) during a severe reactor accident. It provides detailed predictions of the release of radionuclides and nonradioactive materials from the core and transport of these materials within the RCS. The modeling accounts for the chemical and aerosol processes that affect radionuclide behavior. Coupling of detailed chemistry and aerosol packages is a unique feature of VICTORIA; it allows exploration of phenomena involving deposition, revaporization, and re-entrainment that cannot be resolved with other codes. The purpose of this work is to determine the attenuation of fission products in the RCS and on the secondary side of the steam generator in an accident initiated by a steam generator tube rupture (SGTR). As a class, bypass sequences have been identified in NUREG-1150 as being risk dominant for the Surry and Sequoyah pressurized water reactor (PWR) plants.

  19. Detection of Inter-Lineage Natural Recombination in Avian Paramyxovirus Serotype 1 Using Simplified Deep Sequencing Platform

    PubMed Central

    Satharasinghe, Dilan A.; Murulitharan, Kavitha; Tan, Sheau W.; Yeap, Swee K.; Munir, Muhammad; Ideris, Aini; Omar, Abdul R.

    2016-01-01

    Newcastle disease virus (NDV) is a prototype member of avian paramyxovirus serotype 1 (APMV-1), which causes severe and contagious disease in the commercial poultry and wild birds. Despite extensive vaccination programs and other control measures, the disease remains endemic around the globe especially in Asia, Africa, and the Middle East. Being a single serotype, genotype II based vaccines remained most acceptable means of immunization. However, the evidence is emerging on failures of vaccines mainly due to evolving nature of the virus and higher genetic gaps between vaccine and field strains of APMV-1. Most of the epidemiological and genetic characterizations of APMVs are based on conventional methods, which are prone to mask the diverse population of viruses in complex samples. In this study, we report the application of a simple, robust, and less resource-demanding methodology for the whole genome sequencing of NDV, using next-generation sequencing (NGS) on the Illumina MiSeq platform. Using this platform, we sequenced full genomes of five virulent Malaysian NDV strains collected during 2004–2013. All isolates clustered within highly prevalent lineage 5 (specifically in lineage 5a); however, a significantly greater genetic divergence was observed in isolates collected from 2004 to 2011. Interestingly, genetic characterization of one isolate collected in 2013 (IBS025/13) shown natural recombination between lineage 2 and lineage 5. In the event of recombination, the isolate (IBS025/13) carried nucleocapsid protein consist of 55–1801 nucleotides (nts) and near-complete phosphoprotein (1804–3254 nts) genes of lineage 2 whereas surface glycoproteins (fusion, hemagglutinin-neuraminidase) and large polymerase of lineage 5. Additionally, the recombinant virus has a genome size of 15,186 nts which is characteristics for the old genotypes I–IV isolated from 1930 to 1960. Taken together, we report the occurrence of a natural recombination in circulating strains of

  20. A Comprehensive Platform for NGS Data Analysis

    SciTech Connect

    Kravitz, Saul

    2010-06-03

    Saul Kravitz of CLC Bio discusses the company's Genomic Workbench and how it can be used with data from next generation sequencing platforms on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  1. Functional genomics of a living fossil tree, Ginkgo, based on next-generation sequencing technology.

    PubMed

    Lin, Xiaohan; Zhang, Jin; Li, Ying; Luo, Hongmei; Wu, Qiong; Sun, Chao; Song, Jingyuan; Li, Xiwen; Wei, Jianhe; Lu, Aiping; Qian, Zhongzhi; Khan, Ikhlas A; Chen, Shilin

    2011-11-01

    Ginkgo biloba is monotypic species native to China and has old, dioecious, medicinally important characteristics. The functional genes related to these characteristics have not been effectively explored due to a limited number of expressed sequence tags (ESTs) from Ginkgo. To discover novel functional genes efficiently and to understand the development of a living fossil tree, Ginkgo, we used massive parallel pyrosequencing on the Roche 454 GS FLX Titanium platform to generate 64 057 ESTs. The ESTs combined with the 21 590 Ginkgo ESTs in genbank were assembled into 22 304 unique putative transcripts, in which 13 922 novel unique putative transcripts were identified by 454 sequencing. After being assigned to putative functions with Gene Ontology terms, a detailed view of the Ginkgo biological systems was displayed, including characterization of unique putative transcripts with homology to known key enzymes and transcription factors involved in ginkgolide/bilobalide and flavonoid biosynthetic pathways, as well as unique putative transcripts related to development, response to disease and defence. The fact that three full-length Ginkgo genes encoding key enzymes were found and cloned, suggests that high-throughput sequencing technology is superior to traditional gene-by-gene approach in discovery of genes. Additionally, a total of 204 simple sequence repeat motifs were detected. Our study not only lays the foundations for transcriptome-led studies in biosynthetic mechanisms, but also contributes significantly to the understanding of functional genomics and development in non-model plants.

  2. Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.

    PubMed

    Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L

    2016-05-01

    Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies.

  3. Application of next-generation sequencing technologies in Neurology.

    PubMed

    Jiang, Teng; Tan, Meng-Shan; Tan, Lan; Yu, Jin-Tai

    2014-12-01

    Genetic risk factors that underlie many rare and common neurological diseases remain poorly understood because of the multi-factorial and heterogeneous nature of these disorders. Although genome-wide association studies (GWAS) have successfully uncovered numerous susceptibility genes for these diseases, odds ratios associated with risk alleles are generally low and account for only a small proportion of estimated heritability. These results implicated that there are rare (present in <5% of the population) but not causative variants exist in the pathogenesis of these diseases, which usually have large effect size and cannot be captured by GWAS. With the decreasing cost of next-generation sequencing (NGS) technologies, whole-genome sequencing (WGS) and whole-exome sequencing (WES) have enabled the rapid identification of rare variants with large effect size, which made huge progress in understanding the basis of many Mendelian neurological conditions as well as complex neurological diseases. In this article, recent NGS-based studies that aimed to investigate genetic causes for neurological diseases, including Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, stroke, amyotrophic lateral sclerosis and spinocerebellar ataxias, have been reviewed. In addition, we also discuss the future directions of NGS applications in this article.

  4. Generation of animation sequences of three dimensional models

    NASA Technical Reports Server (NTRS)

    Poi, Sharon (Inventor); Bell, Brad N. (Inventor)

    1990-01-01

    The invention is directed toward a method and apparatus for generating an animated sequence through the movement of three-dimensional graphical models. A plurality of pre-defined graphical models are stored and manipulated in response to interactive commands or by means of a pre-defined command file. The models may be combined as part of a hierarchical structure to represent physical systems without need to create a separate model which represents the combined system. System motion is simulated through the introduction of translation, rotation and scaling parameters upon a model within the system. The motion is then transmitted down through the system hierarchy of models in accordance with hierarchical definitions and joint movement limitations. The present invention also calls for a method of editing hierarchical structure in response to interactive commands or a command file such that a model may be included, deleted, copied or moved within multiple system model hierarchies. The present invention also calls for the definition of multiple viewpoints or cameras which may exist as part of a system hierarchy or as an independent camera. The simulated movement of the models and systems is graphically displayed on a monitor and a frame is recorded by means of a video controller. Multiple movement and hierarchy manipulations are then recorded as a sequence of frames which may be played back as an animation sequence on a video cassette recorder.

  5. Applications for next-generation sequencing in fish ecotoxicogenomics

    PubMed Central

    Mehinto, Alvine C.; Martyniuk, Christopher J.; Spade, Daniel J.; Denslow, Nancy D.

    2012-01-01

    The new technologies for next-generation sequencing (NGS) and global gene expression analyses that are widely used in molecular medicine are increasingly applied to the field of fish biology. This has facilitated new directions to address research areas that could not be previously considered due to the lack of molecular information for ecologically relevant species. Over the past decade, the cost of NGS has decreased significantly, making it possible to use non-model fish species to investigate emerging environmental issues. NGS technologies have permitted researchers to obtain large amounts of raw data in short periods of time. There have also been significant improvements in bioinformatics to assemble the sequences and annotate the genes, thus facilitating the management of these large datasets.The combination of DNA sequencing and bioinformatics has improved our abilities to design custom microarrays and study the genome and transcriptome of a wide variety of organisms. Despite the promising results obtained using these techniques in fish studies, NGS technologies are currently underused in ecotoxicogenomics and few studies have employed these methods. These issues should be addressed in order to exploit the full potential of NGS in ecotoxicological studies and expand our understanding of the biology of non-model organisms. PMID:22539934

  6. Next generation sequencing technologies: tool to study avian virus diversity.

    PubMed

    Kapgate, S S; Barbuddhe, S B; Kumanan, K

    2015-03-01

    Increased globalisation, climatic changes and wildlife-livestock interface led to emergence of novel viral pathogens or zoonoses that have become serious concern to avian, animal and human health. High biodiversity and bird migration facilitate spread of the pathogen and provide reservoirs for emerging infectious diseases. Current classical diagnostic methods designed to be virus-specific or aim to be limited to group of viral agents, hinder identifying of novel viruses or viral variants. Recently developed approaches of next-generation sequencing (NGS) provide culture-independent methods that are useful for understanding viral diversity and discovery of novel virus, thereby enabling a better diagnosis and disease control. This review discusses the different possible steps of a NGS study utilizing sequence-independent amplification, high-throughput sequencing and bioinformatics approaches to identify novel avian viruses and their diversity. NGS lead to the identification of a wide range of new viruses such as picobirnavirus, picornavirus, orthoreovirus and avian gamma coronavirus associated with fulminating disease in guinea fowl and is also used in describing viral diversity among avian species. The review also briefly discusses areas of viral-host interaction and disease associated causalities with newly identified avian viruses.

  7. Next Generation Sequencing in Predicting Gene Function in Podophyllotoxin Biosynthesis*

    PubMed Central

    Marques, Joaquim V.; Kim, Kye-Won; Lee, Choonseok; Costa, Michael A.; May, Gregory D.; Crow, John A.; Davin, Laurence B.; Lewis, Norman G.

    2013-01-01

    Podophyllum species are sources of (−)-podophyllotoxin, an aryltetralin lignan used for semi-synthesis of various powerful and extensively employed cancer-treating drugs. Its biosynthetic pathway, however, remains largely unknown, with the last unequivocally demonstrated intermediate being (−)-matairesinol. Herein, massively parallel sequencing of Podophyllum hexandrum and Podophyllum peltatum transcriptomes and subsequent bioinformatics analyses of the corresponding assemblies were carried out. Validation of the assembly process was first achieved through confirmation of assembled sequences with those of various genes previously established as involved in podophyllotoxin biosynthesis as well as other candidate biosynthetic pathway genes. This contribution describes characterization of two of the latter, namely the cytochrome P450s, CYP719A23 from P. hexandrum and CYP719A24 from P. peltatum. Both enzymes were capable of converting (−)-matairesinol into (−)-pluviatolide by catalyzing methylenedioxy bridge formation and did not act on other possible substrates tested. Interestingly, the enzymes described herein were highly similar to methylenedioxy bridge-forming enzymes from alkaloid biosynthesis, whereas candidates more similar to lignan biosynthetic enzymes were catalytically inactive with the substrates employed. This overall strategy has thus enabled facile further identification of enzymes putatively involved in (−)-podophyllotoxin biosynthesis and underscores the deductive power of next generation sequencing and bioinformatics to probe and deduce medicinal plant biosynthetic pathways. PMID:23161544

  8. Application of next-generation sequencing technologies in Neurology

    PubMed Central

    Jiang, Teng; Tan, Meng-Shan

    2014-01-01

    Genetic risk factors that underlie many rare and common neurological diseases remain poorly understood because of the multi-factorial and heterogeneous nature of these disorders. Although genome-wide association studies (GWAS) have successfully uncovered numerous susceptibility genes for these diseases, odds ratios associated with risk alleles are generally low and account for only a small proportion of estimated heritability. These results implicated that there are rare (present in <5% of the population) but not causative variants exist in the pathogenesis of these diseases, which usually have large effect size and cannot be captured by GWAS. With the decreasing cost of next-generation sequencing (NGS) technologies, whole-genome sequencing (WGS) and whole-exome sequencing (WES) have enabled the rapid identification of rare variants with large effect size, which made huge progress in understanding the basis of many Mendelian neurological conditions as well as complex neurological diseases. In this article, recent NGS-based studies that aimed to investigate genetic causes for neurological diseases, including Alzheimer’s disease, Parkinson’s disease, epilepsy, multiple sclerosis, stroke, amyotrophic lateral sclerosis and spinocerebellar ataxias, have been reviewed. In addition, we also discuss the future directions of NGS applications in this article. PMID:25568878

  9. High Resolution Near Surface 3D Seismic Experiments: A Carbonate Platform vs. a Siliciclastic Sequence

    NASA Astrophysics Data System (ADS)

    Filippidou, N.; Drijkoningen, G.; Braaksma, H.; Verwer, K.; Kenter, J.

    2005-05-01

    Interest in high-resolution 3D seismic experiments for imaging shallow targets has increased over the past years. Many case studies presented, show that producing clear seismic images with this non-evasive method, is still a challenge. We use two test-sites where nearby outcrops are present so that an accurate geological model can be built and the seismic result validated. The first so-called natural field laboratory is located in Boulonnais (N. France). It is an upper Jurassic siliciclastic sequence; age equivalent of the source rock of N. Sea. The second one is located in Cap Blanc,to the southwest of the Mallorca island(Spain); depicting an excellent example of Miocene prograding reef platform (Llucmajor Platform); it is a textbook analog for carbonate reservoirs. In both cases, the multidisciplinary experiment included the use of multicomponent and quasi- or 3D seismic recordings. The target depth does not exceed 120m. Vertical and shear portable vibrators were used as source. In the center of the setups, boreholes were drilled and Vertical Seismic Profiles were shot, along with core and borehole measurements both in situ and in the laboratory. These two geologically different sites, with different seismic stratigraphy have provided us with exceptionally high resolution seismic images. In general seismic data was processed more or less following standard procedures, a few innovative techniques on the Mallorca data, as rotation of horizontal components, 3D F-K filter and addition of parallel profiles, have improved the seismic image. In this paper we discuss the basic differences as seen on the seismic sections. The Boulonnais data present highly continuous reflection patterns of extremenly high resolution. This facilitated a high resolution stratigraphic description. Results from the VSP showed substantial wave energy attenuation. However, the high-fold (330 traces ) Mallorca seismic experiment returned a rather discontinuous pattern of possible reflectors

  10. Application of next-generation sequencing technologies in virology.

    PubMed

    Radford, Alan D; Chapman, David; Dixon, Linda; Chantrey, Julian; Darby, Alistair C; Hall, Neil

    2012-09-01

    The progress of science is punctuated by the advent of revolutionary technologies that provide new ways and scales to formulate scientific questions and advance knowledge. Following on from electron microscopy, cell culture and PCR, next-generation sequencing is one of these methodologies that is now changing the way that we understand viruses, particularly in the areas of genome sequencing, evolution, ecology, discovery and transcriptomics. Possibilities for these methodologies are only limited by our scientific imagination and, to some extent, by their cost, which has restricted their use to relatively small numbers of samples. Challenges remain, including the storage and analysis of the large amounts of data generated. As the chemistries employed mature, costs will decrease. In addition, improved methods for analysis will become available, opening yet further applications in virology including routine diagnostic work on individuals, and new understanding of the interaction between viral and host transcriptomes. An exciting era of viral exploration has begun, and will set us new challenges to understand the role of newly discovered viral diversity in both disease and health.

  11. [Next generation sequencing for the diagnostics and epidemiology of tuberculosis].

    PubMed

    Comas, Iñaki; Gil, Ana

    2016-07-01

    Tuberculosis (TB) has overtaken HIV (human immunodeficiency virus) and malaria as the leading cause of death by an infectious disease worldwide. The reduction in the TB incidence is a modest 2% of cases per year, thus we will need 200 years to eradicate the disease. Part of the problem is that TB control tools are decades old and cannot anymore contribute to accelerate eradication of TB. New diagnostics, treatments and vaccines are urgently needed. Next generation sequencing has the potential to become one of these new tools. Genomic characterization of TB isolates is already showing its potential for epidemiology and diagnostics, particularly to identify drug resistance mutations. However, the experimental and bioinformatics skills needed are still far from being standardized and are not easy to incorporate as a routine in clinical laboratories. In this review we will describe current next generation sequencing approaches applied to the Mycobacterium tuberculosis complex, their contribution to the diagnostics and epidemiology of the disease and the efforts that are being undertaken to make the technology accessible to public health and clinical microbiology laboratories.

  12. Application of next-generation sequencing technologies in virology

    PubMed Central

    Chapman, David; Dixon, Linda; Chantrey, Julian; Darby, Alistair C.; Hall, Neil

    2012-01-01

    The progress of science is punctuated by the advent of revolutionary technologies that provide new ways and scales to formulate scientific questions and advance knowledge. Following on from electron microscopy, cell culture and PCR, next-generation sequencing is one of these methodologies that is now changing the way that we understand viruses, particularly in the areas of genome sequencing, evolution, ecology, discovery and transcriptomics. Possibilities for these methodologies are only limited by our scientific imagination and, to some extent, by their cost, which has restricted their use to relatively small numbers of samples. Challenges remain, including the storage and analysis of the large amounts of data generated. As the chemistries employed mature, costs will decrease. In addition, improved methods for analysis will become available, opening yet further applications in virology including routine diagnostic work on individuals, and new understanding of the interaction between viral and host transcriptomes. An exciting era of viral exploration has begun, and will set us new challenges to understand the role of newly discovered viral diversity in both disease and health. PMID:22647373

  13. Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing

    PubMed Central

    2011-01-01

    Background The garden pea, Pisum sativum, is among the best-investigated legume plants and of significant agro-commercial relevance. Pisum sativum has a large and complex genome and accordingly few comprehensive genomic resources exist. Results We analyzed the pea transcriptome at the highest possible amount of accuracy by current technology. We used next generation sequencing with the Roche/454 platform and evaluated and compared a variety of approaches, including diverse tissue libraries, normalization, alternative sequencing technologies, saturation estimation and diverse assembly strategies. We generated libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings, comprising a total of 450 megabases. Libraries were assembled into 324,428 unigenes in a first pass assembly. A second pass assembly reduced the amount to 81,449 unigenes but caused a significant number of chimeras. Analyses of the assemblies identified the assembly step as a major possibility for improvement. By recording frequencies of Arabidopsis orthologs hit by randomly drawn reads and fitting parameters of the saturation curve we concluded that sequencing was exhaustive. For leaf libraries we found normalization allows partial recovery of expression strength aside the desired effect of increased coverage. Based on theoretical and biological considerations we concluded that the sequence reads in the database tagged the vast majority of transcripts in the aerial tissues. A pathway representation analysis showed the merits of sampling multiple aerial tissues to increase the number of tagged genes. All results have been made available as a fully annotated database in fasta format. Conclusions We conclude that the approach taken resulted in a high quality - dataset which serves well as a first comprehensive reference set for the model legume pea. We suggest future deep sequencing transcriptome projects of species lacking a genomics backbone will

  14. SNP discovery in the transcriptome of white Pacific shrimp Litopenaeus vannamei by next generation sequencing.

    PubMed

    Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2014-01-01

    The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies.

  15. Integrated platform for optimized solar PV system design and engineering plan set generation

    SciTech Connect

    Adeyemo, Samuel

    2015-12-30

    The Aurora team has developed software that allows users to quickly generate a three-dimensional model for a building, with a corresponding irradiance map, from any two-dimensional image with associated geo-coordinates. The purpose of this project is to build upon that technology by developing and distributing to solar installers a software platform that automatically retrieves engineering, financial and geographic data for a specific site, and quickly generates an optimal customer proposal and corresponding engineering plans for that site. At the end of the project, Aurora’s optimization platform would have been used to make at least one thousand proposals from at least ten unique solar installation companies, two of whom would sign economically viable contracts to use the software. Furthermore, Aurora’s algorithms would be tested to show that in at least seventy percent of cases, Aurora automatically generated a design equivalent to or better than what a human could have done manually. A ‘better’ design is one that generates more energy for the same cost, or that generates a higher return on investment, while complying with all site-specific aesthetic, electrical and spatial requirements.

  16. Next-Generation Sequencing and Genome Editing in Plant Virology

    PubMed Central

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21–24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007

  17. Next-Generation Sequencing and Genome Editing in Plant Virology.

    PubMed

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology.

  18. Addressing Benefits, Risks and Consent in Next Generation Sequencing Studies

    PubMed Central

    Meller, R

    2016-01-01

    The sequencing of the human genome and technological advances in DNA sequencing have led to a revolution with respect to DNA sequencing and its potential to diagnose genetic disorders. However, requests for open access to genomic data must be balanced against the guiding principles of the Common Rule for human subject research. Unfortunately, the risks to patients involved in genomic studies are still evolving and as such may not be clear to learned and well-intentioned scientists. Central to this issue are the strategies that enable human participants in such studies to remain anonymous, or de-identified. The wealth of genomic data on the Internet in genomic data repositories and other databases has enabled de-identified data to be broken and research subjects to be identified. The security of de-identification neglects the fact that DNA itself is an identifying element. Therefore, it is questionable whether data security standards can ever truly protect the identity of a patient, under the current conditions or in the future. As Big Data methodologies advance, additional sources of data may enable the re-identification of patients enrolled in next-generation sequencing (NGS) studies. As such, it is time to re-evaluate the risks of sharing genomic data and establish new guidelines for good practices. In this commentary, I address the challenges facing federally funded investigators who need to strike a balance between compliance with federal (US) rules for human subjects and the recent requirement for open access/sharing of data from National Institute for Health (NIH)-funded studies involving human subjects. PMID:27375922

  19. SMITH: a LIMS for handling next-generation sequencing workflows

    PubMed Central

    2014-01-01

    Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The

  20. Next-generation polyploid phylogenetics: rapid resolution of hybrid polyploid complexes using PacBio single-molecule sequencing.

    PubMed

    Rothfels, Carl J; Pryer, Kathleen M; Li, Fay-Wei

    2017-01-01

    Difficulties in generating nuclear data for polyploids have impeded phylogenetic study of these groups. We describe a high-throughput protocol and an associated bioinformatics pipeline (Pipeline for Untangling Reticulate Complexes (Purc)) that is able to generate these data quickly and conveniently, and demonstrate its efficacy on accessions from the fern family Cystopteridaceae. We conclude with a demonstration of the downstream utility of these data by inferring a multi-labeled species tree for a subset of our accessions. We amplified four c. 1-kb-long nuclear loci and sequenced them in a parallel-tagged amplicon sequencing approach using the PacBio platform. Purc infers the final sequences from the raw reads via an iterative approach that corrects PCR and sequencing errors and removes PCR-mediated recombinant sequences (chimeras). We generated data for all gene copies (homeologs, paralogs, and segregating alleles) present in each of three sets of 50 mostly polyploid accessions, for four loci, in three PacBio runs (one run per set). From the raw sequencing reads, Purc was able to accurately infer the underlying sequences. This approach makes it easy and economical to study the phylogenetics of polyploids, and, in conjunction with recent analytical advances, facilitates investigation of broad patterns of polyploid evolution.

  1. Microfluidic platform for on-demand generation of spatially indexed combinatorial droplets.

    PubMed

    Zec, Helena; Rane, Tushar D; Wang, Tza-Huei

    2012-09-07

    We propose a highly versatile and programmable nanolitre droplet-based platform that accepts an unlimited number of sample plugs from a multi-well plate, performs digitization of these sample plugs into smaller daughter droplets and subsequent synchronization-free, robust injection of multiple reagents into the sample daughter droplets on-demand. This platform combines excellent control of valve-based microfluidics with the high-throughput capability of droplet microfluidics. We demonstrate the functioning of a proof-of-concept device which generates combinatorial mixture droplets from a linear array of sample plugs and four different reagents, using food dyes to mimic samples and reagents. Generation of a one dimensional array of the combinatorial mixture droplets on the device leads to automatic spatial indexing of these droplets, precluding the need to include a barcode in each droplet to identify its contents. We expect this platform to further expand the range of applications of droplet microfluidics to include applications requiring a high degree of multiplexing as well as high throughput analysis of multiple samples.

  2. ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence

    PubMed Central

    2011-01-01

    Background The possibilities offered by next generation sequencing (NGS) platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model species. However, this abundance of sequences and polymorphisms creates new software needs. To fulfill these needs, we have developed a powerful, yet easy-to-use application. Results The ngs_backbone software is a parallel pipeline capable of analyzing Sanger, 454, Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection) sequence reads. Its main supported analyses are: read cleaning, transcriptome assembly and annotation, read mapping and single nucleotide polymorphism (SNP) calling and selection. In order to build a truly useful tool, the software development was paired with a laboratory experiment. All public tomato Sanger EST reads plus 14.2 million Illumina reads were employed to test the tool and predict polymorphism in tomato. The cleaned reads were mapped to the SGN tomato transcriptome obtaining a coverage of 4.2 for Sanger and 8.5 for Illumina. 23,360 single nucleotide variations (SNVs) were predicted. A total of 76 SNVs were experimentally validated, and 85% were found to be real. Conclusions ngs_backbone is a new software package capable of analyzing sequences produced by NGS technologies and predicting SNVs with great accuracy. In our tomato example, we created a highly polymorphic collection of SNVs that will be a useful resource for tomato researchers and breeders. The software developed along with its documentation is freely available under the AGPL license and can be downloaded from http://bioinf.comav.upv.es/ngs_backbone/ or http://github.com/JoseBlanca/franklin. PMID:21635747

  3. Transcriptome sequencing and analysis of leaf tissue of Avicennia marina using the Illumina platform.

    PubMed

    Huang, Jianzi; Lu, Xiang; Zhang, Wanke; Huang, Rongfeng; Chen, Shouyi; Zheng, Yizhi

    2014-01-01

    Avicennia marina is a widely distributed mangrove species that thrives in high-salinity habitats. It plays a significant role in supporting coastal ecosystem and holds unique potential for studying molecular mechanisms underlying ecological adaptation. Despite and sometimes because of its numerous merits, this species is facing increasing pressure of exploitation and deforestation. Both study on adaptation mechanisms and conservation efforts necessitate more genomic resources for A. marina. In this study, we used Illumina sequencing of an A. marina foliar cDNA library to generate a transcriptome dataset for gene and marker discovery. We obtained 40 million high-quality reads and assembled them into 91,125 unigenes with a mean length of 463 bp. These unigenes covered most of the publicly available A. marina Sanger ESTs and greatly extended the repertoire of transcripts for this species. A total of 54,497 and 32,637 unigenes were annotated based on homology to sequences in the NCBI non-redundant and the Swiss-prot protein databases, respectively. Both Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed some transcriptomic signatures of stress adaptation for this halophytic species. We also detected an extraordinary amount of transcripts derived from fungal endophytes and demonstrated the utility of transcriptome sequencing in surveying endophyte diversity without isolating them out of plant tissues. Additionally, we identified 3,423 candidate simple sequence repeats (SSRs) from 3,141 unigenes with a density of one SSR locus every 8.25 kb sequence. Our transcriptomic data will provide valuable resources for ecological, genetic and evolutionary studies in A. marina.

  4. High-Throughput, Amplicon-Based Sequencing of the CREBBP Gene as a Tool to Develop a Universal Platform-Independent Assay

    PubMed Central

    Fuellgrabe, Marc W.; Herrmann, Dietrich; Knecht, Henrik; Kuenzel, Sven; Kneba, Michael; Pott, Christiane; Brüggemann, Monika

    2015-01-01

    High-throughput sequencing technologies are widely used to analyse genomic variants or rare mutational events in different fields of genomic research, with a fast development of new or adapted platforms and technologies, enabling amplicon-based analysis of single target genes or even whole genome sequencing within a short period of time. Each sequencing platform is characterized by well-defined types of errors, resulting from different steps in the sequencing workflow. Here we describe a universal method to prepare amplicon libraries that can be used for sequencing on different high-throughput sequencing platforms. We have sequenced distinct exons of the CREB binding protein (CREBBP) gene and analysed the output resulting from three major deep-sequencing platforms. platform-specific errors were adjusted according to the result of sequence analysis from the remaining platforms. Additionally, bioinformatic methods are described to determine platform dependent errors. Summarizing the results we present a platform-independent cost-efficient and timesaving method that can be used as an alternative to commercially available sample-preparation kits. PMID:26057250

  5. Genomics of medulloblastoma: from Giemsa-banding to next-generation sequencing in 20 years.

    PubMed

    Northcott, Paul A; Rutka, James T; Taylor, Michael D

    2010-01-01

    Advances in the field of genomics have recently enabled the unprecedented characterization of the cancer genome, providing novel insight into the molecular mechanisms underlying malignancies in humans. The application of high-resolution microarray platforms to the study of medulloblastoma has revealed new oncogenes and tumor suppressors and has implicated changes in DNA copy number, gene expression, and methylation state in its etiology. Additionally, the integration of medulloblastoma genomics with patient clinical data has confirmed molecular markers of prognostic significance and highlighted the potential utility of molecular disease stratification. The advent of next-generation sequencing technologies promises to greatly transform our understanding of medulloblastoma pathogenesis in the next few years, permitting comprehensive analyses of all aspects of the genome and increasing the likelihood that genomic medicine will become part of the routine diagnosis and treatment of medulloblastoma.

  6. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  7. Sequence capture and next-generation sequencing of ultraconserved elements in a large-genome salamander.

    PubMed

    Newman, Catherine E; Austin, Christopher C

    2016-12-01

    Amidst the rapid advancement in next-generation sequencing (NGS) technology over the last few years, salamanders have been left behind. Salamanders have enormous genomes-up to 40 times the size of the human genome-and this poses challenges to generating NGS data sets of quality and quantity similar to those of other vertebrates. However, optimization of laboratory protocols is time-consuming and often cost prohibitive, and continued omission of salamanders from novel phylogeographic research is detrimental to species facing decline. Here, we use a salamander endemic to the southeastern United States, Plethodon serratus, to test the utility of an established protocol for sequence capture of ultraconserved elements (UCEs) in resolving intraspecific phylogeographic relationships and delimiting cryptic species. Without modifying the standard laboratory protocol, we generated a data set consisting of over 600 million reads for 85 P. serratus samples. Species delimitation analyses support recognition of seven species within P. serratus sensu lato, and all phylogenetic relationships among the seven species are fully resolved under a coalescent model. Results also corroborate previous data suggesting nonmonophyly of the Ouachita and Louisiana regions. Our results demonstrate that established UCE protocols can successfully be used in phylogeographic studies of salamander species, providing a powerful tool for future research on evolutionary history of amphibians and other organisms with large genomes.

  8. Transcriptome sequencing as a platform to elucidate molecular components of the diapause response in the Asian tiger mosquito, Aedes albopictus.

    PubMed

    Poelchau, Monica F; Reynolds, Julie A; Denlinger, David L; Elsik, Christine G; Armbruster, Peter A

    2013-06-01

    Diapause has long been recognized as a crucial ecological adaptation to spatio-temporal environmental variation. More recently, rapid evolution of the diapause response has been implicated in response to contemporary global warming and during the range expansion of invasive species. Although the molecular regulation of diapause remains largely unresolved, rapidly emerging next-generation sequencing (NGS) technologies provide exciting opportunities to address this longstanding question. Herein, a new assembly from life-history stages relevant to diapause in the Asian tiger mosquito, Aedes albopictus (Skuse) is presented, along with unique methods for the analysis of NGS data and transcriptome assembly. A digital normalization procedure that significantly reduces computational resources required for transcriptome assembly is evaluated. Additionally, a method for protein reference-based and genomic reference-based merged assembly of 454 and Illumina reads is described. Finally, a gene ontology analysis is presented, which creates a platform to identify physiological processes associated with diapause. Taken together, these methods provide valuable tools for analyzing the transcriptional underpinnings of many complex phenotypes, including diapause, and provide a basis for determining the molecular regulation of diapause in Ae. albopictus.

  9. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing.

    PubMed

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity.

  10. Low Diversity in the Mitogenome of Sperm Whales Revealed by Next-Generation Sequencing

    PubMed Central

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C. Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity. PMID:23254394

  11. Targeted Next Generation Sequencing Identifies Markers of Response to PD-1 Blockade.

    PubMed

    Johnson, Douglas B; Frampton, Garrett M; Rioth, Matthew J; Yusko, Erik; Xu, Yaomin; Guo, Xingyi; Ennis, Riley C; Fabrizio, David; Chalmers, Zachary R; Greenbowe, Joel; Ali, Siraj M; Balasubramanian, Sohail; Sun, James X; He, Yuting; Frederick, Dennie T; Puzanov, Igor; Balko, Justin M; Cates, Justin M; Ross, Jeffrey S; Sanders, Catherine; Robins, Harlan; Shyr, Yu; Miller, Vincent A; Stephens, Philip J; Sullivan, Ryan J; Sosman, Jeffrey A; Lovly, Christine M

    2016-11-01

    Therapeutic antibodies blocking programmed death-1 and its ligand (PD-1/PD-L1) induce durable responses in a substantial fraction of melanoma patients. We sought to determine whether the number and/or type of mutations identified using a next-generation sequencing (NGS) panel available in the clinic was correlated with response to anti-PD-1 in melanoma. Using archival melanoma samples from anti-PD-1/PD-L1-treated patients, we performed hybrid capture-based NGS on 236-315 genes and T-cell receptor (TCR) sequencing on initial and validation cohorts from two centers. Patients who responded to anti-PD-1/PD-L1 had higher mutational loads in an initial cohort (median, 45.6 vs. 3.9 mutations/MB; P = 0.003) and a validation cohort (37.1 vs. 12.8 mutations/MB; P = 0.002) compared with nonresponders. Response rate, progression-free survival, and overall survival were superior in the high, compared with intermediate and low, mutation load groups. Melanomas with NF1 mutations harbored high mutational loads (median, 62.7 mutations/MB) and high response rates (74%), whereas BRAF/NRAS/NF1 wild-type melanomas had a lower mutational load. In these archival samples, TCR clonality did not predict response. Mutation numbers in the 315 genes in the NGS platform strongly correlated with those detected by whole-exome sequencing in The Cancer Genome Atlas samples, but was not associated with survival. In conclusion, mutational load, as determined by an NGS platform available in the clinic, effectively stratified patients by likelihood of response. This approach may provide a clinically feasible predictor of response to anti-PD-1/PD-L1. Cancer Immunol Res; 4(11); 959-67. ©2016 AACR.

  12. Metre-scale cyclicity in Middle Eocene platform carbonates in northern Egypt: Implications for facies development and sequence stratigraphy

    NASA Astrophysics Data System (ADS)

    Tawfik, Mohamed; El-Sorogy, Abdelbaset; Moussa, Mahmoud

    2016-07-01

    The shallow-water carbonates of the Middle Eocene in northern Egypt represent a Tethyan reef-rimmed carbonate platform with bedded inner-platform facies. Based on extensive micro- and biofacies documentation, five lithofacies associations were defined and their respective depositional environments were interpreted. Investigated sections were subdivided into three third-order sequences, named S1, S2 and S3. Sequence S1 is interpreted to correspond to the Lutetian, S2 corresponds to the Late Lutetian and Early Bartonian, and S3 represents the Late Bartonian. Each of the three sequences was further subdivided into fourth-order cycle sets and fifth-order cycles. The complete hierarchy of cycles can be correlated along 190 km across the study area, and highlighting a general "layer-cake" stratigraphic architecture. The documentation of the studied outcrops may contribute to the better regional understanding of the Middle Eocene formations in northern Egypt and to Tethyan pericratonic carbonate models in general.

  13. Molecular diagnostics of a single drug-resistant multiple myeloma case using targeted next-generation sequencing

    PubMed Central

    Ikeda, Hiroshi; Ishiguro, Kazuya; Igarashi, Tetsuyuki; Aoki, Yuka; Hayashi, Toshiaki; Ishida, Tadao; Sasaki, Yasushi; Tokino, Takashi; Shinomura, Yasuhisa

    2015-01-01

    A 69-year-old man was diagnosed with IgG λ-type multiple myeloma (MM), Stage II in October 2010. He was treated with one cycle of high-dose dexamethasone. After three cycles of bortezomib, the patient exhibited slow elevations in the free light-chain levels and developed a significant new increase of serum M protein. Bone marrow cytogenetic analysis revealed a complex karyotype characteristic of malignant plasma cells. To better understand the molecular pathogenesis of this patient, we sequenced for mutations in the entire coding regions of 409 cancer-related genes using a semiconductor-based sequencing platform. Sequencing analysis revealed eight nonsynonymous somatic mutations in addition to several copy number variants, including CCND1 and RB1. These alterations may play roles in the pathobiology of this disease. This targeted next-generation sequencing can allow for the prediction of drug resistance and facilitate improvements in the treatment of MM patients. PMID:26491355

  14. Identification of Disease-Causing Mutations in Autosomal Dominant Retinitis Pigmentosa (adRP) Using Next-Generation DNA Sequencing

    PubMed Central

    Bowne, Sara J.; Sullivan, Lori S.; Koboldt, Daniel C.; Ding, Li; Fulton, Robert; Abbott, Rachel M.; Sodergren, Erica J.; Birch, David G.; Wheaton, Dianna H.; Heckenlively, John R.; Liu, Qin; Pierce, Eric A.; Weinstock, George M.

    2011-01-01

    Purpose. To determine whether massively parallel next-generation DNA sequencing offers rapid and efficient detection of disease-causing mutations in patients with monogenic inherited diseases. Retinitis pigmentosa (RP) is a challenging application for this technology because it is a monogenic disease in individuals and families but is highly heterogeneous in patient populations. RP has multiple patterns of inheritance, with mutations in many genes for each inheritance pattern and numerous, distinct, disease-causing mutations at each locus; further, many RP genes have not been identified yet. Methods. Next-generation sequencing was used to identify mutations in pairs of affected individuals from 21 families with autosomal dominant RP, selected from a cohort of families without mutations in “common” RP genes. One thousand amplicons targeting 249,267 unique bases of 46 candidate genes were sequenced with the 454GS FLX Titanium (Roche Diagnostics, Indianapolis, IN) and GAIIx (Illumina/Solexa, San Diego, CA) platforms. Results. An average sequence depth of 70× and 125× was obtained for the 454GS FLX and GAIIx platforms, respectively. More than 9000 sequence variants were identified and analyzed, to assess the likelihood of pathogenicity. One hundred twelve of these were selected as likely candidates and tested for segregation with traditional di-deoxy capillary electrophoresis sequencing of additional family members and control subjects. Five disease-causing mutations (24%) were identified in the 21 families. Conclusion. This project demonstrates that next-generation sequencing is an effective approach for detecting novel, rare mutations causing heterogeneous monogenic disorders such as RP. With the addition of this technology, disease-causing mutations can now be identified in 65% of autosomal dominant RP cases. PMID:20861475

  15. Metagenome of microorganisms associated with the toxic Cyanobacteria Microcystis aeruginosa analyzed using the 454 sequencing platform

    NASA Astrophysics Data System (ADS)

    Li, Nan; Zhang, Lei; Li, Fuchao; Wang, Yuezhu; Zhu, Yongqiang; Kang, Hui; Wang, Shengyue; Qin, Song

    2011-05-01

    In this study, the 454 pyrosequencing technology was used to analyze the DNA of the Microcystis aeruginosa symbiosis system from cyanobacterial algal blooms in Taihu Lake, China. We generated 183 228 reads with an average length of 248 bp. Running the 454 assembly algorithm over our sequences yielded 22 239 significant contigs. After excluding the M. aeruginosa sequences, we obtained 1 322 assembled contigs longer than 1 000 bp. Taxonomic analysis indicated that four kingdoms were represented in the community: Archaea ( n = 9; 0.01%), Bacteria ( n = 98 921; 99.6%), Eukaryota ( n = 373; 3.7%), and Viruses ( n = 18; 0.02%). The bacterial sequences were predominantly Alphaproteobacteria ( n = 41 805; 83.3%), Betaproteobacteria ( n = 5 254; 10.5%) and Gammaproteobacteria ( n = 1 180; 2.4%). Gene annotations and assignment of COG (clusters of orthologous groups) functional categories indicate that a large number of the predicted genes are involved in metabolic, genetic, and environmental information processes. Our results demonstrate the extraordinary diversity of a microbial community in an ectosymbiotic system and further establish the tremendous utility of pyrosequencing.

  16. Application of next-generation sequencing technology in forensic science.

    PubMed

    Yang, Yaran; Xie, Bingbing; Yan, Jiangwei

    2014-10-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice.

  17. Next generation sequencing in epigenetics: insights and challenges.

    PubMed

    Meaburn, Emma; Schulz, Reiner

    2012-04-01

    The epigenetics community was an early adopter of next generation sequencing (NGS). NGS-based studies have provided detailed and comprehensive views of epigenetic modifications for the genomes of many species and cell types. Recently, DNA methylation has attracted much attention due to the discovery of 5-hydroxymethyl-cytosine and its role in epigenetic reprogramming and pluripotency. This renewed interest has been concomitant with methodological progress enabling, for example, high coverage and single base resolution profiling of the mammalian methylome in small numbers of cells. We summarise this progress and highlight resulting key findings about the complexity of eukaryotic DNA methylation, its role in metazoan genome evolution, epigenetic reprogramming, and its close ties with histone modifications in the context of transcription. Finally, we discuss how fundamental insights gained by NGS, particularly the discovery of widespread allele-specific epigenetic variation in the human genome, have the potential to significantly contribute to the understanding of human common complex diseases.

  18. Current next generation sequencing technology may not meet forensic standards.

    PubMed

    Bandelt, Hans-Jürgen; Salas, Antonio

    2012-01-01

    In a Nature paper of 2010, the concern was raised that intra-individual mtDNA variation may be more pronounced than previously believed, in that heteroplasmies are common and vary markedly from tissue to tissue. This claim taken at face value would have considerable impact on forensic casework. It turns out however that the employed technology detected the germ-line variation relative to the reference sequence only incompletely: on average at least five mutations were missed per sample, as an in silico reassessment of the data reveals. Before one can really set out to access to entire mtDNA genome data with relative ease for forensic purposes, one needs careful calibration studies under strict forensic conditions-or might have to wait for another generation.

  19. Utility of Next Generation Sequencing in Clinical Primary Immunodeficiencies

    PubMed Central

    Raje, Nikita; Soden, Sarah; Swanson, Douglas; Ciaccio, Christina E.; Kingsmore, Stephen F.; Dinwiddie, Darrell L.

    2015-01-01

    Primary immunodeficiencies (PIDs) are a group of genetically heterogeneous disorders that present with very similar symptoms, complicating definitive diagnosis. More than 240 genes have hitherto been associated with PIDs, of which more than 30 have been identified in the last 3 years. Next generation sequencing (NGS) of genomes or exomes of informative families has played a central role in the discovery of novel PID genes. Furthermore, NGS has the potential to transform clinical molecular testing for established PIDs, allowing all PID differential diagnoses to be tested at once, leading to increased diagnostic yield, while decreasing both the time and cost of obtaining a molecular diagnosis. Given that treatment of PID varies by disease gene, early achievement of a molecular diagnosis is likely to enhance treatment decisions and improve patient outcomes. PMID:25149170

  20. Next-Generation Sequencing in Genetic Hearing Loss

    PubMed Central

    Yan, Denise; Tekin, Mustafa; Blanton, Susan H.

    2013-01-01

    The advent of the $1000 genome has the potential to revolutionize the identification of genes and their mutations underlying genetic disorders. This is especially true for extremely heterogeneous Mendelian conditions such as deafness, where the mutation, and indeed the gene, may be private. The recent technological advances in target-enrichment methods and next generation sequencing offer a unique opportunity to break through the barriers of limitations imposed by gene arrays. These approaches now allow for the complete analysis of all known deafness-causing genes and will result in a new wave of discoveries of the remaining genes for Mendelian disorders. In this review, we describe commonly used genomic technologies as well as the application of these technologies to the genetic diagnosis of hearing loss (HL) and to the discovery of novel genes for syndromic and nonsyndromic HL. PMID:23738631

  1. Application of Next-generation Sequencing Technology in Forensic Science

    PubMed Central

    Yang, Yaran; Xie, Bingbing; Yan, Jiangwei

    2014-01-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice. PMID:25462152

  2. Exome sequencing covers >98% of mutations identified on targeted next generation sequencing panels

    PubMed Central

    LaDuca, Holly; Farwell, Kelly D.; Vuong, Huy; Lu, Hsiao-Mei; Mu, Wenbo; Shahmirzadi, Layla; Tang, Sha; Chen, Jefferey; Bhide, Shruti; Chao, Elizabeth C.

    2017-01-01

    Background With the expanded availability of next generation sequencing (NGS)-based clinical genetic tests, clinicians seeking to test patients with Mendelian diseases must weigh the superior coverage of targeted gene panels with the greater number of genes included in whole exome sequencing (WES) when considering their first-tier testing approach. Here, we use an in silico analysis to predict the analytic sensitivity of WES using pathogenic variants identified on targeted NGS panels as a reference. Methods Corresponding nucleotide positions for 1533 different alterations classified as pathogenic or likely pathogenic identified on targeted NGS multi-gene panel tests in our laboratory were interrogated in data from 100 randomly-selected clinical WES samples to quantify the sequence coverage at each position. Pathogenic variants represented 91 genes implicated in hereditary cancer, X-linked intellectual disability, primary ciliary dyskinesia, Marfan syndrome/aortic aneurysms, cardiomyopathies and arrhythmias. Results When assessing coverage among 100 individual WES samples for each pathogenic variant (153,300 individual assessments), 99.7% (n = 152,798) would likely have been detected on WES. All pathogenic variants had at least some coverage on exome sequencing, with a total of 97.3% (n = 1491) detectable across all 100 individuals. For the remaining 42 pathogenic variants, the number of WES samples with adequate coverage ranged from 35 to 99. Factors such as location in GC-rich, repetitive, or homologous regions likely explain why some of these alterations were not detected across all samples. To validate study findings, a similar analysis was performed against coverage data from 60,706 exomes available through the Exome Aggregation Consortium (ExAC). Results from this validation confirmed that 98.6% (91,743,296/93,062,298) of pathogenic variants demonstrated adequate depth for detection. Conclusions Results from this in silico analysis suggest that exome

  3. Next generation sequencing for disorders of sex development.

    PubMed

    Tobias, Edward S; McElreavey, Ken

    2014-01-01

    Advances in sequencing technologies are having a major impact on our understanding of the genetic causes of many human congenital disorders. Next generation sequencing (NGS) approaches are particularly important for determining the inherited genetic changes leading to disorders of sex development (DSD). Knowledge of the genetic pathways involved in ovary or testis development is incomplete and, currently, a molecular diagnosis is made in a minority of DSD cases. Here, we review the different NGS strategies applied to the analysis of rare diseases and highlight the potential pitfalls and advantages that are associated with each approach. We also discuss the problems of variant calling as well as the challenges involved in the identification and interpretation of pathogenic mutations from NGS datasets. As clinics start to use NGS on a routine basis, a close collaboration between the molecular and clinical geneticists is essential. This is particularly relevant in the context of unsolicited genetic findings, where clear guidelines regarding counseling, truly informed consent and precise data interpretation will be invaluable.

  4. Next generation sequencing: Coping with rare genetic diseases in China

    PubMed Central

    Cram, David S; Zhou, Daixing

    2016-01-01

    Summary With a population of 1.4 billion, China shares the largest burden of rare genetic diseases worldwide. Current estimates suggest that there are over ten million individuals afflicted with chromosome disease syndromes and well over one million individuals with monogenic disease. Care of patients with rare genetic diseases remains a largely unmet need due to the paucity of available and affordable treatments. Over recent years, there is increasing recognition of the need for affirmative action by government, health providers, clinicians and patients. The advent of new next generation sequencing (NGS) technologies such as whole genome/exome sequencing, offers an unprecedented opportunity to provide large-scale population screening of the Chinese population to identify the molecular causes of rare genetic diseases. As a surrogate for lack of effective treatments, recent development and implementation of noninvasive prenatal testing (NIPT) in China has the greatest potential, as a single technology, for reducing the number of children born with rare genetic diseases. PMID:27672536

  5. Recommendations on e-infrastructures for next-generation sequencing.

    PubMed

    Spjuth, Ola; Bongcam-Rudloff, Erik; Dahlberg, Johan; Dahlö, Martin; Kallio, Aleksi; Pireddu, Luca; Vezzi, Francesco; Korpelainen, Eija

    2016-06-07

    With ever-increasing amounts of data being produced by next-generation sequencing (NGS) experiments, the requirements placed on supporting e-infrastructures have grown. In this work, we provide recommendations based on the collective experiences from participants in the EU COST Action SeqAhead for the tasks of data preprocessing, upstream processing, data delivery, and downstream analysis, as well as long-term storage and archiving. We cover demands on computational and storage resources, networks, software stacks, automation of analysis, education, and also discuss emerging trends in the field. E-infrastructures for NGS require substantial effort to set up and maintain over time, and with sequencing technologies and best practices for data analysis evolving rapidly it is important to prioritize both processing capacity and e-infrastructure flexibility when making strategic decisions to support the data analysis demands of tomorrow. Due to increasingly demanding technical requirements we recommend that e-infrastructure development and maintenance be handled by a professional service unit, be it internal or external to the organization, and emphasis should be placed on collaboration between researchers and IT professionals.

  6. A distributed system for fast alignment of next-generation sequencing data.

    PubMed

    Srimani, Jaydeep K; Wu, Po-Yen; Phan, John H; Wang, May D

    2010-12-01

    We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.

  7. Mutation Detection in Patients with Retinal Dystrophies Using Targeted Next Generation Sequencing

    PubMed Central

    Weisschuh, Nicole; Mayer, Anja K.; Strom, Tim M.; Kohl, Susanne; Glöckle, Nicola; Schubach, Max; Andreasson, Sten; Bernd, Antje; Birch, David G.; Hamel, Christian P.; Heckenlively, John R.; Jacobson, Samuel G.; Kamme, Christina; Kellner, Ulrich; Kunstmann, Erdmute; Maffei, Pietro; Reiff, Charlotte M.; Rohrschneider, Klaus; Rosenberg, Thomas; Rudolph, Günther; Vámos, Rita; Varsányi, Balázs; Weleber, Richard G.; Wissinger, Bernd

    2016-01-01

    Retinal dystrophies (RD) constitute a group of blinding diseases that are characterized by clinical variability and pronounced genetic heterogeneity. The different nonsyndromic and syndromic forms of RD can be attributed to mutations in more than 200 genes. Consequently, next generation sequencing (NGS) technologies are among the most promising approaches to identify mutations in RD. We screened a large cohort of patients comprising 89 independent cases and families with various subforms of RD applying different NGS platforms. While mutation screening in 50 cases was performed using a RD gene capture panel, 47 cases were analyzed using whole exome sequencing. One family was analyzed using whole genome sequencing. A detection rate of 61% was achieved including mutations in 34 known and two novel RD genes. A total of 69 distinct mutations were identified, including 39 novel mutations. Notably, genetic findings in several families were not consistent with the initial clinical diagnosis. Clinical reassessment resulted in refinement of the clinical diagnosis in some of these families and confirmed the broad clinical spectrum associated with mutations in RD genes. PMID:26766544

  8. Application of next generation sequencing in clinical microbiology and infection prevention.

    PubMed

    Deurenberg, Ruud H; Bathoorn, Erik; Chlebowicz, Monika A; Couto, Natacha; Ferdous, Mithila; García-Cobos, Silvia; Kooistra-Smid, Anna M D; Raangs, Erwin C; Rosema, Sigrid; Veloo, Alida C M; Zhou, Kai; Friedrich, Alexander W; Rossen, John W A

    2017-02-10

    Current molecular diagnostics of human pathogens provide limited information that is often not sufficient for outbreak and transmission investigation. Next generation sequencing (NGS) determines the DNA sequence of a complete bacterial genome in a single sequence run, and from these data, information on resistance and virulence, as well as information for typing is obtained, useful for outbreak investigation. The obtained genome data can be further used for the development of an outbreak-specific screening test. In this review, a general introduction to NGS is presented, including the library preparation and the major characteristics of the most common NGS platforms, such as the MiSeq (Illumina) and the Ion PGM™ (ThermoFisher). An overview of the software used for NGS data analyses used at the medical microbiology diagnostic laboratory in the University Medical Center Groningen in The Netherlands is given. Furthermore, applications of NGS in the clinical setting are described, such as outbreak management, molecular case finding, characterization and surveillance of pathogens, rapid identification of bacteria using the 16S-23S rRNA region, taxonomy, metagenomics approaches on clinical samples, and the determination of the transmission of zoonotic micro-organisms from animals to humans. Finally, we share our vision on the use of NGS in personalised microbiology in the near future, pointing out specific requirements.

  9. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data

    PubMed Central

    Beerenwinkel, Niko; Günthard, Huldrych F.; Roth, Volker; Metzner, Karin J.

    2012-01-01

    Many viruses, including the clinically relevant RNA viruses HIV (human immunodeficiency virus) and HCV (hepatitis C virus), exist in large populations and display high genetic heterogeneity within and between infected hosts. Assessing intra-patient viral genetic diversity is essential for understanding the evolutionary dynamics of viruses, for designing effective vaccines, and for the success of antiviral therapy. Next-generation sequencing (NGS) technologies allow the rapid and cost-effective acquisition of thousands to millions of short DNA sequences from a single sample. However, this approach entails several challenges in experimental design and computational data analysis. Here, we review the entire process of inferring viral diversity from sample collection to computing measures of genetic diversity. We discuss sample preparation, including reverse transcription and amplification, and the effect of experimental conditions on diversity estimates due to in vitro base substitutions, insertions, deletions, and recombination. The use of different NGS platforms and their sequencing error profiles are compared in the context of various applications of diversity estimation, ranging from the detection of single nucleotide variants (SNVs) to the reconstruction of whole-genome haplotypes. We describe the statistical and computational challenges arising from these technical artifacts, and we review existing approaches, including available software, for their solution. Finally, we discuss open problems, and highlight successful biomedical applications and potential future clinical use of NGS to estimate viral diversity. PMID:22973268

  10. Management of Incidental Findings in the Era of Next-generation Sequencing

    PubMed Central

    Blackburn, Heather L.; Schroeder, Bradley; Turner, Clesson; Shriver, Craig D.; Ellsworth, Darrell L.; Ellsworth, Rachel E.

    2015-01-01

    Next-generation sequencing (NGS) technologies allow for the generation of whole exome or whole genome sequencing data, which can be used to identify novel genetic alterations associated with defined phenotypes or to expedite discovery of functional variants for improved patient care. Because this robust technology has the ability to identify all mutations within a genome, incidental findings (IF)- genetic alterations associated with conditions or diseases unrelated to the patient’s present condition for which current tests are being performed- may have important clinical ramifications. The current debate among genetic scientists and clinicians focuses on the following questions: 1) should any IF be disclosed to patients, and 2) which IF should be disclosed – actionable mutations, variants of unknown significance, or all IF? Policies for disclosure of IF are being developed for when and how to convey these findings and whether adults, minors, or individuals unable to provide consent have the right to refuse receipt of IF. In this review, we detail current NGS technology platforms, discuss pressing issues regarding disclosure of IF, and how IF are currently being handled in prenatal, pediatric, and adult patients. PMID:26069456

  11. Probabilistic model based error correction in a set of various mutant sequences analyzed by next-generation sequencing.

    PubMed

    Aita, Takuyo; Ichihashi, Norikazu; Yomo, Tetsuya

    2013-12-01

    To analyze the evolutionary dynamics of a mutant population in an evolutionary experiment, it is necessary to sequence a vast number of mutants by high-throughput (next-generation) sequencing technologies, which enable rapid and parallel analysis of multikilobase sequences. However, the observed sequences include many errors of base call. Therefore, if next-generation sequencing is applied to analysis of a heterogeneous population of various mutant sequences, it is necessary to discriminate between true bases as point mutations and errors of base call in the observed sequences, and to subject the sequences to error-correction processes. To address this issue, we have developed a novel method of error correction based on the Potts model and a maximum a posteriori probability (MAP) estimate of its parameters corresponding to the "true sequences". Our method of error correction utilizes (1) the "quality scores" which are assigned to individual bases in the observed sequences and (2) the neighborhood relationship among the observed sequences mapped in sequence space. The computer experiments of error correction of artificially generated sequences supported the effectiveness of our method, showing that 50-90% of errors were removed. Interestingly, this method is analogous to a probabilistic model based method of image restoration developed in the field of information engineering.

  12. Human identification by lice: A Next Generation Sequencing challenge.

    PubMed

    Pilli, Elena; Agostino, Alessandro; Vergani, Debora; Salata, Elena; Ciuna, Ignazio; Berti, Andrea; Caramelli, David; Lambiase, Simonetta

    2016-09-01

    Rapid and progressive advances in molecular biology techniques and the advent of Next Generation Sequencing (NGS) have opened new possibilities for analyses also in the identification of entomological matrixes. Insects and other arthropods are widespread in nature and those found at a crime scene can provide a useful contribution to forensic investigations. Entomological evidence is used by experts to define the postmortem interval (PMI), which is essentially based on morphological recognition of the insect and an estimation of its insect life cycle stage. However, molecular genotyping methods can also provide an important support for forensic entomological investigations when the identification of species or human genetic material is required. This case study concerns a collection of insects found in the house of a woman who died from unknown causes. Initially the insects were identified morphologically as belonging to the Pediculidae family, and then, human DNA was extracted and analyzed from their gastrointestinal tract. The application of the latest generation forensic DNA assays, such as the Quantifiler(®) Trio DNA Quantification Kit and the HID-Ion AmpliSeq™ Identity Panel (Applied Biosystems(®)), individuated the presence of human DNA in the samples and determined the genetic profile.

  13. Next-generation sequencing for high-throughput molecular ecology: a step-by-step protocol for targeted multilocus genotyping by pyrosequencing.

    PubMed

    Puritz, Jonathan B; Toonen, Robert J

    2013-01-01

    Next-generation sequencing technology can now provide population biologists and phylogeographers with information at the genomic scale; however, many pertinent questions in population genetics and phylogeography can be answered effectively with modest levels of genomic information. For the past two decades, most population-level studies have lacked nuclear DNA (nDNA) sequence data due to the complications and cost of amplifying and sequencing diploid loci. However, pyrosequencing of emulsion PCR reactions, amplifying from only one molecule at a time, can generate megabases of clonally amplified loci at high coverage, thereby greatly simplifying allelic sequence determination. Here, we present a step-by-step methodology for utilizing the 454 GS FLX Titanium pyrosequencing platform to simultaneously sequence 16 populations (at 20 individuals per population) at 10 different nDNA loci (3,200 loci in total) in one plate of sequencing for less than the cost of traditional Sanger sequencing.

  14. Controls on facies and sequence stratigraphy of an upper Miocene carbonate ramp and platform, Melilla basin, NE Morocco

    USGS Publications Warehouse

    Cunningham, K.J.; Collins, Luke S.

    2002-01-01

    Upwelling of cool seawater, paleoceanographic circulation, paleoclimate, local tectonics and relative sea-level change controlled the lithofacies and sequence stratigraphy of a carbonate ramp and overlying platform that are part of a temporally well constrained carbonate complex in the Melilla basin, northeastern Morocco. At Melilla, from oldest to youngest, a third-order depositional sequence within the carbonate complex contains (1) a retrogradational, transgressive, warm temperate-type rhodalgal ramp; (2) an early highstand, progradational, bioclastic platform composed mainly of a temperate-type, bivalve-rich molechfor facies; and (3) late highstand, progradational to downstepping, subtropical/tropical-type chlorozoan fringing Porites reefs. The change from rhodalgal ramp to molechfor platform occurred at 7.0??0.14 Ma near the Tortonian/Messinian boundary. During a late stage in the development of the bioclastic platform a transition from temperate-type molechfor facies to subtropical/tropical-type chlorozoan facies occurred and is bracketed by chron 3An.2n (??? 6.3-6.6 Ma). Comparison to a well-dated carbonate complex in southeastern Spain at Cabo de Gata suggests that upwelling of cool seawater influenced production of temperate-type limestone within the ramp and platform at Melilla during postulated late Tortonian-early Messinian subtropical/tropical paleoclimatic conditions in the western Paleo-Mediterranean region. The upwelling of cool seawater across the bioclastic platform at Melilla could be related to the beginning of 'siphoning' of deep, cold Atlantic waters into the Paleo-Mediterranean Sea at 7.17 Ma. The facies change within the bioclastic platform from molechfor to chlorozoan facies may be coincident with a reduction of the siphoning of Atlantic waters and the end of upwelling at Melilla during chron 3An.2n. The ramp contains one retrogradational parasequence and the bioclastic platform three progradational parasequences. Minor erosional surfaces

  15. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits.

    PubMed

    Karamitros, Timokratis; Magiorkinis, Gkikas

    2015-12-15

    The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage over the targets and their flanking regions. Oxford Nanopore Technologies recently released an USB3.0-interfaced sequencer, the MinION. To date no particular method for enriching MinION libraries has been standardized. Here, using biotinylated PCR-generated baits in a novel approach, we describe a simple and efficient way for multiplexed enrichment of MinION libraries, overcoming technical limitations related with the chemistry of the sequencing-adapters and the length of the DNA fragments. Using Phage Lambda and Escherichia coli as models we selectively enrich for specific targets, significantly increasing the corresponding read-coverage, eliminating unwanted regions. We show that by capturing genomic fragments, which contain the target sequences, we recover reads extending targeted regions and thus can be used for the determination of potentially unknown flanking sequences. By pooling enriched libraries derived from two distinct E. coli strains and analyzing them in parallel, we demonstrate the efficiency of this method in multiplexed format. Crucially we evaluated the optimal bait size for large fragment libraries and we describe for the first time a standardized method for target enrichment in MinION platform.

  16. FaceTOON: a unified platform for feature-based cartoon expression generation

    NASA Astrophysics Data System (ADS)

    Zaharia, Titus; Marre, Olivier; Prêteux, Françoise; Monjaux, Perrine

    2008-02-01

    This paper presents the FaceTOON system, a semi-automatic platform dedicated to the creation of verbal and emotional facial expressions, within the applicative framework of 2D cartoon production. The proposed FaceTOON platform makes it possible to rapidly create 3D facial animations with a minimum amount of user interaction. In contrast with existing commercial 3D modeling softwares, which usually require from the users advanced 3D graphics skills and competences, the FaceTOON system is based exclusively on 2D interaction mechanisms, the 3D modeling stage being completely transparent for the user. The system takes as input a neutral 3D face model, free of any facial feature, and a set of 2D drawings, representing the desired facial features. A 2D/3D virtual mapping procedure makes it possible to obtain a ready-for-animation model which can be directly manipulated and deformed for generating expressions. The platform includes a complete set of dedicated tools for 2D/3D interactive deformation, pose management, key-frame interpolation and MPEG-4 compliant animation and rendering. The proposed FaceTOON system is currently considered for industrial evaluation and commercialization by the Quadraxis company.

  17. Genomic resources for a commercial flatfish, the Senegalese sole (Solea senegalensis): EST sequencing, oligo microarray design, and development of the Soleamold bioinformatic platform

    PubMed Central

    Cerdà, Joan; Mercadé, Jaume; Lozano, Juan José; Manchado, Manuel; Tingaud-Sequeira, Angèle; Astola, Antonio; Infante, Carlos; Halm, Silke; Viñas, Jordi; Castellana, Barbara; Asensio, Esther; Cañavate, Pedro; Martínez-Rodríguez, Gonzalo; Piferrer, Francesc; Planas, Josep V; Prat, Francesc; Yúfera, Manuel; Durany, Olga; Subirada, Francesc; Rosell, Elisabet; Maes, Tamara

    2008-01-01

    Background The Senegalese sole, Solea senegalensis, is a highly prized flatfish of growing commercial interest for aquaculture in Southern Europe. However, despite the industrial production of Senegalese sole being hampered primarily by lack of information on the physiological mechanisms involved in reproduction, growth and immunity, very limited genomic information is available on this species. Results Sequencing of a S. senegalensis multi-tissue normalized cDNA library, from adult tissues (brain, stomach, intestine, liver, ovary, and testis), larval stages (pre-metamorphosis, metamorphosis), juvenile stages (post-metamorphosis, abnormal fish), and undifferentiated gonads, generated 10,185 expressed sequence tags (ESTs). Clones were sequenced from the 3'-end to identify isoform specific sequences. Assembly of the entire EST collection into contigs gave 5,208 unique sequences of which 1,769 (34%) had matches in GenBank, thus showing a low level of redundancy. The sequence of the 5,208 unigenes was used to design and validate an oligonucleotide microarray representing 5,087 unique Senegalese sole transcripts. Finally, a novel interactive bioinformatic platform, Soleamold, was developed for the Senegalese sole EST collection as well as microarray and ISH data. Conclusion New genomic resources have been developed for S. senegalensis, an economically important fish in aquaculture, which include a collection of expressed genes, an oligonucleotide microarray, and a publicly available bioinformatic platform that can be used to study gene expression in this species. These resources will help elucidate transcriptional regulation in wild and captive Senegalese sole for optimization of its production under intensive culture conditions. PMID:18973667

  18. Genetic sequence relationships of Winnipegosis platform carbonates, Southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-07-01

    Examination of cores and well-log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger Vail-type sequences separated by regionally persistent unconformities or their correlative conformities. Sea level rise during the early Eifelian caused southeastward onlap of the Ashern Formation onto Middle Silurian carbonates of the Interlake Formation. Maximum flooding, expressed by deepest marine facies and a hardground surface, suggests the existence of a condensed section at the top of the Ashern Formation. This section was developed during the maximum rate of sea level rise. A decrease in the rate of sea level rise resulted in aggradation of lower Winnipegosis units on a gently dipping ramp. These units are presented by nodular and burrowed open-marine limestones with scattered stromatoporoid patch reefs and grainstone shoals. During the subsequent sea level fall, represented by Temple units, a shelf margin with pronounced depositional topography and adjacent starved basin were developed. Temple strata include coral-brachiopod-stromatoporoid reefs and productive fore-reef talus deposits along the shelf-margin rim. With increased rates of sea level fall, the platform interior and shelf margin were subaerially exposed, slope carbonates were dolomitized, and the E-shale was deposited as a lowstand wedge.

  19. Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money.

    PubMed

    Vincent, Antony T; Derome, Nicolas; Boyle, Brian; Culley, Alexander I; Charette, Steve J

    2016-03-16

    The Sanger sequencing method produces relatively long DNA sequences of unmatched quality and has been considered for long time as the gold standard for sequencing DNA. Many improvements of the Sanger method that culminated with fluorescent dyes coupled with automated capillary electrophoresis enabled the sequencing of the first genomes. Nevertheless, using this technology to sequence whole genomes was costly, laborious and time consuming even for genomes that are relatively small in size. A major technological advance was the introduction of next-generation sequencing (NGS) pioneered by 454 Life Sciences in the early part of the 21th century. NGS allowed scientists to sequence thousands to millions of DNA molecules in a single machine run. Since then, new NGS technologies have emerged and existing NGS platforms have been improved, enabling the production of genome sequences at an unprecedented rate as well as broadening the spectrum of NGS applications. The current affordability of generating genomic information, especially with microbial samples, has resulted in a false sense of simplicity that belies the fact that many researchers still consider these technologies a black box. In this review, our objective is to identify and discuss four steps that we consider crucial to the success of any NGS-related project. These steps are: (1) the definition of the research objectives beyond sequencing and appropriate experimental planning, (2) library preparation, (3) sequencing and (4) data analysis. The goal of this review is to give an overview of the process, from sample to analysis, and discuss how to optimize your resources to achieve the most from your NGS-based research. Regardless of the evolution and improvement of the sequencing technologies, these four steps will remain relevant.

  20. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  1. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  2. Next-generation sequencing in X-linked intellectual disability

    PubMed Central

    Tzschach, Andreas; Grasshoff, Ute; Beck-Woedl, Stefanie; Dufke, Claudia; Bauer, Claudia; Kehrer, Martin; Evers, Christina; Moog, Ute; Oehl-Jaschkowitz, Barbara; Di Donato, Nataliya; Maiwald, Robert; Jung, Christine; Kuechler, Alma; Schulz, Solveig; Meinecke, Peter; Spranger, Stephanie; Kohlhase, Jürgen; Seidel, Jörg; Reif, Silke; Rieger, Manuela; Riess, Angelika; Sturm, Marc; Bickmann, Julia; Schroeder, Christopher; Dufke, Andreas; Riess, Olaf; Bauer, Peter

    2015-01-01

    X-linked intellectual disability (XLID) is a genetically heterogeneous disorder with more than 100 genes known to date. Most genes are responsible for a small proportion of patients only, which has hitherto hampered the systematic screening of large patient cohorts. We performed targeted enrichment and next-generation sequencing of 107 XLID genes in a cohort of 150 male patients. Hundred patients had sporadic intellectual disability, and 50 patients had a family history suggestive of XLID. We also analysed a sporadic female patient with severe ID and epilepsy because she had strongly skewed X-inactivation. Target enrichment and high parallel sequencing allowed a diagnostic coverage of >10 reads for ~96% of all coding bases of the XLID genes at a mean coverage of 124 reads. We found 18 pathogenic variants in 13 XLID genes (AP1S2, ATRX, CUL4B, DLG3, IQSEC2, KDM5C, MED12, OPHN1, SLC9A6, SMC1A, UBE2A, UPF3B and ZDHHC9) among the 150 male patients. Thirteen pathogenic variants were present in the group of 50 familial patients (26%), and 5 pathogenic variants among the 100 sporadic patients (5%). Systematic gene dosage analysis for low coverage exons detected one pathogenic hemizygous deletion. An IQSEC2 nonsense variant was detected in the female ID patient, providing further evidence for a role of this gene in encephalopathy in females. Skewed X-inactivation was more frequently observed in mothers with pathogenic variants compared with those without known X-linked defects. The mutation rate in the cohort of sporadic patients corroborates previous estimates of 5–10% for X-chromosomal defects in male ID patients. PMID:25649377

  3. Authentication of Herbal Supplements Using Next-Generation Sequencing

    PubMed Central

    Braukmann, Thomas W. A.; Borisenko, Alex V.; Zakharov, Evgeny V.

    2016-01-01

    Background DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. Methods We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. Results All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven–by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Conclusion Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product

  4. Ocean colour opportunities from Meteosat Second and Third Generation geostationary platforms

    NASA Astrophysics Data System (ADS)

    Kwiatkowska, Ewa J.; Ruddick, Kevin; Ramon, Didier; Vanhellemont, Quinten; Brockmann, Carsten; Lebreton, Carole; Bonekamp, Hans G.

    2016-05-01

    Ocean colour applications from medium-resolution polar-orbiting satellite sensors have now matured and evolved into operational services. These applications are enabled by the Sentinel-3 OLCI space sensors of the European Earth Observation Copernicus programme and the VIIRS sensors of the US Joint Polar Satellite System programme. Key drivers for the Copernicus ocean colour services are the national obligations of the EU member states to report on the quality of marine, coastal and inland waters for the EU Water Framework Directive and Marine Strategy Framework Directive. Further applications include CO2 sequestration, carbon cycle and climate, fisheries and aquaculture management, near-real-time alerting to harmful algae blooms, environmental monitoring and forecasting, and assessment of sediment transport in coastal waters. Ocean colour data from polar-orbiting satellite platforms, however, suffer from fractional coverage, primarily due to clouds, and inadequate resolution of quickly varying processes. Ocean colour remote sensing from geostationary platforms can provide significant improvements in coverage and sampling frequency and support new applications and services. EUMETSAT's SEVIRI instrument on the geostationary Meteosat Second Generation platforms (MSG) is not designed to meet ocean colour mission requirements, however, it has been demonstrated to provide valuable contribution, particularly in combination with dedicated ocean colour polar observations. This paper describes the ongoing effort to develop operational ocean colour water turbidity and related products and user services from SEVIRI. SEVIRI's multi-temporal capabilities can benefit users requiring improved local-area coverage and frequent diurnal observations. A survey of user requirements and a study of technical capabilities and limitations of the SEVIRI instruments are the basis for this development and are described in this paper. The products will support monitoring of sediment transport

  5. Ocean colour products from geostationary platforms, opportunities with Meteosat Second and Third Generation

    NASA Astrophysics Data System (ADS)

    Kwiatkowska, E. J.; Ruddick, K.; Ramon, D.; Vanhellemont, Q.; Brockmann, C.; Lebreton, C.; Bonekamp, H. G.

    2015-12-01

    Ocean colour applications from medium-resolution polar-orbiting satellite sensors have now matured and evolved into operational services. The examples include the Sentinel-3 OLCI missions of the European Earth Observation Copernicus programme and the VIIRS missions of the US Joint Polar Satellite System programme. Key drivers for Copernicus ocean colour services are the national obligations of the EU member states to report on the quality of marine, coastal and inland waters for the EU Water Framework Directive and Marine Strategy Framework Directive. Further applications include CO2 sequestration, carbon cycle and climate, fisheries and aquaculture management, near-real-time alerting to harmful algae blooms, environmental monitoring and forecasting, and assessment of sediment transport in coastal waters. Ocean colour data from polar-orbiting satellite platforms, however, suffer from fractional coverage, primarily due to clouds, and inadequate resolution of quickly varying processes. Ocean colour remote sensing from geostationary platforms can provide significant improvements in coverage and sampling frequency and support new applications and services. EUMETSAT's SEVIRI instrument on the geostationary Meteosat Second Generation platforms (MSG) is not designed to meet ocean colour mission requirements, however, it has been demonstrated to provide valuable contribution, particularly in combination with dedicated ocean colour polar observations. This paper describes the ongoing effort to develop operational ocean colour water turbidity and related products and user services from SEVIRI. A survey of user requirements and a study of technical capabilities and limitations of the SEVIRI instruments are the basis for this development and are described in this paper. The products will support monitoring of sediment transport, water clarity, and tidal dynamics. Further products and services are anticipated from EUMETSAT's FCI instruments on Meteosat Third Generation

  6. High-throughput sequencing of core STR loci for forensic genetic investigations using the Roche Genome Sequencer FLX platform.

    PubMed

    Fordyce, Sarah L; Ávila-Arcos, Maria C; Rockenbauer, Eszter; Børsting, Claus; Frank-Hansen, Rune; Petersen, Frederik Torp; Willerslev, Eske; Hansen, Anders J; Morling, Niels; Gilbert, M Thomas P

    2011-08-01

    The analysis and profiling of short tandem repeat (STR) loci is routinely used in forensic genetics. Current methods to investigate STR loci, including PCR-based standard fragment analyses and capillary electrophoresis, only provide amplicon lengths that are used to estimate the number of STR repeat units. These methods do not allow for the full resolution of STR base composition that sequencing approaches could provide. Here we present an STR profiling method based on the use of the Roche Genome Sequencer (GS) FLX to simultaneously sequence multiple core STR loci. Using this method in combination with a bioinformatic tool designed specifically to analyze sequence lengths and frequencies, we found that GS FLX STR sequence data are comparable to conventional capillary electrophoresis-based STR typing. Furthermore, we found DNA base substitutions and repeat sequence variations that would not have been identified using conventional STR typing.

  7. Sequencing technologies and genome sequencing.

    PubMed

    Pareek, Chandra Shekhar; Smoczynski, Rafal; Tretyn, Andrzej

    2011-11-01

    The high-throughput - next generation sequencing (HT-NGS) technologies are currently the hottest topic in the field of human and animals genomics researches, which can produce over 100 times more data compared to the most sophisticated capillary sequencers based on the Sanger method. With the ongoing developments of high throughput sequencing machines and advancement of modern bioinformatics tools at unprecedented pace, the target goal of sequencing individual genomes of living organism at a cost of $1,000 each is seemed to be realistically feasible in the near future. In the relatively short time frame since 2005, the HT-NGS technologies are revolutionizing the human and animal genome researches by analysis of chromatin immunoprecipitation coupled to DNA microarray (ChIP-chip) or sequencing (ChIP-seq), RNA sequencing (RNA-seq), whole genome genotyping, genome wide structural variation, de novo assembling and re-assembling of genome, mutation detection and carrier screening, detection of inherited disorders and complex human diseases, DNA library preparation, paired ends and genomic captures, sequencing of mitochondrial genome and personal genomics. In this review, we addressed the important features of HT-NGS like, first generation DNA sequencers, birth of HT-NGS, second generation HT-NGS platforms, third generation HT-NGS platforms: including single molecule Heliscope™, SMRT™ and RNAP sequencers, Nanopore, Archon Genomics X PRIZE foundation, comparison of second and third HT-NGS platforms, applications, advances and future perspectives of sequencing technologies on human and animal genome research.

  8. Scanning the effects of ethyl methanesulfonate on the whole genome of Lotus japonicus using second-generation sequencing analysis.

    PubMed

    Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M; Biswas, Bandana; Batley, Jacqueline

    2015-02-06

    Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm.

  9. Evolution of a Reconfigurable Processing Platform for a Next Generation Space Software Defined Radio

    NASA Technical Reports Server (NTRS)

    Kacpura, Thomas J.; Downey, Joseph A.; Anderson, Keffery R.; Baldwin, Keith

    2014-01-01

    The National Aeronautics and Space Administration (NASA)Harris Ka-Band Software Defined Radio (SDR) is the first, fully reprogrammable space-qualified SDR operating in the Ka-Band frequency range. Providing exceptionally higher data communication rates than previously possible, this SDR offers in-orbit reconfiguration, multi-waveform operation, and fast deployment due to its highly modular hardware and software architecture. Currently in operation on the International Space Station (ISS), this new paradigm of reconfigurable technology is enabling experimenters to investigate navigation and networking in the space environment.The modular SDR and the NASA developed Space Telecommunications Radio System (STRS) architecture standard are the basis for Harris reusable, digital signal processing space platform trademarked as AppSTAR. As a result, two new space radio products are a synthetic aperture radar payload and an Automatic Detection Surveillance Broadcast (ADS-B) receiver. In addition, Harris is currently developing many new products similar to the Ka-Band software defined radio for other applications. For NASAs next generation flight Ka-Band radio development, leveraging these advancements could lead to a more robust and more capable software defined radio.The space environment has special considerations different from terrestrial applications that must be considered for any system operated in space. Each space mission has unique requirements that can make these systems unique. These unique requirements can make products that are expensive and limited in reuse. Space systems put a premium on size, weight and power. A key trade is the amount of reconfigurability in a space system. The more reconfigurable the hardware platform, the easier it is to adapt to the platform to the next mission, and this reduces the amount of non-recurring engineering costs. However, the more reconfigurable platforms often use more spacecraft resources. Software has similar considerations

  10. High-Throughput Microdissection for Next-Generation Sequencing

    PubMed Central

    Rosenberg, Avi Z.; Armani, Michael D.; Fetsch, Patricia A.; Xi, Liqiang; Pham, Tina Thu; Raffeld, Mark; Chen, Yun; O’Flaherty, Neil; Stussman, Rebecca; Blackler, Adele R.; Du, Qiang; Hanson, Jeffrey C.; Roth, Mark J.; Filie, Armando C.; Roh, Michael H.; Emmert-Buck, Michael R.; Hipp, Jason D.; Tangrea, Michael A.

    2016-01-01

    Precision medicine promises to enhance patient treatment through the use of emerging molecular technologies, including genomics, transcriptomics, and proteomics. However, current tools in surgical pathology lack the capability to efficiently isolate specific cell populations in complex tissues/tumors, which can confound molecular results. Expression microdissection (xMD) is an immuno-based cell/subcellular isolation tool that procures targets of interest from a cytological or histological specimen. In this study, we demonstrate the accuracy and precision of xMD by rapidly isolating immunostained targets, including cytokeratin AE1/AE3, p53, and estrogen receptor (ER) positive cells and nuclei from tissue sections. Other targets procured included green fluorescent protein (GFP) expressing fibroblasts, in situ hybridization positive Epstein-Barr virus nuclei, and silver stained fungi. In order to assess the effect on molecular data, xMD was utilized to isolate specific targets from a mixed population of cells where the targets constituted only 5% of the sample. Target enrichment from this admixed cell population prior to next-generation sequencing (NGS) produced a minimum 13-fold increase in mutation allele frequency detection. These data suggest a role for xMD in a wide range of molecular pathology studies, as well as in the clinical workflow for samples where tumor cell enrichment is needed, or for those with a relative paucity of target cells. PMID:26999048

  11. Next generation sequencing and comparative analyses of Xenopus mitogenomes

    PubMed Central

    2012-01-01

    under strong negative (purifying selection), with genes under the strongest pressure (Complex 4) also being the most highly expressed, highlighting their potentially crucial functions in the mitochondrial respiratory chain. Conclusions Next generation sequencing of long-PCR amplicons using single taxon or multi-taxon approaches enabled two new species of Xenopus mtDNA to be fully characterized. We anticipate our complete mitochondrial genome amplification methods to be applicable to other amphibians, helpful for identifying the most appropriate markers for differentiating species, populations and resolving phylogenies, a pressing need since amphibians are undergoing drastic global decline. Our mtDNAs also provide templates for conserved primer design and the assembly of RNA and DNA reads following high throughput “omic” techniques such as RNA- and ChIP-seq. These could help us better understand how processes such mitochondrial replication and gene expression influence xenopus growth and development, as well as how they evolved and are regulated. PMID:22992290

  12. Simulating Next-Generation Sequencing Datasets from Empirical Mutation and Sequencing Models

    PubMed Central

    Stephens, Zachary D.; Hudson, Matthew E.; Mainzer, Liudmila S.; Taschuk, Morgan; Weber, Matthew R.; Iyer, Ravishankar K.

    2016-01-01

    An obstacle to validating and benchmarking methods for genome analysis is that there are few reference datasets available for which the “ground truth” about the mutational landscape of the sample genome is known and fully validated. Additionally, the free and public availability of real human genome datasets is incompatible with the preservation of donor privacy. In order to better analyze and understand genomic data, we need test datasets that model all variants, reflecting known biology as well as sequencing artifacts. Read simulators can fulfill this requirement, but are often criticized for limited resemblance to true data and overall inflexibility. We present NEAT (NExt-generation sequencing Analysis Toolkit), a set of tools that not only includes an easy-to-use read simulator, but also scripts to facilitate variant comparison and tool evaluation. NEAT has a wide variety of tunable parameters which can be set manually on the default model or parameterized using real datasets. The software is freely available at github.com/zstephens/neat-genreads. PMID:27893777

  13. Efficient DNA fingerprinting based on the targeted sequencing of active retrotransposon insertion sites using a bench-top high-throughput sequencing platform.

    PubMed

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-10-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships.

  14. Efficient DNA Fingerprinting Based on the Targeted Sequencing of Active Retrotransposon Insertion Sites Using a Bench-Top High-Throughput Sequencing Platform

    PubMed Central

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-01-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships. PMID:24935865

  15. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues

    PubMed Central

    García-Chequer, A.J.; Méndez-Tenorio, A.; Olguín-López, G.; Sánchez-Vallejo, C.; Isa, P.; Arias, C.F.; Torres, J.; Hernández-Angeles, A.; Ramírez-Ortiz, M.A.; Lara, C.; Cabrera-Muñoz, Ma.de.L.; Sadowinski-Pine, S.; Bravo-Ortiz, J.C.; Ramón-García, G.; Diegopérez-Ramírez, J.; Ramírez-Reyes, G.; Casarrubias-Islas, R.; Ramírez, J.; Orjuela, M.; Ponce-Castañeda, M.V.

    2016-01-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma. PMID:26937470

  16. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues.

    PubMed

    García-Chequer, A J; Méndez-Tenorio, A; Olguín-López, G; Sánchez-Vallejo, C; Isa, P; Arias, C F; Torres, J; Hernández-Angeles, A; Ramírez-Ortiz, M A; Lara, C; Cabrera-Muñoz, Ma de L; Sadowinski-Pine, S; Bravo-Ortiz, J C; Ramón-García, G; Diegopérez-Ramírez, J; Ramírez-Reyes, G; Casarrubias-Islas, R; Ramírez, J; Orjuela, M; Ponce-Castañeda, M V

    2016-03-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma.

  17. Detecting Positive Selection of Korean Native Goat Populations Using Next-Generation Sequencing

    PubMed Central

    Lee, Wonseok; Ahn, Sojin; Taye, Mengistie; Sung, Samsun; Lee, Hyun-Jeong; Cho, Seoae; Kim, Heebal

    2016-01-01

    Goats (Capra hircus) are one of the oldest species of domesticated animals. Native Korean goats are a particularly interesting group, as they are indigenous to the area and were raised in the Korean peninsula almost 2,000 years ago. Although they have a small body size and produce low volumes of milk and meat, they are quite resistant to lumbar paralysis. Our study aimed to reveal the distinct genetic features and patterns of selection in native Korean goats by comparing the genomes of native Korean goat and crossbred goat populations. We sequenced the whole genome of 15 native Korean goats and 11 crossbred goats using next-generation sequencing (Illumina platform) to compare the genomes of the two populations. We found decreased nucleotide diversity in the native Korean goats compared to the crossbred goats. Genetic structural analysis demonstrated that the native Korean goat and crossbred goat populations shared a common ancestry, but were clearly distinct. Finally, to reveal the native Korean goat’s selective sweep region, selective sweep signals were identified in the native Korean goat genome using cross-population extended haplotype homozygosity (XP-EHH) and a cross-population composite likelihood ratio test (XP-CLR). As a result, we were able to identify candidate genes for recent selection, such as the CCR3 gene, which is related to lumbar paralysis resistance. Combined with future studies and recent goat genome information, this study will contribute to a thorough understanding of the native Korean goat genome. PMID:27989103

  18. Validation and Utilization of a Clinical Next-Generation Sequencing Panel for Selected Cardiovascular Disorders

    PubMed Central

    Celestino-Soper, Patrícia B. S.; Gao, Hongyu; Lynnes, Ty C.; Lin, Hai; Liu, Yunlong; Spoonamore, Katherine G.; Chen, Peng-Sheng; Vatta, Matteo

    2017-01-01

    The development of high-throughput technologies such as next-generation sequencing (NGS) has allowed for thousands of DNA loci to be interrogated simultaneously in a fast and economical method for the detection of clinically deleterious variants. Whenever a clinical diagnosis is known, a targeted NGS approach involving the use of disease-specific gene panels can be employed. This approach is often valuable as it allows for a more specific and clinically relevant interpretation of results. Here, we describe the customization, validation, and utilization of a commercially available targeted enrichment platform for the scalability of clinical diagnostic cardiovascular genetic tests, including the design of the gene panels, the technical parameters for the quality assurance and quality control, the customization of the bioinformatics pipeline, and the post-bioinformatics analysis procedures. Regions of poor base coverage were detected and targeted by Sanger sequencing as needed. All panels were successfully validated using genotype-known DNA samples either commercially available or from research subjects previously tested in outside clinical laboratories. In our experience, utilizing several of the sub-panels in a clinical setting with 33 real-life cardiovascular patients, we found that 20% of tests requested were reported to have at least one pathogenic or likely pathogenic variant that could explain the patient phenotype. For each of these patients, the positive results may aid the clinical team and the patients in best developing a disease management plan and in identifying relatives at risk. PMID:28361054

  19. Detecting Positive Selection of Korean Native Goat Populations Using Next-Generation Sequencing.

    PubMed

    Lee, Wonseok; Ahn, Sojin; Taye, Mengistie; Sung, Samsun; Lee, Hyun-Jeong; Cho, Seoae; Kim, Heebal

    2016-12-01

    Goats (Capra hircus) are one of the oldest species of domesticated animals. Native Korean goats are a particularly interesting group, as they are indigenous to the area and were raised in the Korean peninsula almost 2,000 years ago. Although they have a small body size and produce low volumes of milk and meat, they are quite resistant to lumbar paralysis. Our study aimed to reveal the distinct genetic features and patterns of selection in native Korean goats by comparing the genomes of native Korean goat and crossbred goat populations. We sequenced the whole genome of 15 native Korean goats and 11 crossbred goats using next-generation sequencing (Illumina platform) to compare the genomes of the two populations. We found decreased nucleotide diversity in the native Korean goats compared to the crossbred goats. Genetic structural analysis demonstrated that the native Korean goat and crossbred goat populations shared a common ancestry, but were clearly distinct. Finally, to reveal the native Korean goat's selective sweep region, selective sweep signals were identified in the native Korean goat genome using cross-population extended haplotype homozygosity (XP-EHH) and a cross-population composite likelihood ratio test (XP-CLR). As a result, we were able to identify candidate genes for recent selection, such as the CCR3 gene, which is related to lumbar paralysis resistance. Combined with future studies and recent goat genome information, this study will contribute to a thorough understanding of the native Korean goat genome.

  20. Next generation sequencing technologies in cancer diagnostics and therapeutics: A mini review.

    PubMed

    Li, W; Zhao, K; Kirberger, M; Liao, W; Yan, Y

    2015-10-30

    The development of advanced molecular technologies has ushered in the era of 'omics' science, including transcriptomics, proteomics, and genomics. Genomics, or whole genome approach, has become the most comprehensive investigative method to identify new gene mutations, signal pathways and drug targets for cancers. The purpose of this review is to summarize current second generation sequencing techniques in applied genomics, and to analyze the advantages and/or problems associated with each of the various sequencing platforms. Our understanding of molecular factors associated with tumorigenesis is no longer limited to the mutation of well—known cancer related genes, but may involve a broader range of factors involved in tumor development, including novel somatic mutations, gene fusions, long non—coding RNAs, microRNAs, copy number variations, methylation, and genomic structural variations. Furthermore, these new methods are not limited to analyses of single genetic or epigenetic factor, but offer comprehensive molecule profiling as a more critical and powerful approach to decoding the mystery of tumor development and identifying more reliable cancer biomarkers.

  1. Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries.

    PubMed

    Nyyssönen, Mari; Tran, Huu M; Karaoz, Ulas; Weihe, Claudia; Hadi, Masood Z; Martiny, Jennifer B H; Martiny, Adam C; Brodie, Eoin L

    2013-01-01

    Recent advances in sequencing technologies generate new predictions and hypotheses about the functional roles of environmental microorganisms. Yet, until we can test these predictions at a scale that matches our ability to generate them, most of them will remain as hypotheses. Function-based mining of metagenomic libraries can provide direct linkages between genes, metabolic traits and microbial taxa and thus bridge this gap between sequence data generation and functional predictions. Here we developed high-throughput screening assays for function-based characterization of activities involved in plant polymer decomposition from environmental metagenomic libraries. The multiplexed assays use fluorogenic and chromogenic substrates, combine automated liquid handling and use a genetically modified expression host to enable simultaneous screening of 12,160 clones for 14 activities in a total of 170,240 reactions. Using this platform we identified 374 (0.26%) cellulose, hemicellulose, chitin, starch, phosphate and protein hydrolyzing clones from fosmid libraries prepared from decomposing leaf litter. Sequencing on the Illumina MiSeq platform, followed by assembly and gene prediction of a subset of 95 fosmid clones, identified a broad range of bacterial phyla, including Actinobacteria, Bacteroidetes, multiple Proteobacteria sub-phyla in addition to some Fungi. Carbohydrate-active enzyme genes from 20 different glycoside hydrolase (GH) families were detected. Using tetranucleotide frequency (TNF) binning of fosmid sequences, multiple enzyme activities from distinct fosmids were linked, demonstrating how biochemically-confirmed functional traits in environmental metagenomes may be attributed to groups of specific organisms. Overall, our results demonstrate how functional screening of metagenomic libraries can be used to connect microbial functionality to community composition and, as a result, complement large-scale metagenomic sequencing efforts.

  2. Coupled high-throughput functional screening and next generation sequencing for identification of plant polymer decomposing enzymes in metagenomic libraries

    PubMed Central

    Nyyssönen, Mari; Tran, Huu M.; Karaoz, Ulas; Weihe, Claudia; Hadi, Masood Z.; Martiny, Jennifer B. H.; Martiny, Adam C.; Brodie, Eoin L.

    2013-01-01

    Recent advances in sequencing technologies generate new predictions and hypotheses about the functional roles of environmental microorganisms. Yet, until we can test these predictions at a scale that matches our ability to generate them, most of them will remain as hypotheses. Function-based mining of metagenomic libraries can provide direct linkages between genes, metabolic traits and microbial taxa and thus bridge this gap between sequence data generation and functional predictions. Here we developed high-throughput screening assays for function-based characterization of activities involved in plant polymer decomposition from environmental metagenomic libraries. The multiplexed assays use fluorogenic and chromogenic substrates, combine automated liquid handling and use a genetically modified expression host to enable simultaneous screening of 12,160 clones for 14 activities in a total of 170,240 reactions. Using this platform we identified 374 (0.26%) cellulose, hemicellulose, chitin, starch, phosphate and protein hydrolyzing clones from fosmid libraries prepared from decomposing leaf litter. Sequencing on the Illumina MiSeq platform, followed by assembly and gene prediction of a subset of 95 fosmid clones, identified a broad range of bacterial phyla, including Actinobacteria, Bacteroidetes, multiple Proteobacteria sub-phyla in addition to some Fungi. Carbohydrate-active enzyme genes from 20 different glycoside hydrolase (GH) families were detected. Using tetranucleotide frequency (TNF) binning of fosmid sequences, multiple enzyme activities from distinct fosmids were linked, demonstrating how biochemically-confirmed functional traits in environmental metagenomes may be attributed to groups of specific organisms. Overall, our results demonstrate how functional screening of metagenomic libraries can be used to connect microbial functionality to community composition and, as a result, complement large-scale metagenomic sequencing efforts. PMID:24069019

  3. A systems approach to designing next generation vaccines: combining α-galactose modified antigens with nanoparticle platforms

    NASA Astrophysics Data System (ADS)

    Phanse, Yashdeep; Carrillo-Conde, Brenda R.; Ramer-Tait, Amanda E.; Broderick, Scott; Kong, Chang Sun; Rajan, Krishna; Flick, Ramon; Mandell, Robert B.; Narasimhan, Balaji; Wannemuehler, Michael J.

    2014-01-01

    Innovative vaccine platforms are needed to develop effective countermeasures against emerging and re-emerging diseases. These platforms should direct antigen internalization by antigen presenting cells and promote immunogenic responses. This work describes an innovative systems approach combining two novel platforms, αGalactose (αGal)-modification of antigens and amphiphilic polyanhydride nanoparticles as vaccine delivery vehicles, to rationally design vaccine formulations. Regimens comprising soluble αGal-modified antigen and nanoparticle-encapsulated unmodified antigen induced a high titer, high avidity antibody response with broader epitope recognition of antigenic peptides than other regimen. Proliferation of antigen-specific CD4+ T cells was also enhanced compared to a traditional adjuvant. Combining the technology platforms and augmenting immune response studies with peptide arrays and informatics analysis provides a new paradigm for rational, systems-based design of next generation vaccine platforms against emerging and re-emerging pathogens.

  4. Generation of BAC-end sequences for rainbow trout genome analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    For non-sequenced genomes, BAC end sequences (BES) provide a valuable sample of repetitive elements and gene content. Here we report the results of BAC end sequencing of just over half of the rainbow trout (Oncorhynchus mykiss) Swanson HindIII library. We sequenced 177,860 BAC ends that generated 17...

  5. Next-Generation Sequencing in the Understanding of Kaposi’s Sarcoma-Associated Herpesvirus (KSHV) Biology

    PubMed Central

    Strahan, Roxanne; Uppal, Timsy; Verma, Subhash C.

    2016-01-01

    Non-Sanger-based novel nucleic acid sequencing techniques, referred to as Next-Generation Sequencing (NGS), provide a rapid, reliable, high-throughput, and massively parallel sequencing methodology that has improved our understanding of human cancers and cancer-related viruses. NGS has become a quintessential research tool for more effective characterization of complex viral and host genomes through its ever-expanding repertoire, which consists of whole-genome sequencing, whole-transcriptome sequencing, and whole-epigenome sequencing. These new NGS platforms provide a comprehensive and systematic genome-wide analysis of genomic sequences and a full transcriptional profile at a single nucleotide resolution. When combined, these techniques help unlock the function of novel genes and the related pathways that contribute to the overall viral pathogenesis. Ongoing research in the field of virology endeavors to identify the role of various underlying mechanisms that control the regulation of the herpesvirus biphasic lifecycle in order to discover potential therapeutic targets and treatment strategies. In this review, we have complied the most recent findings about the application of NGS in Kaposi’s sarcoma-associated herpesvirus (KSHV) biology, including identification of novel genomic features and whole-genome KSHV diversities, global gene regulatory network profiling for intricate transcriptome analyses, and surveying of epigenetic marks (DNA methylation, modified histones, and chromatin remodelers) during de novo, latent, and productive KSHV infections. PMID:27043613

  6. A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing.

    PubMed

    Cantacessi, Cinzia; Jex, Aaron R; Hall, Ross S; Young, Neil D; Campbell, Bronwyn E; Joachim, Anja; Nolan, Matthew J; Abubucker, Sahar; Sternberg, Paul W; Ranganathan, Shoba; Mitreva, Makedonka; Gasser, Robin B

    2010-09-01

    Transcriptomics (at the level of single cells, tissues and/or whole organisms) underpins many fields of biomedical science, from understanding the basic cellular function in model organisms, to the elucidation of the biological events that govern the development and progression of human diseases, and the exploration of the mechanisms of survival, drug-resistance and virulence of pathogens. Next-generation sequencing (NGS) technologies are contributing to a massive expansion of transcriptomics in all fields and are reducing the cost, time and performance barriers presented by conventional approaches. However, bioinformatic tools for the analysis of the sequence data sets produced by these technologies can be daunting to researchers with limited or no expertise in bioinformatics. Here, we constructed a semi-automated, bioinformatic workflow system, and critically evaluated it for the analysis and annotation of large-scale sequence data sets generated by NGS. We demonstrated its utility for the exploration of differences in the transcriptomes among various stages and both sexes of an economically important parasitic worm (Oesophagostomum dentatum) as well as the prediction and prioritization of essential molecules (including GTPases, protein kinases and phosphatases) as novel drug target candidates. This workflow system provides a practical tool for the assembly, annotation and analysis of NGS data sets, also to researchers with a limited bioinformatic expertise. The custom-written Perl, Python and Unix shell computer scripts used can be readily modified or adapted to suit many different applications. This system is now utilized routinely for the analysis of data sets from pathogens of major socio-economic importance and can, in principle, be applied to transcriptomics data sets from any organism.

  7. A platform for rapid generation of single and multiplexed reporters in human iPSC lines.

    PubMed

    Pei, Ying; Sierra, Guadalupe; Sivapatham, Renuka; Swistowski, Andrzej; Rao, Mahendra S; Zeng, Xianmin

    2015-03-17

    Induced pluripotent stem cells (iPSC) are important tools for drug discovery assays and toxicology screens. In this manuscript, we design high efficiency TALEN and ZFN to target two safe harbor sites on chromosome 13 and 19 in a widely available and well-characterized integration-free iPSC line. We show that these sites can be targeted in multiple iPSC lines to generate reporter systems while retaining pluripotent characteristics. We extend this concept to making lineage reporters using a C-terminal targeting strategy to endogenous genes that express in a lineage-specific fashion. Furthermore, we demonstrate that we can develop a master cell line strategy and then use a Cre-recombinase induced cassette exchange strategy to rapidly exchange reporter cassettes to develop new reporter lines in the same isogenic background at high efficiency. Equally important we show that this recombination strategy allows targeting at progenitor cell stages, further increasing the utility of the platform system. The results in concert provide a novel platform for rapidly developing custom single or dual reporter systems for screening assays.

  8. WiseEye: Next Generation Expandable and Programmable Camera Trap Platform for Wildlife Research

    PubMed Central

    Nazir, Sajid; Newey, Scott; Irvine, R. Justin; Verdicchio, Fabio; Davidson, Paul; Fairhurst, Gorry; van der Wal, René

    2017-01-01

    The widespread availability of relatively cheap, reliable and easy to use digital camera traps has led to their extensive use for wildlife research, monitoring and public outreach. Users of these units are, however, often frustrated by the limited options for controlling camera functions, the generation of large numbers of images, and the lack of flexibility to suit different research environments and questions. We describe the development of a user-customisable open source camera trap platform named ‘WiseEye’, designed to provide flexible camera trap technology for wildlife researchers. The novel platform is based on a Raspberry Pi single-board computer and compatible peripherals that allow the user to control its functions and performance. We introduce the concept of confirmatory sensing, in which the Passive Infrared triggering is confirmed through other modalities (i.e. radar, pixel change) to reduce the occurrence of false positives images. This concept, together with user-definable metadata, aided identification of spurious images and greatly reduced post-collection processing time. When tested against a commercial camera trap, WiseEye was found to reduce the incidence of false positive images and false negatives across a range of test conditions. WiseEye represents a step-change in camera trap functionality, greatly increasing the value of this technology for wildlife research and conservation management. PMID:28076444

  9. WiseEye: Next Generation Expandable and Programmable Camera Trap Platform for Wildlife Research.

    PubMed

    Nazir, Sajid; Newey, Scott; Irvine, R Justin; Verdicchio, Fabio; Davidson, Paul; Fairhurst, Gorry; Wal, René van der

    2017-01-01

    The widespread availability of relatively cheap, reliable and easy to use digital camera traps has led to their extensive use for wildlife research, monitoring and public outreach. Users of these units are, however, often frustrated by the limited options for controlling camera functions, the generation of large numbers of images, and the lack of flexibility to suit different research environments and questions. We describe the development of a user-customisable open source camera trap platform named 'WiseEye', designed to provide flexible camera trap technology for wildlife researchers. The novel platform is based on a Raspberry Pi single-board computer and compatible peripherals that allow the user to control its functions and performance. We introduce the concept of confirmatory sensing, in which the Passive Infrared triggering is confirmed through other modalities (i.e. radar, pixel change) to reduce the occurrence of false positives images. This concept, together with user-definable metadata, aided identification of spurious images and greatly reduced post-collection processing time. When tested against a commercial camera trap, WiseEye was found to reduce the incidence of false positive images and false negatives across a range of test conditions. WiseEye represents a step-change in camera trap functionality, greatly increasing the value of this technology for wildlife research and conservation management.

  10. On-chip generation and demultiplexing of quantum correlated photons using a silicon-silica monolithic photonic integration platform.

    PubMed

    Matsuda, Nobuyuki; Karkus, Peter; Nishi, Hidetaka; Tsuchizawa, Tai; Munro, William J; Takesue, Hiroki; Yamada, Koji

    2014-09-22

    We demonstrate the generation and demultiplexing of quantum correlated photons on a monolithic photonic chip composed of silicon and silica-based waveguides. Photon pairs generated in a nonlinear silicon waveguide are successfully separated into two optical channels of an arrayed-waveguide grating fabricated on a silica-based waveguide platform.

  11. Next generation sequencing (NGS)technologies and applications

    SciTech Connect

    Vuyisich, Momchilo

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  12. Next-Generation Sequencing Tech Panel ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Rhodes, Michael; Fiske, Haley; Knight, Jim; Turner, Steve (Pacific Biosciences

    2012-06-01

    Representatives from several next-generation sequencer manufacturers take part in a panel discussion at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  13. Generation of droplets to serpentine threads on a rotating compact-disk platform

    NASA Astrophysics Data System (ADS)

    Kar, Shantimoy; Joshi, Sumit; Chaudhary, Kaustav; Maiti, Tapas Kumar; Chakraborty, Suman

    2015-12-01

    We generate stable monodisperse droplets of nano-liter volumes and long serpentine liquid threads in a single, simple "Y"-shaped microchannel mounted on a rotationally actuated lab-on-a-compact-disk platform. Exploitation of Coriolis force offers versatile modus operandi of the present setup, without involving any design complications. Based on the fundamental understanding and subsequent analysis, we present scaling theories consistent with the experimental observations. We also outline specific applications of this technique, in the biological as well as in the physical domain, including digital polymerase chain reaction (PCR), controlled release of medical components, digital counting of colony forming units, hydrogel engineering, optical sensors and scaffolds for living tissues, to name a few.

  14. Viral population analysis and minority-variant detection using short read next-generation sequencing

    PubMed Central

    Watson, Simon J.; Welkers, Matthijs R. A.; Depledge, Daniel P.; Coulter, Eve; Breuer, Judith M.; de Jong, Menno D.; Kellam, Paul

    2013-01-01

    RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unprecedented resolution. A critical problem with such analyses is the presence of sequencing-related errors that obscure the identification of true biological variants present at low frequency. Here, we report the development and assessment of the Quality Assessment of Short Read (QUASR) Pipeline (http://sourceforge.net/projects/quasr) specific for virus genome short read analysis that minimizes sequencing errors from multiple deep-sequencing platforms, and enables post-mapping analysis of the minority variants within the viral population. QUASR significantly reduces the error-related noise in deep-sequencing datasets, resulting in increased mapping accuracy and reduction of erroneous mutations. Using QUASR, we have determined influenza virus genome dynamics in sequential samples from an in vitro evolution of 2009 pandemic H1N1 (A/H1N1/09) influenza from samples sequenced on both the Roche 454 GSFLX and Illumina GAIIx platforms. Importantly, concordance between the 454 and Illumina sequencing allowed unambiguous minority-variant detection and accurate determination of virus population turnover in vitro. PMID:23382427

  15. Viral population analysis and minority-variant detection using short read next-generation sequencing.

    PubMed

    Watson, Simon J; Welkers, Matthijs R A; Depledge, Daniel P; Coulter, Eve; Breuer, Judith M; de Jong, Menno D; Kellam, Paul

    2013-03-19

    RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unprecedented resolution. A critical problem with such analyses is the presence of sequencing-related errors that obscure the identification of true biological variants present at low frequency. Here, we report the development and assessment of the Quality Assessment of Short Read (QUASR) Pipeline (http://sourceforge.net/projects/quasr) specific for virus genome short read analysis that minimizes sequencing errors from multiple deep-sequencing platforms, and enables post-mapping analysis of the minority variants within the viral population. QUASR significantly reduces the error-related noise in deep-sequencing datasets, resulting in increased mapping accuracy and reduction of erroneous mutations. Using QUASR, we have determined influenza virus genome dynamics in sequential samples from an in vitro evolution of 2009 pandemic H1N1 (A/H1N1/09) influenza from samples sequenced on both the Roche 454 GSFLX and Illumina GAIIx platforms. Importantly, concordance between the 454 and Illumina sequencing allowed unambiguous minority-variant detection and accurate determination of virus population turnover in vitro.

  16. A microfluidic platform for size-dependent generation of droplet interface bilayer networks on rails

    PubMed Central

    Carreras, P.; Elani, Y.; Law, R. V.; Brooks, N. J.; Seddon, J. M.; Ces, O.

    2015-01-01

    Droplet interface bilayer (DIB) networks are emerging as a cornerstone technology for the bottom up construction of cell-like and tissue-like structures and bio-devices. They are an exciting and versatile model-membrane platform, seeing increasing use in the disciplines of synthetic biology, chemical biology, and membrane biophysics. DIBs are formed when lipid-coated water-in-oil droplets are brought together—oil is excluded from the interface, resulting in a bilayer. Perhaps the greatest feature of the DIB platform is the ability to generate bilayer networks by connecting multiple droplets together, which can in turn be used in applications ranging from tissue mimics, multicellular models, and bio-devices. For such applications, the construction and release of DIB networks of defined size and composition on-demand is crucial. We have developed a droplet-based microfluidic method for the generation of different sized DIB networks (300–1500 pl droplets) on-chip. We do this by employing a droplet-on-rails strategy where droplets are guided down designated paths of a chip with the aid of microfabricated grooves or “rails,” and droplets of set sizes are selectively directed to specific rails using auxiliary flows. In this way we can uniquely produce parallel bilayer networks of defined sizes. By trapping several droplets in a rail, extended DIB networks containing up to 20 sequential bilayers could be constructed. The trapped DIB arrays can be composed of different lipid types and can be released on-demand and regenerated within seconds. We show that chemical signals can be propagated across the bio-network by transplanting enzymatic reaction cascades for inter-droplet communication. PMID:26759638

  17. Strategies for achieving high sequencing accuracy for low diversity samples and avoiding sample bleeding using illumina platform.

    PubMed

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer's, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  18. Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

    PubMed Central

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  19. Generation and Characterization of an IgG4 Monomeric Fc Platform

    PubMed Central

    Shan, Lu; Colazet, Magali; Rosenthal, Kim L.; Yu, Xiang-Qing; Bee, Jared S.; Ferguson, Andrew; Damschroder, Melissa M.; Wu, Herren; Dall’Acqua, William F.; Tsui, Ping

    2016-01-01

    The immunoglobulin Fc region is a homodimer consisted of two sets of CH2 and CH3 domains and has been exploited to generate two-arm protein fusions with high expression yields, simplified purification processes and extended serum half-life. However, attempts to generate one-arm fusion proteins with monomeric Fc, with one set of CH2 and CH3 domains, are often plagued with challenges such as weakened binding to FcRn or partial monomer formation. Here, we demonstrate the generation of a stable IgG4 Fc monomer with a unique combination of mutations at the CH3-CH3 interface using rational design combined with in vitro evolution methodologies. In addition to size-exclusion chromatography and analytical ultracentrifugation, we used multi-angle light scattering (MALS) to show that the engineered Fc monomer exhibits excellent monodispersity. Furthermore, crystal structure analysis (PDB ID: 5HVW) reveals monomeric properties supported by disrupted interactions at the CH3-CH3 interface. Monomeric Fc fusions with Fab or scFv achieved FcRn binding and serum half-life comparable to wildtype IgG. These results demonstrate that this monomeric IgG4 Fc is a promising therapeutic platform to extend the serum half-life of proteins in a monovalent format. PMID:27479095

  20. Next-generation sequencing technologies and their impact on microbial genomics.

    PubMed

    Forde, Brian M; O'Toole, Paul W

    2013-09-01

    Next-generation sequencing technologies have had a dramatic impact in the field of genomic research through the provision of a low cost, high-throughput alternative to traditional capillary sequencers. These new sequencing methods have surpassed their original scope and now provide a range of utility-based applications, which allow for a more comprehensive analysis of the structure and content of microbial genomes than was previously possible. With the commercialization of a third generation of sequencing technologies imminent, we discuss the applications of current next-generation sequencing methods and explore their impact on and contribution to microbial genome research.

  1. DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

    PubMed

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.

  2. DNApod: DNA polymorphism annotation database from next-generation sequence read archives

    PubMed Central

    Mochizuki, Takako; Tanizawa, Yasuhiro; Fujisawa, Takatomo; Ohta, Tazro; Nikoh, Naruo; Shimizu, Tokurou; Toyoda, Atsushi; Fujiyama, Asao; Kurata, Nori; Nagasaki, Hideki; Kaminuma, Eli; Nakamura, Yasukazu

    2017-01-01

    With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information. PMID:28234924

  3. Recent Patents and Advances in the Next-Generation Sequencing Technologies

    PubMed Central

    Lin, Biaoyang; Wang, Jun; Cheng, Yin

    2010-01-01

    We are now witnessing a new genomic revolution due to the arrival and continued advancements in the next-generation high-throughput sequencing technologies, which encompass sequencing by synthesis including fluorescent in situ sequencing (FISSEQ) and pyrosequencing, sequencing by ligation including using polony amplification and supported oligonucleotide detection (SOLiD), sequencing by hybridization in combination with sequencing-by-ligation and nanopore technology, nanopore sequencing and other novel sequencing technologies using nano-transistor array, scanning tunneling microscopy and nanowire molecule sensors etc. We review here major technologies and recent patents for achieving high-throughput, ultra-fast, extremely cheap, and highly accurate sequencing. We will see enormous impacts of these next-generation sequencing methods for solving complex biological problems and for ushering in the practice of personalized medicine. PMID:21709726

  4. The impact of RNA secondary structure on read start locations on the Illumina sequencing platform.

    PubMed

    Price, Adam; Garhyan, Jaishree; Gibas, Cynthia

    2017-01-01

    High-throughput sequencing is subject to sequence dependent bias, which must be accounted for if researchers are to make precise measurements and draw accurate conclusions from their data. A widely studied source of bias in sequencing is the GC content bias, in which levels of GC content in a genomic region effect the number of reads produced during sequencing. Although some research has been performed on methods to correct for GC bias, there has been little effort to understand the underlying mechanism. The availability of sequencing protocols that target the specific location of structure in nucleic acid molecules enables us to investigate the underlying molecular origin of observed GC bias in sequencing. By applying a parallel analysis of RNA structure (PARS) protocol to bacterial genomes of varying GC content, we are able to observe the relationship between local RNA secondary structure and sequencing outcome, and to establish RNA secondary structure as the significant contributing factor to observed GC bias.

  5. The impact of RNA secondary structure on read start locations on the Illumina sequencing platform

    PubMed Central

    Price, Adam; Garhyan, Jaishree

    2017-01-01

    High-throughput sequencing is subject to sequence dependent bias, which must be accounted for if researchers are to make precise measurements and draw accurate conclusions from their data. A widely studied source of bias in sequencing is the GC content bias, in which levels of GC content in a genomic region effect the number of reads produced during sequencing. Although some research has been performed on methods to correct for GC bias, there has been little effort to understand the underlying mechanism. The availability of sequencing protocols that target the specific location of structure in nucleic acid molecules enables us to investigate the underlying molecular origin of observed GC bias in sequencing. By applying a parallel analysis of RNA structure (PARS) protocol to bacterial genomes of varying GC content, we are able to observe the relationship between local RNA secondary structure and sequencing outcome, and to establish RNA secondary structure as the significant contributing factor to observed GC bias. PMID:28245230

  6. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression

    PubMed Central

    Popham, Holly J. R.; Gould, Fred; Adang, Michael J.; Jurat-Fuentes, Juan Luis

    2015-01-01

    Heliothine pests such as the tobacco budworm, Heliothis virescens (F.), pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus) and silkworm (Bombyx mori). Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt) Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new biological

  7. Burrow-generated false facies and phantom sequences

    SciTech Connect

    Wanless, H.R.; Tagett, M.

    1986-05-01

    Callianassa (=Ophiomorpha) and other burrowers deeply rework shallow marine sequences. Through in-situ reworking, they create false sedimentary facies and stratigraphic sequences. Callianassa's key to effectiveness is that it expels sand and mud from burrow excavations but concentrates coarse material at the base of the burrow complex. Coarse material can be derived by falling into the burrow entrance, by reworking the existing sediment sequence, or by a combination of both. Examples come from shallow marine carbonate environments of south Florida and the Turks and Caicos Islands, British West Indies. Many mudbanks in south Florida are formed as stacks of layered mudstone units 20-100 cm thick. Between events, seagrasses may recolonize, and a burrowing benthic community may repopulate the substrate. The layered mudstone beneath older areas of mudbank flats can gradually be converted to a bioturbated skeletal wackestone by the deep burrowing community. Burrowing also causes mixing of faunal assemblages. On Caicos Bank, an extensive carbonate tidal flat (3-4 m thick) is slowly being transgressed. About 1 m of tidal-flat sequence is eroded at the shoreline. The remaining 2-3 m could be preserved as part of the transgressive sequence. Callianassa burrowing, however, quickly reworks the sequence, replacing tidal-flat sands and muds with marine peloidal and skeletal sediment. Within 100 m of the shoreline, the only evidence of the tidal-flat sequence is a concentration of high-spired gastropods in Calliannassa burrows at the base of the Holocene sequence and a few patches of tidal-flat sediment that burrowers missed. What looks like a basal transgressive lag is in fact a biogenic concentrate from in-situ reworking of a now phantom sequence.

  8. CAPRG: sequence assembling pipeline for next generation sequencing of non-model organisms.

    PubMed

    Rawat, Arun; Elasri, Mohamed O; Gust, Kurt A; George, Glover; Pham, Don; Scanlan, Leona D; Vulpe, Chris; Perkins, Edward J

    2012-01-01

    Our goal is to introduce and describe the utility of a new pipeline "Contigs Assembly Pipeline using Reference Genome" (CAPRG), which has been developed to assemble "long sequence reads" for non-model organisms by leveraging a reference genome of a closely related phylogenetic relative. To facilitate this effort, we utilized two avian transcriptomic datasets generated using ROCHE/454 technology as test cases for CAPRG assembly. We compared the results of CAPRG assembly using a reference genome with the results of existing methods that utilize de novo strategies such as VELVET, PAVE, and MIRA by employing parameter space comparisons (intra-assembling comparison). CAPRG performed as well or better than the existing assembly methods based on various benchmarks for "gene-hunting." Further, CAPRG completed the assemblies in a fraction of the time required by the existing assembly algorithms. Additional advantages of CAPRG included reduced contig inflation resulting in lower computational resources for annotation, and functional identification for contigs that may be categorized as "unknowns" by de novo methods. In addition to providing evaluation of CAPRG performance, we observed that the different assembly (inter-assembly) results could be integrated to enhance the putative gene coverage for any transcriptomics study.

  9. Building a geological reference platform using sequence stratigraphy combined with geostatistical tools

    NASA Astrophysics Data System (ADS)

    Bourgine, Bernard; Lasseur, Éric; Leynet, Aurélien; Badinier, Guillaume; Ortega, Carole; Issautier, Benoit; Bouchet, Valentin

    2015-04-01

    In 2012 BRGM launched an extensive program to build the new French Geological Reference platform (RGF). Among the objectives of this program is to provide the public with validated, reliable and 3D-consistent geological data, with estimation of uncertainty. Approx. 100,000 boreholes over the whole French national territory provide a preliminary interpretation in terms of depths of main geological interfaces, but with an unchecked, unknown and often low reliability. The aim of this paper is to present the procedure that has been tested on two areas in France, in order to validate (or not) these boreholes, with the aim of being generalized as much as possible to the nearly 100,000 boreholes waiting for validation. The approach is based on the following steps, and includes the management of uncertainty at different steps: (a) Selection of a loose network of boreholes owning a logging or coring information enabling a reliable interpretation. This first interpretation is based on the correlation of well log data and allows defining 3D sequence stratigraphic framework identifying isochronous surfaces. A litho-stratigraphic interpretation is also performed. Be "A" the collection of all boreholes used for this step (typically 3 % of the total number of holes to be validated) and "B" the other boreholes to validate, (b) Geostatistical analysis of characteristic geological interfaces. The analysis is carried out firstly on the "A" type data (to validate the variogram model), then on the "B" type data and at last on "B" knowing "A". It is based on cross-validation tests and evaluation of the uncertainty associated to each geological interface. In this step, we take into account inequality constraints provided by boreholes that do not intersect all interfaces, as well as the "litho-stratigraphic pile" defining the formations and their relationships (depositing surfaces or erosion). The goal is to identify quickly and semi-automatically potential errors among the data, up to

  10. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  11. Identification and Characterization of Microsatellite Loci in Maqui (Aristotelia chilensis [Molina] Stunz) Using Next-Generation Sequencing (NGS)

    PubMed Central

    Bastías, Adriana; Correa, Francisco; Rojas, Pamela; Almada, Rubén; Muñoz, Carlos; Sagredo, Boris

    2016-01-01

    Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree native to South America with edible fruit characterized by very high antioxidant capacity and anthocyanin content. To preserve maqui as a genetic resource it is essential to study its genetic diversity. However, the complete genome is unknown and only a few gene sequences are available in databases. Simple sequence repeats (SSR) markers, which are neutral, co-dominant, reproducible and highly variable, are desirable to support genetic studies in maqui populations. By means of identification and characterization of microsatellite loci from a maqui genotype, using 454 sequencing technology, we develop a set of SSR for this species. Obtaining a total of 165,043 shotgun genome sequences, with an average read length of 387 bases, we covered 64 Mb of the maqui genome. Reads were assembled into 4,832 contigs, while 98,546 reads remained as singletons, generating a total of 103,378 consensus genomic sequences. A total of 24,494 SSR maqui markers were identified. Of them, 15,950 SSR maqui markers were classified as perfects. The most common SSR motifs were dinucleotide (31%), followed by tetranucleotide (26%) and trinucleotide motifs (24%). The motif AG/CT (28.4%) was the most abundant, while the motif AC (89 bp) was the largest. Eleven polymorphic SSRs were selected and used to analyze a population of 40 maqui genotypes. Polymorphism information content (PIC) ranged from 0.117 to 0.82, with an average of 0.58. Non-significant groups were observed in the maqui population, showing a panmictic genetic structure. In addition, we also predicted 11150 putative genes and 3 microRNAs (miRNAs) in maqui sequences. This results, including partial sequences of genes, some miRNAs and SSR markers from high throughput next generation sequencing (NGS) of maqui genomic DNA, constitute the first platform to undertake genetic and molecular studies of this important species. PMID:27459734

  12. A technical platform for generating reproducible expression data from Streptomyces coelicolor batch cultivations.

    PubMed

    Battke, F; Herbig, A; Wentzel, A; Jakobsen, O M; Bonin, M; Hodgson, D A; Wohlleben, W; Ellingsen, T E; Nieselt, K

    2011-01-01

    Streptomyces coelicolor, the model species of the genus Streptomyces, presents a complex life cycle of successive morphological and biochemical changes involving the formation of substrate and aerial mycelium, sporulation and the production of antibiotics. The switch from primary to secondary metabolism can be triggered by nutrient starvation and is of particular interest as some of the secondary metabolites produced by related Streptomycetes are commercially relevant. To understand these events on a molecular basis, a reliable technical platform encompassing reproducible fermentation as well as generation of coherent transcriptomic data is required. Here, we investigate the technical basis of a previous study as reported by Nieselt et al. (BMC Genomics 11:10, 2010) in more detail, based on the same samples and focusing on the validation of the custom-designed microarray as well as on the reproducibility of the data generated from biological replicates. We show that the protocols developed result in highly coherent transcriptomic measurements. Furthermore, we use the data to predict chromosomal gene clusters, extending previously known clusters as well as predicting interesting new clusters with consistent functional annotations.

  13. Beating heart on a chip: a novel microfluidic platform to generate functional 3D cardiac microtissues.

    PubMed

    Marsano, Anna; Conficconi, Chiara; Lemme, Marta; Occhetta, Paola; Gaudiello, Emanuele; Votta, Emiliano; Cerino, Giulia; Redaelli, Alberto; Rasponi, Marco

    2016-02-07

    In the past few years, microfluidic-based technology has developed microscale models recapitulating key physical and biological cues typical of the native myocardium. However, the application of controlled physiological uniaxial cyclic strains on a defined three-dimension cellular environment is not yet possible. Two-dimension mechanical stimulation was particularly investigated, neglecting the complex three-dimensional cell-cell and cell-matrix interactions. For this purpose, we developed a heart-on-a-chip platform, which recapitulates the physiologic mechanical environment experienced by cells in the native myocardium. The device includes an array of hanging posts to confine cell-laden gels, and a pneumatic actuation system to induce homogeneous uniaxial cyclic strains to the 3D cell constructs during culture. The device was used to generate mature and highly functional micro-engineered cardiac tissues (μECTs), from both neonatal rat and human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CM), strongly suggesting the robustness of our engineered cardiac micro-niche. Our results demonstrated that the cyclic strain was effectively highly uniaxial and uniformly transferred to cells in culture. As compared to control, stimulated μECTs showed superior cardiac differentiation, as well as electrical and mechanical coupling, owing to a remarkable increase in junction complexes. Mechanical stimulation also promoted early spontaneous synchronous beating and better contractile capability in response to electric pacing. Pacing analyses of hiPSC-CM constructs upon controlled administration of isoprenaline showed further promising applications of our platform in drug discovery, delivery and toxicology fields. The proposed heart-on-a-chip device represents a relevant step forward in the field, providing a standard functional three-dimensional cardiac model to possibly predict signs of hypertrophic changes in cardiac phenotype by mechanical and biochemical co-stimulation.

  14. A Framework for the Generation and Dissemination of Drop Size Distribution (DSD) Characteristics Using Multiple Platforms

    NASA Technical Reports Server (NTRS)

    Wolf, David B.; Tokay, Ali; Petersen, Walt; Williams, Christopher; Gatlin, Patrick; Wingo, Mathew

    2010-01-01

    Proper characterization of the precipitation drop size distribution (DSD) is integral to providing realistic and accurate space- and ground-based precipitation retrievals. Current technology allows for the development of DSD products from a variety of platforms, including disdrometers, vertical profilers and dual-polarization radars. Up to now, however, the dissemination or availability of such products has been limited to individual sites and/or field campaigns, in a variety of formats, often using inconsistent algorithms for computing the integral DSD parameters, such as the median- and mass-weighted drop diameter, total number concentration, liquid water content, rain rate, etc. We propose to develop a framework for the generation and dissemination of DSD characteristic products using a unified structure, capable of handling the myriad collection of disdrometers, profilers, and dual-polarization radar data currently available and to be collected during several upcoming GPM Ground Validation field campaigns. This DSD super-structure paradigm is an adaptation of the radar super-structure developed for NASA s Radar Software Library (RSL) and RSL_in_IDL. The goal is to provide the DSD products in a well-documented format, most likely NetCDF, along with tools to ingest and analyze the products. In so doing, we can develop a robust archive of DSD products from multiple sites and platforms, which should greatly benefit the development and validation of precipitation retrieval algorithms for GPM and other precipitation missions. An outline of this proposed framework will be provided as well as a discussion of the algorithms used to calculate the DSD parameters.

  15. Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

    PubMed Central

    2010-01-01

    Background Complete chloroplast genome sequences provide a valuable source of molecular markers for studies in molecular ecology and evolution of plants. To obtain complete genome sequences, recent studies have made use of the polymerase chain reaction to amplify overlapping fragments from conserved gene loci. However, this approach is time consuming and can be more difficult to implement where gene organisation differs among plants. An alternative approach is to first isolate chloroplasts and then use the capacity of high-throughput sequencing to obtain complete genome sequences. We report our findings from studies of the latter approach, which used a simple chloroplast isolation procedure, multiply-primed rolling circle amplification of chloroplast DNA, Illumina Genome Analyzer II sequencing, and de novo assembly of paired-end sequence reads. Results A modified rapid chloroplast isolation protocol was used to obtain plant DNA that was enriched for chloroplast DNA, but nevertheless contained nuclear and mitochondrial DNA. Multiply-primed rolling circle amplification of this mixed template produced sufficient quantities of chloroplast DNA, even when the amount of starting material was small, and improved the template quality for Illumina Genome Analyzer II (hereafter Illumina GAII) sequencing. We demonstrate, using independent samples of karaka (Corynocarpus laevigatus), that there is high fidelity in the sequence obtained from this template. Although less than 20% of our sequenced reads could be mapped to chloroplast genome, it was relatively easy to assemble complete chloroplast genome sequences from the mixture of nuclear, mitochondrial and chloroplast reads. Conclusions We report successful whole genome sequencing of chloroplast DNA from karaka, obtained efficiently and with high fidelity. PMID:20920211

  16. Using cellular automata to generate image representation for biological sequences.

    PubMed

    Xiao, X; Shao, S; Ding, Y; Huang, Z; Chen, X; Chou, K-C

    2005-02-01

    A novel approach to visualize biological sequences is developed based on cellular automata (Wolfram, S. Nature 1984, 311, 419-424), a set of discrete dynamical systems in which space and time are discrete. By transforming the symbolic sequence codes into the digital codes, and using some optimal space-time evolvement rules of cellular automata, a biological sequence can be represented by a unique image, the so-called cellular automata image. Many important features, which are originally hidden in a long and complicated biological sequence, can be clearly revealed thru its cellular automata image. With biological sequences entering into databanks rapidly increasing in the post-genomic era, it is anticipated that the cellular automata image will become a very useful vehicle for investigation into their key features, identification of their function, as well as revelation of their "fingerprint". It is anticipated that by using the concept of the pseudo amino acid composition (Chou, K.C. Proteins: Structure, Function, and Genetics, 2001, 43, 246-255), the cellular automata image approach can also be used to improve the quality of predicting protein attributes, such as structural class and subcellular location.

  17. GOblet: a platform for Gene Ontology annotation of anonymous sequence data

    PubMed Central

    Groth, Detlef; Lehrach, Hans; Hennig, Steffen

    2004-01-01

    GOblet is a comprehensive web server application providing the annotation of anonymous sequence data with Gene Ontology (GO) terms. It uses a variety of different protein databases (human, murines, invertebrates, plants, sp-trembl) and their respective GO mappings. The user selects the appropriate database and alignment threshold and thereafter submits single or multiple nucleotide or protein sequences. Results are shown in different ways, e.g. as survey statistics for the main GO categories for all sequences or as detailed results for each single sequence that has been submitted. In its newest version, GOblet allows the batch submission of sequences and provides an improved display of results with the aid of Java applets. All output data, together with the Java applet, are packed to a downloadable archive for local installation and analysis. GOblet can be accessed freely at http://goblet.molgen.mpg.de. PMID:15215401

  18. Certified DNA Reference Materials to Compare HER2 Gene Amplification Measurements Using Next-Generation Sequencing Methods.

    PubMed

    Lih, Chih-Jian; Si, Han; Das, Biswajit; Harrington, Robin D; Harper, Kneshay N; Sims, David J; McGregor, Paul M; Camalier, Corinne E; Kayserian, Andrew Y; Williams, P Mickey; He, Hua-Jun; Almeida, Jamie L; Lund, Steve P; Choquette, Steve; Cole, Kenneth D

    2016-09-01

    The National Institute of Standards and Technology (NIST) Standard Reference Materials 2373 is a set of genomic DNA samples prepared from five breast cancer cell lines with certified values for the ratio of the HER2 gene copy number to the copy numbers of reference genes determined by real-time quantitative PCR and digital PCR. Targeted-amplicon, whole-exome, and whole-genome sequencing measurements were used with the reference material to compare the performance of both the laboratory steps and the bioinformatic approaches of the different methods using a range of amplification ratios. Although good reproducibility was observed in each next-generation sequencing method, slightly different HER2 copy numbers associated with platform-specific biases were obtained. This study clearly demonstrates the value of Standard Reference Materials 2373 as reference material and as a calibrator for evaluating assay performance as well as for increasing confidence in reporting HER2 amplification for clinical applications.

  19. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  20. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies.

    PubMed

    Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S; Singh, Rajesh R; Roy-Chowdhuri, Sinchita

    2015-08-28

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects.

  1. Big data challenges in bone research: genome-wide association studies and next-generation sequencing

    PubMed Central

    Alonso, Nerea; Lucas, Gavin; Hysi, Pirro

    2015-01-01

    Genome-wide association studies (GWAS) have been developed as a practical method to identify genetic loci associated with disease by scanning multiple markers across the genome. Significant advances in the genetics of complex diseases have been made owing to advances in genotyping technologies, the progress of projects such as HapMap and 1000G and the emergence of genetics as a collaborative discipline. Because of its great potential to be used in parallel by multiple collaborators, it is important to adhere to strict protocols assuring data quality and analyses. Quality control analyses must be applied to each sample and each single-nucleotide polymorphism (SNP). The software package PLINK is capable of performing the whole range of necessary quality control tests. Genotype imputation has also been developed to substantially increase the power of GWAS methodology. Imputation permits the investigation of associations at genetic markers that are not directly genotyped. Results of individual GWAS reports can be combined through meta-analysis. Finally, next-generation sequencing (NGS) has gained popularity in recent years through its capacity to analyse a much greater number of markers across the genome. Although NGS platforms are capable of examining a higher number of SNPs compared with GWA studies, the results obtained by NGS require careful interpretation, as their biological correlation is incompletely understood. In this article, we will discuss the basic features of such protocols. PMID:25709812

  2. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data.

    PubMed

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-07-13

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

  3. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS).

    PubMed

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-Dos-Santos, André M; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-Dos-Santos, Ândrea

    2016-05-13

    Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform.

  4. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS)

    PubMed Central

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-dos-Santos, André M.; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-dos-Santos, Ândrea

    2016-01-01

    Abstract Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  5. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data

    PubMed Central

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-01-01

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses. PMID:26166306

  6. Autonomously generating operations sequences for a Mars Rover using AI-based planning

    NASA Technical Reports Server (NTRS)

    Sherwood, Rob; Mishkin, Andrew; Estlin, Tara; Chien, Steve; Backes, Paul; Cooper, Brian; Maxwell, Scott; Rabideau, Gregg

    2001-01-01

    This paper discusses a proof-of-concept prototype for ground-based automatic generation of validated rover command sequences from highlevel science and engineering activities. This prototype is based on ASPEN, the Automated Scheduling and Planning Environment. This Artificial Intelligence (AI) based planning and scheduling system will automatically generate a command sequence that will execute within resource constraints and satisfy flight rules.

  7. New Generation Sequencing Technology Panel at SFAF-Part I

    SciTech Connect

    Fiske, Haley; Turner, Steve; Rhodes, Michael; Milos, Patrice; Harkins, Tim

    2009-05-27

    From left to right: Haley Fiske of Illumina Inc., Steve Turner of Pacific Biosciences, Michael Rhodes of Applied Biosystems, Patrice Milos of Helicos Biosciences and Tim Harkins of Roche Diagnostics answer questions in a forum moderated by Bob Fulton at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  8. New Generation Sequencing Technology Panel at SFAF-Part II

    SciTech Connect

    Fiske, Haley; Turner, Steve; Rhodes, Michael; Milos, Patrice; Harkins, Tim

    2009-05-27

    From left to right: Haley Fiske of Illumina Inc., Steve Turner of Pacific Biosciences, Michael Rhodes of Applied Biosystems, Patrice Milos of Helicos Biosciences and Tim Harkins of Roche Diagnostics answer questions in a forum moderated by Bob Fulton at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  9. Sequence stratigraphy and systems tract development of the Latemar platform, Middle Triassic of the dolomites: Outcrop calibration keyed by cycle stacking patterns

    SciTech Connect

    Goldhammer, R.K.; Dunn, P.A. ); Harris, M.T. ); Hardie, L.A. )

    1991-03-01

    The Middle Triassic Latemar platform provides a seismic-scale outcrop example of an intact carbonate shelf-to-basin transition, ideal for integrating sequence stratigraphy with facies and cyclic stratigraphy. This subcircular, high-relief buildup records two third-order accommodation sequences within the platform interior: the lower Ladinian sequence and the upper Ladinian sequence. Sequence L1 developed atop a widespread, low-relief Middle Anisian carbonate bank (60 m thick). Underlying subtidal bank cycles thin upward into the basal, subaerial sequence boundary (type 1) reflecting decreasing third-order accommodation; above it, platform-interior facies of sequence L1 retrograde. This results in superimposition of Ladinian basinal and foreslope facies atop the underlying, horizontal, shallow-water bank along its periphery. The transgressive (TST) and highstand systems tract (HST) of sequence L1 (as well as L2) are marked by long-term, systematic vertical facies changes and variation in stacking patterns of aggradational high-frequency, 20 kyr cycles within the platform interior. The maximum flooding surface (MFS) is a marine hardground surface displaying evidence of very slow sedimentation and is the platform expression of the condensed section. A type 2 SB caps sequence L1, marked by an interval of vertically superimposed thin subaerial tepees; beneath this, high-frequency cycles are thinning-upward, and above they are thickening-upward. Only the transgressive systems tract of sequence L2 is preserved at the Latemar owing to late Ladinian-Early Carnian volcanism and tectonism which terminated carbonate platform deposition.

  10. Concatenated shift registers generating maximally spaced phase shifts of PN-sequences

    NASA Technical Reports Server (NTRS)

    Hurd, W. J.; Welch, L. R.

    1977-01-01

    A large class of linearly concatenated shift registers is shown to generate approximately maximally spaced phase shifts of pn-sequences, for use in pseudorandom number generation. A constructive method is presented for finding members of this class, for almost all degrees for which primitive trinomials exist. The sequences which result are not normally characterized by trinomial recursions, which is desirable since trinomial sequences can have some undesirable randomness properties.

  11. Performance evaluation of Sanger sequencing for the diagnosis of primary hyperoxaluria and comparison with targeted next generation sequencing

    PubMed Central

    Williams, Emma L; Bagg, Eleanor A L; Mueller, Michael; Vandrovcova, Jana; Aitman, Timothy J; Rumsby, Gill

    2015-01-01

    Definitive diagnosis of primary hyperoxaluria (PH) currently utilizes sequential Sanger sequencing of the AGXT, GRPHR, and HOGA1 genes but efficacy is unproven. This analysis is time-consuming, relatively expensive, and delays in diagnosis and inappropriate treatment can occur if not pursued early in the diagnostic work-up. We reviewed testing outcomes of Sanger sequencing in 200 consecutive patient samples referred for analysis. In addition, the Illumina Truseq custom amplicon system was evaluated for paralleled next-generation sequencing (NGS) of AGXT,GRHPR, and HOGA1 in 90 known PH patients. AGXT sequencing was requested in all patients, permitting a diagnosis of PH1 in 50%. All remaining patients underwent targeted exon sequencing of GRHPR and HOGA1 with 8% diagnosed with PH2 and 8% with PH3. Complete sequencing of both GRHPR and HOGA1 was not requested in 25% of patients referred leaving their diagnosis in doubt. NGS analysis showed 98% agreement with Sanger sequencing and both approaches had 100% diagnostic specificity. Diagnostic sensitivity of Sanger sequencing was 98% and for NGS it was 97%. NGS has comparable diagnostic performance to Sanger sequencing for the diagnosis of PH and, if implemented, would screen for all forms of PH simultaneously ensuring prompt diagnosis at decreased cost. PMID:25629080

  12. Combining microarray-based genomic selection (MGS) with the Illumina Genome Analyzer platform to sequence diploid target regions.

    PubMed

    Okou, David T; Locke, Adam E; Steinberg, Karyn M; Hagen, Katie; Athri, Prashanth; Shetty, Amol C; Patel, Viren; Zwick, Michael E

    2009-09-01

    Novel methods of targeted sequencing of unique regions from complex eukaryotic genomes have generated a great deal of excitement, but critical demonstrations of these methods efficacy with respect to diploid genotype calling and experimental variation are lacking. To address this issue, we optimized microarray-based genomic selection (MGS) for use with the Illumina Genome Analyzer (IGA). A set of 202 fragments (304 kb total) contained within a 1.7 Mb genomic region on human chromosome X were MGS/IGA sequenced in ten female HapMap samples generating a total of 2.4 GB of DNA sequence. At a minimum coverage threshold of 5X, 93.9% of all bases and 94.9% of segregating sites were called, while 57.7% of bases (57.4% of segregating sites) were called at a 50X threshold. Data accuracy at known segregating sites was 98.9% at 5X coverage, rising to 99.6% at 50X coverage. Accuracy at homozygous sites was 98.7% at 5X sequence coverage and 99.5% at 50X coverage. Although accuracy at heterozygous sites was modestly lower, it was still over 92% at 5X coverage and increased to nearly 97% at 50X coverage. These data provide the first demonstration that MGS/IGA sequencing can generate the very high quality sequence data necessary for human genetics research. All sequences generated in this study have been deposited in NCBI Short Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra, Accession # SRA007913).

  13. Facies architecture and high-resolution sequence stratigraphy of an Upper Cretaceous platform margin succession, southern central Pyrenees, Spain

    NASA Astrophysics Data System (ADS)

    Pomar, Luis; Gili, Eulalia; Obrador, Antonio; Ward, William C.

    2005-04-01

    Excellent exposures of Upper Cretaceous (Santonian) carbonate platforms on the northern flank of Sant Corneli anticline (southern central Pyrenees) provide detailed information of facies architecture in both depositional strike and dip directions. Basic accretional units are differentiated by facies contrast across mappable surfaces. These surfaces do not show clear evidence of subaerial erosion and are correlated basinward with bedding planes across which there are subtle changes in skeletal composition. Two types of basic accretional units have been identified based on bedding patterns, internal facies architecture and skeletal composition: (1) Rudist buildups consist of a rudist and coral belt at the platform margin, passing landward into a slender-hippuritid lithosome, locally overlain by a bioclastic blanket that passes basinward, into bioclastic "apron-like" clinobeds and into fine-grained packstone/wackestone. (2) Calcarenite wedges consist of yellow-brown, benthic-foraminifer-rich grainstones to grain-dominated packstones, with scattered rudist shells and small coral colonies, passing basinward into mud-dominated packstones to wackestones, with variable siliciclastic content (quartz sand to silt and clay). Rudist buildups and calcarenite wedges alternate, although not rhythmically. These changes in platform skeletal composition reflect changes in the dominant type of carbonate-producing biota independently of the changes in accommodation. Both types of basic accretional units: rudist buildups and calcarenite wedges, form simple sequences and parasequences according to internal lithofacies arrangement and inferred sea-level cyclicity (cycles or paracycles). High-frequency sea-level cyclicity fits in the Milankovitch frequency band. Long-term changes in accommodation governing aggradation, progradation and backstepping of basic sequences and parasequences reflect tectonic influence rather than long-term changes in eustatic sea level.

  14. A next-generation marker genotyping platform (AmpSeq) in heterozygous crops: a case study for marker-assisted selection in grapevine.

    PubMed

    Yang, Shanshan; Fresnedo-Ramírez, Jonathan; Wang, Minghui; Cote, Linda; Schweitzer, Peter; Barba, Paola; Takacs, Elizabeth M; Clark, Matthew; Luby, James; Manns, David C; Sacks, Gavin; Mansfield, Anna Katharine; Londo, Jason; Fennell, Anne; Gadoury, David; Reisch, Bruce; Cadle-Davidson, Lance; Sun, Qi

    2016-01-01

    Marker-assisted selection (MAS) is often employed in crop breeding programs to accelerate and enhance cultivar development, via selection during the juvenile phase and parental selection prior to crossing. Next-generation sequencing and its derivative technologies have been used for genome-wide molecular marker discovery. To bridge the gap between marker development and MAS implementation, this study developed a novel practical strategy with a semi-automated pipeline that incorporates trait-associated single nucleotide polymorphism marker discovery, low-cost genotyping through amplicon sequencing (AmpSeq) and decision making. The results document the development of a MAS package derived from genotyping-by-sequencing using three traits (flower sex, disease resistance and acylated anthocyanins) in grapevine breeding. The vast majority of sequence reads (⩾99%) were from the targeted regions. Across 380 individuals and up to 31 amplicons sequenced in each lane of MiSeq data, most amplicons (83 to 87%) had <10% missing data, and read depth had a median of 220-244×. Several strengths of the AmpSeq platform that make this approach of broad interest in diverse crop species include accuracy, flexibility, speed, high-throughput, low-cost and easily automated analysis.

  15. A next-generation marker genotyping platform (AmpSeq) in heterozygous crops: a case study for marker-assisted selection in grapevine

    PubMed Central

    Yang, Shanshan; Fresnedo-Ramírez, Jonathan; Wang, Minghui; Cote, Linda; Schweitzer, Peter; Barba, Paola; Takacs, Elizabeth M; Clark, Matthew; Luby, James; Manns, David C; Sacks, Gavin; Mansfield, Anna Katharine; Londo, Jason; Fennell, Anne; Gadoury, David; Reisch, Bruce; Cadle-Davidson, Lance; Sun, Qi

    2016-01-01

    Marker-assisted selection (MAS) is often employed in crop breeding programs to accelerate and enhance cultivar development, via selection during the juvenile phase and parental selection prior to crossing. Next-generation sequencing and its derivative technologies have been used for genome-wide molecular marker discovery. To bridge the gap between marker development and MAS implementation, this study developed a novel practical strategy with a semi-automated pipeline that incorporates trait-associated single nucleotide polymorphism marker discovery, low-cost genotyping through amplicon sequencing (AmpSeq) and decision making. The results document the development of a MAS package derived from genotyping-by-sequencing using three traits (flower sex, disease resistance and acylated anthocyanins) in grapevine breeding. The vast majority of sequence reads (⩾99%) were from the targeted regions. Across 380 individuals and up to 31 amplicons sequenced in each lane of MiSeq data, most amplicons (83 to 87%) had <10% missing data, and read depth had a median of 220–244×. Several strengths of the AmpSeq platform that make this approach of broad interest in diverse crop species include accuracy, flexibility, speed, high-throughput, low-cost and easily automated analysis. PMID:27257505

  16. Generating Correlated Gamma Sequences for Sea-Clutter Simulation

    DTIC Science & Technology

    2012-03-01

    2003]. The C code has been compiled as a Mex file for MATLAB by Davidson [2011] as part of his Radar Toolbox. Two special cases are implemented with...in IEEE Geoscience and Remote Sensing Symposium, pp. 2397–2400. Davidson, G. (2011) Matlab and c radar toolbox. URL – http://www.radarfactory.com... Radar Division Defence Science and Technology Organisation DSTO–TR–2688 ABSTRACT This report presents a hybrid method for simulating sequences of

  17. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    PubMed

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  18. Challenges and opportunities for next-generation sequencing in companion diagnostics.

    PubMed

    Lin, Erick; Chien, Jeremy; Ong, Frank S; Fan, Jian-Bing

    2015-02-01

    The rapid decline in sequencing costs has allowed next-generation sequencing (NGS) assays, previously ubiquitous only in research laboratories, to begin making inroads into molecular diagnostics. Genotypic assays - DNA sequencing - include whole genome sequencing, whole exome sequencing, focused assays that target only a handful of genes. Phenotypic assays comprise a broader spectrum of options and can query a variety of epigenetic modifications of DNA (such as ChIP-seq, bisulfite sequencing, DNase-I hypersensitivity site-sequencing, Formaldehyde-Assisted Isolation of Regulatory Elements-sequencing, etc.) that regulate gene expression-related processes or gene expression (RNA-sequencing) itself. To date, the US FDA has only cleared 12 DNA-based companion diagnostic tests, all in cancer. Although challenges exist for NGS in companion diagnostics, the wide-ranging capabilities of NGS offer extraordinary opportunities for the development and implementation of NGS-based companion diagnostics to probe oncogenes, tumor suppressor genes and cancer-enabling genes.

  19. Quantifying Next Generation Sequencing Sample Pre-Processing Bias in HIV-1 Complete Genome Sequencing.

    PubMed

    Vrancken, Bram; Trovão, Nídia Sequeira; Baele, Guy; van Wijngaerden, Eric; Vandamme, Anne-Mieke; van Laethem, Kristel; Lemey, Philippe

    2016-01-07

    Genetic analyses play a central role in infectious disease research. Massively parallelized "mechanical cloning" and sequencing technologies were quickly adopted by HIV researchers in order to broaden the understanding of the clinical importance of minor drug-resistant variants. These efforts have, however, remained largely limited to small genomic regions. The growing need to monitor multiple genome regions for drug resistance testing, as well as the obvious benefit for studying evolutionary and epidemic processes makes complete genome sequencing an important goal in viral research. In addition, a major drawback for NGS applications to RNA viruses is the need for large quantities of input DNA. Here, we use a generic overlapping amplicon-based near full-genome amplification protocol to compare low-input enzymatic fragmentation (Nextera™) with conventional mechanical shearing for Roche 454 sequencing. We find that the fragmentation method has only a modest impact on the characterization of the population composition and that for reliable results, the variation introduced at all steps of the procedure--from nucleic acid extraction to sequencing--should be taken into account, a finding that is also relevant for NGS technologies that are now more commonly used. Furthermore, by applying our protocol to deep sequence a number of pre-therapy plasma and PBMC samples, we illustrate the potential benefits of a near complete genome sequencing approach in routine genotyping.

  20. Assessing kinetic and epitopic diversity across orthogonal monoclonal antibody generation platforms

    PubMed Central

    Abdiche, Yasmina Noubia; Harriman, Rian; Deng, Xiaodi; Yeung, Yik Andy; Miles, Adam; Morishige, Winse; Boustany, Leila; Zhu, Lei; Izquierdo, Shelley Mettler; Harriman, William

    2016-01-01

    ABSTRACT The ability of monoclonal antibodies (mAbs) to target specific antigens with high precision has led to an increasing demand to generate them for therapeutic use in many disease areas. Historically, the discovery of therapeutic mAbs has relied upon the immunization of mammals and various in vitro display technologies. While the routine immunization of rodents yields clones that are stable in serum and have been selected against vast arrays of endogenous, non-target self-antigens, it is often difficult to obtain species cross-reactive mAbs owing to the generally high sequence similarity shared across human antigens and their mammalian orthologs. In vitro display technologies bypass this limitation, but lack an in vivo screening mechanism, and thus may potentially generate mAbs with undesirable binding specificity and stability issues. Chicken immunization is emerging as an attractive mAb discovery method because it combines the benefits of both in vivo and in vitro display methods. Since chickens are phylogenetically separated from mammals, their proteins share less sequence homology with those of humans, so human proteins are often immunogenic and can readily elicit rodent cross-reactive clones, which are necessary for in vivo proof of mechanism studies. Here, we compare the binding characteristics of mAbs isolated from chicken immunization, mouse immunization, and phage display of human antibody libraries. Our results show that chicken-derived mAbs not only recapitulate the kinetic diversity of mAbs sourced from other methods, but appear to offer an expanded repertoire of epitopes. Further, chicken-derived mAbs can bind their native serum antigen with very high affinity, highlighting their therapeutic potential. PMID:26652308

  1. Next-generation tag sequencing for cancer gene expression profiling.

    PubMed

    Morrissy, A Sorana; Morin, Ryan D; Delaney, Allen; Zeng, Thomas; McDonald, Helen; Jones, Steven; Zhao, Yongjun; Hirst, Martin; Marra, Marco A

    2009-10-01

    We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors, antisense transcripts, and intronic sequences, the latter possibly representing novel exons or genes. We observed increases in the diversity, abundance, and dynamic range of such rare transcripts and took advantage of the greater dynamic range of expression to identify, in cancers and normal libraries, altered expression ratios of alternative transcript isoforms. The strand-specific information of Tag-seq reads further allowed us to detect altered expression ratios of sense and antisense (S-AS) transcripts between cancer and normal libraries. S-AS transcripts were enriched in known cancer genes, while transcript isoforms were enriched in miRNA targeting sites. We found that transcript abundance had a stronger GC-bias in LongSAGE than Tag-seq, such that AT-rich tags were less abundant than GC-rich tags in LongSAGE. Tag-seq also performed better in gene discovery, identifying >98% of genes detected by LongSAGE and profiling a distinct subset of the transcriptome characterized by AT-rich genes, which was expressed at levels below those detectable by LongSAGE. Overall, Tag-seq is sensitive to rare transcripts, has less sequence composition bias relative to LongSAGE, and allows differential expression analysis for a greater range of transcripts, including transcripts encoding important regulatory molecules.

  2. SeqReporter: automating next-generation sequencing result interpretation and reporting workflow in a clinical laboratory.

    PubMed

    Roy, Somak; Durso, Mary Beth; Wald, Abigail; Nikiforov, Yuri E; Nikiforova, Marina N

    2014-01-01

    A wide repertoire of bioinformatics applications exist for next-generation sequencing data analysis; however, certain requirements of the clinical molecular laboratory limit their use: i) comprehensive report generation, ii) compatibility with existing laboratory information systems and computer operating system, iii) knowledgebase development, iv) quality management, and v) data security. SeqReporter is a web-based application developed using ASP.NET framework version 4.0. The client-side was designed using HTML5, CSS3, and Javascript. The server-side processing (VB.NET) relied on interaction with a customized SQL server 2008 R2 database. Overall, 104 cases (1062 variant calls) were analyzed by SeqReporter. Each variant call was classified into one of five report levels: i) known clinical significance, ii) uncertain clinical significance, iii) pending pathologists' review, iv) synonymous and deep intronic, and v) platform and panel-specific sequence errors. SeqReporter correctly annotated and classified 99.9% (859 of 860) of sequence variants, including 68.7% synonymous single-nucleotide variants, 28.3% nonsynonymous single-nucleotide variants, 1.7% insertions, and 1.3% deletions. One variant of potential clinical significance was re-classified after pathologist review. Laboratory information system-compatible clinical reports were generated automatically. SeqReporter also facilitated quality management activities. SeqReporter is an example of a customized and well-designed informatics solution to optimize and automate the downstream analysis of clinical next-generation sequencing data. We propose it as a model that may envisage the development of a comprehensive clinical informatics solution.

  3. Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization.

    PubMed

    Veidenberg, Andres; Medlar, Alan; Löytynoja, Ari

    2016-04-01

    Wasabi is an open source, web-based environment for evolutionary sequence analysis. Wasabi visualizes sequence data together with a phylogenetic tree within a modern, user-friendly interface: The interface hides extraneous options, supports context sensitive menus, drag-and-drop editing, and displays additional information, such as ancestral sequences, associated with specific tree nodes. The Wasabi environment supports reproducibility by automatically storing intermediate analysis steps and includes built-in functions to share data between users and publish analysis results. For computational analysis, Wasabi supports PRANK and PAGAN for phylogeny-aware alignment and alignment extension, and it can be easily extended with other tools. Along with drag-and-drop import of local files, Wasabi can access remote data through URL and import sequence data, GeneTrees and EPO alignments directly from Ensembl. To demonstrate a typical workflow using Wasabi, we reproduce key findings from recent comparative genomics studies, including a reanalysis of the EGLN1 gene from the tiger genome study: These case studies can be browsed within Wasabi at http://wasabiapp.org:8000?id=usecases. Wasabi runs inside a web browser and does not require any installation. One can start using it at http://wasabiapp.org. All source code is licensed under the AGPLv3.

  4. Next-generation sequencing-based method shows increased mutation detection sensitivity in an Indian retinoblastoma cohort

    PubMed Central

    Singh, Jaya; Mishra, Avshesh; Pandian, Arunachalam Jayamuruga; Mallipatna, Ashwin C.; Khetan, Vikas; Sripriya, S.; Kapoor, Suman; Agarwal, Smita; Sankaran, Satish; Katragadda, Shanmukh; Veeramachaneni, Vamsi; Hariharan, Ramesh; Subramanian, Kalyanasundaram

    2016-01-01

    Purpose Retinoblastoma (Rb) is the most common primary intraocular cancer of childhood and one of the major causes of blindness in children. India has the highest number of patients with Rb in the world. Mutations in the RB1 gene are the primary cause of Rb, and heterogeneous mutations are distributed throughout the entire length of the gene. Therefore, genetic testing requires screening of the entire gene, which by conventional sequencing is time consuming and expensive. Methods In this study, we screened the RB1 gene in the DNA isolated from blood or saliva samples of 50 unrelated patients with Rb using the TruSight Cancer panel. Next-generation sequencing (NGS) was done on the Illumina MiSeq platform. Genetic variations were identified using the Strand NGS software and interpreted using the StrandOmics platform. Results We were able to detect germline pathogenic mutations in 66% (33/50) of the cases, 12 of which were novel. We were able to detect all types of mutations, including missense, nonsense, splice site, indel, and structural variants. When we considered bilateral Rb cases only, the mutation detection rate increased to 100% (22/22). In unilateral Rb cases, the mutation detection rate was 30% (6/20). Conclusions Our study suggests that NGS-based approaches increase the sensitivity of mutation detection in the RB1 gene, making it fast and cost-effective compared to the conventional tests performed in a reflex-testing mode. PMID:27582626

  5. Next-generation sequencing (NGS) for assessment of microbial water quality: current progress, challenges, and future opportunities.

    PubMed

    Tan, BoonFei; Ng, Charmaine; Nshimyimana, Jean Pierre; Loh, Lay Leng; Gin, Karina Y-H; Thompson, Janelle R

    2015-01-01

    Water quality is an emergent property of a complex system comprised of interacting microbial populations and introduced microbial and chemical contaminants. Studies leveraging next-generation sequencing (NGS) technologies are providing new insights into the ecology of microbially mediated processes that influence fresh water quality such as algal blooms, contaminant biodegradation, and pathogen dissemination. In addition, sequencing methods targeting small subunit (SSU) rRNA hypervariable regions have allowed identification of signature microbial species that serve as bioindicators for sewage contamination in these environments. Beyond amplicon sequencing, metagenomic and metatranscriptomic analyses of microbial communities in fresh water environments reveal the genetic capabilities and interplay of waterborne microorganisms, shedding light on the mechanisms for production and biodegradation of toxins and other contaminants. This review discusses the challenges and benefits of applying NGS-based methods to water quality research and assessment. We will consider the suitability and biases inherent in the application of NGS as a screening tool for assessment of biological risks and discuss the potential and limitations for direct quantitative interpretation of NGS data. Secondly, we will examine case studies from recent literature where NGS based methods have been applied to topics in water quality assessment, including development of bioindicators for sewage pollution and microbial source tracking, characterizing the distribution of toxin and antibiotic resistance genes in water samples, and investigating mechanisms of biodegradation of harmful pollutants that threaten water quality. Finally, we provide a short review of emerging NGS platforms and their potential applications to the next generation of water quality assessment tools.

  6. Next-Generation Sequencing Approaches in Cancer: Where Have They Brought Us and Where Will They Take Us?

    PubMed Central

    LeBlanc, Veronique G.; Marra, Marco A.

    2015-01-01

    Next-generation sequencing (NGS) technologies and data have revolutionized cancer research and are increasingly being deployed to guide clinicians in treatment decision-making. NGS technologies have allowed us to take an “omics” approach to cancer in order to reveal genomic, transcriptomic, and epigenomic landscapes of individual malignancies. Integrative multi-platform analyses are increasingly used in large-scale projects that aim to fully characterize individual tumours as well as general cancer types and subtypes. In this review, we examine how NGS technologies in particular have contributed to “omics” approaches in cancer research, allowing for large-scale integrative analyses that consider hundreds of tumour samples. These types of studies have provided us with an unprecedented wealth of information, providing the background knowledge needed to make small-scale (including “N of 1”) studies informative and relevant. We also take a look at emerging opportunities provided by NGS and state-of-the-art third-generation sequencing technologies, particularly in the context of translational research. Cancer research and care are currently poised to experience significant progress catalyzed by accessible sequencing technologies that will benefit both clinical- and research-based efforts. PMID:26404381

  7. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    SciTech Connect

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  8. Architecting Prodiguer: the next generation French climate modelling data management platform

    NASA Astrophysics Data System (ADS)

    Morgan, Mark; Denvil, Sebastien; Bhardwaj, Ashish

    2010-05-01

    The Pierre Simon Laplace Institute (IPSL), like many other climate modeling groups, is involved in the international development of a comprehensive Earth System Model (ESM) to study the interactions between chemical, physical, and biological processes. This work entails the coupling of different components (land, ocean, atmosphere, chemistry...etc) and requires an execution environment platform that can tackle the entire range of interdependent model configurations. Furthermore, the ever-increasing number of simulations, executed against model configurations within scientific computing centres, is generating a huge volume of data and meta-data that must be made available to the international community of researchers, modelers, students and general users. IPSL is in the process of implementing a French national project called Prodiguer whose objective is to ensure that the data and meta-data can be delivered to the French & international communities in a timely and appropriate fashion, hence acheiving the strategic goals outlined above. Prodiguer aims to leverage, extend and build upon the work of international projects such as Earth System Grid, METAFOR and IS-ENES. Thus Prodiguer is to be seen as one actor amongst many attempting the difficult task of information integration within a complex enterprise space. We will present the technical architecture being put in place to achieve the goals of Prodiguer. Such an architecture necessarily encompasses many aspects of Service / Resource Orientated Architural practice. From security to messaging patterns, from message queues to failover strategies, we will illustrate how pragmatism is inevitably the main driver behind such an architecture. We will also illustrate that as the number of actors increases so does workflow complexity, and as a consequence simplicity becomes an important guiding factor in itself.

  9. Synchronized excitability in a network enables generation of internal neuronal sequences

    PubMed Central

    Wang, Yingxue; Roth, Zachary; Pastalkova, Eva

    2016-01-01

    Hippocampal place field sequences are supported by sensory cues and network internal mechanisms. In contrast, sharp-wave (SPW) sequences, theta sequences, and episode field sequences are internally generated. The relationship of these sequences to memory is unclear. SPW sequences have been shown to support learning and have been assumed to also support episodic memory. Conversely, we demonstrate these SPW sequences were present in trained rats even after episodic memory was impaired and after other internal sequences – episode field and theta sequences – were eliminated. SPW sequences did not support memory despite continuing to ‘replay’ all task-related sequences – place- field and episode field sequences. Sequence replay occurred selectively during synchronous increases of population excitability -- SPWs. Similarly, theta sequences depended on the presence of repeated synchronized waves of excitability – theta oscillations. Thus, we suggest that either intermittent or rhythmic synchronized changes of excitability trigger sequential firing of neurons, which in turn supports learning and/or memory. DOI: http://dx.doi.org/10.7554/eLife.20697.001 PMID:27677848

  10. Identification and characterization of Highlands J virus from a Mississippi sandhill crane using unbiased next-generation sequencing

    USGS Publications Warehouse

    Ip, Hon S.; Wiley, Michael R.; Long, Renee; Gustavo, Palacios; Shearn-Bochsler, Valerie; Whitehouse, Chris A.

    2014-01-01

    Advances in massively parallel DNA sequencing platforms, commonly termed next-generation sequencing (NGS) technologies, have greatly reduced time, labor, and cost associated with DNA sequencing. Thus, NGS has become a routine tool for new viral pathogen discovery and will likely become the standard for routine laboratory diagnostics of infectious diseases in the near future. This study demonstrated the application of NGS for the rapid identification and characterization of a virus isolated from the brain of an endangered Mississippi sandhill crane. This bird was part of a population restoration effort and was found in an emaciated state several days after Hurricane Isaac passed over the refuge in Mississippi in 2012. Post-mortem examination had identified trichostrongyliasis as the possible cause of death, but because a virus with morphology consistent with a togavirus was isolated from the brain of the bird, an arboviral etiology was strongly suspected. Because individual molecular assays for several known arboviruses were negative, unbiased NGS by Illumina MiSeq was used to definitively identify and characterize the causative viral agent. Whole genome sequencing and phylogenetic analysis revealed the viral isolate to be the Highlands J virus, a known avian pathogen. This study demonstrates the use of unbiased NGS for the rapid detection and characterization of an unidentified viral pathogen and the application of this technology to wildlife disease diagnostics and conservation medicine.

  11. Association analysis using next-generation sequence data from publicly available control groups: the robust variance score statistic

    PubMed Central

    Derkach, Andriy; Chiang, Theodore; Gong, Jiafen; Addis, Laura; Dobbins, Sara; Tomlinson, Ian; Houlston, Richard; Pal, Deb K.; Strug, Lisa J.

    2014-01-01

    Motivation: Sufficiently powered case–control studies with next-generation sequence (NGS) data remain prohibitively expensive for many investigators. If feasible, a more efficient strategy would be to include publicly available sequenced controls. However, these studies can be confounded by differences in sequencing platform; alignment, single nucleotide polymorphism and variant calling algorithms; read depth; and selection thresholds. Assuming one can match cases and controls on the basis of ethnicity and other potential confounding factors, and one has access to the aligned reads in both groups, we investigate the effect of systematic differences in read depth and selection threshold when comparing allele frequencies between cases and controls. We propose a novel likelihood-based method, the robust variance score (RVS), that substitutes genotype calls by their expected values given observed sequence data. Results: We show theoretically that the RVS eliminates read depth bias in the estimation of minor allele frequency. We also demonstrate that, using simulated and real NGS data, the RVS method controls Type I error and has comparable power to the ‘gold standard’ analysis with the true underlying genotypes for both common and rare variants. Availability and implementation: An RVS R script and instructions can be found at strug.research.sickkids.ca, and at https://github.com/strug-lab/RVS. Contact: lisa.strug@utoronto.ca Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24733292

  12. GPU technology as a platform for accelerating local complexity analysis of protein sequences.

    PubMed

    Papadopoulos, Agathoklis; Kirmitzoglou, Ioannis; Promponas, Vasilis J; Theocharides, Theocharis

    2013-01-01

    The use of GPGPU programming paradigm (running CUDA-enabled algorithms on GPU cards) in Bioinformatics showed promising results [1]. As such a similar approach can be used to speedup other algorithms such as CAST, a popular tool used for masking low-complexity regions (LCRs) in protein sequences [2] with increased sensitivity. We developed and implemented a CUDA-enabled version (GPU_CAST) of the multi-threaded version of CAST software first presented in [3] and optimized in [4]. The proposed software implementation uses the nVIDIA CUDA libraries and the GPGPU programming paradigm to take advantage of the inherent parallel characteristics of the CAST algorithm to execute the calculations on the GPU card of the host computer system. The GPU-based implementation presented in this work, is compared against the multi-threaded, multi-core optimized version of CAST [4] and yielded speedups of 5x-10x for large protein sequence datasets.

  13. Next-generation sequencing technologies and applications for human genetic history and forensics.

    PubMed

    Berglund, Eva C; Kiialainen, Anna; Syvänen, Ann-Christine

    2011-11-24

    Rapid advances in the development of sequencing technologies in recent years have enabled an increasing number of applications in biology and medicine. Here, we review key technical aspects of the preparation of DNA templates for sequencing, the biochemical reaction principles and assay formats underlying next-generation sequencing systems, methods for imaging and base calling, quality control, and bioinformatic approaches for sequence alignment, variant calling and assembly. We also discuss some of the most important advances that the new sequencing technologies have brought to the fields of human population genetics, human genetic history and forensic genetics.

  14. An improved protocol for sequencing of repetitive genomic regions and structural variations using mutagenesis and next generation sequencing.

    PubMed

    Sipos, Botond; Massingham, Tim; Stütz, Adrian M; Goldman, Nick

    2012-01-01

    The rise of Next Generation Sequencing (NGS) technologies has transformed de novo genome sequencing into an accessible research tool, but obtaining high quality eukaryotic genome assemblies remains a challenge, mostly due to the abundance of repetitive elements. These also make it difficult to study nucleotide polymorphism in repetitive regions, including certain types of structural variations. One solution proposed for resolving such regions is Sequence Assembly aided by Mutagenesis (SAM), which relies on the fact that introducing enough random mutations breaks the repetitive structure, making assembly possible. Sequencing many different mutated copies permits the sequence of the repetitive region to be inferred by consensus methods. However, this approach relies on molecular cloning in order to isolate and amplify individual mutant copies, making it hard to scale-up the approach for use in conjunction with high-throughput sequencing technologies. To address this problem, we propose NG-SAM, a modified version of the SAM protocol that relies on PCR and dilution steps only, coupled to a NGS workflow. NG-SAM therefore has the potential to be scaled-up, e.g. using emerging microfluidics technologies. We built a realistic simulation pipeline to study the feasibility of NG-SAM, and our results suggest that under appropriate experimental conditions the approach might be successfully put into practice. Moreover, our simulations suggest that NG-SAM is capable of reconstructing robustly a wide range of potential target sequences of varying lengths and repetitive structures.

  15. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

    PubMed

    Kwok, Hin; Chiang, Alan Kwok Shing

    2016-02-24

    Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

  16. Designing next-generation platforms for evaluating scientific output: what scientists can learn from the social web.

    PubMed

    Yarkoni, Tal

    2012-01-01

    Traditional pre-publication peer review of scientific output is a slow, inefficient, and unreliable process. Efforts to replace or supplement traditional evaluation models with open evaluation platforms that leverage advances in information technology are slowly gaining traction, but remain in the early stages of design and implementation. Here I discuss a number of considerations relevant to the development of such platforms. I focus particular attention on three core elements that next-generation evaluation platforms should strive to emphasize, including (1) open and transparent access to accumulated evaluation data, (2) personalized and highly customizable performance metrics, and (3) appropriate short-term incentivization of the userbase. Because all of these elements have already been successfully implemented on a large scale in hundreds of existing social web applications, I argue that development of new scientific evaluation platforms should proceed largely by adapting existing techniques rather than engineering entirely new evaluation mechanisms. Successful implementation of open evaluation platforms has the potential to substantially advance both the pace and the quality of scientific publication and evaluation, and the scientific community has a vested interest in shifting toward such models as soon as possible.

  17. Designing next-generation platforms for evaluating scientific output: what scientists can learn from the social web

    PubMed Central

    Yarkoni, Tal

    2012-01-01

    Traditional pre-publication peer review of scientific output is a slow, inefficient, and unreliable process. Efforts to replace or supplement traditional evaluation models with open evaluation platforms that leverage advances in information technology are slowly gaining traction, but remain in the early stages of design and implementation. Here I discuss a number of considerations relevant to the development of such platforms. I focus particular attention on three core elements that next-generation evaluation platforms should strive to emphasize, including (1) open and transparent access to accumulated evaluation data, (2) personalized and highly customizable performance metrics, and (3) appropriate short-term incentivization of the userbase. Because all of these elements have already been successfully implemented on a large scale in hundreds of existing social web applications, I argue that development of new scientific evaluation platforms should proceed largely by adapting existing techniques rather than engineering entirely new evaluation mechanisms. Successful implementation of open evaluation platforms has the potential to substantially advance both the pace and the quality of scientific publication and evaluation, and the scientific community has a vested interest in shifting toward such models as soon as possible. PMID:23060783

  18. Generating Multiple Base-Resolution DNA Methylomes Using Reduced Representation Bisulfite Sequencing.

    PubMed

    Chatterjee, Aniruddha; Rodger, Euan J; Stockwell, Peter A; Le Mée, Gwenn; Morison, Ian M

    2017-01-01

    Reduced representation bisulfite sequencing (RRBS) is an effective technique for profiling genome-wide DNA methylation patterns in eukaryotes. RRBS couples size selection, bisulfite conversion, and second-generation sequencing to enrich for CpG-dense regions of the genome. The progressive improvement of second-generation sequencing technologies and reduction in cost provided an opportunity to examine the DNA methylation patterns of multiple genomes. Here, we describe a protocol for sequencing multiple RRBS libraries in a single sequencing reaction to generate base-resolution methylomes. Furthermore, we provide a brief guideline for base-calling and data analysis of multiplexed RRBS libraries. These strategies will be useful to perform large-scale, genome-wide DNA methylation analysis.

  19. Applications of nanotechnology, next generation sequencing and microarrays in biomedical research.

    PubMed

    Elingaramil, Sauli; Li, Xiaolong; He, Nongyue

    2013-07-01

    Next-generation sequencing technologies, microarrays and advances in bio nanotechnology have had an enormous impact on research within a short time frame. This impact appears certain to increase further as many biomedical institutions are now acquiring these prevailing new technologies. Beyond conventional sampling of genome content, wide-ranging applications are rapidly evolving for next-generation sequencing, microarrays and nanotechnology. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted re sequencing and discovery of transcription factor binding sites, noncoding RNA expression profiling and molecular diagnostics. This paper thus discusses current applications of nanotechnology, next-generation sequencing technologies and microarrays in biomedical research and highlights the transforming potential these technologies offer.

  20. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms

    PubMed Central

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-01-01

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes. PMID:24296905

  1. Next Generation Sequencing of DNA-Launched Chikungunya Vaccine Virus

    PubMed Central

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-01-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3’ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. PMID:26855330

  2. Short Communication: Investigating a Chain of HIV Transmission Events Due to Homosexual Exposure and Blood Transfusion Based on a Next Generation Sequencing Method.

    PubMed

    Zhao, Qi; Zhang, Chen; Jiang, Yan; Wen, Yujie; Pan, Pinliang; Li, Yang; Zhang, Guiyun; Zhang, Lei; Qiu, Maofeng

    2015-12-01

    This study investigates a chain of HIV transmission events due to homosexual exposure and blood transfusion in China. The MiSeq platform, a next generation sequencing (NGS) system, was used to obtain genetic details of the HIV-1 env region (336 base pairs). Evolutionary analysis combined with epidemiologic evidence suggests a transmission chain from patient T3 to T2 through homosexual exposure and subsequently to T1 through blood transfusion. More importantly, a phylogenetic study suggested a likely genetic bottleneck for HIV in homosexual transmission from T3 to T2, while T1 inherited the majority of variants from T2. The result from the MiSeq platform is consistent with findings from the epidemiologic survey. The MiSeq platform is a powerful tool for tracing HIV transmissions and intrapersonal evolution.

  3. Microsatellites from Fosterella christophii (Bromeliaceae) by de novo transcriptome sequencing on the Pacific Biosciences RS platform1

    PubMed Central

    Wöhrmann, Tina; Huettel, Bruno; Wagner, Natascha; Weising, Kurt

    2016-01-01

    Premise of the study: Microsatellite markers were developed in Fosterella christophii (Bromeliaceae) to investigate the genetic diversity and population structure within the F. micrantha group, comprising F. christophii, F. micrantha, and F. villosula. Methods and Results: Full-length cDNAs were isolated from F. christophii and sequenced on a Pacific Biosciences RS platform. A total of 1590 high-quality consensus isoforms were assembled into 971 unigenes containing 421 perfect microsatellites. Thirty primer sets were designed, of which 13 revealed a high level of polymorphism in three populations of F. christophii, with four to nine alleles per locus. Each of these 13 loci cross-amplified in the closely related species F. micrantha and F. villosula, with one to six and one to 11 alleles per locus, respectively. Conclusions: The new markers are promising tools to study the population genetics of F. christophii and to discover species boundaries within the F. micrantha group. PMID:26819858

  4. Geostatistical reservoir characterization of complex lateral and vertical sequences in a mixed carbonate platform

    SciTech Connect

    Norris, R.J.; Alabert, F.G.; Massonnat, G.J. )

    1994-07-01

    In recent years reservoir characterization through the use of geostatistics has become an almost routine part of production geology. Many techniques are available within the broad title of geostatistics, having been developed in response to many types of problem. One characteristic feature of almost all techniques (Stochastic Indicator Simulation, Boolean [open quotes]object[close quotes] Modeling, Gaussian [and Truncated Gaussian] methods and Optimized Markov-fields) is their reliance on the concept of quantifiable correlations, which reflect some aspect of the shape of [open quotes]objects.[close quotes] For example, almost any of the above noted techniques, and their variants, could be used to model fluvial, deltaic, or turbiditic reservoirs because in each case facies can be described in terms of geometries (channels, lobes, etc.). This study considers the complex lateral and vertical variations of a mixed carbonate platform environment, where facies cannot be easily characterized by simple geometries. The complex heterogeneities are a function of changes in sea level, representing fluctuations over several orders of cyclicity. Given facies have no characteristic form, being the product of the interplay between sediment supply and sea level change. This type of environment is, therefore, characterized by a good deal of information concerning trends in the data, while correlations and geometries are almost meaningless. Associated with the concepts of cyclicity, rules concerning the reappearance of facies, or otherwise, were developed. For example, minor recurrences of maximum flooding surfaces could be tolerated within individual units but other specified recurrences need to be excluded.

  5. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis

    PubMed Central

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N.

    2016-01-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  6. Using NS5B Sequencing for Hepatitis C Virus Genotyping Reveals Discordances with Commercial Platforms.

    PubMed

    Chueca, Natalia; Rivadulla, Isidro; Lovatti, Rubén; Reina, Gabriel; Blanco, Ana; Fernandez-Caballero, Jose Angel; Cardeñoso, Laura; Rodriguez-Granjer, Javier; Fernandez-Alonso, Miriam; Aguilera, Antonio; Alvarez, Marta; Galán, Juan Carlos; García, Federico

    2016-01-01

    We aimed to evaluate the correct assignment of HCV genotypes by three commercial methods-Trugene HCV genotyping kit (Siemens), VERSANT HCV Genotype 2.0 assay (Siemens), and Real-Time HCV genotype II (Abbott)-compared to NS5B sequencing. We studied 327 clinical samples that carried representative HCV genotypes of the most frequent geno/subtypes in Spain. After commercial genotyping, the sequencing of a 367 bp fragment in the NS5B gene was used to assign genotypes. Major discrepancies were defined, e.g. differences in the assigned genotype by one of the three methods and NS5B sequencing, including misclassification of subtypes 1a and 1b. Minor discrepancies were considered when differences at subtype levels, other than 1a and 1b, were observed. The overall discordance with the reference method was 34% for Trugene and 15% for VERSANT HCV2.0. The Abbott assay correctly identified all 1a and 1b subtypes, but did not subtype all the 2, 3, 4 and 5 (34%) genotypes. Major discordances were found in 16% of cases for Trugene HCV, and the majority were 1b- to 1a-related discordances; major discordances were found for VERSANT HCV 2.0 in 6% of cases, which were all but one 1b to 1a cases. These results indicated that the Trugene assay especially, and to a lesser extent, Versant HCV 2.0, can fail to differentiate HCV subtypes 1a and 1b, and lead to critical errors in clinical practice for correctly using directly acting antiviral agents.

  7. Using NS5B Sequencing for Hepatitis C Virus Genotyping Reveals Discordances with Commercial Platforms

    PubMed Central

    Chueca, Natalia; Rivadulla, Isidro; Lovatti, Rubén; Reina, Gabriel; Blanco, Ana; Fernandez-Caballero, Jose Angel; Cardeñoso, Laura; Rodriguez-Granjer, Javier; Fernandez-Alonso, Miriam; Aguilera, Antonio; Alvarez, Marta

    2016-01-01

    We aimed to evaluate the correct assignment of HCV genotypes by three commercial methods—Trugene HCV genotyping kit (Siemens), VERSANT HCV Genotype 2.0 assay (Siemens), and Real-Time HCV genotype II (Abbott)—compared to NS5B sequencing. We studied 327 clinical samples that carried representative HCV genotypes of the most frequent geno/subtypes in Spain. After commercial genotyping, the sequencing of a 367 bp fragment in the NS5B gene was used to assign genotypes. Major discrepancies were defined, e.g. differences in the assigned genotype by one of the three methods and NS5B sequencing, including misclassification of subtypes 1a and 1b. Minor discrepancies were considered when differences at subtype levels, other than 1a and 1b, were observed. The overall discordance with the reference method was 34% for Trugene and 15% for VERSANT HCV2.0. The Abbott assay correctly identified all 1a and 1b subtypes, but did not subtype all the 2, 3, 4 and 5 (34%) genotypes. Major discordances were found in 16% of cases for Trugene HCV, and the majority were 1b- to 1a-related discordances; major discordances were found for VERSANT HCV 2.0 in 6% of cases, which were all but one 1b to 1a cases. These results indicated that the Trugene assay especially, and to a lesser extent, Versant HCV 2.0, can fail to differentiate HCV subtypes 1a and 1b, and lead to critical errors in clinical practice for correctly using directly acting antiviral agents. PMID:27097040

  8. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  9. Storm-generated bedforms and relict dissolution pits and channels on the Yucatan carbonate platform

    NASA Astrophysics Data System (ADS)

    Gulick, S. P.; Goff, J. A.; Stewart, H. A.; Perez-Cruz, L. L.; Davis, M. B.; Duncan, D.; Saustrup, S.; Sanford, J. C.; Fucugauchi, J. U.

    2013-12-01

    survey area. Therefore, none of these dissolution pits appear to be underlain by a cenote or sink hole. The NW sector of the survey area exhibits a more complex morphology than the alternating ribbon/bare rock morphology elsewhere, including linear scarps (up to ~1 m relief), deeper pitting (up to ~1 m relief), and sinuous, dendritic channeling (up to ~2 m relief). The geologic origin of these features will require further investigation. Sand drifts are present in this region, but are thinner and cover less area. These observations show the dominant modern sediment formation and transport processes on this starved platform are from large storms and hurricanes that place large regions of the platform at wave base. Remaining observed features were generated during times of lower sea level.

  10. Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing

    PubMed Central

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640

  11. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    PubMed

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  12. Next generation sequencing identifies ‘interactome’ signatures in relapsed and refractory metastatic colorectal cancer

    PubMed Central

    Cooke, Laurence; Mahadevan, Daruka

    2017-01-01

    Background In the management of metastatic colorectal cancer (mCRC), KRAS, NRAS and BRAF mutational status individualizes therapeutic options and identify a cohort of patients (pts) with an aggressive clinical course. We hypothesized that relapsed and refractory mCRC pts develop unique mutational signatures that may guide therapy, predict for a response and highlight key signaling pathways important for clinical decision making. Methods Relapsed and refractory mCRC pts (N=32) were molecularly profiled utilizing commercially available next generation sequencing (NGS) platforms. Web-based bioinformatics tools (Reactome/Enrichr) were utilized to elucidate mutational profile linked pathways-networks that have the potential to guide therapy. Results Pts had progressed on fluoropyrimidines, oxaliplatin, irinotecan, bevacizumab, cetuximab and/or panitumumab. Most common histology was adenocarcinoma (colon N=29; rectal N=3). Of the mutations TP53 was the most common, followed by APC, KRAS, PIK3CA, BRAF, SMAD4, SPTA1, FAT1, PDGFRA, ATM, ROS1, ALK, CDKN2A, FBXW7, TGFBR2, NOTCH1 and HER3. Pts had on average had ≥5 unique mutations. The most frequent activated signaling pathways were: HER2, fibroblast growth factor receptor (FGFR), p38 through BRAF-MEK cascade via RIT and RIN, ARMS-mediated activation of MAPK cascade, and VEGFR2. Conclusions Dominant driver oncogene mutations do not always equate to oncogenic dependence, hence understanding pathogenic ‘interactome(s)’ in individual pts is key to both clinically relevant targets and in choosing the next best therapy. Mutational signatures derived from corresponding ‘pathway-networks’ represent a meaningful tool to (I) evaluate functional investigation in the laboratory; (II) predict response to drug therapy; and (III) guide rational drug combinations in relapsed and refractory mCRC pts. PMID:28280605

  13. Next-generation sequencing for targeted discovery of rare mutations in rice

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Advances in DNA sequencing (i.e., next-generation sequencing, NGS) have greatly increased the power and efficiency of detecting rare mutations in large mutant populations. Targeting Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach for identifying gene mutations resulting fro...

  14. Next generation sequencing provides rapid access to the genome of wheat stripe rust

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: The wheat stripe rust fungus (Puccinia striiformis f. sp. tritici, PST) is responsible for significant yield losses in wheat production worldwide. In spite of its economic importance, the PST genomic sequence is not currently available. Fortunately Next Generation Sequencing (NGS) has ra...

  15. Finding sRNA generative locales from high-throughput sequencing data with NiBLS

    PubMed Central

    2010-01-01

    Background Next-generation sequencing technologies allow researchers to obtain millions of sequence reads in a single experiment. One important use of the technology is the sequencing of small non-coding regulatory RNAs and the identification of the genomic locales from which they originate. Currently, there is a paucity of methods for finding small RNA generative locales. Results We describe and implement an algorithm that can determine small RNA generative locales from high-throughput sequencing data. The algorithm creates a network, or graph, of the small RNAs by creating links between them depending on their proximity on the target genome. For each of the sub-networks in the resulting graph the clustering coefficient, a measure of the interconnectedness of the subnetwork, is used to identify the generative locales. We test the algorithm over a wide range of parameters using RFAM sequences as positive controls and demonstrate that the algorithm has good sensitivity and specificity in a range of Arabidopsis and mouse small RNA sequence sets and that the locales it generates are robust to differences in the choice of parameters. Conclusions NiBLS is a fast, reliable and sensitive method for determining small RNA locales in high-throughput sequence data that is generally applicable to all classes of small RNA. PMID:20167070

  16. Efficient generation of transgenic cattle using the DNA transposon and their analysis by next-generation sequencing

    PubMed Central

    Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo

    2016-01-01

    Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781

  17. Whole genome sequence analysis of the arctic-lineage strain responsible for distemper in Italian wolves and dogs through a fast and robust next generation sequencing protocol.

    PubMed

    Marcacci, Maurilia; Ancora, Massimo; Mangone, Iolanda; Teodori, Liana; Di Sabatino, Daria; De Massis, Fabrizio; Camma', Cesare; Savini, Giovanni; Lorusso, Alessio

    2014-06-01

    Dynamic surveillance and characterization of canine distemper virus (CDV) circulating strains are essential against possible vaccine breakthroughs events. This study describes the setup of a fast and robust next-generation sequencing (NGS) Ion PGM™ protocol that was used to obtain the complete genome sequence of a CDV isolate (CDV2784/2013). CDV2784/2013 is the prototype of CDV strains responsible for severe clinical distemper in dogs and wolves in Italy during 2013. CDV2784/2013 was isolated on cell culture and total RNA was used for NGS sample preparation. A total of 112.3 Mb of reads were assembled de novo using MIRA version 4.0rc4, which yielded a total number of 403 contigs with 12.1% coverage. The whole genome (15,690 bp) was recovered successfully and compared to those of existing CDV whole genomes. CDV2784/2013 was shown to have 92% nt identity with the Onderstepoort vaccine strain. This study describes for the first time a fast and robust Ion PGM™ platform-based whole genome amplification protocol for non-segmented negative stranded RNA viruses starting from total cell-purified RNA. Additionally, this is the first study reporting the whole genome analysis of an Arctic lineage strain that is known to circulate widely in Europe, Asia and USA.

  18. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples.

    PubMed

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-03-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline.

  19. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples

    PubMed Central

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-01-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline. PMID:26910355

  20. Multi-platform next generation sequencing of a high quality genome assembly for the goat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goat is adapted to most of the environmental conditions found around the world. The species has evolved to be tolerant to diseases, productive in tropical or arid regions, and culturally and economically important in developing countries. Characterization of these genetic adaptations and devel...

  1. Thermal Test of an Improved Platform for Silicon Nanowire-Based Thermoelectric Micro-generators

    NASA Astrophysics Data System (ADS)

    Calaza, C.; Fonseca, L.; Salleras, M.; Donmez, I.; Tarancón, A.; Morata, A.; Santos, J. D.; Gadea, G.

    2016-03-01

    This work reports on an improved design intended to enhance the thermal isolation between the hot and cold parts of a silicon-based thermoelectric microgenerator. Micromachining techniques and silicon on insulator substrates are used to obtain a suspended silicon platform surrounded by a bulk silicon rim, in which arrays of bottom-up silicon nanowires are integrated later on to join both parts with a thermoelectric active material. In previous designs the platform was linked to the rim by means of bulk silicon bridges, used as mechanical support and holder for the electrical connections. Such supports severely reduce platform thermal isolation and penalise the functional area due to the need of longer supports. A new technological route is planned to obtain low thermal conductance supports, making use of a particular geometrical design and a wet bulk micromachining process to selectively remove silicon shaping a thin dielectric membrane. Thermal conductance measurements have been performed to analyse the influence of the different design parameters of the suspended platform (support type, bridge/membrane length, separation between platform and silicon rim,) on overall thermal isolation. A thermal conductance reduction from 1.82 mW/K to 1.03 mW/K, has been obtained on tested devices by changing the support type, even though its length has been halved.

  2. Genetic sequence relationships of Winnipegosis platform carbonates, southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-02-01

    Examination of cores and well log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger, Vail-type sequences separated by regionally persistent unconformities or their correlative conformities.

  3. Using Next-Generation Sequencing for DNA Barcoding: Capturing Allelic Variation in ITS2

    PubMed Central

    Batovska, Jana; Cogan, Noel O. I.; Lynch, Stacey E.; Blacket, Mark J.

    2016-01-01

    Internal Transcribed Spacer 2 (ITS2) is a popular DNA barcoding marker; however, in some animal species it is hypervariable and therefore difficult to sequence with traditional methods. With next-generation sequencing (NGS) it is possible to sequence all gene variants despite the presence of single nucleotide polymorphisms (SNPs), insertions/deletions (indels), homopolymeric regions, and microsatellites. Our aim was to compare the performance of Sanger sequencing and NGS amplicon sequencing in characterizing ITS2 in 26 mosquito species represented by 88 samples. The suitability of ITS2 as a DNA barcoding marker for mosquitoes, and its allelic diversity in individuals and species, was also assessed. Compared to Sanger sequencing, NGS was able to characterize the ITS2 region to a greater extent, with resolution within and between individuals and species that was previously not possible. A total of 382 unique sequences (alleles) were generated from the 88 mosquito specimens, demonstrating the diversity present that has been overlooked by traditional sequencing methods. Multiple indels and microsatellites were present in the ITS2 alleles, which were often specific to species or genera, causing variation in sequence length. As a barcoding marker, ITS2 was able to separate all of the species, apart from members of the Culex pipiens complex, providing the same resolution as the commonly used Cytochrome Oxidase I (COI). The ability to cost-effectively sequence hypervariable markers makes NGS an invaluable tool with many applications in the DNA barcoding field, and provides insights into the limitations of previous studies and techniques. PMID:27799340

  4. Sequence-Independent, Single-Primer Amplification Next-Generation Sequencing of Hantaan Virus Cell Culture-Based Isolates.

    PubMed

    Song, Dong Hyun; Kim, Won-Keun; Gu, Se Hun; Lee, Daesang; Kim, Jeong-Ah; No, Jin Sun; Lee, Seung-Ho; Wiley, Michael R; Palacios, Gustavo; Song, Jin-Won; Jeong, Seong Tae

    2017-02-08

    Hantaan virus (HTNV), identified in the striped field mouse (Apodemus agrarius), belongs to the genus Hantavirus of the family Bunyaviridae and contains tripartite RNA genomes, small (S), medium (M), and large (L) segments. HTNV is a major causative for hemorrhagic fever with renal syndrome (HFRS) with fatality rates ranging from 1% to 15% in the Republic of Korea (ROK) and China. Defining of HTNV whole-genome sequences and isolation of the infectious particle play a critical role in the characterization and preventive and therapeutic strategies of hantavirus outbreaks. Next-generation sequencing (NGS) provides an advanced tool for massive genomic sequencing of viruses. However, the isolation of viral infectious particles is a huge obstacle to investigate and develop anti-virals for hantaviruses. Here, we report 12 HTNV isolates from lung tissues of the striped field mouse in the highly HFRS-endemic areas. Sequence-independent, single-primer amplification (SISPA) NGS was attempted to recover the genomic sequences of HTNV isolates. The nucleotide sequence of HTNV S, M, and L segments were covered up to 99.4-100%, 97.5-100%, and 95.6-99.8%, respectively, based on the full length of the prototype HTNV 76-118. The whole-genome sequencing of HTNV isolates was accomplished by additional reverse transcription polymerase chain reaction (RT-PCR) and rapid amplification cDNA ends (RACE) PCR. In conclusion, this study will lead to the attempt and usage of SISPA NGS technologies to delineate the whole-genome sequence of hantaviruses, providing a new era of viral genomics for the surveillance, trace, and disease risk management of HFRS incidents.

  5. Evaluation of 16S Rrna amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  6. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  7. Co-detection and sequencing of genes and transcripts from the same single cells facilitated by a microfluidics platform.

    PubMed

    Han, Lin; Zi, Xiaoyuan; Garmire, Lana X; Wu, Yu; Weissman, Sherman M; Pan, Xinghua; Fan, Rong

    2014-09-26

    Despite the recent advance of single-cell gene expression analyses, co-measurement of both genomic and transcriptional signatures at the single-cell level has not been realized. However such analysis is necessary in order to accurately delineate how genetic information is transcribed, expressed, and regulated to give rise to an enormously diverse range of cell phenotypes. Here we report on a microfluidics-facilitated approach that allows for controlled separation of cytoplasmic and nuclear contents of a single cell followed by on-chip amplification of genomic DNA and cytoplasmic mRNA. When coupled with off-chip polymerase chain reaction, gel electrophoresis and Sanger sequencing, a panel of genes and transcripts from the same single cell can be co-detected and sequenced. This platform is potentially an enabling tool to permit multiple genomic measurements performed on the same single cells and opens new opportunities to tackle a range of fundamental biology questions including non-genetic cell-to-cell variability, epigenetic regulation, and stem cell fate control. It also helps address clinical challenges such as diagnosing intra-tumor heterogeneity and dissecting complex cellular immune responses.

  8. Co-detection and sequencing of genes and transcripts from the same single cells facilitated by a microfluidics platform

    NASA Astrophysics Data System (ADS)

    Han, Lin; Zi, Xiaoyuan; Garmire, Lana X.; Wu, Yu; Weissman, Sherman M.; Pan, Xinghua; Fan, Rong

    2014-09-01

    Despite the recent advance of single-cell gene expression analyses, co-measurement of both genomic and transcriptional signatures at the single-cell level has not been realized. However such analysis is necessary in order to accurately delineate how genetic information is transcribed, expressed, and regulated to give rise to an enormously diverse range of cell phenotypes. Here we report on a microfluidics-facilitated approach that allows for controlled separation of cytoplasmic and nuclear contents of a single cell followed by on-chip amplification of genomic DNA and cytoplasmic mRNA. When coupled with off-chip polymerase chain reaction, gel electrophoresis and Sanger sequencing, a panel of genes and transcripts from the same single cell can be co-detected and sequenced. This platform is potentially an enabling tool to permit multiple genomic measurements performed on the same single cells and opens new opportunities to tackle a range of fundamental biology questions including non-genetic cell-to-cell variability, epigenetic regulation, and stem cell fate control. It also helps address clinical challenges such as diagnosing intra-tumor heterogeneity and dissecting complex cellular immune responses.

  9. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    PubMed Central

    Gullapalli, Rama R.; Desai, Ketaki V.; Santana-Santos, Lucas; Kant, Jeffrey A.; Becich, Michael J.

    2012-01-01

    The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it's hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future. PMID:23248761

  10. Navigating the Rapids: The Development of Regulated Next-Generation Sequencing-Based Clinical Trial Assays and Companion Diagnostics

    PubMed Central

    Pant, Saumya; Weiner, Russell; Marton, Matthew J.

    2014-01-01

    Over the past decade, next-generation sequencing (NGS) technology has experienced meteoric growth in the aspects of platform, technology, and supporting bioinformatics development allowing its widespread and rapid uptake in research settings. More recently, NGS-based genomic data have been exploited to better understand disease development and patient characteristics that influence response to a given therapeutic intervention. Cancer, as a disease characterized by and driven by the tumor genetic landscape, is particularly amenable to NGS-based diagnostic (Dx) approaches. NGS-based technologies are particularly well suited to studying cancer disease development, progression and emergence of resistance, all key factors in the development of next-generation cancer Dxs. Yet, to achieve the promise of NGS-based patient treatment, drug developers will need to overcome a number of operational, technical, regulatory, and strategic challenges. Here, we provide a succinct overview of the state of the clinical NGS field in terms of the available clinically targeted platforms and sequencing technologies. We discuss the various operational and practical aspects of clinical NGS testing that will facilitate or limit the uptake of such assays in routine clinical care. We examine the current strategies for analytical validation and Food and Drug Administration (FDA)-approval of NGS-based assays and ongoing efforts to standardize clinical NGS and build quality control standards for the same. The rapidly evolving companion diagnostic (CDx) landscape for NGS-based assays will be reviewed, highlighting the key areas of concern and suggesting strategies to mitigate risk. The review will conclude with a series of strategic questions that face drug developers and a discussion of the likely future course of NGS-based CDx development efforts. PMID:24860780

  11. An ultra-sparse code underliesthe generation of neural sequences in a songbird

    NASA Astrophysics Data System (ADS)

    Hahnloser, Richard H. R.; Kozhevnikov, Alexay A.; Fee, Michale S.

    2002-09-01

    Sequences of motor activity are encoded in many vertebrate brains by complex spatio-temporal patterns of neural activity; however, the neural circuit mechanisms underlying the generation of these pre-motor patterns are poorly understood. In songbirds, one prominent site of pre-motor activity is the forebrain robust nucleus of the archistriatum (RA), which generates stereotyped sequences of spike bursts during song and recapitulates these sequences during sleep. We show that the stereotyped sequences in RA are driven from nucleus HVC (high vocal centre), the principal pre-motor input to RA. Recordings of identified HVC neurons in sleeping and singing birds show that individual HVC neurons projecting onto RA neurons produce bursts sparsely, at a single, precise time during the RA sequence. These HVC neurons burst sequentially with respect to one another. We suggest that at each time in the RA sequence, the ensemble of active RA neurons is driven by a subpopulation of RA-projecting HVC neurons that is active only at that time. As a population, these HVC neurons may form an explicit representation of time in the sequence. Such a sparse representation, a temporal analogue of the `grandmother cell' concept for object recognition, eliminates the problem of temporal interference during sequence generation and learning attributed to more distributed representations.

  12. Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics.

    PubMed

    Gibbons, John G; Janson, Eric M; Hittinger, Chris Todd; Johnston, Mark; Abbot, Patrick; Rokas, Antonis

    2009-12-01

    Next-generation sequencing has opened the door to genomic analysis of nonmodel organisms. Technologies generating long-sequence reads (200-400 bp) are increasingly used in evolutionary studies of nonmodel organisms, but the short-sequence reads (30-50 bp) that can be produced at lower cost are thought to be of limited utility for de novo sequencing applications. Here, we tested this assumption by short-read sequencing the transcriptomes of the tropical disease vectors Aedes aegypti and Anopheles gambiae, for which complete genome sequences are available. Comparison of our results to the reference genomes allowed us to accurately evaluate the quantity, quality, and functional and evolutionary information content of our "test" data. We produced more than 0.7 billion nucleotides of sequenced data per species that assembled into more than 21,000 test contigs larger than 100 bp per species and covered approximately 27% of the Aedes reference transcriptome. Remarkably, the substitution error rate in the test contigs was approximately 0.25% per site, with very few indels or assembly errors. Test contigs of both species were enriched for genes involved in energy production and protein synthesis and underrepresented in genes involved in transcription and differentiation. Ortholog prediction using the test contigs was accurate across hundreds of millions of years of evolution. Our results demonstrate the considerable utility of short-read transcriptome sequencing for genomic studies of nonmodel organisms and suggest an approach for assessing the information content of next-generation data for evolutionary studies.

  13. Genetic sequence relationships of Winnipegosis platform carbonates, southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-01-01

    Examination of cores and well log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger, Vail-type sequences separated by regionally persistent unconformities or their correlative conformities. Sea level rise during the early Eifelian caused southeastward onlap of the Ashern Formation onto Middle Silurian carbonates of the Interlake Formation. Maximum flooding, expressed by deepest marine facies and a hardground surface, suggests the existence of a condensed section at the top of the Ashern Formation. This was developed during the maximum rate of sea level rise. A decrease in the rate of sea level rise resulted in aggradation of lower Winnipegosis units on a gently dipping ramp. These are represented by nodular and burrowed open marine limestones with scattered stromatoporoid patch reefs and grainstone shoals. During the subsequent sea level fall, represented by Temple units, a shelf margin with pronounced depositional topography and adjacent starved basin were developed. Temple strata include coral-brachiopod-stromatoporoid reefs and productive fore-reef talus deposits along the shelf margin rim.

  14. Efficient generation of hPSC-derived midbrain dopaminergic neurons in a fully defined, scalable, 3D biomaterial platform

    PubMed Central

    Adil, Maroof M.; Rodrigues, Gonçalo M. C.; Kulkarni, Rishikesh U.; Rao, Antara T.; Chernavsky, Nicole E.; Miller, Evan W.; Schaffer, David V.

    2017-01-01

    Pluripotent stem cells (PSCs) have major potential as an unlimited source of functional cells for many biomedical applications; however, the development of cell manufacturing systems to enable this promise faces many challenges. For example, there have been major recent advances in the generation of midbrain dopaminergic (mDA) neurons from stem cells for Parkinson’s Disease (PD) therapy; however, production of these cells typically involves undefined components and difficult to scale 2D culture formats. Here, we used a fully defined, 3D, thermoresponsive biomaterial platform to rapidly generate large numbers of action-potential firing mDA neurons after 25 days of differentiation (~40% tyrosine hydroxylase (TH) positive, maturing into 25% cells exhibiting mDA neuron-like spiking behavior). Importantly, mDA neurons generated in 3D exhibited a 30-fold increase in viability upon implantation into rat striatum compared to neurons generated on 2D, consistent with the elevated expression of survival markers FOXA2 and EN1 in 3D. A defined, scalable, and resource-efficient cell culture platform can thus rapidly generate high quality differentiated cells, both neurons and potentially other cell types, with strong potential to accelerate both basic and translational research. PMID:28091566

  15. De novo genome assembly of the economically important weed horseweed using integrated data from multiple sequencing platforms.

    PubMed

    Peng, Yanhui; Lai, Zhao; Lane, Thomas; Nageswara-Rao, Madhugiri; Okada, Miki; Jasieniuk, Marie; O'Geen, Henriette; Kim, Ryan W; Sammons, R Douglas; Rieseberg, Loren H; Stewart, C Neal

    2014-11-01

    Horseweed (Conyza canadensis), a member of the Compositae (Asteraceae) family, was the first broadleaf weed to evolve resistance to glyphosate. Horseweed, one of the most problematic weeds in the world, is a true diploid (2n = 2x = 18), with the smallest genome of any known agricultural weed (335 Mb). Thus, it is an appropriate candidate to help us understand the genetic and genomic bases of weediness. We undertook a draft de novo genome assembly of horseweed by combining data from multiple sequencing platforms (454 GS-FLX, Illumina HiSeq 2000, and PacBio RS) using various libraries with different insertion sizes (approximately 350 bp, 600 bp, 3 kb, and 10 kb) of a Tennessee-accessed, glyphosate-resistant horseweed biotype. From 116.3 Gb (approximately 350× coverage) of data, the genome was assembled into 13,966 scaffolds with 50% of the assembly = 33,561 bp. The assembly covered 92.3% of the genome, including the complete chloroplast genome (approximately 153 kb) and a nearly complete mitochondrial genome (approximately 450 kb in 120 scaffolds). The nuclear genome is composed of 44,592 protein-coding genes. Genome resequencing of seven additional horseweed biotypes was performed. These sequence data were assembled and used to analyze genome variation. Simple sequence repeat and single-nucleotide polymorphisms were surveyed. Genomic patterns were detected that associated with glyphosate-resistant or -susceptible biotypes. The draft genome will be useful to better understand weediness and the evolution of herbicide resistance and to devise new management strategies. The genome will also be useful as another reference genome in the Compositae. To our knowledge, this article represents the first published draft genome of an agricultural weed.

  16. The promise and challenges of next-generation genome sequencing for clinical care.

    PubMed

    Johansen Taber, Katherine A; Dickinson, Barry D; Wilson, Modena

    2014-02-01

    With increased speed and decreased costs, next-generation gene sequencing has the potential to improve medical care by making possible widespread evaluation of patients' genomes in clinical settings. The entire genome of an individual can now be sequenced in less than 1 week at a cost of $5000 to $10,000; the cost will continue to decline. Analyses based on next-generation sequencing include whole-genome sequencing and whole-exome sequencing; DNA sequences that encode proteins are collectively known as the exome. In some instances, whole genome and whole-exome sequencing have already helped to accurately diagnose diseases with atypical manifestations, that are difficult to diagnose using clinical or laboratory criteria alone, or that otherwise require extensive or costly evaluation. For some patients with malignant neoplasms, next-generating sequencing can improve tumor classification, diagnosis, and management. Many challenges remain, however, such as the storage and interpretation of vast amounts of sequence data, training physicians and other health care professionals whose knowledge of genetics may be insufficient, effective genetic counseling and communication of results to patients, and establishing standards for the appropriate use of the technology. Rigorous studies are needed to assess the utility of whole-genome and whole-exome sequencing in large groups of patients, including comparative studies with other approaches to screening and diagnosis, and the evaluation of clinical end points and health care costs. The successes to date have been in single cases or in very small groups of patients. At present, although whole-genome or whole-exome sequencing show great promise, they should be incorporated into patient care only in limited clinical situations.

  17. Preparation of high-quality next-generation sequencing libraries from picogram quantities of target DNA.

    PubMed

    Parkinson, Nicholas J; Maslau, Siarhei; Ferneyhough, Ben; Zhang, Gang; Gregory, Lorna; Buck, David; Ragoussis, Jiannis; Ponting, Chris P; Fischer, Michael D

    2012-01-01

    New sequencing technologies can address diverse biomedical questions but are limited by a minimum required DNA input of typically 1 μg. We describe how sequencing libraries can be reproducibly created from 20 pg of input DNA using a modified transpososome-mediated fragmentation technique. Resulting libraries incorporate in-line bar-coding, which facilitates sample multiplexes that can be sequenced using Illumina platforms with the manufacturer's sequencing primer. We demonstrate this technique by providing deep coverage sequence of the Escherichia coli K-12 genome that shows equivalent target coverage to a 1-μg input library prepared using standard Illumina methods. Reducing template quantity does, however, increase the proportion of duplicate reads and enriches coverage in low-GC regions. This finding was confirmed with exhaustive resequencing of a mouse library constructed from 20 pg of gDNA input (about seven haploid genomes) resulting in ∼0.4-fold statistical coverage of uniquely mapped fragments. This implies that a near-complete coverage of the mouse genome is obtainable with this approach using 20 genomes as input. Application of this new method now allows genomic studies from low mass samples and routine preparation of sequencing libraries from enrichment procedures.

  18. Preparation of high-quality next-generation sequencing libraries from picogram quantities of target DNA

    PubMed Central

    Parkinson, Nicholas J.; Maslau, Siarhei; Ferneyhough, Ben; Zhang, Gang; Gregory, Lorna; Buck, David; Ragoussis, Jiannis; Ponting, Chris P.; Fischer, Michael D.

    2012-01-01

    New sequencing technologies can address diverse biomedical questions but are limited by a minimum required DNA input of typically 1 μg. We describe how sequencing libraries can be reproducibly created from 20 pg of input DNA using a modified transpososome-mediated fragmentation technique. Resulting libraries incorporate in-line bar-coding, which facilitates sample multiplexes that can be sequenced using Illumina platforms with the manufacturer's sequencing primer. We demonstrate this technique by providing deep coverage sequence of the Escherichia coli K-12 genome that shows equivalent target coverage to a 1-μg input library prepared using standard Illumina methods. Reducing template quantity does, however, increase the proportion of duplicate reads and enriches coverage in low-GC regions. This finding was confirmed with exhaustive resequencing of a mouse library constructed from 20 pg of gDNA input (about seven haploid genomes) resulting in ∼0.4-fold statistical coverage of uniquely mapped fragments. This implies that a near-complete coverage of the mouse genome is obtainable with this approach using 20 genomes as input. Application of this new method now allows genomic studies from low mass samples and routine preparation of sequencing libraries from enrichment procedures. PMID:22090378

  19. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity

    PubMed Central

    Yassour, Moran; Grabherr, Manfred; Blood, Philip D.; Bowden, Joshua; Couger, Matthew Brian; Eccles, David; Li, Bo; Lieber, Matthias; MacManes, Matthew D.; Ott, Michael; Orvis, Joshua; Pochet, Nathalie; Strozzi, Francesco; Weeks, Nathan; Westerman, Rick; William, Thomas; Dewey, Colin N.; Henschel, Robert; LeDuc, Richard D.; Friedman, Nir; Regev, Aviv

    2013-01-01

    De novo assembly of RNA-Seq data allows us to study transcriptomes without the need for a genome sequence, such as in non-model organisms of ecological and evolutionary importance, cancer samples, or the microbiome. In this protocol, we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-Seq data in non-model organisms. We also present Trinity’s supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples, and approaches to identify protein coding genes. In an included tutorial we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sf.net. PMID:23845962

  20. Short reads and nonmodel species: exploring the complexities of next-generation sequence assembly and SNP discovery in the absence of a reference genome.

    PubMed

    Everett, M V; Grau, E D; Seeb, J E

    2011-03-01

    How practical is gene and SNP discovery in a nonmodel species using short read sequences? Next-generation sequencing technologies are being applied to an increasing number of species with no reference genome. For nonmodel species, the cost, availability of existing genetic resources, genome complexity and the planned method of assembly must all be considered when selecting a sequencing platform. Our goal was to examine the feasibility and optimal methodology for SNP and gene discovery in the sockeye salmon (Oncorhynchus nerka) using short read sequences. SOLiD short reads (up to 50 bp) were generated from single- and pooled-tissue transcriptome libraries from ten sockeye salmon. The individuals were from five distinct populations from the Wood River Lakes and Mendeltna Creek, Alaska. As no reference genome was available for sockeye salmon, the SOLiD sequence reads were assembled to publicly available EST reference sequences from sockeye salmon and two closely related species, rainbow trout (Oncorhynchus mykiss) and Atlantic salmon (Salmo salar). Additionally, de novo assembly of the SOLiD data was carried out, and the SOLiD reads were remapped to the de novo contigs. The results from each reference assembly were compared across all references. The number and size of contigs assembled varied with the size reference sequences. In silico SNP discovery was carried out on contigs from all four EST references; however, discovery of valid SNPs was most successful using one of the two conspecific references.

  1. DNA methylation-based forensic age prediction using artificial neural networks and next generation sequencing.

    PubMed

    Vidaki, Athina; Ballard, David; Aliferi, Anastasia; Miller, Thomas H; Barron, Leon P; Syndercombe Court, Denise

    2017-05-01

    generation sequencing (NGS)-based method able to quantify the methylation status of the selected 16 CpG sites was developed using the Illumina MiSeq(®) platform. The method was validated using DNA standards of known methylation levels and the age prediction accuracy has been initially assessed in a set of 46 whole blood samples. Although the resulted prediction accuracy using the NGS data was lower compared to the original model (MAE=7.5years), it is expected that future optimization of our strategy to account for technical variation as well as increasing the sample size will improve both the prediction accuracy and reproducibility.

  2. Building a Robust Tumor Profiling Program: Synergy between Next-Generation Sequencing and Targeted Single-Gene Testing

    PubMed Central

    Lieberman, David B.; Roth, David B.; Zhao, Jianhua; Watt, Christopher D.; Daber, Robert D.; Morrissette, Jennifer J. D.

    2016-01-01

    Next-generation sequencing (NGS) is a powerful platform for identifying cancer mutations. Routine clinical adoption of NGS requires optimized quality control metrics to ensure accurate results. To assess the robustness of our clinical NGS pipeline, we analyzed the results of 304 solid tumor and hematologic malignancy specimens tested simultaneously by NGS and one or more targeted single-gene tests (EGFR, KRAS, BRAF, NPM1, FLT3, and JAK2). For samples that passed our validated tumor percentage and DNA quality and quantity thresholds, there was perfect concordance between NGS and targeted single-gene tests with the exception of two FLT3 internal tandem duplications that fell below the stringent pre-established reporting threshold but were readily detected by manual inspection. In addition, NGS identified clinically significant mutations not covered by single-gene tests. These findings confirm NGS as a reliable platform for routine clinical use when appropriate quality control metrics, such as tumor percentage and DNA quality cutoffs, are in place. Based on our findings, we suggest a simple workflow that should facilitate adoption of clinical oncologic NGS services at other institutions. PMID:27043212

  3. Building a Robust Tumor Profiling Program: Synergy between Next-Generation Sequencing and Targeted Single-Gene Testing.

    PubMed

    Hiemenz, Matthew C; Kadauke, Stephan; Lieberman, David B; Roth, David B; Zhao, Jianhua; Watt, Christopher D; Daber, Robert D; Morrissette, Jennifer J D

    2016-01-01

    Next-generation sequencing (NGS) is a powerful platform for identifying cancer mutations. Routine clinical adoption of NGS requires optimized quality control metrics to ensure accurate results. To assess the robustness of our clinical NGS pipeline, we analyzed the results of 304 solid tumor and hematologic malignancy specimens tested simultaneously by NGS and one or more targeted single-gene tests (EGFR, KRAS, BRAF, NPM1, FLT3, and JAK2). For samples that passed our validated tumor percentage and DNA quality and quantity thresholds, there was perfect concordance between NGS and targeted single-gene tests with the exception of two FLT3 internal tandem duplications that fell below the stringent pre-established reporting threshold but were readily detected by manual inspection. In addition, NGS identified clinically significant mutations not covered by single-gene tests. These findings confirm NGS as a reliable platform for routine clinical use when appropriate quality control metrics, such as tumor percentage and DNA quality cutoffs, are in place. Based on our findings, we suggest a simple workflow that should facilitate adoption of clinical oncologic NGS services at other institutions.

  4. Bit error rate tester using fast parallel generation of linear recurring sequences

    DOEpatents

    Pierson, Lyndon G.; Witzke, Edward L.; Maestas, Joseph H.

    2003-05-06

    A fast method for generating linear recurring sequences by parallel linear recurring sequence generators (LRSGs) with a feedback circuit optimized to balance minimum propagation delay against maximal sequence period. Parallel generation of linear recurring sequences requires decimating the sequence (creating small contiguous sections of the sequence in each LRSG). A companion matrix form is selected depending on whether the LFSR is right-shifting or left-shifting. The companion matrix is completed by selecting a primitive irreducible polynomial with 1's most closely grouped in a corner of the companion matrix. A decimation matrix is created by raising the companion matrix to the (n*k).sup.th power, where k is the number of parallel LRSGs and n is the number of bits to be generated at a time by each LRSG. Companion matrices with 1's closely grouped in a corner will yield sparse decimation matrices. A feedback circuit comprised of XOR logic gates implements the decimation matrix in hardware. Sparse decimation matrices can be implemented with minimum number of XOR gates, and therefore a minimum propagation delay through the feedback circuit. The LRSG of the invention is particularly well suited to use as a bit error rate tester on high speed communication lines because it permits the receiver to synchronize to the transmitted pattern within 2n bits.

  5. High Throughput Sequencing: An Overview of Sequencing Chemistry.

    PubMed

    Ambardar, Sheetal; Gupta, Rikita; Trakroo, Deepika; Lal, Rup; Vakhlu, Jyoti

    2016-12-01

    In the present century sequencing is to the DNA science, what gel electrophoresis was to it in the last century. From 1977 to 2016 three generation of the sequencing technologies of various types have been developed. Second and third generation sequencing technologies referred commonly to as next generation sequencing technology, has evolved significantly with increase in sequencing speed, decrease in sequencing cost, since its inception in 2004. GS FLX by 454 Life Sciences/Roche diagnostics, Genome Analyzer, HiSeq, MiSeq and NextSeq by Illumina, Inc., SOLiD by ABI, Ion Torrent by Life Technologies are various type of the sequencing platforms available for second generation sequencing. The platforms available for the third generation sequencing are Helicos™ Genetic Analysis System by SeqLL, LLC, SMRT Sequencing by Pacific Biosciences, Nanopore sequencing by Oxford Nanopore's, Complete Genomics by Beijing Genomics Institute and GnuBIO by BioRad, to name few. The present article is an overview of the principle and the sequencing chemistry of these high throughput sequencing technologies along with brief comparison of various types of sequencing platforms available.

  6. Heterogeneous Suppression of Sequential Effects in Random Sequence Generation, but Not in Operant Learning

    PubMed Central

    Shteingart, Hanan; Loewenstein, Yonatan

    2016-01-01

    There is a long history of experiments in which participants are instructed to generate a long sequence of binary random numbers. The scope of this line of research has shifted over the years from identifying the basic psychological principles and/or the heuristics that lead to deviations from randomness, to one of predicting future choices. In this paper, we used generalized linear regression and the framework of Reinforcement Learning in order to address both points. In particular, we used logistic regression analysis in order to characterize the temporal sequence of participants’ choices. Surprisingly, a population analysis indicated that the contribution of the most recent trial has only a weak effect on behavior, compared to more preceding trials, a result that seems irreconcilable with standard sequential effects that decay monotonously with the delay. However, when considering each participant separately, we found that the magnitudes of the sequential effect are a monotonous decreasing function of the delay, yet these individual sequential effects are largely averaged out in a population analysis because of heterogeneity. The substantial behavioral heterogeneity in this task is further demonstrated quantitatively by considering the predictive power of the model. We show that a heterogeneous model of sequential dependencies captures the structure available in random sequence generation. Finally, we show that the results of the logistic regression analysis can be interpreted in the framework of reinforcement learning, allowing us to compare the sequential effects in the random sequence generation task to those in an operant learning task. We show that in contrast to the random sequence generation task, sequential effects in operant learning are far more homogenous across the population. These results suggest that in the random sequence generation task, different participants adopt different cognitive strategies to suppress sequential dependencies when

  7. A water-stable metal-organic framework of a zwitterionic carboxylate with dysprosium: a sensing platform for Ebolavirus RNA sequences.

    PubMed

    Qin, Liang; Lin, Li-Xian; Fang, Zhi-Ping; Yang, Shui-Ping; Qiu, Gui-Hua; Chen, Jin-Xiang; Chen, Wen-Hua

    2016-01-04

    We herein report a water-stable 3D dysprosium-based metal-organic framework (MOF) that can non-covalently interact with probe ss-DNA. The formed system can serve as an effective fluorescence sensing platform for the detection of complementary Ebolavirus RNA sequences with the detection limit of 160 pM.

  8. The Effect of Nucleic Acid Extraction Platforms and Sample Storage on the Integrity of Viral RNA for Use in Whole Genome Sequencing.

    PubMed

    Lewandowski, Kuiama; Bell, Andrew; Miles, Rory; Carne, Simon; Wooldridge, David; Manso, Carmen; Hennessy, Nicola; Bailey, Daniel; Pullan, Steven T; Gharbia, Saheer; Vipond, Richard

    2017-03-01

    Extraction of viral RNA and the storage of sample material are extremely important factors in the detection and whole genome sequencing (WGS) of viral pathogens. Although PCR-based detection methods focus on small amplicons, viral WGS applications require RNA of high quality and integrity for adequate sequence coverage and depth. This study examined the fitness of one manual and four automated RNA extraction platforms commonly used in diagnostic laboratories for use in metagenomic sequencing, how the practice of storing sample material in Qiagen buffer AVL before extraction affected the integrity of viral RNA and its suitability for use in amplicon-based WGS methods, and how the addition of Triton X-100 to buffer AVL affected the capability of the extraction platforms and the integrity of viral RNA in stored samples. This study found that the EZ1 platform gave the best performance of the automated platforms and gave comparable results to the frequently used manual Qiagen extraction protocol when extracted viral RNA was used in metagenomics sequencing. To maintain high levels of viral RNA integrity suitable for amplicon-based WGS, nucleic acid should be extracted from samples immediately, because even short storage periods in buffer AVL have a severe effect on integrity, and the addition of Triton X-100 had little effect on the quality of viral material for WGS.

  9. HAPCAD: An open-source tool to detect PCR crossovers in next-generation sequencing generated HLA data

    PubMed Central

    McDevitt, Shana L.; Bredeson, Jessen V.; Roy, Scott W.; Lane, Julie A.; Noble, Janelle A.

    2016-01-01

    Next-generation sequencing (NGS) based HLA genotyping can generate PCR artifacts corresponding to IMGT/HLA Database alleles, for which multiple examples have been observed, including sequence corresponding to the HLA-DRB1*03:42 allele. Repeat genotyping of 131 samples, previously genotyped as DRB1*03:01 homozygotes using probe-based methods, resulted in the heterozygous call DRB1*03:01+DRB1*03:42. The apparent rare DRB1*03:42 allele is hypothesized to be a “hybrid amplicon” generated by PCR crossover, a process in which a partial PCR product denatures from its template, anneals to a different allele template, and extends to completion. Unlike most PCR crossover products, “hybrid amplicons” always corresponds to an IMGT/HLA Database allele, necessitating a case-by-case analysis of whether its occurrence reflects the actual allele or is simply the result of PCR crossover. The Hybrid Amplicon/PCR Crossover Artifact Detector (HAPCAD) program mimics jumping PCR in silico and flags allele sequences that may also be generated as hybrid amplicon. PMID:26802209

  10. Next Generation Proton Beam Writing: A Platform Technology for Nanowire Integration

    DTIC Science & Technology

    2010-06-01

    silsesquioxane (HSQ) nanostructures for Nickel electroplating, S. Gorelick, F. Zhang, P.G. Shao, J.A. van Kan, Harry J . Whitlow, F. Watt, Nuclear...Yaping Ren, Jeroen Anton van Kan, Sher-Yi Chiam, Linke Jian, Herbert O. Moser, Thomas Osipowicz, Frank Watt, Nuclear Instruments & Methods in Physics...Research Section B Volume 267 (2009) 2376-2380 2 Proton beam writing: a platform technology for nanowire production, J . A. van Kan F. Zhang S. Y

  11. Concordance between genomic alterations assessed by next-generation sequencing in tumor tissue or circulating cell-free DNA

    PubMed Central

    Carneiro, Benedito A.; Chandra, Sunandana; Mohindra, Nisha; Kalyan, Aparna; Kaplan, Jason; Matsangou, Maria; Pai, Sachin; Costa, Ricardo; Jovanovic, Borko; Cristofanilli, Massimo; Platanias, Leonidas C.; Giles, Francis J.

    2016-01-01

    Genomic analysis of tumor tissue is the standard technique for identifying DNA alterations in malignancies. Genomic analysis of circulating tumor cell-free DNA (cfDNA) represents a relatively non-invasive method of assessing genomic alterations using peripheral blood. We compared the concordance of genomic alterations between cfDNA and tissue biopsies in this retrospective study. Twenty-eight patients with advanced solid tumors with paired next-generation sequencing tissue and cfDNA biopsies were identified. Sixty-five genes were common to both assays. Concordance was defined as the presence or absence of the identical genomic alteration(s) in a single gene on both molecular platforms. Including all aberrations, the average number of alterations per patient for tissue and cfDNA analysis was 4.82 and 2.96, respectively. When eliminating alterations not detectable in the cfDNA assay, mean number of alterations for tissue and cfDNA was 3.21 and 2.96, respectively. Overall, concordance was 91.9–93.9%. However, the concordance rate decreased to 11.8–17.1% when considering only genes with reported genomic alterations in either assay. Over 50% of mutations detected in either technique were not detected using the other biopsy technique, indicating a potential complementary role of each assay. Across 5 genes (TP53, EGFR, KRAS, APC, CDKN2A), sensitivity and specificity were 59.1% and 94.8%, respectively. Potential explanations for the lack of concordance include differences in assay platform, spatial and temporal factors, tumor heterogeneity, interval treatment, subclones, and potential germline DNA contamination. These results highlight the importance of prospective studies to evaluate concordance of genomic findings between distinct platforms that ultimately may inform treatment decisions. PMID:27588476

  12. Second-generation sequencing of forensic STRs using the Ion Torrent™ HID STR 10-plex and the Ion PGM™.

    PubMed

    Fordyce, Sarah L; Mogensen, Helle Smidt; Børsting, Claus; Lagacé, Robert E; Chang, Chien-Wei; Rajagopalan, Narasimhan; Morling, Niels

    2015-01-01

    Second-generation sequencing (SGS) using Roche/454 and Illumina platforms has proved capable of sequencing the majority of the key forensic genetic STR systems. Given that Roche has announced that the 454 platforms will no longer be supported from 2015, focus should now be shifted to competing SGS platforms, such as the MiSeq (Illumina) and the Ion Personal Genome Machine (Ion PGM™; Thermo Fisher). There are currently several challenges faced with amplicon-based SGS STR typing in forensic genetics, including current lengths of amplicons for CE-typing and lack of uniform data analysis between laboratories. Thermo Fisher has designed a human identification (HID) short tandem repeat (STR) 10-plex panel including amelogenin, CSF1PO, D16S539, D3S1358, D5S818, D7S820, D8S1179, TH01, TPOX and vWA, where the primers have been designed specifically for the purpose of SGS and the data analysis is supported by Ion Torrent™ software. Hence, the combination of the STR 10-plex and the Ion PGM™ represents the first fully integrated SGS STR typing solution from PCR to data analysis. In this study, four experiments were performed to evaluate the alpha-version of the STR 10-plex: (1) typing of control samples; (2) analysis of sensitivity; (3) typing of mixtures; and (4) typing of biological crime case samples. Full profiles and concordant results between replicate SGS runs and CE-typing were observed for all control samples. Full profiles were seen with DNA input down to 50 pg, with the exception of a single locus drop-out in one of the 100 pg dilutions. Mixtures were easily deconvoluted down to 20:1, although alleles from the minor contributor had to be identified manually as some signals were not called by the Ion Torrent™ software. Interestingly, full profiles were obtained for all biological samples from real crime and identification cases, in which only partial profiles were obtained with PCR-CE assays. In conclusion, the Ion Torrent™ HID STR 10-plex panel offers an

  13. Whole Genome Sequencing and a New Bioinformatics Platform Allow for Rapid Gene Identification in D. melanogaster EMS Screens

    PubMed Central

    Gonzalez, Michael A.; Van Booven, Derek; Hulme, William; Ulloa, Rick H.; Lebrigio, Rafael F. Acosta; Osterloh, Jeannette; Logan, Mary; Freeman, Marc; Zuchner, Stephan

    2012-01-01

    Forward genetic screens in Drosophila melanogaster using ethyl methanesulfonate (EMS) mutagenesis are a powerful approach for identifying genes that modulate specific biological processes in an in vivo setting. The mapping of genes that contain randomly-induced point mutations has become more efficient in Drosophila thanks to the maturation and availability of many types of genetic tools. However, classic approaches to gene mapping are relatively slow and ultimately require extensive Sanger sequencing of candidate chromosomal loci. With the advent of new high-throughput sequencing techniques, it is increasingly efficient to directly re-sequence the whole genome of model organisms. This approach, in combination with traditional chromosomal mapping, has the potential to greatly simplify and accelerate mutation identification in mutants generated in EMS screens. Here we show that next-generation sequencing (NGS) is an accurate and efficient tool for high-throughput sequencing and mutation discovery in Drosophila melanogaster. As a test case, mutant strains of Drosophila that exhibited long-term survival of severed peripheral axons were identified in a forward EMS mutagenesis. All mutants were recessive and fell into a single lethal complementation group, which suggested that a single gene was responsible for the protective axon degenerative phenotype. Whole genome sequencing of these genomes identified the underlying gene ect4. To improve the process of genome wide mutation identification, we developed Genomes Management Application (GEM.app, https://genomics.med.miami.edu), a graphical online user interface to a custom query framework. Using a custom GEM.app query, we were able to identify that each mutant carried a unique non-sense mutation in the gene ect4 (dSarm), which was recently shown by Osterloh et al. to be essential for the activation of axonal degeneration. Our results demonstrate the current advantages and limitations of NGS in Drosophila and we introduce

  14. Efficiency to Discovery Transgenic Loci in GM Rice Using Next Generation Sequencing Whole Genome Re-sequencing

    PubMed Central

    Park, Doori; Kim, Dongin; Jang, Green; Lim, Jongsung; Shin, Yun-Ji; Kim, Jina; Seo, Mi-Seong; Park, Su-Hyun; Kim, Ju-Kon

    2015-01-01

    Molecular characterization technology in genetically modified organisms, in addition to how transgenic biotechnologies are developed now require full transparency to assess the risk to living modified and non-modified organisms. Next generation sequencing (NGS) methodology is suggested as an effective means in genome characterization and detection of transgenic insertion locations. In the present study, we applied NGS to insert transgenic loci, specifically the epidermal growth factor (EGF) in genetically modified rice cells. A total of 29.3 Gb (~72× coverage) was sequenced with a 2 × 150 bp paired end method by Illumina HiSeq2500, which was consecutively mapped to the rice genome and T-vector sequence. The compatible pairs of reads were successfully mapped to 10 loci on the rice chromosome and vector sequences were validated to the insertion location by polymerase chain reaction (PCR) amplification. The EGF transgenic site was confirmed only on chromosome 4 by PCR. Results of this study demonstrated the success of NGS data to characterize the rice genome. Bioinformatics analyses must be developed in association with NGS data to identify highly accurate transgenic sites. PMID:26523132

  15. A Microbiome DNA Enrichment Method for Next-Generation Sequencing Sample Preparation.

    PubMed

    Yigit, Erbay; Feehery, George R; Langhorst, Bradley W; Stewart, Fiona J; Dimalanta, Eileen T; Pradhan, Sriharsa; Slatko, Barton; Gardner, Andrew F; McFarland, James; Sumner, Christine; Davis, Theodore B

    2016-07-01

    "Microbiome" is used to describe the communities of microorganisms and their genes in a particular environment, including communities in association with a eukaryotic host or part of a host. One challenge in microbiome analysis concerns the presence of host DNA in samples. Removal of host DNA before sequencing results in greater sequence depth of the intended microbiome target population. This unit describes a novel method of microbial DNA enrichment in which methylated host DNA such as human genomic DNA is selectively bound and separated from microbial DNA before next-generation sequencing (NGS) library construction. This microbiome enrichment technique yields a higher fraction of microbial sequencing reads and improved read quality resulting in a reduced cost of downstream data generation and analysis. © 2016 by John Wiley & Sons, Inc.

  16. Quantitative evaluation of bias in PCR amplification and next-generation sequencing derived from metabarcoding samples.

    PubMed

    Pawluczyk, Marta; Weiss, Julia; Links, Matthew G; Egaña Aranguren, Mikel; Wilkinson, Mark D; Egea-Cortines, Marcos

    2015-03-01

    Unbiased identification of organisms by PCR reactions using universal primers followed by DNA sequencing assumes positive amplification. We used six universal loci spanning 48 plant species and quantified the bias at each step of the identification process from end point PCR to next-generation sequencing. End point amplification was significantly different for single loci and between species. Quantitative PCR revealed that Cq threshold for various loci, even within a single DNA extraction, showed 2,000-fold differences in DNA quantity after amplification. Next-generation sequencing (NGS) experiments in nine species showed significant biases towards species and specific loci using adaptor-specific primers. NGS sequencing bias may be predicted to some extent by the Cq values of qPCR amplification.

  17. “Shovel-ready” Sequences as a Stimulus for the Next Generation of Life Scientists

    PubMed Central

    Boyle, Michael D.

    2010-01-01

    Genomics and bioinformatics are dynamic fields well-suited for capturing the imagination of undergraduates in both research laboratories and classrooms. Currently, raw nucleotide sequence is being provided, as part of several genomics research initiatives, for undergraduate research and teaching. These initiatives could be easily extended and much more effective if the source of the sequenced material and the subsequent focus of the data analysis were aligned with the research interests of individual faculty at undergraduate institutions. By judicious use of surplus capacity in existing nucleotide sequencing cores, raw sequence data could be generated to support ongoing research efforts involving undergraduates. This would allow these students to participate actively in discovery research, with a goal of making novel contributions to their field through original research while nurturing the next generation of talented research scientists. PMID:23653696

  18. A model for the sequence stratigraphy of carbonate ramp to rimmed-platform transitions developed from study of the Middle Cambrian of the S. Appalachians

    SciTech Connect

    Walker, K.R.; Srinivasan, K. . Dept. of Geological Sciences)

    1992-01-01

    Part of the Cambrian System in the S. Appalachians consists of six alternating limestone and shale formations (Conasauga Group). The shelf margin discussed here faced a shallower intracratonic shale basin to the west and northwest. Analysis of the Maryville Limestone along a depositional transect reveals that the shelf evolved from a carbonate ramp that sloped gently basinward to a flat-topped rimmed-platform fringed with steep slopes. The authors describe here the sequence stratigraphy of the ramp to platform transition. A process oriented approach has allowed them to define the sequences, sequence boundaries, and the stacking pattern of the Maryville Limestone, and also to develop a general model for ramp to a rimmed-platform development. The Maryville consists of a combination of aggradational, retrogradational, and progradational units. The stacking pattern is the result of variations in sedimentation rate, subsidence, and absolute sea-level change. Each of the dominantly carbonate units of the Conasauga represent a gradual transition from a ramp-like, shallow-water-to-basin transition into a rimmed-platform. The transition between the Maryville Limestone (M. Cambrian) and the overlying Nolichucky Shale (U. Cambrian) is a sequence boundary. This boundary is both an exposure surface and a drowning unconformity and represents a distinct shift in the pattern of sedimentation. It marks the termination of shallow-water carbonate deposition because of exposure, followed by drowning with continued subsidence during lag time slow sedimentation, and, finally, onlap of basinal siliciclastics onto the old rimmed-platform edge as the sedimentation surface gradually deepened. The model developed here can serve as a useful process analog to other lower Paleozoic and possibly younger passive-margin sequences.

  19. Discovery of a divergent HPIV4 from respiratory secretions using second and third generation metagenomic sequencing.

    PubMed

    Alquezar-Planas, David E; Mourier, Tobias; Bruhn, Christian A W; Hansen, Anders J; Vitcetz, Sarah Nathalie; Mørk, Søren; Gorodkin, Jan; Nielsen, Hanne Abel; Guo, Yan; Sethuraman, Anand; Paxinos, Ellen E; Shan, Tongling; Delwart, Eric L; Nielsen, Lars P

    2013-01-01

    Molecular detection of viruses has been aided by high-throughput sequencing, permitting the genomic characterization of emerging strains. In this study, we comprehensively screened 500 respiratory secretions from children with upper and/or lower respiratory tract infections for viral pathogens. The viruses detected are described, including a divergent human parainfluenza virus type 4 from GS FLX pyrosequencing of 92 specimens. Complete full-genome characterization of the virus followed, using Single Molecule, Real-Time (SMRT) sequencing. Subsequent "primer walking" combined with Sanger sequencing validated the RS platform's utility in viral sequencing from complex clinical samples. Comparative genomics reveals the divergent strain clusters with the only completely sequenced HPIV4a subtype. However, it also exhibits various structural features present in one of the HPIV4b reference strains, opening questions regarding their lifecycle and evolutionary relationships among these viruses. Clinical data from patients infected with the strain, as well as viral prevalence estimates using real-time PCR, is also described.

  20. An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.

    PubMed

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A V S K; Varshney, Rajeev K

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  1. An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

    PubMed Central

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M.; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A. V. S. K.; Varshney, Rajeev K.

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  2. Internally generated sequences in learning and executing goal-directed behavior.

    PubMed

    Pezzulo, Giovanni; van der Meer, Matthijs A A; Lansink, Carien S; Pennartz, Cyriel M A

    2014-12-01

    A network of brain structures including hippocampus (HC), prefrontal cortex, and striatum controls goal-directed behavior and decision making. However, the neural mechanisms underlying these functions are unknown. Here, we review the role of 'internally generated sequences': structured, multi-neuron firing patterns in the network that are not confined to signaling the current state or location of an agent, but are generated on the basis of internal brain dynamics. Neurophysiological studies suggest that such sequences fulfill functions in memory consolidation, augmentation of representations, internal simulation, and recombination of acquired information. Using computational modeling, we propose that internally generated sequences may be productively considered a component of goal-directed decision systems, implementing a sampling-based inference engine that optimizes goal acquisition at multiple timescales of on-line choice, action control, and learning.

  3. Generation of novel motor sequences: the neural correlates of musical improvisation.

    PubMed

    Berkowitz, Aaron L; Ansari, Daniel

    2008-06-01

    While some motor behavior is instinctive and stereotyped or learned and re-executed, much action is a spontaneous response to a novel set of environmental conditions. The neural correlates of both pre-learned and cued motor sequences have been previously studied, but novel motor behavior has thus far not been examined through brain imaging. In this paper, we report a study of musical improvisation in trained pianists with functional magnetic resonance imaging (fMRI), using improvisation as a case study of novel action generation. We demonstrate that both rhythmic (temporal) and melodic (ordinal) motor sequence creation modulate activity in a network of brain regions comprised of the dorsal premotor cortex, the rostral cingulate zone of the anterior cingulate cortex, and the inferior frontal gyrus. These findings are consistent with a role for the dorsal premotor cortex in movement coordination, the rostral cingulate zone in voluntary selection, and the inferior frontal gyrus in sequence generation. Thus, the invention of novel motor sequences in musical improvisation recruits a network of brain regions coordinated to generate possible sequences, select among them, and execute the decided-upon sequence.

  4. EagleView: a genome assembly viewer for next-generation sequencing technologies.

    PubMed

    Huang, Weichun; Marth, Gabor

    2008-09-01

    The emergence of high-throughput next-generation sequencing technologies (e.g., 454 Life Sciences [Roche], Illumina sequencing [formerly Solexa sequencing]) has dramatically sped up whole-genome de novo sequencing and resequencing. While the low cost of these sequencing technologies provides an unparalleled opportunity for genome-wide polymorphism discovery, the analysis of the new data types and huge data volume poses formidable informatics challenges for base calling, read alignment and genome assembly, polymorphism detection, as well as data visualization. We introduce a new data integration and visualization tool EagleView to facilitate data analyses, visual validation, and hypothesis generation. EagleView can handle a large genome assembly of millions of reads. It supports a compact assembly view, multiple navigation modes, and a pinpoint view of technology-specific trace information. Moreover, EagleView supports viewing coassembly of mixed-type reads from different technologies and supports integrating genome feature annotations into genome assemblies. EagleView has been used in our own lab and by over 100 research labs worldwide for next-generation sequence analyses. The EagleView software is freely available for not-for-profit use at http://bioinformatics.bc.edu/marthlab/EagleView.

  5. Next-generation sequencing (NGS) for assessment of microbial water quality: current progress, challenges, and future opportunities

    PubMed Central

    Tan, BoonFei; Ng, Charmaine; Nshimyimana, Jean Pierre; Loh, Lay Leng; Gin, Karina Y.-H.; Thompson, Janelle R.

    2015-01-01

    Water quality is an emergent property of a complex system comprised of interacting microbial populations and introduced microbial and chemical contaminants. Studies leveraging next-generation sequencing (NGS) technologies are providing new insights into the ecology of microbially mediated processes that influence fresh water quality such as algal blooms, contaminant biodegradation, and pathogen dissemination. In addition, sequencing methods targeting small subunit (SSU) rRNA hypervariable regions have allowed identification of signature microbial species that serve as bioindicators for sewage contamination in these environments. Beyond amplicon sequencing, metagenomic and metatranscriptomic analyses of microbial communities in fresh water environments reveal the genetic capabilities and interplay of waterborne microorganisms, shedding light on the mechanisms for production and biodegradation of toxins and other contaminants. This review discusses the challenges and benefits of applying NGS-based methods to water quality research and assessment. We will consider the suitability and biases inherent in the application of NGS as a screening tool for assessment of biological risks and discuss the potential and limitations for direct quantitative interpretation of NGS data. Secondly, we will examine case studies from recent literature where NGS based methods have been applied to topics in water quality assessment, including development of bioindicators for sewage pollution and microbial source tracking, characterizing the distribution of toxin and antibiotic resistance genes in water samples, and investigating mechanisms of biodegradation of harmful pollutants that threaten water quality. Finally, we provide a short review of emerging NGS platforms and their potential applications to the next generation of water quality assessment tools. PMID:26441948

  6. Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification.

    PubMed

    Kamps, Rick; Brandão, Rita D; Bosch, Bianca J van den; Paulussen, Aimee D C; Xanthoulea, Sofia; Blok, Marinus J; Romano, Andrea

    2017-01-31

    Next-generation sequencing (NGS) technology has expanded in the last decades with significant improvements in the reliability, sequencing chemistry, pipeline analyses, data interpretation and costs. Such advances make the use of NGS feasible in clinical practice today. This review describes the recent technological developments in NGS applied to the field of oncology. A number of clinical applications are reviewed, i.e., mutation detection in inherited cancer syndromes based on DNA-sequencing, detection of spliceogenic variants based on RNA-sequencing, DNA-sequencing to identify risk modifiers and application for pre-implantation genetic diagnosis, cancer somatic mutation analysis, pharmacogenetics and liquid biopsy. Conclusive remarks, clinical limitations, implications and ethical considerations that relate to the different applications are provided.

  7. Integrated next-generation sequencing analysis of whole exome and 409 cancer-related genes.

    PubMed

    Shimoda, Yuji; Nagashima, Takeshi; Urakami, Kenichi; Tanabe, Tomoe; Saito, Junko; Naruoka, Akane; Serizawa, Masakuni; Mochizuki, Tohru; Ohshima, Keiichi; Ohnami, Sumiko; Ohnami, Shumpei; Kusuhara, Masatoshi; Yamaguchi, Ken

    2016-01-01

    The use of next-generation sequencing (NGS) techniques to analyze the genomes of cancer cells has identified numerous genomic alterations, including single-base substitutions, small insertions and deletions, amplification, recombination, and epigenetic modifications. NGS contributes to the clinical management of patients as well as new discoveries that identify the mechanisms of tumorigenesis. Moreover, analysis of gene panels targeting actionable mutations enhances efforts to optimize the selection of chemotherapeutic regimens. However, whole genome sequencing takes several days and costs at least $10,000, depending on sequence coverage. Therefore, laboratories with relatively limited resources must employ a more economical approach. For this purpose, we conducted an integrated nucleotide sequence analysis of a panel of 409-cancer related genes (409-CRG) combined with whole exome sequencing (WES). Analysis of the 409-CRG panel detected low-frequency variants with high sensitivity, and WES identified moderate and high frequency somatic variants as well as germline variants.

  8. Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification

    PubMed Central

    Kamps, Rick; Brandão, Rita D.; van den Bosch, Bianca J.; Paulussen, Aimee D. C.; Xanthoulea, Sofia; Blok, Marinu