Science.gov

Sample records for generation sequencing platforms

  1. Next-Generation Sequencing Platforms

    NASA Astrophysics Data System (ADS)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  2. Performance comparison of Next Generation sequencing platforms.

    PubMed

    Erguner, Bekir; Ustek, Duran; Sagiroglu, Mahmut S

    2015-01-01

    Next Generation DNA Sequencing technologies offer ultra high sequencing throughput for very low prices. The increase in throughput and diminished costs open up new research areas. Moreover, number of clinicians utilizing DNA sequencing keeps growing. One of the main concern for researchers and clinicians who are adopting these platforms is their sequencing accuracy. We compared three of the most commonly used Next Generation Sequencing platforms; Ion Torrent from Life Technologies, GS FLX+ from Roche and HiSeq 2000 from Illumina.

  3. The Accuracy, Feasibility and Challenges of Sequencing Short Tandem Repeats Using Next-Generation Sequencing Platforms

    PubMed Central

    Zavodna, Monika; Bagshaw, Andrew; Brauning, Rudiger; Gemmell, Neil J.

    2014-01-01

    To date we have little knowledge of how accurate next-generation sequencing (NGS) technologies are in sequencing repetitive sequences beyond known limitations to accurately sequence homopolymers. Only a handful of previous reports have evaluated the potential of NGS for sequencing short tandem repeats (microsatellites) and no empirical study has compared and evaluated the performance of more than one NGS platform with the same dataset. Here we examined yeast microsatellite variants from both long-read (454-sequencing) and short-read (Illumina) NGS platforms and compared these to data derived through Sanger sequencing. In addition, we investigated any locus-specific biases and differences that might have resulted from variability in microsatellite repeat number, repeat motif or type of mutation. Out of 112 insertion/deletion variants identified among 45 microsatellite amplicons in our study, we found 87.5% agreement between the 454-platform and Sanger sequencing in frequency of variant detection after Benjamini-Hochberg correction for multiple tests. For a subset of 21 microsatellite amplicons derived from Illumina sequencing, the results of short-read platform were highly consistent with the other two platforms, with 100% agreement with 454-sequencing and 93.6% agreement with the Sanger method after Benjamini-Hochberg correction. We found that the microsatellite attributes copy number, repeat motif and type of mutation did not have a significant effect on differences seen between the sequencing platforms. We show that both long-read and short-read NGS platforms can be used to sequence short tandem repeats accurately, which makes it feasible to consider the use of these platforms in high-throughput genotyping. It appears the major requirement for achieving both high accuracy and rare variant detection in microsatellite genotyping is sufficient read depth coverage. This might be a challenge because each platform generates a consistent pattern of non-uniform sequence

  4. Toward Complete Bacterial Genome Sequencing Through the Combined Use of Multiple Next-Generation Sequencing Platforms.

    PubMed

    Jeong, Haeyoung; Lee, Dae-Hee; Ryu, Choong-Min; Park, Seung-Hwan

    2016-01-01

    PacBio's long-read sequencing technologies can be successfully used for a complete bacterial genome assembly using recently developed non-hybrid assemblers in the absence of secondgeneration, high-quality short reads. However, standardized procedures that take into account multiple pre-existing second-generation sequencing platforms are scarce. In addition to Illumina HiSeq and Ion Torrent PGM-based genome sequencing results derived from previous studies, we generated further sequencing data, including from the PacBio RS II platform, and applied various bioinformatics tools to obtain complete genome assemblies for five bacterial strains. Our approach revealed that the hierarchical genome assembly process (HGAP) non-hybrid assembler resulted in nearly complete assemblies at a moderate coverage of ~75x, but that different versions produced non-compatible results requiring post processing. The other two platforms further improved the PacBio assembly through scaffolding and a final error correction.

  5. Peptide Synthesis on a Next-Generation DNA Sequencing Platform.

    PubMed

    Svensen, Nina; Peersen, Olve B; Jaffrey, Samie R

    2016-09-01

    Methods for displaying large numbers of peptides on solid surfaces are essential for high-throughput characterization of peptide function and binding properties. Here we describe a method for converting the >10(7) flow cell-bound clusters of identical DNA strands generated by the Illumina DNA sequencing technology into clusters of complementary RNA, and subsequently peptide clusters. We modified the flow-cell-bound primers with ribonucleotides thus enabling them to be used by poliovirus polymerase 3D(pol) . The primers hybridize to the clustered DNA thus leading to RNA clusters. The RNAs fold into functional protein- or small molecule-binding aptamers. We used the mRNA-display approach to synthesize flow-cell-tethered peptides from these RNA clusters. The peptides showed selective binding to cognate antibodies. The methods described here provide an approach for using DNA clusters to template peptide synthesis on an Illumina flow cell, thus providing new opportunities for massively parallel peptide-based assays.

  6. Comparison of three next-generation sequencing platforms for metagenomic sequencing and identification of pathogens in blood

    PubMed Central

    2014-01-01

    Background The introduction of benchtop sequencers has made adoption of whole genome sequencing possible for a broader community of researchers than ever before. Concurrently, metagenomic sequencing (MGS) is rapidly emerging as a tool for interrogating complex samples that defy conventional analyses. In addition, next-generation sequencers are increasingly being used in clinical or related settings, for instance to track outbreaks. However, information regarding the analytical sensitivity or limit of detection (LoD) of benchtop sequencers is currently lacking. Furthermore, the specificity of sequence information at or near the LoD is unknown. Results In the present study, we assess the ability of three next-generation sequencing platforms to identify a pathogen (viral or bacterial) present in low titers in a clinically relevant sample (blood). Our results indicate that the Roche-454 Titanium platform is capable of detecting Dengue virus at titers as low as 1X102.5 pfu/mL, corresponding to an estimated 5.4X104 genome copies/ml maximum. The increased throughput of the benchtop sequencers, the Ion Torrent PGM and Illumina MiSeq platforms, enabled detection of viral genomes at concentrations as low as 1X104 genome copies/mL. Platform-specific biases were evident in sequence read distributions as well as viral genome coverage. For bacterial samples, only the MiSeq platform was able to provide sequencing reads that could be unambiguously classified as originating from Bacillus anthracis. Conclusion The analytical sensitivity of all three platforms approaches that of standard qPCR assays. Although all platforms were able to detect pathogens at the levels tested, there were several noteworthy differences. The Roche-454 Titanium platform produced consistently longer reads, even when compared with the latest chemistry updates for the PGM platform. The MiSeq platform produced consistently greater depth and breadth of coverage, while the Ion Torrent was unequaled for speed of

  7. Use of four next-generation sequencing platforms to determine HIV-1 coreceptor tropism.

    PubMed

    Archer, John; Weber, Jan; Henry, Kenneth; Winner, Dane; Gibson, Richard; Lee, Lawrence; Paxinos, Ellen; Arts, Eric J; Robertson, David L; Mimms, Larry; Quiñones-Mateu, Miguel E

    2012-01-01

    HIV-1 coreceptor tropism assays are required to rule out the presence of CXCR4-tropic (non-R5) viruses prior treatment with CCR5 antagonists. Phenotypic (e.g., Trofile™, Monogram Biosciences) and genotypic (e.g., population sequencing linked to bioinformatic algorithms) assays are the most widely used. Although several next-generation sequencing (NGS) platforms are available, to date all published deep sequencing HIV-1 tropism studies have used the 454™ Life Sciences/Roche platform. In this study, HIV-1 co-receptor usage was predicted for twelve patients scheduled to start a maraviroc-based antiretroviral regimen. The V3 region of the HIV-1 env gene was sequenced using four NGS platforms: 454™, PacBio® RS (Pacific Biosciences), Illumina®, and Ion Torrent™ (Life Technologies). Cross-platform variation was evaluated, including number of reads, read length and error rates. HIV-1 tropism was inferred using Geno2Pheno, Web PSSM, and the 11/24/25 rule and compared with Trofile™ and virologic response to antiretroviral therapy. Error rates related to insertions/deletions (indels) and nucleotide substitutions introduced by the four NGS platforms were low compared to the actual HIV-1 sequence variation. Each platform detected all major virus variants within the HIV-1 population with similar frequencies. Identification of non-R5 viruses was comparable among the four platforms, with minor differences attributable to the algorithms used to infer HIV-1 tropism. All NGS platforms showed similar concordance with virologic response to the maraviroc-based regimen (75% to 80% range depending on the algorithm used), compared to Trofile (80%) and population sequencing (70%). In conclusion, all four NGS platforms were able to detect minority non-R5 variants at comparable levels suggesting that any NGS-based method can be used to predict HIV-1 coreceptor usage.

  8. Quality control of next-generation sequencing library through an integrative digital microfluidic platform.

    PubMed

    Thaitrong, Numrin; Kim, Hanyoup; Renzi, Ronald F; Bartsch, Michael S; Meagher, Robert J; Patel, Kamlesh D

    2012-12-01

    We have developed an automated quality control (QC) platform for next-generation sequencing (NGS) library characterization by integrating a droplet-based digital microfluidic (DMF) system with a capillary-based reagent delivery unit and a quantitative CE module. Using an in-plane capillary-DMF interface, a prepared sample droplet was actuated into position between the ground electrode and the inlet of the separation capillary to complete the circuit for an electrokinetic injection. Using a DNA ladder as an internal standard, the CE module with a compact LIF detector was capable of detecting dsDNA in the range of 5-100 pg/μL, suitable for the amount of DNA required by the Illumina Genome Analyzer sequencing platform. This DMF-CE platform consumes tenfold less sample volume than the current Agilent BioAnalyzer QC technique, preserving precious sample while providing necessary sensitivity and accuracy for optimal sequencing performance. The ability of this microfluidic system to validate NGS library preparation was demonstrated by examining the effects of limited-cycle PCR amplification on the size distribution and the yield of Illumina-compatible libraries, demonstrating that as few as ten cycles of PCR bias the size distribution of the library toward undesirable larger fragments.

  9. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  10. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    PubMed

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.

  11. A Microfluidic DNA Library Preparation Platform for Next-Generation Sequencing

    PubMed Central

    Sinha, Anupama; Bent, Zachary W.; Solberg, Owen D.; Williams, Kelly P.; Langevin, Stanley A.; Renzi, Ronald F.; Van De Vreugde, James L.; Meagher, Robert J.; Schoeniger, Joseph S.; Lane, Todd W.; Branda, Steven S.; Bartsch, Michael S.; Patel, Kamlesh D.

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories. PMID:23894387

  12. A microfluidic DNA library preparation platform for next-generation sequencing.

    PubMed

    Kim, Hanyoup; Jebrail, Mais J; Sinha, Anupama; Bent, Zachary W; Solberg, Owen D; Williams, Kelly P; Langevin, Stanley A; Renzi, Ronald F; Van De Vreugde, James L; Meagher, Robert J; Schoeniger, Joseph S; Lane, Todd W; Branda, Steven S; Bartsch, Michael S; Patel, Kamlesh D

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  13. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study.

    PubMed

    Li, Sheng; Tighe, Scott W; Nicolet, Charles M; Grove, Deborah; Levy, Shawn; Farmerie, William; Viale, Agnes; Wright, Chris; Schweitzer, Peter A; Gao, Yuan; Kim, Dewey; Boland, Joe; Hicks, Belynda; Kim, Ryan; Chhangawala, Sagar; Jafari, Nadereh; Raghavachari, Nalini; Gandara, Jorge; Garcia-Reyero, Natàlia; Hendrickson, Cynthia; Roberson, David; Rosenfeld, Jeffrey A; Rosenfeld, Jeffrey; Smith, Todd; Underwood, Jason G; Wang, May; Zumbo, Paul; Baldwin, Don A; Grills, George S; Mason, Christopher E

    2014-09-01

    High-throughput RNA sequencing (RNA-seq) greatly expands the potential for genomics discoveries, but the wide variety of platforms, protocols and performance capabilitites has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We carried out replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (poly-A-selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies PGM and Proton, Pacific Biosciences RS and Roche 454). The results show high intraplatform (Spearman rank R > 0.86) and inter-platform (R > 0.83) concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. For intact RNA, gene expression profiles from rRNA-depletion and poly-A enrichment are similar. In addition, rRNA depletion enables effective analysis of degraded RNA samples. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq.

  14. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis.

    PubMed

    Dalloul, Rami A; Long, Julie A; Zimin, Aleksey V; Aslam, Luqman; Beal, Kathryn; Blomberg, Le Ann; Bouffard, Pascal; Burt, David W; Crasta, Oswald; Crooijmans, Richard P M A; Cooper, Kristal; Coulombe, Roger A; De, Supriyo; Delany, Mary E; Dodgson, Jerry B; Dong, Jennifer J; Evans, Clive; Frederickson, Karin M; Flicek, Paul; Florea, Liliana; Folkerts, Otto; Groenen, Martien A M; Harkins, Tim T; Herrero, Javier; Hoffmann, Steve; Megens, Hendrik-Jan; Jiang, Andrew; de Jong, Pieter; Kaiser, Pete; Kim, Heebal; Kim, Kyu-Won; Kim, Sungwon; Langenberger, David; Lee, Mi-Kyung; Lee, Taeheon; Mane, Shrinivasrao; Marcais, Guillaume; Marz, Manja; McElroy, Audrey P; Modise, Thero; Nefedov, Mikhail; Notredame, Cédric; Paton, Ian R; Payne, William S; Pertea, Geo; Prickett, Dennis; Puiu, Daniela; Qioa, Dan; Raineri, Emanuele; Ruffier, Magali; Salzberg, Steven L; Schatz, Michael C; Scheuring, Chantel; Schmidt, Carl J; Schroeder, Steven; Searle, Stephen M J; Smith, Edward J; Smith, Jacqueline; Sonstegard, Tad S; Stadler, Peter F; Tafer, Hakim; Tu, Zhijian Jake; Van Tassell, Curtis P; Vilella, Albert J; Williams, Kelly P; Yorke, James A; Zhang, Liqing; Zhang, Hong-Bin; Zhang, Xiaojun; Zhang, Yang; Reed, Kent M

    2010-09-07

    A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.

  15. Multi-Platform Next-Generation Sequencing of the Domestic Turkey (Meleagris gallopavo): Genome Assembly and Analysis

    PubMed Central

    Aslam, Luqman; Beal, Kathryn; Ann Blomberg, Le; Bouffard, Pascal; Burt, David W.; Crasta, Oswald; Crooijmans, Richard P. M. A.; Cooper, Kristal; Coulombe, Roger A.; De, Supriyo; Delany, Mary E.; Dodgson, Jerry B.; Dong, Jennifer J.; Evans, Clive; Frederickson, Karin M.; Flicek, Paul; Florea, Liliana; Folkerts, Otto; Groenen, Martien A. M.; Harkins, Tim T.; Herrero, Javier; Hoffmann, Steve; Megens, Hendrik-Jan; Jiang, Andrew; de Jong, Pieter; Kaiser, Pete; Kim, Heebal; Kim, Kyu-Won; Kim, Sungwon; Langenberger, David; Lee, Mi-Kyung; Lee, Taeheon; Mane, Shrinivasrao; Marcais, Guillaume; Marz, Manja; McElroy, Audrey P.; Modise, Thero; Nefedov, Mikhail; Notredame, Cédric; Paton, Ian R.; Payne, William S.; Pertea, Geo; Prickett, Dennis; Puiu, Daniela; Qioa, Dan; Raineri, Emanuele; Ruffier, Magali; Salzberg, Steven L.; Schatz, Michael C.; Scheuring, Chantel; Schmidt, Carl J.; Schroeder, Steven; Searle, Stephen M. J.; Smith, Edward J.; Smith, Jacqueline; Sonstegard, Tad S.; Stadler, Peter F.; Tafer, Hakim; Tu, Zhijian (Jake); Van Tassell, Curtis P.; Vilella, Albert J.; Williams, Kelly P.; Yorke, James A.; Zhang, Liqing; Zhang, Hong-Bin; Zhang, Xiaojun; Zhang, Yang; Reed, Kent M.

    2010-01-01

    A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest. PMID:20838655

  16. Performance Comparison of Illumina and Ion Torrent Next-Generation Sequencing Platforms for 16S rRNA-Based Bacterial Community Profiling

    PubMed Central

    Kawashima, Toana; Rosenthal, Christopher; Hoogestraat, Daniel R.; Cummings, Lisa A.; Sengupta, Dhruba J.; Harkins, Timothy T.; Cookson, Brad T.

    2014-01-01

    High-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common “benchtop” sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone. PMID:25261520

  17. A two-dimensional pooling strategy for rare variant detection on next-generation sequencing platforms.

    PubMed

    Zuzarte, Philip C; Denroche, Robert E; Fehringer, Gordon; Katzov-Eckert, Hagit; Hung, Rayjean J; McPherson, John D

    2014-01-01

    We describe a method for pooling and sequencing DNA from a large number of individual samples while preserving information regarding sample identity. DNA from 576 individuals was arranged into four 12 row by 12 column matrices and then pooled by row and by column resulting in 96 total pools with 12 individuals in each pool. Pooling of DNA was carried out in a two-dimensional fashion, such that DNA from each individual is present in exactly one row pool and exactly one column pool. By considering the variants observed in the rows and columns of a matrix we are able to trace rare variants back to the specific individuals that carry them. The pooled DNA samples were enriched over a 250 kb region previously identified by GWAS to significantly predispose individuals to lung cancer. All 96 pools (12 row and 12 column pools from 4 matrices) were barcoded and sequenced on an Illumina HiSeq 2000 instrument with an average depth of coverage greater than 4,000×. Verification based on Ion PGM sequencing confirmed the presence of 91.4% of confidently classified SNVs assayed. In this way, each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach. This provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.

  18. A comprehensive transcriptome assembly of pigeonpea (Cajanauscajan L.) using sanger and second-generation sequencing platforms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A comprehensive transcriptome assembly for pigeonpea has been developed by analyzing 128.9 million short Illumina GA IIx single end reads, 2.19 million single end FLX/454 reads, and 18,353 Sanger expressed sequenced tags (ESTs) from more than 16 genotypes. The resultant transcriptome assembly, refer...

  19. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo) genome assembly and analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing technologies were used to rapidly and efficiently sequence the genome of the domestic turkey (Meleagris gallopavo). The current genome assembly (~1.1 Gb) includes 917 Mb of sequence assigned to chromosomes. Innate heterozygosity of the sequenced bird allowed discovery of...

  20. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  1. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform.

    PubMed

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-09-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus.

  2. Robustness of Massively Parallel Sequencing Platforms.

    PubMed

    Kavak, Pınar; Yüksel, Bayram; Aksu, Soner; Kulekci, M Oguzhan; Güngör, Tunga; Hach, Faraz; Şahinalp, S Cenk; Alkan, Can; Sağıroğlu, Mahmut Şamil

    2015-01-01

    The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications.

  3. AG-NGS: a powerful and user-friendly computing application for the semi-automated preparation of next-generation sequencing libraries using open liquid handling platforms.

    PubMed

    Callejas, Sergio; Álvarez, Rebeca; Benguria, Alberto; Dopazo, Ana

    2014-01-01

    Next-generation sequencing (NGS) is becoming one of the most widely used technologies in the field of genomics. Library preparation is one of the most critical, hands-on, and time-consuming steps in the NGS workflow. Each library must be prepared in an independent well, increasing the number of hours required for a sequencing run and the risk of human-introduced error. Automation of library preparation is the best option to avoid these problems. With this in mind, we have developed automatic genomics NGS (AG-NGS), a computing application that allows an open liquid handling platform to be transformed into a library preparation station without losing the potential of an open platform. Implementation of AG-NGS does not require programming experience, and the application has also been designed to minimize implementation costs. Automated library preparation with AG-NGS generated high-quality libraries from different samples, demonstrating its efficiency, and all quality control parameters fell within the range of optimal values.

  4. Generations of sequencing technologies.

    PubMed

    Pettersson, Erik; Lundeberg, Joakim; Ahmadian, Afshin

    2009-02-01

    Advancements in the field of DNA sequencing are changing the scientific horizon and promising an era of personalized medicine for elevated human health. Although platforms are improving at the rate of Moore's Law, thereby reducing the sequencing costs by a factor of two or three each year, we find ourselves at a point in history where individual genomes are starting to appear but where the cost is still too high for routine sequencing of whole genomes. These needs will be met by miniaturized and parallelized platforms that allow a lower sample and template consumption thereby increasing speed and reducing costs. Current massively parallel, state-of-the-art systems are providing significantly improved throughput over Sanger systems and future single-molecule approaches will continue the exponential improvements in the field.

  5. Next-generation sequencing and microarray-based interrogation of microRNAs from formalin-fixed, paraffin-embedded tissue: Preliminary assessment of cross-platform concordance

    PubMed Central

    Kelly, Andrew D.; Hill, Katherine E.; Correll, Mick; Hu, Lan; Wang, Yaoyu; Rubio, Renee; Duan, Shenghua; Quackenbush, John; Spentzos, Dimitrios

    2014-01-01

    Next-generation sequencing is increasingly employed in biomedical investigations. Strong concordance between microarray and mRNA-seq levels has been reported in high quality specimens but information is lacking on formalin-fixed, paraffin-embedded (FFPE) tissues, and particularly for microRNA (miRNA) analysis. We conducted a preliminary examination of the concordance between miRNA-seq and cDNA-mediated annealing, selection, extension, and ligation (DASL) miRNA assays. Quantitative agreement between platforms is moderate (Spearman correlation 0.514–0.596) and there is discordance of detection calls on a subset of miRNAs. Quantitative PCR (q-RT-PCR) performed for several discordant miRNAs confirmed the presence of most sequences detected by miRNA-seq but not by DASL but also that miRNA-seq did not detect some sequences, which DASL confidently detected. Our results suggest that miRNA-seq is specific, with few false positive calls, but it may not detect certain abundant miRNAs in FFPE tissue. Further work is necessary to fully address these issues that are pertinent for translational research. PMID:23562991

  6. Comprehensive transcriptome assembly of chickpea (Cicer arietinum L.) using Sanger and next generation sequencing platforms: development and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A high-quality transcriptome assembly for chickpea has been developed using ~135 million Illumina single-end reads, 7.12 million single-end FLX/454 reads, and 139 thousand Sanger expressed sequence tags (ESTs). This hybrid transcriptome assembly, which we refer to as the "Cicer arietinum Transcripto...

  7. Profile of bacterial communities in South African mine-water samples using Illumina next-generation sequencing platform.

    PubMed

    Keshri, Jitendra; Mankazana, Boitumelo B J; Momba, Maggy N B

    2015-04-01

    Mine water is an example of an extreme environment that contains a large number of diverse and specific bacteria. It is imperative to gain an understanding of these bacterial communities in order to develop effective strategies for the bioremediation of polluted aquatic systems. In this study, the high-throughput sequencing approach was used to characterize the bacterial communities in two different mine waters of South Africa: vanadium and gold mine water. Over 2629 operational taxonomic units (OTUs) were recovered from 15,802 reads of the 16S ribosomal RNA (rRNA) gene. They represented 8 phyla, 43 orders, 84 families and 105 genera. Proteobacteria and unclassified bacterial sequences were the most dominant. Apart from these, Firmicutes, Bacteroidetes, Actinobacteria, Candidate phylum OD1, Cyanobacteria, Verrucomicrobia and Deinococcus-Thermus were the recovered phyla, although their relative abundance differed between both the mine-water samples. Yet, diversity indices suggested that the bacterial communities inhabiting the vanadium mine water were more diverse than those in gold mine water. Interestingly, substantial percentages of the reads from either sample (58 % in vanadium and 17 % in gold mine water) could not be assigned to any phylum and remained unclassified, suggesting hitherto unidentified populations, and vast untapped microbial diversity. Overall, the results of this study exhibited bacterial community structures with high diversity in mine water, which can be explored further for their role in bioremediation and environmental management.

  8. Comprehensive transcriptome assembly of Chickpea (Cicer arietinum L.) using sanger and next generation sequencing platforms: development and applications.

    PubMed

    Kudapa, Himabindu; Azam, Sarwar; Sharpe, Andrew G; Taran, Bunyamin; Li, Rong; Deonovic, Benjamin; Cameron, Connor; Farmer, Andrew D; Cannon, Steven B; Varshney, Rajeev K

    2014-01-01

    A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags (ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as Cicer arietinumTranscriptome Assembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago, soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in

  9. Automatic Command Sequence Generation

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladded, Roy; Khanampompan, Teerapat

    2007-01-01

    Automatic Sequence Generator (Autogen) Version 3.0 software automatically generates command sequences for the Mars Reconnaissance Orbiter (MRO) and several other JPL spacecraft operated by the multi-mission support team. Autogen uses standard JPL sequencing tools like APGEN, ASP, SEQGEN, and the DOM database to automate the generation of uplink command products, Spacecraft Command Message Format (SCMF) files, and the corresponding ground command products, DSN Keywords Files (DKF). Autogen supports all the major multi-mission mission phases including the cruise, aerobraking, mapping/science, and relay mission phases. Autogen is a Perl script, which functions within the mission operations UNIX environment. It consists of two parts: a set of model files and the autogen Perl script. Autogen encodes the behaviors of the system into a model and encodes algorithms for context sensitive customizations of the modeled behaviors. The model includes knowledge of different mission phases and how the resultant command products must differ for these phases. The executable software portion of Autogen, automates the setup and use of APGEN for constructing a spacecraft activity sequence file (SASF). The setup includes file retrieval through the DOM (Distributed Object Manager), an object database used to store project files. This step retrieves all the needed input files for generating the command products. Depending on the mission phase, Autogen also uses the ASP (Automated Sequence Processor) and SEQGEN to generate the command product sent to the spacecraft. Autogen also provides the means for customizing sequences through the use of configuration files. By automating the majority of the sequencing generation process, Autogen eliminates many sequence generation errors commonly introduced by manually constructing spacecraft command sequences. Through the layering of commands into the sequence by a series of scheduling algorithms, users are able to rapidly and reliably construct the

  10. Direct Chloroplast Sequencing: Comparison of Sequencing Platforms and Analysis Tools for Whole Chloroplast Barcoding

    PubMed Central

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis. PMID:25329378

  11. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    PubMed

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert James

    2014-01-01

    Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina) and Ion Torrent (Life Technology) sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare). Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels) between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  12. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing. PMID:27547838

  13. Next generation sequencing technology: Advances and applications.

    PubMed

    Buermans, H P J; den Dunnen, J T

    2014-10-01

    Impressive progress has been made in the field of Next Generation Sequencing (NGS). Through advancements in the fields of molecular biology and technical engineering, parallelization of the sequencing reaction has profoundly increased the total number of produced sequence reads per run. Current sequencing platforms allow for a previously unprecedented view into complex mixtures of RNA and DNA samples. NGS is currently evolving into a molecular microscope finding its way into virtually every fields of biomedical research. In this chapter we review the technical background of the different commercially available NGS platforms with respect to template generation and the sequencing reaction and take a small step towards what the upcoming NGS technologies will bring. We close with an overview of different implementations of NGS into biomedical research. This article is part of a Special Issue entitled: From Genome to Function.

  14. Quasi-Random Sequence Generators.

    1994-03-01

    Version 00 LPTAU generates quasi-random sequences. The sequences are uniformly distributed sets of L=2**30 points in the N-dimensional unit cube: I**N=[0,1]. The sequences are used as nodes for multidimensional integration, as searching points in global optimization, as trial points in multicriteria decision making, as quasi-random points for quasi Monte Carlo algorithms.

  15. Comparison of Sequencing Platforms for Single Nucleotide Variant Calls in a Human Sample

    PubMed Central

    Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required. PMID:23405114

  16. Comparison of sequencing platforms for single nucleotide variant calls in a human sample.

    PubMed

    Ratan, Aakrosh; Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar; Schuster, Stephan C

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required.

  17. Application of genotyping-by-sequencing on semiconductor sequencing platforms: A comparison of genetic and reference-based marker ordering in barley

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid development of next generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach fo...

  18. Wolfcampian sequence stratigraphy of eastern Central Basin platform, Texas

    SciTech Connect

    Candelaria, M.P.; Entzminger, D.J.; Behnken, F.H. ); Sarg, J.F. ); Wilde, G.L. )

    1992-04-01

    Integrated study of well logs, cores, high-resolution seismic data, and biostratigraphy has established the sequence framework of the Atokan (Early Pennsylvanian)-Wolfcampian (Early Permian) stratigraphic section along the eastern margin of the Central Basin platform in the Permian basin. Sequence interpretation of high-resolution, high-fold seismic data through this stratigraphic interval has revealed a complex progradational/retrogradational evolution of the platform margin that has demonstrated overall progradation of at least 12 km during early-middle Wolfcampian. Sequence stratigraphic study of the Wolfcamp interval has revealed details of the internal architecture and morphologic evolution of the contemporaneous platform margin. Two generalized seismic facies assemblages are recognized in the Wolfcampian. Platform interior facies are characterized by high-amplitude, laterally continuous parallel reflections; platform margin facies consist of progradational sigmoidal to oblique clinoforms and are characterized by discontinuous, low-amplitude reflections. Sequence interpretation of carbonate platform-to-basin strata geometries helps in predicting subtle stratigraphic trapping relationships and potential reservoir facies distribution. Moreover, this interpretive method assists in describing complex reservoir heterogeneities that can contribute to significant reserve additions from within existing fields.

  19. Next-generation sequencing technologies for environmental DNA research.

    PubMed

    Shokralla, Shadi; Spall, Jennifer L; Gibson, Joel F; Hajibabaei, Mehrdad

    2012-04-01

    Since 2005, advances in next-generation sequencing technologies have revolutionized biological science. The analysis of environmental DNA through the use of specific gene markers such as species-specific DNA barcodes has been a key application of next-generation sequencing technologies in ecological and environmental research. Access to parallel, massive amounts of sequencing data, as well as subsequent improvements in read length and throughput of different sequencing platforms, is leading to a better representation of sample diversity at a reasonable cost. New technologies are being developed rapidly and have the potential to dramatically accelerate ecological and environmental research. The fast pace of development and improvements in next-generation sequencing technologies can reflect on broader and more robust applications in environmental DNA research. Here, we review the advantages and limitations of current next-generation sequencing technologies in regard to their application for environmental DNA analysis.

  20. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    PubMed

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools. PMID:25110940

  1. Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

    PubMed

    Kaas, Rolf S; Leekitcharoenphon, Pimlapas; Aarestrup, Frank M; Lund, Ole

    2014-01-01

    Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

  2. High-speed multiple sequence alignment on a reconfigurable platform.

    PubMed

    Oliver, Tim; Schmidt, Bertil; Maskell, Douglas; Nathan, Darran; Clemens, Ralf

    2006-01-01

    Progressive alignment is a widely used approach to compute multiple sequence alignments (MSAs). However, aligning several hundred sequences by popular progressive alignment tools requires hours on sequential computers. Due to the rapid growth of sequence databases biologists have to compute MSAs in a far shorter time. In this paper we present a new approach to MSA on reconfigurable hardware platforms to gain high performance at low cost. We have constructed a linear systolic array to perform pairwise sequence distance computations using dynamic programming. This results in an implementation with significant runtime savings on a standard FPGA.

  3. Sedimentology and sequence stratigraphy of reefs and carbonate platforms

    SciTech Connect

    Schlager, W. )

    1992-01-01

    Classical sequence stratigraphy has been developed primarily from siliciclastic systems. Application of the concept to carbonates has not been as straightforward as was originally expected even though the basic tenets of sequence stratigraphy are supposed to be applicable to all depositional systems. Rather than force carbonate platforms into the straightjacket of a concept derived from another sediment family, this course takes a different tack. It starts out from the premise that sequence stratigraphy is a modern and sophisticated version of lithostratigraphy and as such is a sedimentologic concept. More sedimentology into sequence stratigraphy is the motto of the course and the red line that runs through the chapter of this book. The cook sets out with a review of sedimentologic in reference to petroleum deposits principles governing the large-scale anatomy of reefs and platforms. It then looks at sequences an systems tracts from a sedimentologic point of view, assesses the differences between siliciclastics and carbonates in their response to sea level, evaluates processes that compete with sea level for control on carbonate sequences, and finally presents a set of guidelines for application of sequence stratigraphy to reefs and carbonate platforms.

  4. Replacement Sequence of Events Generator

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladden, Daniel Wenkert Roy; Khanampompan, Teerpat

    2008-01-01

    The soeWINDOW program automates the generation of an ITAR (International Traffic in Arms Regulations)-compliant sub-RSOE (Replacement Sequence of Events) by extracting a specified temporal window from an RSOE while maintaining page header information. RSOEs contain a significant amount of information that is not ITAR-compliant, yet that foreign partners need to see for command details to their instrument, as well as the surrounding commands that provide context for validation. soeWINDOW can serve as an example of how command support products can be made ITAR-compliant for future missions. This software is a Perl script intended for use in the mission operations UNIX environment. It is designed for use to support the MRO (Mars Reconnaissance Orbiter) instrument team. The tool also provides automated DOM (Distributed Object Manager) storage into the special ITAR-okay DOM collection, and can be used for creating focused RSOEs for product review by any of the MRO teams.

  5. Automated Sequence Generation Process and Software

    NASA Technical Reports Server (NTRS)

    Gladden, Roy

    2007-01-01

    "Automated sequence generation" (autogen) signifies both a process and software used to automatically generate sequences of commands to operate various spacecraft. The autogen software comprises the autogen script plus the Activity Plan Generator (APGEN) program. APGEN can be used for planning missions and command sequences.

  6. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    SciTech Connect

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequence datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.

  7. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  8. Sequence stratigraphy and the demise of carbonate platforms

    SciTech Connect

    Schlager, W. )

    1991-03-01

    The termination of carbonate platforms and the change to siliciclastic deposition produce pronounced unconformities because carbonates and siliciclastics develop different sea-floor morphology and follow different patterns of sediment input and dispersal. Geometrically, these unconformities resemble lowstand unconformities but platform termination by exposure is rare in settings with continued sedimentation. With a lag time of only a few thousand years, the platform will resume growth when reflooded under the same environmental conditions. Most platforms are terminated by drowning - they become submerged to below the photic zone or they drown in a flood of siliciclastic sediment. The high growth potential of healthy platforms severely limits the possibility of drowning by sea-level pulse. During the rapid Holocene transgression, any reefs and platforms grew at rates of 10{sup 3} Bubnoffs (= microns per year), some in excess of 10{sup 4} Bubnoffs. Scleractinian corals, the key element in the growth of platform rims, can grow at over 10{sup 5} Bubnoffs and similar growth rates have been observed on fossil frame builders. The rates of long-term subsidence as well as third-order sea-level cycles are 10{sup 1}-10{sup 2} Bubnoffs. Platform drowning is, therefore, more often controlled by changes in the marine environment and reduction in growth potential than by sea-level pulses. The coincidence of oceanic anoxia and mass drowning of platforms in the mid-Cretaceous, the Early Jurassic, and the Late Devonian illustrate the environmental dimension of drowning. Drowning unconformities demonstrate the importance of environmental change as an autonomous control on sequences, independent of sea-level fluctuations.

  9. Sequence Data for Clostridium autoethanogenum using Three Generations of Sequencing Technologies

    DOE PAGESBeta

    Utturkar, Sagar M.; Klingeman, Dawn Marie; Bruno-Barcena, José M.; Chinn, Mari S.; Grunden, Amy; Köpke, Michael; Brown, Steven D.

    2015-04-14

    During the past decade, DNA sequencing output has been mostly dominated by the second generation sequencing platforms which are characterized by low cost, high throughput and shorter read lengths for example, Illumina. The emergence and development of so called third generation sequencing platforms such as PacBio has permitted exceptionally long reads (over 20 kb) to be generated. Due to read length increases, algorithm improvements and hybrid assembly approaches, the concept of one chromosome, one contig and automated finishing of microbial genomes is now a realistic and achievable task for many microbial laboratories. In this paper, we describe high quality sequencemore » datasets which span three generations of sequencing technologies, containing six types of data from four NGS platforms and originating from a single microorganism, Clostridium autoethanogenum. The dataset reported here will be useful for the scientific community to evaluate upcoming NGS platforms, enabling comparison of existing and novel bioinformatics approaches and will encourage interest in the development of innovative experimental and computational methods for NGS data.« less

  10. Next generation sequencing methodologies--an overview.

    PubMed

    Pickrell, William O; Rees, Mark I; Chung, Seo-Kyung

    2012-01-01

    Gene discovery has been one of the most important advances in our understanding of human disorders. Early linkage and positional cloning strategies have now given way to next generation sequencing (NGS) with age-old help from biostatistical and bioinformatical input. In this chapter, we present the importance of getting the basics right, namely, how the best phenotyping in the clinical domain will provide a higher chance of a successful NGS experiment. In addition, we show getting the correct submission of DNA samples to NGS providers is dependent on the type of inheritance pattern that may or may not be apparent. We discuss one of the most crucial decisions for investigators when designing a study, namely choosing a trio, quad or cohort for analysis. Following on from this, we compare and contrast the underlying technology adopted by provider companies as they vie for customers and submissions. Each platform has advantages and disadvantages based on false calls, coverage, and read depth; however, some of these issues may be solved with the third wave of sequencing technology development in early commercial roll-out. Lastly, we provide a bioinformatic filtering overview of a "quad"-based submission and show how 3 million SNPs and indels can be reduced to a biologically plausible and experimentally manageable n≤50 gene variants. PMID:23046880

  11. Field guide to next-generation DNA sequencers.

    PubMed

    Glenn, Travis C

    2011-09-01

    The diversity of available 2(nd) and 3(rd) generation DNA sequencing platforms is increasing rapidly. Costs for these systems range from < $100,000 to more than $1,000,000, with instrument run times ranging from minutes to weeks. Extensive trade-offs exist among these platforms. I summarize the major characteristics of each commercially available platform to enable direct comparisons. In terms of cost per megabase (Mb) of sequence, the Illumina and SOLiD platforms are clearly superior (≤ $0.10/Mb vs. > $10/Mb for 454 and some Ion Torrent chips). In terms of cost per nonmultiplexed sample and instrument run time, the Pacific Biosciences and Ion Torrent platforms excel, with the 454 GS Junior and Illumina MiSeq also notable in this regard. All platforms allow multiplexing of samples, but details of library preparation, experimental design and data analysis can constrain the options. The wide range of characteristics among available platforms provides opportunities both to conduct groundbreaking studies and to waste money on scales that were previously infeasible. Thus, careful thought about the desired characteristics of these systems is warranted before purchasing or using any of them. Updated information from this guide will be maintained at: http://dna.uga.edu/ and http://tomato.biol.trinity.edu/blog/.

  12. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley.

    PubMed

    Mascher, Martin; Wu, Shuangye; Amand, Paul St; Stein, Nils; Poland, Jesse

    2013-01-01

    The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS), a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL) population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new sequencing

  13. Next-generation sequencing strategies for characterizing the turkey genome.

    PubMed

    Dalloul, Rami A; Zimin, Aleksey V; Settlage, Robert E; Kim, Sungwon; Reed, Kent M

    2014-02-01

    The turkey genome sequencing project was initiated in 2008 and has relied primarily on next-generation sequencing (NGS) technologies. Our first efforts used a synergistic combination of 2 NGS platforms (Roche/454 and Illumina GAII), detailed bacterial artificial chromosome (BAC) maps, and unique assembly tools to sequence and assemble the genome of the domesticated turkey, Meleagris gallopavo. Since the first release in 2010, efforts to improve the genome assembly, gene annotation, and genomic analyses continue. The initial assembly build (2.01) represented about 89% of the genome sequence with 17X coverage depth (931 Mb). Sequence contigs were assigned to 30 of the 40 chromosomes with approximately 10% of the assembled sequence corresponding to unassigned chromosomes (ChrUn). The sequence has been refined through both genome-wide and area-focused sequencing, including shotgun and paired-end sequencing, and targeted sequencing of chromosomal regions with low or incomplete coverage. These additional efforts have improved the sequence assembly resulting in 2 subsequent genome builds of higher genome coverage (25X/Build3.0 and 30X/Build4.0) with a current sequence totaling 1,010 Mb. Further, BAC with end sequences assigned to the Z/W and MG18 (MHC) chromosomes, ChrUn, or not placed in the previous build were isolated, deeply sequenced (Hi-Seq), and incorporated into the latest build (5.0). To aid in the annotation and to generate a gene expression atlas of major tissues, a comprehensive set of RNA samples was collected at various developmental stages of female and male turkeys. Transcriptome sequencing data (using Illumina Hi-Seq) will provide information to enhance the final assembly and ultimately improve sequence annotation. The most current sequence covers more than 95% of the turkey genome and should yield a much improved gene level of annotation, making it a valuable resource for studying genetic variations underlying economically important traits in poultry.

  14. Construction of a rationally designed antibody platform for sequencing-assisted selection.

    PubMed

    Larman, H Benjamin; Xu, George Jing; Pavlova, Natalya N; Elledge, Stephen J

    2012-11-01

    Antibody discovery platforms have become an important source of both therapeutic biomolecules and research reagents. Massively parallel DNA sequencing can be used to assist antibody selection by comprehensively monitoring libraries during selection, thus greatly expanding the power of these systems. We have therefore constructed a rationally designed, fully defined single-chain variable fragment (scFv) library and analysis platform optimized for analysis with short-read deep sequencing. Sequence-defined oligonucleotide libraries encoding three complementarity-determining regions (L3 from the light chain, H2 and H3 from the heavy chain) were synthesized on a programmable microarray and combinatorially cloned into a single scFv framework for molecular display. Our unique complementarity-determining region sequence design optimizes for protein binding by utilizing a hidden Markov model that was trained on all antibody-antigen cocrystal structures in the Protein Data Bank. The resultant ~10(12)-member library was produced in ribosome-display format, and comprehensively analyzed over four rounds of antigen selections by multiplex paired-end Illumina sequencing. The hidden Markov model scFv library generated multiple binders against an emerging cancer antigen and is the basis for a next-generation antibody production platform. PMID:23064642

  15. Concept For Generation Of Long Pseudorandom Sequences

    NASA Technical Reports Server (NTRS)

    Wang, C. C.

    1990-01-01

    Conceptual very-large-scale integrated (VLSI) digital circuit performs exponentiation in finite field. Algorithm that generates unusually long sequences of pseudorandom numbers executed by digital processor that includes such circuits. Concepts particularly advantageous for such applications as spread-spectrum communications, cryptography, and generation of ranging codes, synthetic noise, and test data, where usually desirable to make pseudorandom sequences as long as possible.

  16. Metagenomics using next-generation sequencing.

    PubMed

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis. PMID:24515370

  17. Metagenomics using next-generation sequencing.

    PubMed

    Bragg, Lauren; Tyson, Gene W

    2014-01-01

    Traditionally, microbial genome sequencing has been restricted to the small number of species that can be grown in pure culture. The progressive development of culture-independent methods over the last 15 years now allows researchers to sequence microbial communities directly from environmental samples. This approach is commonly referred to as "metagenomics" or "community genomics". However, the term metagenomics is applied liberally in the literature to describe any culture-independent analysis of microbial communities. Here, we define metagenomics as shotgun ("random") sequencing of the genomic DNA of a sample taken directly from the environment. The metagenome can be thought of as a sampling of the collective genome of the microbial community. We outline the considerations and analyses that should be undertaken to ensure the success of a metagenomic sequencing project, including the choice of sequencing platform and methods for assembly, binning, annotation, and comparative analysis.

  18. Next-generation sequencing - feasibility and practicality in haematology.

    PubMed

    Kohlmann, Alexander; Grossmann, Vera; Nadarajah, Niroshan; Haferlach, Torsten

    2013-03-01

    Next-generation sequencing platforms have evolved to provide an accurate and comprehensive means for the detection of molecular mutations in heterogeneous tumour specimens. Here, we review the feasibility and practicality of this novel laboratory technology. In particular, we focus on the utility of next-generation sequencing technology in characterizing haematological neoplasms and the landmark findings in key haematological malignancies. We also discuss deep-sequencing strategies to analyse the constantly increasing number of molecular markers applied for disease classification, patient stratification and individualized monitoring of minimal residual disease. Although many facets of this assay need to be taken into account, amplicon deep-sequencing has already demonstrated a promising technical performance and is being continuously developed towards routine application in diagnostic laboratories so that an impact on clinical practice can be achieved.

  19. Utilization of Benchtop Next Generation Sequencing Platforms Ion Torrent PGM and MiSeq in Noninvasive Prenatal Testing for Chromosome 21 Trisomy and Testing of Impact of In Silico and Physical Size Selection on Its Analytical Performance

    PubMed Central

    Minarik, Gabriel; Repiska, Gabriela; Hyblova, Michaela; Nagyova, Emilia; Soltys, Katarina; Budis, Jaroslav; Duris, Frantisek; Sysak, Rastislav; Gerykova Bujalkova, Maria; Vlkova-Izrael, Barbora; Biro, Orsolya; Nagy, Balint; Szemes, Tomas

    2015-01-01

    Objectives The aims of this study were to test the utility of benchtop NGS platforms for NIPT for trisomy 21 using previously published z score calculation methods and to optimize the sample preparation and data analysis with use of in silico and physical size selection methods. Methods Samples from 130 pregnant women were analyzed by whole genome sequencing on benchtop NGS systems Ion Torrent PGM and MiSeq. The targeted yield of 3 million raw reads on each platform was used for z score calculation. The impact of in silico and physical size selection on analytical performance of the test was studied. Results Using a z score value of 3 as the cut-off, 98.11% - 100% (104-106/106) specificity and 100% (24/24) sensitivity and 99.06% - 100% (105-106/106) specificity and 100% (24/24) sensitivity were observed for Ion Torrent PGM and MiSeq, respectively. After in silico based size selection both platforms reached 100% specificity and sensitivity. Following the physical size selection z scores of tested trisomic samples increased significantly—p = 0.0141 and p = 0.025 for Ion Torrent PGM and MiSeq, respectively. Conclusions Noninvasive prenatal testing for chromosome 21 trisomy with the utilization of benchtop NGS systems led to results equivalent to previously published studies performed on high-to-ultrahigh throughput NGS systems. The in silico size selection led to higher specificity of the test. Physical size selection performed on isolated DNA led to significant increase in z scores. The observed results could represent a basis for increasing of cost effectiveness of the test and thus help with its penetration worldwide. PMID:26669558

  20. ADS: The Next Generation Search Platform

    NASA Astrophysics Data System (ADS)

    Accomazzi, A.; Kurtz, M. J.; Henneken, E. A.; Chyla, R.; Luker, J.; Grant, C. S.; Thompson, D. M.; Holachek, A.; Dave, R.; Murray, S. S.

    2015-04-01

    Four years after the last LISA meeting, the NASA Astrophysics Data System (ADS) finds itself in the middle of major changes to the infrastructure and contents of its database. In this paper we highlight a number of features of great importance to librarians and discuss the additional functionality that we are currently developing. Our citation coverage has doubled since 2010 and now consists of over 10 million citations. We are normalizing the affiliation information in our records and we have started collecting and linking funding sources with papers in our system. At the same time, we are undergoing major technology changes in the ADS platform. We have rolled out and are now enhancing a new high-performance search engine capable of performing full-text as well as metadata searches using an intuitive query language. We are currently able to index acknowledgments, affiliations, citations, and funding sources. While this effort is still ongoing, some of its benefits are already available through the ADS Labs user interface and API at http://adslabs.org/adsabs/.

  1. NG6: Integrated next generation sequencing storage and processing environment

    PubMed Central

    2012-01-01

    Background Next generation sequencing platforms are now well implanted in sequencing centres and some laboratories. Upcoming smaller scale machines such as the 454 junior from Roche or the MiSeq from Illumina will increase the number of laboratories hosting a sequencer. In such a context, it is important to provide these teams with an easily manageable environment to store and process the produced reads. Results We describe a user-friendly information system able to manage large sets of sequencing data. It includes, on one hand, a workflow environment already containing pipelines adapted to different input formats (sff, fasta, fastq and qseq), different sequencers (Roche 454, Illumina HiSeq) and various analyses (quality control, assembly, alignment, diversity studies,…) and, on the other hand, a secured web site giving access to the results. The connected user will be able to download raw and processed data and browse through the analysis result statistics. The provided workflows can easily be modified or extended and new ones can be added. Ergatis is used as a workflow building, running and monitoring system. The analyses can be run locally or in a cluster environment using Sun Grid Engine. Conclusions NG6 is a complete information system designed to answer the needs of a sequencing platform. It provides a user-friendly interface to process, store and download high-throughput sequencing data. PMID:22958229

  2. Next-generation sequencing discoveries in lymphoma.

    PubMed

    Slack, Graham W; Gascoyne, Randy D

    2013-03-01

    Since the mapping of the human genome and the advent of next-generation sequencing technology thorough examination of the cancer genome has become a reality. Over the last few years several studies have used next-generation sequencing technology to investigate the genetic landscape of Hodgkin and non-Hodgkin lymphomas, identifying novel genetic mutations and gene rearrangements that have shed new light on the underlying tumor biology in these diseases as well as identifying possible targets for directed therapy. This review covers the major discoveries in lymphoma using next-generation sequencing technology.

  3. Iterative method for generating correlated binary sequences

    NASA Astrophysics Data System (ADS)

    Usatenko, O. V.; Melnik, S. S.; Apostolov, S. S.; Makarov, N. M.; Krokhin, A. A.

    2014-11-01

    We propose an efficient iterative method for generating random correlated binary sequences with a prescribed correlation function. The method is based on consecutive linear modulations of an initially uncorrelated sequence into a correlated one. Each step of modulation increases the correlations until the desired level has been reached. The robustness and efficiency of the proposed algorithm are tested by generating sequences with inverse power-law correlations. The substantial increase in the strength of correlation in the iterative method with respect to single-step filtering generation is shown for all studied correlation functions. Our results can be used for design of disordered superlattices, waveguides, and surfaces with selective transport properties.

  4. Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform

    PubMed Central

    Schirmer, Melanie; Ijaz, Umer Z.; D'Amore, Rosalinda; Hall, Neil; Sloan, William T.; Quince, Christopher

    2015-01-01

    With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. PMID:25586220

  5. Software for pre-processing Illumina next-generation sequencing short read sequences

    PubMed Central

    2014-01-01

    Background When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets. Methods We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7. Results Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference

  6. Next generation sequencing data of a defined microbial mock community

    PubMed Central

    Singer, Esther; Andreopoulos, Bill; Bowers, Robert M.; Lee, Janey; Deshpande, Shweta; Chiniquy, Jennifer; Ciobanu, Doina; Klenk, Hans-Peter; Zane, Matthew; Daum, Christopher; Clum, Alicia; Cheng, Jan-Fang; Copeland, Alex; Woyke, Tanja

    2016-01-01

    Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-26), composed of 23 bacterial and 3 archaeal strains with finished genomes. These strains span 10 phyla and 14 classes, a range of GC contents, genome sizes, repeat content and encompass a diverse abundance profile. Short read Illumina and long-read PacBio SMRT sequences of this mock community are described. These data represent a valuable resource for the scientific community, enabling extensive benchmarking and comparative evaluation of bioinformatics tools without the need to simulate data. As such, these data can aid in improving our current sequence data analysis toolkit and spur interest in the development of new tools. PMID:27673566

  7. Comparison of next-generation sequencing systems.

    PubMed

    Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

    2012-01-01

    With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized.

  8. SNP Discovery through Next-Generation Sequencing and Its Applications

    PubMed Central

    Kumar, Santosh; Banks, Travis W.; Cloutier, Sylvie

    2012-01-01

    The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in various model and nonmodel plant species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. Although excellent reviews have been published on next-generation sequencing, its associated bioinformatics challenges, and the applications of SNPs in genetic studies, a comprehensive review connecting these three intertwined research areas is needed. This paper touches upon various aspects of SNP discovery, highlighting key points in availability and selection of appropriate sequencing platforms, bioinformatics pipelines, SNP filtering criteria, and applications of SNPs in genetic analyses. The use of next-generation sequencing methodologies in many non-model crops leading to discovery and implementation of SNPs in various genetic studies is discussed. Development and improvement of bioinformatics software that are open source and freely available have accelerated the SNP discovery while reducing the associated cost. Key considerations for SNP filtering and associated pipelines are discussed in specific topics. A list of commonly used software and their sources is compiled for easy access and reference. PMID:23227038

  9. Standardization and quality management in next-generation sequencing.

    PubMed

    Endrullat, Christoph; Glökler, Jörn; Franke, Philipp; Frohme, Marcus

    2016-09-01

    DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data. PMID:27668169

  10. Standardization and quality management in next-generation sequencing.

    PubMed

    Endrullat, Christoph; Glökler, Jörn; Franke, Philipp; Frohme, Marcus

    2016-09-01

    DNA sequencing continues to evolve quickly even after > 30 years. Many new platforms suddenly appeared and former established systems have vanished in almost the same manner. Since establishment of next-generation sequencing devices, this progress gains momentum due to the continually growing demand for higher throughput, lower costs and better quality of data. In consequence of this rapid development, standardized procedures and data formats as well as comprehensive quality management considerations are still scarce. Here, we listed and summarized current standardization efforts and quality management initiatives from companies, organizations and societies in form of published studies and ongoing projects. These comprise on the one hand quality documentation issues like technical notes, accreditation checklists and guidelines for validation of sequencing workflows. On the other hand, general standard proposals and quality metrics are developed and applied to the sequencing workflow steps with the main focus on upstream processes. Finally, certain standard developments for downstream pipeline data handling, processing and storage are discussed in brief. These standardization approaches represent a first basis for continuing work in order to prospectively implement next-generation sequencing in important areas such as clinical diagnostics, where reliable results and fast processing is crucial. Additionally, these efforts will exert a decisive influence on traceability and reproducibility of sequence data.

  11. Double-digest RAD sequencing using Ion Proton semiconductor platform (ddRADseq-ion) with nonmodel organisms.

    PubMed

    Recknagel, Hans; Jacobs, Arne; Herzyk, Pawel; Elmer, Kathryn R

    2015-11-01

    Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double-digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single-end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11,000 polymorphic loci per library of 6-30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost-effective generation of variable and reproducible genetic markers. PMID:25808755

  12. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform.

    PubMed

    Shokralla, Shadi; Porter, Teresita M; Gibson, Joel F; Dobosz, Rafal; Janzen, Daniel H; Hallwachs, Winnie; Golding, G Brian; Hajibabaei, Mehrdad

    2015-04-17

    Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions.

  13. Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform

    PubMed Central

    Shokralla, Shadi; Porter, Teresita M.; Gibson, Joel F.; Dobosz, Rafal; Janzen, Daniel H.; Hallwachs, Winnie; Golding, G. Brian; Hajibabaei, Mehrdad

    2015-01-01

    Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions. PMID:25884109

  14. A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms

    PubMed Central

    Indugu, Nagaraju; Bittinger, Kyle; Kumar, Sanjay; Vecchiarelli, Bonnie

    2016-01-01

    Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M2 = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed. PMID:26870608

  15. A comparison of rumen microbial profiles in dairy cows as retrieved by 454 Roche and Ion Torrent (PGM) sequencing platforms.

    PubMed

    Indugu, Nagaraju; Bittinger, Kyle; Kumar, Sanjay; Vecchiarelli, Bonnie; Pitta, Dipti

    2016-01-01

    Next generation sequencing (NGS) technology is a widely accepted tool used by microbial ecologists to explore complex microbial communities in different ecosystems. As new NGS platforms continue to become available, it becomes imperative to compare data obtained from different platforms and analyze their effect on microbial community structure. In the present study, we compared sequencing data from both the 454 and Ion Torrent (PGM) platforms on the same DNA samples obtained from the rumen of dairy cows during their transition period. Despite the substantial difference in the number of reads, error rate and length of reads among both platforms, we identified similar community composition between the two data sets. Procrustes analysis revealed similar correlations (M (2) = 0.319; P = 0.001) in the microbial community composition between the two platforms. Both platforms revealed the abundance of the same bacterial phyla which were Bacteroidetes and Firmicutes; however, PGM recovered an additional four phyla. Comparisons made at the genus level by each platforms revealed differences in only a few genera such as Prevotella, Ruminococcus, Succiniclasticum and Treponema (p < 0.05; chi square test). Collectively, we conclude that the output generated from PGM and 454 yielded concurrent results, provided stringent bioinformatics pipelines are employed. PMID:26870608

  16. Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform.

    PubMed

    Della Mina, Erika; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

    2015-03-01

    We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8-10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1-2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on 'benchtop' sequencers combining rapid turnaround times with higher manageability.

  17. Improving molecular diagnosis in epilepsy by a dedicated high-throughput sequencing platform

    PubMed Central

    Mina, Erika Della; Ciccone, Roberto; Brustia, Francesca; Bayindir, Baran; Limongelli, Ivan; Vetro, Annalisa; Iascone, Maria; Pezzoli, Laura; Bellazzi, Riccardo; Perotti, Gianfranco; De Giorgis, Valentina; Lunghi, Simona; Coppola, Giangennaro; Orcesi, Simona; Merli, Pietro; Savasta, Salvatore; Veggiotti, Pierangelo; Zuffardi, Orsetta

    2015-01-01

    We analyzed by next-generation sequencing (NGS) 67 epilepsy genes in 19 patients with different types of either isolated or syndromic epileptic disorders and in 15 controls to investigate whether a quick and cheap molecular diagnosis could be provided. The average number of nonsynonymous and splice site mutations per subject was similar in the two cohorts indicating that, even with relatively small targeted platforms, finding the disease gene is not an univocal process. Our diagnostic yield was 47% with nine cases in which we identified a very likely causative mutation. In most of them no interpretation would have been possible in absence of detailed phenotype and familial information. Seven out of 19 patients had a phenotype suggesting the involvement of a specific gene. Disease-causing mutations were found in six of these cases. Among the remaining patients, we could find a probably causative mutation only in three. None of the genes affected in the latter cases had been suspected a priori. Our protocol requires 8–10 weeks including the investigation of the parents with a cost per patient comparable to sequencing of 1–2 medium-to-large-sized genes by conventional techniques. The platform we used, although providing much less information than whole-exome or whole-genome sequencing, has the advantage that can also be run on ‘benchtop' sequencers combining rapid turnaround times with higher manageability. PMID:24848745

  18. Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment.

    PubMed

    Kwak, Daniel; Kam, Alfred; Becerra, David; Zhou, Qikuan; Hops, Adam; Zarour, Eleyine; Kam, Arthur; Sarmenta, Luis; Blanchette, Mathieu; Waldispühl, Jérôme

    2013-01-01

    Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem. PMID:24148814

  19. Impact of Next Generation Sequencing Techniques in Food Microbiology

    PubMed Central

    Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana

    2014-01-01

    Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799

  20. Next-Generation Sequencing for Binary Protein–Protein Interactions

    PubMed Central

    Suter, Bernhard; Zhang, Xinmin; Pesce, C. Gustavo; Mendelsohn, Andrew R.; Dinesh-Kumar, Savithramma P.; Mao, Jian-Hua

    2015-01-01

    The yeast two-hybrid (Y2H) system exploits host cell genetics in order to display binary protein–protein interactions (PPIs) via defined and selectable phenotypes. Numerous improvements have been made to this method, adapting the screening principle for diverse applications, including drug discovery and the scale-up for proteome wide interaction screens in human and other organisms. Here we discuss a systematic workflow and analysis scheme for screening data generated by Y2H and related assays that includes high-throughput selection procedures, readout of comprehensive results via next-generation sequencing (NGS), and the interpretation of interaction data via quantitative statistics. The novel assays and tools will serve the broader scientific community to harness the power of NGS technology to address PPI networks in health and disease. We discuss examples of how this next-generation platform can be applied to address specific questions in diverse fields of biology and medicine. PMID:26734059

  1. Next Generation Sequencing Reveals the Hidden Diversity of Zooplankton Assemblages

    PubMed Central

    Harmer, Rachel A.; Somerfield, Paul J.; Atkinson, Angus

    2013-01-01

    Background Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. Methodology/Principle Findings Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. Conclusions Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly

  2. Next-Generation Sequencing in Intellectual Disability.

    PubMed

    Carvill, Gemma L; Mefford, Heather C

    2015-09-01

    Next-generation sequencing technologies have revolutionized gene discovery in patients with intellectual disability (ID) and led to an unprecedented expansion in the number of genes implicated in this disorder. We discuss the strategies that have been used to identify these novel genes for both syndromic and nonsyndromic ID and highlight the phenotypic and genetic heterogeneity that underpin this condition. Finally, we discuss the future of defining the genetic etiology of ID, including the role of whole-genome sequencing, mosaicism, and the importance of diagnostic testing in ID. PMID:27617123

  3. Next-Generation Sequencing in Intellectual Disability

    PubMed Central

    Carvill, Gemma L.; Mefford, Heather C.

    2015-01-01

    Next-generation sequencing technologies have revolutionized gene discovery in patients with intellectual disability (ID) and led to an unprecedented expansion in the number of genes implicated in this disorder. We discuss the strategies that have been used to identify these novel genes for both syndromic and nonsyndromic ID and highlight the phenotypic and genetic heterogeneity that underpin this condition. Finally, we discuss the future of defining the genetic etiology of ID, including the role of whole-genome sequencing, mosaicism, and the importance of diagnostic testing in ID. PMID:27617123

  4. A repetitive sequence assembler based on next-generation sequencing.

    PubMed

    Lian, S; Tu, Y; Wang, Y; Chen, X; Wang, L

    2016-01-01

    Repetitive sequences of variable length are common in almost all eukaryotic genomes, and most of them are presumed to have important biomedical functions and can cause genomic instability. Next-generation sequencing (NGS) technologies provide the possibility of identifying capturing these repetitive sequences directly from the NGS data. In this study, we assessed the performances in identifying capturing repeats of leading assemblers, such as Velvet, SOAPdenovo, SGA, MSR-CA, Bambus2, ALLPATHS-LG, and AByss using three real NGS datasets. Our results indicated that most of them performed poorly in capturing the repeats. Consequently, we proposed a repetitive sequence assembler, named NGSReper, for capturing repeats from NGS data. Simulated datasets were used to validate the feasibility of NGSReper. The results indicate that the completeness of capturing repeat is up to 99%. Cross validation was performed in three real NGS datasets, and extensive comparisons indicate that NGSReper performed best in terms of completeness and accuracy in capturing repeats. In conclusion, NGSReper is an appropriate and suitable tool for capturing repeats directly from NGS data. PMID:27525861

  5. Short barcodes for next generation sequencing.

    PubMed

    Mir, Katharina; Neuhaus, Klaus; Bossert, Martin; Schober, Steffen

    2013-01-01

    We consider the design and evaluation of short barcodes, with a length between six and eight nucleotides, used for parallel sequencing on platforms where substitution errors dominate. Such codes should have not only good error correction properties but also the code words should fulfil certain biological constraints (experimental parameters). We compare published barcodes with codes obtained by two new constructions methods, one based on the currently best known linear codes and a simple randomized construction method. The evaluation done is with respect to the error correction capabilities, barcode size and their experimental parameters and fundamental bounds on the code size and their distance properties. We provide a list of codes for lengths between six and eight nucleotides, where for length eight, two substitution errors can be corrected. In fact, no code with larger minimum distance can exist.

  6. CaPSID: A bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes

    PubMed Central

    2012-01-01

    Background It is now well established that nearly 20% of human cancers are caused by infectious agents, and the list of human oncogenic pathogens will grow in the future for a variety of cancer types. Whole tumor transcriptome and genome sequencing by next-generation sequencing technologies presents an unparalleled opportunity for pathogen detection and discovery in human tissues but requires development of new genome-wide bioinformatics tools. Results Here we present CaPSID (Computational Pathogen Sequence IDentification), a comprehensive bioinformatics platform for identifying, querying and visualizing both exogenous and endogenous pathogen nucleotide sequences in tumor genomes and transcriptomes. CaPSID includes a scalable, high performance database for data storage and a web application that integrates the genome browser JBrowse. CaPSID also provides useful metrics for sequence analysis of pre-aligned BAM files, such as gene and genome coverage, and is optimized to run efficiently on multiprocessor computers with low memory usage. Conclusions To demonstrate the usefulness and efficiency of CaPSID, we carried out a comprehensive analysis of both a simulated dataset and transcriptome samples from ovarian cancer. CaPSID correctly identified all of the human and pathogen sequences in the simulated dataset, while in the ovarian dataset CaPSID’s predictions were successfully validated in vitro. PMID:22901030

  7. Genome Walking by Next Generation Sequencing Approaches

    PubMed Central

    Volpicella, Mariateresa; Leoni, Claudia; Costanza, Alessandra; Fanizza, Immacolata; Placido, Antonio; Ceci, Luigi R.

    2012-01-01

    Genome Walking (GW) comprises a number of PCR-based methods for the identification of nucleotide sequences flanking known regions. The different methods have been used for several purposes: from de novo sequencing, useful for the identification of unknown regions, to the characterization of insertion sites for viruses and transposons. In the latter cases Genome Walking methods have been recently boosted by coupling to Next Generation Sequencing technologies. This review will focus on the development of several protocols for the application of Next Generation Sequencing (NGS) technologies to GW, which have been developed in the course of analysis of insertional libraries. These analyses find broad application in protocols for functional genomics and gene therapy. Thanks to the application of NGS technologies, the original vision of GW as a procedure for walking along an unknown genome is now changing into the possibility of observing the parallel marching of hundreds of thousands of primers across the borders of inserted DNA molecules in host genomes. PMID:24832505

  8. Microfluidic Platform Generates Oxygen Landscapes for Localized Hypoxic Activation

    PubMed Central

    Rexius, Megan L.; Mauleon, Gerardo; Malik, Asrar B.; Rehman, Jalees; Eddington, David T.

    2014-01-01

    An open-well microfluidic platform generates an oxygen landscape using gas-perfused networks which diffuse across a membrane. The device enables real-time analysis of cellular and tissue responses to oxygen tension to define how cells adapt to heterogeneous oxygen conditions found in the physiological setting. We demonstrate that localized hypoxic activation of cells elicited specific metabolic and gene responses in human microvascular endothelial cells and bone marrow-derived mesenchymal stem cells. A robust demonstration of the compatibility of the device with standard laboratory techniques demonstrates the wide utility of the method. This platform is ideally suited to study real-time cell responses and cell-cell interactions within physiologically relevant oxygen landscapes. PMID:25315003

  9. Clinical Integration of Next Generation Sequencing Technology

    PubMed Central

    Gullapalli, R.R.; Lyons-Weiler, M.; Petrosko, P.; Dhir, R.; Becich, M.J.; LaFramboise, W.A.

    2012-01-01

    Abstract/Synopsis Recent technological advances in Next Generation Sequencing (NGS) methods have substantially reduced cost and operational complexity leading to the production of bench top sequencers and commercial software solutions for implementation in small research and clinical laboratories. This chapter summarizes requirements and hurdles to the successful implementation of these systems including 1) calibration, validation and optimization of the instrumentation, experimental paradigm and primary readout, 2) secure transfer, storage and secondary processing of the data, 3) implementation of software tools for targeted analysis, and 4) training of research and clinical personnel to evaluate data fidelity and interpret the molecular significance of the genomic output. In light of the commercial and technological impetus to bring NGS technology into the clinical domain, it is critical that novel tests incorporate rigid protocols with built-in calibration standards and that data transfer and processing occur under exacting security measures for interpretation by clinicians with specialized training in molecular diagnostics. PMID:23078661

  10. Next Generation Sequencing in Endocrine Practice

    PubMed Central

    Forlenza, Gregory P.; Calhoun, Amy; Beckman, Kenneth B.; Halvorsen, Tanya; Hamdoun, Elwaseila; Zierhut, Heather; Sarafoglou, Kyriakie; Polgreen, Lynda E.; Miller, Bradley S.; Nathan, Brandon; Petryk, Anna

    2016-01-01

    With the completion of the Human Genome Project and advances in genomic sequencing technologies, the use of clinical molecular diagnostics has grown tremendously over the last decade. Next-generation sequencing (NGS) has overcome many of the practical roadblocks that had slowed the adoption of molecular testing for routine clinical diagnosis. In endocrinology, targeted NGS now complements biochemical testing and imaging studies. The goal of this review is to provide clinicians with a guide to the application of NGS to genetic testing for endocrine conditions, by compiling a list of established gene mutations detectable by NGS, and highlighting key phenotypic features of these disorders. As we outline in this review, the clinical utility of NGS-based molecular testing for endocrine disorders is very high. Identifying an exact genetic etiology improves understanding of the disease, provides clear explanation to families about the cause, and guides decisions about screening, prevention and/or treatment. PMID:25958132

  11. Discovery of posttranscriptional regulatory RNAs using next generation sequencing technologies.

    PubMed

    Gelderman, Grant; Contreras, Lydia M

    2013-01-01

    Next generation sequencing (NGS) has revolutionized the way by which we engineer metabolism by radically altering the path to genome-wide inquiries. This is due to the fact that NGS approaches offer several powerful advantages over traditional methods that include the ability to fully sequence hundreds to thousands of genes in a single experiment and simultaneously detect homozygous and heterozygous deletions, alterations in gene copy number, insertions, translocations, and exome-wide substitutions that include "hot-spot mutations." This chapter describes the use of these technologies as a sequencing technique for transcriptome analysis and discovery of regulatory RNA elements in the context of three main platforms: Illumina HiSeq, 454 pyrosequencing, and SOLiD sequencing. Specifically, this chapter focuses on the use of Illumina HiSeq, since it is the most widely used platform for RNA discovery and transcriptome analysis. Regulatory RNAs have now been found in all branches of life. In bacteria, noncoding small RNAs (sRNAs) are involved in highly sophisticated regulatory circuits that include quorum sensing, carbon metabolism, stress responses, and virulence (Gorke and Vogel, Gene Dev 22:2914-2925, 2008; Gottesman, Trends Genet 21:399-404, 2005; Romby et al., Curr Opin Microbiol 9:229-236, 2006). Further characterization of the underlying regulation of gene expression remains poorly understood given that it is estimated that over 60% of all predicted genes remain hypothetical and the 5' and 3' untranslated regions are unknown for more than 90% of the genes (Siegel et al., Trends Parasitol 27:434-441, 2011). Importantly, manipulation of the posttranscriptional regulation that occurs at the level of RNA stability and export, trans-splicing, polyadenylation, protein translation, and protein stability via untranslated regions (Clayton, EMBO J 21:1881-1888, 2002; Haile and Papadopoulou, Curr Opin Microbiol 10:569-577, 2007) could be highly beneficial to metabolic

  12. Advances in Alport syndrome diagnosis using next-generation sequencing

    PubMed Central

    Artuso, Rosangela; Fallerini, Chiara; Dosa, Laura; Scionti, Francesca; Clementi, Maurizio; Garosi, Guido; Massella, Laura; Epistolato, Maria Carmela; Mancini, Roberta; Mari, Francesca; Longo, Ilaria; Ariani, Francesca; Renieri, Alessandra; Bruttini, Mirella

    2012-01-01

    Alport syndrome (ATS) is a hereditary nephropathy often associated with sensorineural hypoacusis and ocular abnormalities. Mutations in the COL4A5 gene cause X-linked ATS. Mutations in COL4A4 and COL4A3 genes have been reported in both autosomal recessive and autosomal dominant ATS. The conventional mutation screening, performed by DHPLC and/or Sanger sequencing, is time-consuming and has relatively high costs because of the absence of hot spots and to the high number of exons per gene: 51 (COL4A5), 48 (COL4A4) and 52 (COL4A3). Several months are usually necessary to complete the diagnosis, especially in cases with less informative pedigrees. To overcome these limitations, we designed a next-generation sequencing (NGS) protocol enabling simultaneous detection of all possible variants in the three genes. We used a method coupling selective amplification to the 454 Roche DNA sequencing platform (Genome Sequencer junior). The application of this technology allowed us to identify the second mutation in two ATS patients (p.Ser1147Phe in COL4A3 and p.Arg1682Trp in COL4A4) and to reconsider the diagnosis of ATS in a third patient. This study, therefore, illustrates the successful application of NGS to mutation screening of Mendelian disorders with locus heterogeneity. PMID:21897443

  13. Advances in Alport syndrome diagnosis using next-generation sequencing.

    PubMed

    Artuso, Rosangela; Fallerini, Chiara; Dosa, Laura; Scionti, Francesca; Clementi, Maurizio; Garosi, Guido; Massella, Laura; Epistolato, Maria Carmela; Mancini, Roberta; Mari, Francesca; Longo, Ilaria; Ariani, Francesca; Renieri, Alessandra; Bruttini, Mirella

    2012-01-01

    Alport syndrome (ATS) is a hereditary nephropathy often associated with sensorineural hypoacusis and ocular abnormalities. Mutations in the COL4A5 gene cause X-linked ATS. Mutations in COL4A4 and COL4A3 genes have been reported in both autosomal recessive and autosomal dominant ATS. The conventional mutation screening, performed by DHPLC and/or Sanger sequencing, is time-consuming and has relatively high costs because of the absence of hot spots and to the high number of exons per gene: 51 (COL4A5), 48 (COL4A4) and 52 (COL4A3). Several months are usually necessary to complete the diagnosis, especially in cases with less informative pedigrees. To overcome these limitations, we designed a next-generation sequencing (NGS) protocol enabling simultaneous detection of all possible variants in the three genes. We used a method coupling selective amplification to the 454 Roche DNA sequencing platform (Genome Sequencer junior). The application of this technology allowed us to identify the second mutation in two ATS patients (p.Ser1147Phe in COL4A3 and p.Arg1682Trp in COL4A4) and to reconsider the diagnosis of ATS in a third patient. This study, therefore, illustrates the successful application of NGS to mutation screening of Mendelian disorders with locus heterogeneity.

  14. Next generation sequencing and its applications in forensic genetics.

    PubMed

    Børsting, Claus; Morling, Niels

    2015-09-01

    It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics.

  15. Revealing the Complexity of Breast Cancer by Next Generation Sequencing

    PubMed Central

    Verigos, John; Magklara, Angeliki

    2015-01-01

    Over the last few years the increasing usage of “-omic” platforms, supported by next-generation sequencing, in the analysis of breast cancer samples has tremendously advanced our understanding of the disease. New driver and passenger mutations, rare chromosomal rearrangements and other genomic aberrations identified by whole genome and exome sequencing are providing missing pieces of the genomic architecture of breast cancer. High resolution maps of breast cancer methylomes and sequencing of the miRNA microworld are beginning to paint the epigenomic landscape of the disease. Transcriptomic profiling is giving us a glimpse into the gene regulatory networks that govern the fate of the breast cancer cell. At the same time, integrative analysis of sequencing data confirms an extensive intertumor and intratumor heterogeneity and plasticity in breast cancer arguing for a new approach to the problem. In this review, we report on the latest findings on the molecular characterization of breast cancer using NGS technologies, and we discuss their potential implications for the improvement of existing therapies. PMID:26561834

  16. On the study of microbial transcriptomes using second- and third-generation sequencing technologies.

    PubMed

    Choi, Sang Chul

    2016-08-01

    Second-generation sequencing technologies transformed the study of microbial transcriptomes. They helped reveal the transcription start sites and antisense transcripts of microbial species, improving the microbial genome annotation. Quantification of genome-wide gene expression levels allowed for functional studies of microbial research. Ever-evolving sequencing technologies are reshaping approaches to studying microbial transcriptomes. Recently, Oxford Nanopore Technologies delivered a sequencing platform called MinION, a third-generation sequencing technology, to the research community. We expect it to be the next sequencing technology that enables breakthroughs in life science fields. The studies of microbial transcriptomes will be no exception. In this paper, we review microbial transcriptomics studies using second- generation sequencing technology. We also discuss the prospect of microbial transcriptomics studies with thirdgeneration sequencing. PMID:27480632

  17. deepTools: a flexible platform for exploring deep-sequencing data

    PubMed Central

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A.; Manke, Thomas

    2014-01-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. PMID:24799436

  18. deepTools: a flexible platform for exploring deep-sequencing data.

    PubMed

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy.

  19. deepTools: a flexible platform for exploring deep-sequencing data.

    PubMed

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy. PMID:24799436

  20. Capturing genomic signatures of DNA sequence variation using a standard anonymous microarray platform

    PubMed Central

    Cannon, C. H.; Kua, C. S.; Lobenhofer, E. K.; Hurban, P.

    2006-01-01

    Comparative genomics, using the model organism approach, has provided powerful insights into the structure and evolution of whole genomes. Unfortunately, only a small fraction of Earth's biodiversity will have its genome sequenced in the foreseeable future. Most wild organisms have radically different life histories and evolutionary genomics than current model systems. A novel technique is needed to expand comparative genomics to a wider range of organisms. Here, we describe a novel approach using an anonymous DNA microarray platform that gathers genomic samples of sequence variation from any organism. Oligonucleotide probe sequences placed on a custom 44 K array were 25 bp long and designed using a simple set of criteria to maximize their complexity and dispersion in sequence probability space. Using whole genomic samples from three known genomes (mouse, rat and human) and one unknown (Gonystylus bancanus), we demonstrate and validate its power, reliability, transitivity and sensitivity. Using two separate statistical analyses, a large numbers of genomic ‘indicator’ probes were discovered. The construction of a genomic signature database based upon this technique would allow virtual comparisons and simple queries could generate optimal subsets of markers to be used in large-scale assays, using simple downstream techniques. Biologists from a wide range of fields, studying almost any organism, could efficiently perform genomic comparisons, at potentially any phylogenetic level after performing a small number of standardized DNA microarray hybridizations. Possibilities for refining and expanding the approach are discussed. PMID:17000641

  1. Initial steps towards a production platform for DNA sequence analysis on the grid

    PubMed Central

    2010-01-01

    Background Bioinformatics is confronted with a new data explosion due to the availability of high throughput DNA sequencers. Data storage and analysis becomes a problem on local servers, and therefore it is needed to switch to other IT infrastructures. Grid and workflow technology can help to handle the data more efficiently, as well as facilitate collaborations. However, interfaces to grids are often unfriendly to novice users. Results In this study we reused a platform that was developed in the VL-e project for the analysis of medical images. Data transfer, workflow execution and job monitoring are operated from one graphical interface. We developed workflows for two sequence alignment tools (BLAST and BLAT) as a proof of concept. The analysis time was significantly reduced. All workflows and executables are available for the members of the Dutch Life Science Grid and the VL-e Medical virtual organizations All components are open source and can be transported to other grid infrastructures. Conclusions The availability of in-house expertise and tools facilitates the usage of grid resources by new users. Our first results indicate that this is a practical, powerful and scalable solution to address the capacity and collaboration issues raised by the deployment of next generation sequencers. We currently adopt this methodology on a daily basis for DNA sequencing and other applications. More information and source code is available via http://www.bioinformaticslaboratory.nl/ PMID:21156038

  2. Impact of next-generation sequencing error on analysis of barcoded plasmid libraries of known complexity and sequence

    PubMed Central

    Deakin, Claire T.; Deakin, Jeffrey J.; Ginn, Samantha L.; Young, Paul; Humphreys, David; Suter, Catherine M.; Alexander, Ian E.; Hallwirth, Claus V.

    2014-01-01

    Barcoded vectors are promising tools for investigating clonal diversity and dynamics in hematopoietic gene therapy. Analysis of clones marked with barcoded vectors requires accurate identification of potentially large numbers of individually rare barcodes, when the exact number, sequence identity and abundance are unknown. This is an inherently challenging application, and the feasibility of using contemporary next-generation sequencing technologies is unresolved. To explore this potential application empirically, without prior assumptions, we sequenced barcode libraries of known complexity. Libraries containing 1, 10 and 100 Sanger-sequenced barcodes were sequenced using an Illumina platform, with a 100-barcode library also sequenced using a SOLiD platform. Libraries containing 1 and 10 barcodes were distinguished from false barcodes generated by sequencing error by a several log-fold difference in abundance. In 100-barcode libraries, however, expected and false barcodes overlapped and could not be resolved by bioinformatic filtering and clustering strategies. In independent sequencing runs multiple false-positive barcodes appeared to be represented at higher abundance than known barcodes, despite their confirmed absence from the original library. Such errors, which potentially impact barcoding studies in an application-dependent manner, are consistent with the existence of both stochastic and systematic error, the mechanism of which is yet to be fully resolved. PMID:25013183

  3. Supersequence and composite sequence carbonate platform growth: Permian and Triassic outcrop data of the Arabian platform and Neo-Tethys

    NASA Astrophysics Data System (ADS)

    Weidlich, O.; Bernecker, M.

    2003-05-01

    Permian and Triassic carbonate platforms of the Arabian Peninsula (Gondwana) and seamounts of the Neo-Tethys (Hawasina and Batain basins) are characterized by distinctive supersequences (second order, duration 5-20 million years, my) and composite sequences (third order, duration 0.5-5 my). The presented sequence stratigraphic framework will be compared with existing sea level curves to discuss the validity of different regional oscillations during the dispersal of Pangea. The carbonate succession of the Haushi and Akhdar Groups of the Arabian platform is composed of four Permian (P1-P4) and four Triassic supersequences (Tr1-Tr4). Isolated platforms of the Hawasina and Batain basins comprise two Permian supersequences and one Triassic supersequence. In contrast to the continuous development of the Arabian shield, carbonate platform growth of the seamounts was restricted to the Guadalupian-Lopingian and to the Middle-Upper Triassic, and ceased after drowning events. Composite sequences exhibit a well-developed stacking pattern during the Guadalupian-Lopingian (Saiq Formation). Lowstand systems tracts (LSTs) occur during the Cisuralian (Gharif Formation, Haushi Group) and Triassic (Mahil Formation, Akhdar Group). Open-marine depositional environments prevail during transgressive systems tracts (TSTs) with diverse biota including rugose and scleractinian corals, chaetetids, bryozoans, and crinoids. Highstand system tracts (HSTs) exhibit a twofold pattern: During the transgressive phase of supersequences, composite sequence highstands are dominated by reef or level-bottom communities with corals. Cyclic platform deposits or monotonous mud- and wackestone accumulated during the turnaround or late second-order highstand of a supersequence. Correlation of maximum flooding surfaces with published data suggests that supersequences P1, P2, and Tr4 can be traced across the Arabian platform into the Neo-Tethys basins, while supersequences P3, P4, and Tr1-Tr3 resulted from

  4. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    PubMed

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.

  5. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    PubMed

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. PMID:24641208

  6. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens

    PubMed Central

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-01-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. PMID:24641208

  7. Periodic binary sequence generators: VLSI circuits considerations

    NASA Technical Reports Server (NTRS)

    Perlman, M.

    1984-01-01

    Feedback shift registers are efficient periodic binary sequence generators. Polynomials of degree r over a Galois field characteristic 2(GF(2)) characterize the behavior of shift registers with linear logic feedback. The algorithmic determination of the trinomial of lowest degree, when it exists, that contains a given irreducible polynomial over GF(2) as a factor is presented. This corresponds to embedding the behavior of an r-stage shift register with linear logic feedback into that of an n-stage shift register with a single two-input modulo 2 summer (i.e., Exclusive-OR gate) in its feedback. This leads to Very Large Scale Integrated (VLSI) circuit architecture of maximal regularity (i.e., identical cells) with intercell communications serialized to a maximal degree.

  8. Next generation sequencing in endocrine practice.

    PubMed

    Forlenza, Gregory P; Calhoun, Amy; Beckman, Kenneth B; Halvorsen, Tanya; Hamdoun, Elwaseila; Zierhut, Heather; Sarafoglou, Kyriakie; Polgreen, Lynda E; Miller, Bradley S; Nathan, Brandon; Petryk, Anna

    2015-01-01

    With the completion of the Human Genome Project and advances in genomic sequencing technologies, the use of clinical molecular diagnostics has grown tremendously over the last decade. Next-generation sequencing (NGS) has overcome many of the practical roadblocks that had slowed the adoption of molecular testing for routine clinical diagnosis. In endocrinology, targeted NGS now complements biochemical testing and imaging studies. The goal of this review is to provide clinicians with a guide to the application of NGS to genetic testing for endocrine conditions, by compiling a list of established gene mutations detectable by NGS, and highlighting key phenotypic features of these disorders. As we outline in this review, the clinical utility of NGS-based molecular testing for endocrine disorders is very high. Identifying an exact genetic etiology improves understanding of the disease, provides clear explanation to families about the cause, and guides decisions about screening, prevention and/or treatment. To illustrate this approach, a case of hypophosphatasia with a pathogenic mutation in the ALPL gene detected by NGS is presented.

  9. Long period pseudo random number sequence generator

    NASA Technical Reports Server (NTRS)

    Wang, Charles C. (Inventor)

    1989-01-01

    A circuit for generating a sequence of pseudo random numbers, (A sub K). There is an exponentiator in GF(2 sup m) for the normal basis representation of elements in a finite field GF(2 sup m) each represented by m binary digits and having two inputs and an output from which the sequence (A sub K). Of pseudo random numbers is taken. One of the two inputs is connected to receive the outputs (E sub K) of maximal length shift register of n stages. There is a switch having a pair of inputs and an output. The switch outputs is connected to the other of the two inputs of the exponentiator. One of the switch inputs is connected for initially receiving a primitive element (A sub O) in GF(2 sup m). Finally, there is a delay circuit having an input and an output. The delay circuit output is connected to the other of the switch inputs and the delay circuit input is connected to the output of the exponentiator. Whereby after the exponentiator initially receives the primitive element (A sub O) in GF(2 sup m) through the switch, the switch can be switched to cause the exponentiator to receive as its input a delayed output A(K-1) from the exponentiator thereby generating (A sub K) continuously at the output of the exponentiator. The exponentiator in GF(2 sup m) is novel and comprises a cyclic-shift circuit; a Massey-Omura multiplier; and, a control logic circuit all operably connected together to perform the function U(sub i) = 92(sup i) (for n(sub i) = 1 or 1 (for n(subi) = 0).

  10. AIV Platform for the Galileo Message Generation Facility

    NASA Astrophysics Data System (ADS)

    Oving, B. A.; Zwartbol, T.; Denham, S.; Rennie, M.

    2007-08-01

    The Message Generation Facility (MGF) is an element of the Galileo Mission Segment (GMS) and is responsible for real-time distribution of the navigation, integrity and SAR messages from the processing facilities (OSPF, IPF, ERIS, RLSP) to the Up-Link Stations (ULS). The main objective is to route a message to the correct ULS in time for on-board update of navigation data and integrity data for dissemination to users. The MGF element is being developed by Deimos Space S.L. (Spain). To perform the Assembly, Integration and Verification (AIV) activities of the MGF, a dedicated test platform, MGF-AIVP, is developed by the National Aerospace Laboratory, NLR (the Netherlands). The MGF-AIVP simulates other Elements in the GMS that are connected to the MGF, in real-time. Its focus is to verify the main objective of the MGF.

  11. A Modular Assembly Platform for Rapid Generation of DNA Constructs

    PubMed Central

    Akama-Garren, Elliot H.; Joshi, Nikhil S.; Tammela, Tuomas; Chang, Gregory P.; Wagner, Bethany L.; Lee, Da-Yae; Rideout III, William M.; Papagiannakopoulos, Thales; Xue, Wen; Jacks, Tyler

    2016-01-01

    Traditional cloning methods have limitations on the number of DNA fragments that can be simultaneously manipulated, which dramatically slows the pace of molecular assembly. Here we describe GMAP, a Gibson assembly-based modular assembly platform consisting of a collection of promoters and genes, which allows for one-step production of DNA constructs. GMAP facilitates rapid assembly of expression and viral constructs using modular genetic components, as well as increasingly complicated genetic tools using contextually relevant genomic elements. Our data demonstrate the applicability of GMAP toward the validation of synthetic promoters, identification of potent RNAi constructs, establishment of inducible lentiviral systems, tumor initiation in genetically engineered mouse models, and gene-targeting for the generation of knock-in mice. GMAP represents a recombinant DNA technology designed for widespread circulation and easy adaptation for other uses, such as synthetic biology, genetic screens, and CRISPR-Cas9. PMID:26887506

  12. Next-Generation Phylogeography: A Targeted Approach for Multilocus Sequencing of Non-Model Organisms

    PubMed Central

    Puritz, Jonathan B.; Addison, Jason A.; Toonen, Robert J.

    2012-01-01

    The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA) sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua) at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers. PMID:22470543

  13. Deep sequencing analysis of phage libraries using Illumina platform.

    PubMed

    Matochko, Wadim L; Chu, Kiki; Jin, Bingjie; Lee, Sam W; Whitesides, George M; Derda, Ratmir

    2012-09-01

    This paper presents an analysis of phage-displayed libraries of peptides using Illumina. We describe steps for the preparation of short DNA fragments for deep sequencing and MatLab software for the analysis of the results. Screening of peptide libraries displayed on the surface of bacteriophage (phage display) can be used to discover peptides that bind to any target. The key step in this discovery is the analysis of peptide sequences present in the library. This analysis is usually performed by Sanger sequencing, which is labor intensive and limited to examination of a few hundred phage clones. On the other hand, Illumina deep-sequencing technology can characterize over 10(7) reads in a single run. We applied Illumina sequencing to analyze phage libraries. Using PCR, we isolated the variable regions from M13KE phage vectors from a phage display library. The PCR primers contained (i) sequences flanking the variable region, (ii) barcodes, and (iii) variable 5'-terminal region. We used this approach to examine how diversity of peptides in phage display libraries changes as a result of amplification of libraries in bacteria. Using HiSeq single-end Illumina sequencing of these fragments, we acquired over 2×10(7) reads, 57 base pairs (bp) in length. Each read contained information about the barcode (6bp), one complimentary region (12bp) and a variable region (36bp). We applied this sequencing to a model library of 10(6) unique clones and observed that amplification enriches ∼150 clones, which dominate ∼20% of the library. Deep sequencing, for the first time, characterized the collapse of diversity in phage libraries. The results suggest that screens based on repeated amplification and small-scale sequencing identify a few binding clones and miss thousands of useful clones. The deep sequencing approach described here could identify under-represented clones in phage screens. It could also be instrumental in developing new screening strategies, which can preserve

  14. Whole-transcriptome sequencing of Pinellia ternata using the Illumina platform.

    PubMed

    Huang, X; Jing, Y; Liu, D J; Yang, B Y; Chen, H; Li, M

    2016-01-01

    Pinelliae rhizoma is the dried tuber of Pinellia ternata (Thunb.) Breit., and has been used for thousands of years as a traditional Chinese medicine. However, its genomic background is little known. With the development of high-throughput genomic sequencing, it is now easy and cheap to obtain genomic information. In this study, 193,032,910 high-quality clean reads were generated using the Illumina Hiseq 2000 platform. A total of 53,544 unigenes were identified from the contigs assembled. Functional annotation analysis annotated 37,318, 27,697, 23,043, 22,869, 23,328, and 27,415 unigenes. KEGG analysis revealed that five pathways (169 genes) were associated with alkaloid synthesis, 201 unigenes were related to fatty acid biosynthesis (ko00061), and 133 unigenes were involved in the biosynthesis of unsaturated fatty acids (ko01040). In addition, 6703 simple sequence repeats were designed based on the unigene sequences for screening germplasm resources in the future. These data are a valuable resource for genomic studies on Pinellia plants.

  15. Whole-transcriptome sequencing of Pinellia ternata using the Illumina platform.

    PubMed

    Huang, X; Jing, Y; Liu, D J; Yang, B Y; Chen, H; Li, M

    2016-01-01

    Pinelliae rhizoma is the dried tuber of Pinellia ternata (Thunb.) Breit., and has been used for thousands of years as a traditional Chinese medicine. However, its genomic background is little known. With the development of high-throughput genomic sequencing, it is now easy and cheap to obtain genomic information. In this study, 193,032,910 high-quality clean reads were generated using the Illumina Hiseq 2000 platform. A total of 53,544 unigenes were identified from the contigs assembled. Functional annotation analysis annotated 37,318, 27,697, 23,043, 22,869, 23,328, and 27,415 unigenes. KEGG analysis revealed that five pathways (169 genes) were associated with alkaloid synthesis, 201 unigenes were related to fatty acid biosynthesis (ko00061), and 133 unigenes were involved in the biosynthesis of unsaturated fatty acids (ko01040). In addition, 6703 simple sequence repeats were designed based on the unigene sequences for screening germplasm resources in the future. These data are a valuable resource for genomic studies on Pinellia plants. PMID:27420994

  16. Strategies for complete mitochondrial genome sequencing on Ion Torrent PGM™ platform in forensic sciences.

    PubMed

    Zhou, Yishu; Guo, Fei; Yu, Jiao; Liu, Feng; Zhao, Jinling; Shen, Hongying; Zhao, Bin; Jia, Fei; Sun, Zhu; Song, He; Jiang, Xianhua

    2016-05-01

    Next generation sequencing (NGS) is a time saving and cost-efficient method to detect the complete mitochondrial genome (mtGenome) compared to Sanger sequencing. In this study we focused on developing strategies for mtGenome sequencing on the Ion Torrent PGM™ platform and NGS data analysis. With our experience, 4, 15 and 30 samples could be loaded onto Ion 314™, Ion 316™ and Ion 318™ chips respectively at a pooling concentration of 26pM, achieving to sufficient average coverage of ≥1500 × and well strand balance of 1.05. Data processing software is essential to NGS mega data analysis. The in-house Perl scripts were developed for primary data analysis to screen out uncertain positions and samples from variant call format (VCF) reports and for pedigree study to perform pairwise comparisons. The Integrative Genomic Viewer (IGV) and the NextGENe software were introduced to secondary data analysis. The mthap and EMMA were employed for haplogroup assignment. The dataset was reviewed and approved by the EMPOP as the final version, which showed 2.66% error rate generated from the Torrent Variant Caller (TVC). Across the mtGenome, 4022 variants were found at 725 nucleotide positions, where ratio of transitions to transversions was estimated at 20.89:1 and 22.18% of variants was concentrated at hypervariable segments I and II (HVS-I and HVS-II). Totally, 107 complete mtGenome haplotypes were observed from 107 Northern Chinese Han and assigned to 88 haplogroups. The random match probability (RMP) of complete mtGenome was calculated as 0.009345794, decreasing 26.19% by comparison to that of HVS-I only, and the haplotype diversity (HD) was evaluated as 1, increasing 0.33% by comparison to that of HVS-I only. Principal component analysis (PCA) showed that our population was clustered to East and Southeast Asians. The strategies in this study are suitable for complete mtGenome sequencing on Ion Torrent PGM™ platform and Northern Chinese Han (EMP00670) is the first

  17. Preparation of SELEX Samples for Next-Generation Sequencing.

    PubMed

    Tolle, Fabian; Mayer, Günter

    2016-01-01

    Fuelled by massive whole genome sequencing projects such as the human genome project, enormous technological advancements and therefore tremendous price drops could be achieved, rendering next-generation sequencing very attractive for deep sequencing of SELEX libraries. Herein we describe the preparation of SELEX samples for Illumina sequencing, based on the already established whole genome sequencing workflow. We describe the addition of barcode sequences for multiplexing and the adapter ligation, avoiding associated pitfalls. PMID:26552817

  18. Applications of next-generation sequencing techniques in plant biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The last several years have seen revolutionary advances in DNA sequencing technologies with the advent of next generation sequencing (NGS) techniques. NGS methods now allow millions of bases to be sequenced in one round, at a fraction of the cost relative to traditional Sanger sequencing, allowing u...

  19. Next-Generation Sequencing: A Review of Technologies and Tools for Wound Microbiome Research

    PubMed Central

    Hodkinson, Brendan P.; Grice, Elizabeth A.

    2015-01-01

    Significance: The colonization of wounds by specific microbes or communities of microbes may delay healing and/or lead to infection-related complication. Studies of wound-associated microbial communities (microbiomes) to date have primarily relied upon culture-based methods, which are known to have extreme biases and are not reliable for the characterization of microbiomes. Biofilms are very resistant to culture and are therefore especially difficult to study with techniques that remain standard in clinical settings. Recent Advances: Culture-independent approaches employing next-generation DNA sequencing have provided researchers and clinicians a window into wound-associated microbiomes that could not be achieved before and has begun to transform our view of wound-associated biodiversity. Within the past decade, many platforms have arisen for performing this type of sequencing, with various types of applications for microbiome research being possible on each. Critical Issues: Wound care incorporating knowledge of microbiomes gained from next-generation sequencing could guide clinical management and treatments. The purpose of this review is to outline the current platforms, their applications, and the steps necessary to undertake microbiome studies using next-generation sequencing. Future Directions: As DNA sequencing technology progresses, platforms will continue to produce longer reads and more reads per run at lower costs. A major future challenge is to implement these technologies in clinical settings for more precise and rapid identification of wound bioburden. PMID:25566414

  20. Generating Functions for the Powers of Fibonacci Sequences

    ERIC Educational Resources Information Center

    Terrana, D.; Chen, H.

    2007-01-01

    In this note, based on the Binet formulas and the power-reducing techniques, closed forms of generating functions for the powers of Fibonacci sequences are presented. The corresponding results are extended to some other famous sequences as well.

  1. Economic regulation of next-generation sequencing.

    PubMed

    Evans, Barbara J

    2014-01-01

    Next-generation sequencing broadens the debate about appropriate regulatory oversight of genetic testing and may force scholars to move beyond familiar privacy and health and safety regulatory issues to address new problems with industry structure and economic regulation. The genetic testing industry is passing through a period of profound structural change in response to shifts in technology and in the legal environment. Making genetic testing safe and effective for consumers increasingly requires access to comprehensive genomic data infrastructures that can support accurate, state-of-the-art interpretation of genetic test results. At present, there are significant barriers to access and there is no sector-specific regulator with power to ensure appropriate data access. Without it, genetic testing will not be safe for consumers even when it is performed at CLIA-certified laboratories using tests that have been FDA-cleared or approved. This article explores the emerging structure of the genetic testing industry and describes its present economic regulatory vacuum. In view of this gap in regulation, the article explores whether generally applicable law, particularly antitrust law, may offer solutions to the industry's data access problems. It concludes that courts may have a useful role to play, particularly in Europe and other jurisdictions where the essential facilities doctrine enjoys continued vitality. After Verizon Communications v. Law Offices of Curtis V. Trinko, the role of U.S. federal courts is less certain. Congress has demonstrated willingness to address access issues as they emerged in other infrastructure industries in recent decades. This article expresses no preference between legislative and judicial solutions. Its aim is simply to highlight an emerging economic regulatory issue which, if left unresolved, presents real health and safety concerns for consumers who receive genetic tests.

  2. Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

    PubMed Central

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-01

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207

  3. Historical perspective, development and applications of next-generation sequencing in plant virology.

    PubMed

    Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

    2014-01-06

    Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology.

  4. Polynomials Generated by the Fibonacci Sequence

    NASA Astrophysics Data System (ADS)

    Garth, David; Mills, Donald; Mitchell, Patrick

    2007-06-01

    The Fibonacci sequence's initial terms are F_0=0 and F_1=1, with F_n=F_{n-1}+F_{n-2} for n>=2. We define the polynomial sequence p by setting p_0(x)=1 and p_{n}(x)=x*p_{n-1}(x)+F_{n+1} for n>=1, with p_{n}(x)= sum_{k=0}^{n} F_{k+1}x^{n-k}. We call p_n(x) the Fibonacci-coefficient polynomial (FCP) of order n. The FCP sequence is distinct from the well-known Fibonacci polynomial sequence. We answer several questions regarding these polynomials. Specifically, we show that each even-degree FCP has no real zeros, while each odd-degree FCP has a unique, and (for degree at least 3) irrational, real zero. Further, we show that this sequence of unique real zeros converges monotonically to the negative of the golden ratio. Using Rouche's theorem, we prove that the zeros of the FCP's approach the golden ratio in modulus. We also prove a general result that gives the Mahler measures of an infinite subsequence of the FCP sequence whose coefficients are reduced modulo an integer m>=2. We then apply this to the case that m=L_n, the nth Lucas number, showing that the Mahler measure of the subsequence is phi^{n-1}, where phi=(1+sqrt 5)/2.

  5. A research roadmap for next-generation sequencing informatics.

    PubMed

    Altman, Russ B; Prabhu, Snehit; Sidow, Arend; Zook, Justin M; Goldfeder, Rachel; Litwack, David; Ashley, Euan; Asimenos, George; Bustamante, Carlos D; Donigan, Katherine; Giacomini, Kathleen M; Johansen, Elaine; Khuri, Natalia; Lee, Eunice; Liang, Xueying Sharon; Salit, Marc; Serang, Omar; Tezak, Zivana; Wall, Dennis P; Mansfield, Elizabeth; Kass-Hout, Taha

    2016-04-20

    Next-generation sequencing technologies are fueling a wave of new diagnostic tests. Progress on a key set of nine research challenge areas will help generate the knowledge required to advance effectively these diagnostics to the clinic. PMID:27099173

  6. A research roadmap for next-generation sequencing informatics.

    PubMed

    Altman, Russ B; Prabhu, Snehit; Sidow, Arend; Zook, Justin M; Goldfeder, Rachel; Litwack, David; Ashley, Euan; Asimenos, George; Bustamante, Carlos D; Donigan, Katherine; Giacomini, Kathleen M; Johansen, Elaine; Khuri, Natalia; Lee, Eunice; Liang, Xueying Sharon; Salit, Marc; Serang, Omar; Tezak, Zivana; Wall, Dennis P; Mansfield, Elizabeth; Kass-Hout, Taha

    2016-04-20

    Next-generation sequencing technologies are fueling a wave of new diagnostic tests. Progress on a key set of nine research challenge areas will help generate the knowledge required to advance effectively these diagnostics to the clinic.

  7. SNP Discovery Using Next Generation Transcriptomic Sequencing.

    PubMed

    De Wit, Pierre

    2016-01-01

    In this chapter, I will guide the user through methods to find new SNP markers from expressed sequence (RNA-Seq) data, focusing on the sample preparation and also on the bioinformatic analyses needed to sort through the immense flood of data from high-throughput sequencing machines. The general steps included are as follows: sample preparation, sequencing, quality control of data, assembly, mapping, SNP discovery, filtering, validation. The first few steps are traditional laboratory protocols, whereas steps following the sequencing are of bioinformatic nature. The bioinformatics described herein are by no means exhaustive, rather they serve as one example of a simple way of analyzing high-throughput sequence data to find SNP markers. Ideally, one would like to run through this protocol several times with a new dataset, while varying software parameters slightly, in order to determine the robustness of the results. The final validation step, although not described in much detail here, is also quite critical as that will be the final test of the accuracy of the assumptions made in silico.There is a plethora of downstream applications of a SNP dataset, not covered in this chapter. For an example of a more thorough protocol also including differential gene expression and functional enrichment analyses, BLAST annotation and downstream applications of SNP markers, a good starting point could be the "Simple Fool's Guide to population genomics via RNA-Seq," which is available at http://sfg.stanford.edu . PMID:27460371

  8. Next Generation Sequencing at the University of Chicago Genomics Core

    SciTech Connect

    Faber, Pieter

    2013-04-24

    The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.

  9. Depositional sequence evolution, Paleozoic and early Mesozoic of the central Saharan platform, North Africa

    SciTech Connect

    Sprague, A.R.G. )

    1991-08-01

    Over 30 depositional sequences have been identified in the Paleozoic and lower Mesozoic of the Ghadames basin of eastern Algeria, southern Tunisia, and western Libya. Well logs and lithologic information from more than 500 wells were used to correlate the 30 sequences throughout the basin (total area more than 1 million km{sup 2}). Based on systematic change in the log response of strata in successively younger sequences, five groups of sequences with distinctive characteristics have been identified: Cambro-Ordivician, Upper Silurian-Middle Devonian, Upper Devonian, Carboniferous, and Middle Triassic-Middle Jurassic. Each sequence group is terminated by a major, tectonically enhanced sequence boundary that is immediately overlain (except for the Carboniferous) by a shale-prone interval deposited in response to basin-wide flooding. The four Paleozoic sequence groups were deposited on the Saharan platform, a north facing, clastic-dominated shelf that covered most of North Africa during the Paleozoic. The sequence boundary at the top of the Carboniferous sequence group is one of several Permian-Carboniferous angular unconformities in North Africa related to the Hercynian orogeny. The youngest sequence group (Middle Triassic to Middle Jurassic) is a clastic-evaporite package that onlaps southward onto the top of Paleozoic sequence boundary. The progressive changes from the Cambrian to the Jurassic, in the nature of the Ghadames basin sequences is a reflection of the interplay between basin morphology and tectonics, vegetation, eustasy, climate, and sediment supply.

  10. JVM: Java Visual Mapping tool for next generation sequencing read.

    PubMed

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB.

  11. Ad 2.0: a novel recombineering platform for high-throughput generation of tailored adenoviruses

    PubMed Central

    Mück-Häusl, Martin; Solanki, Manish; Zhang, Wenli; Ruzsics, Zsolt; Ehrhardt, Anja

    2015-01-01

    Recombinant adenoviruses containing a double-stranded DNA genome of 26–45 kb were broadly explored in basic virology, for vaccination purposes, for treatment of tumors based on oncolytic virotherapy, or simply as a tool for efficient gene transfer. However, the majority of recombinant adenoviral vectors (AdVs) is based on a small fraction of adenovirus types and their genetic modification. Recombineering techniques provide powerful tools for arbitrary engineering of recombinant DNA. Here, we adopted a seamless recombineering technology for high-throughput and arbitrary genetic engineering of recombinant adenoviral DNA molecules. Our cloning platform which also includes a novel recombination pipeline is based on bacterial artificial chromosomes (BACs). It enables generation of novel recombinant adenoviruses from different sources and switching between commonly used early generation AdVs and the last generation high-capacity AdVs lacking all viral coding sequences making them attractive candidates for clinical use. In combination with a novel recombination pipeline allowing cloning of AdVs containing large and complex transgenes and the possibility to generate arbitrary chimeric capsid-modified adenoviruses, these techniques allow generation of tailored AdVs with distinct features. Our technologies will pave the way toward broader applications of AdVs in molecular medicine including gene therapy and vaccination studies. PMID:25609697

  12. Image encryption using random sequence generated from generalized information domain

    NASA Astrophysics Data System (ADS)

    Xia-Yan, Zhang; Guo-Ji, Zhang; Xuan, Li; Ya-Zhou, Ren; Jie-Hua, Wu

    2016-05-01

    A novel image encryption method based on the random sequence generated from the generalized information domain and permutation–diffusion architecture is proposed. The random sequence is generated by reconstruction from the generalized information file and discrete trajectory extraction from the data stream. The trajectory address sequence is used to generate a P-box to shuffle the plain image while random sequences are treated as keystreams. A new factor called drift factor is employed to accelerate and enhance the performance of the random sequence generator. An initial value is introduced to make the encryption method an approximately one-time pad. Experimental results show that the random sequences pass the NIST statistical test with a high ratio and extensive analysis demonstrates that the new encryption scheme has superior security.

  13. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, Eduard

    1998-01-01

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  14. Variable Speed Wind Turbine Generator with Zero-sequence Filter

    DOEpatents

    Muljadi, Eduard

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  15. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, E.

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility. 14 figs.

  16. Transcriptome Sequencing and Development of an Expression Microarray Platform for Liver Infection in Adenovirus Type 5-Infected Syrian Golden Hamsters

    PubMed Central

    Ying, Baoling; Toth, Karoly; Spencer, Jacqueline F.; Aurora, Rajeev; Wold, William S.M.

    2015-01-01

    The Syrian golden hamster is an attractive animal for research on infectious diseases and other diseases. We report here the sequencing, assembly, and annotation of the Syrian hamster transcriptome. We include transcripts from ten pooled tissues from a naïve hamster and one stimulated with lipopolysaccharide. Our data set identified 42,707 non-redundant transcripts, representing 34,191 unique genes. Based on the transcriptome data, we generated a custom microarray and used this new platform to investigate the transcriptional response in the Syrian hamster liver following intravenous adenovirus type 5 (Ad5) infection. We found that Ad5 infection caused a massive change in regulation of liver transcripts, with robust up-regulation of genes involved in the antiviral response, indicating that the innate immune response functions in the host defense against Ad5 infection of the liver. The data and novel platforms developed in this study will facilitate further development of this important animal model. PMID:26319212

  17. Using next-generation sequencing as a genetic diagnostic tool in rare autosomal recessive neurologic Mendelian disorders.

    PubMed

    Chen, Zhao; Wang, Jun-Ling; Tang, Bei-Sha; Sun, Zhan-Fang; Shi, Yu-Ting; Shen, Lu; Lei, Li-Fang; Wei, Xiao-Ming; Xiao, Jing-Jing; Hu, Zheng-Mao; Pan, Qian; Xia, Kun; Zhang, Qing-Yan; Dai, Mei-Zhi; Liu, Yu; Ashizawa, Tetsuo; Jiang, Hong

    2013-10-01

    Next-generation sequencing was used to investigate 9 rare Chinese pedigrees with rare autosomal recessive neurologic Mendelian disorders. Five probands with ataxia-telangectasia and 1 proband with chorea-acanthocytosis were analyzed by targeted gene sequencing. Whole-exome sequencing was used to investigate 3 affected individuals with Joubert syndrome, nemaline myopathy, or spastic ataxia Charlevoix-Saguenay type. A list of known and novel candidate variants was identified for each causative gene. All variants were genetically verified by Sanger sequencing or quantitative polymerase chain reaction with the strategy of disease segregation in related pedigrees and healthy controls. The advantages of using next-generation sequencing to diagnose rare autosomal recessive neurologic Mendelian disorders characterized by genetic and phenotypic heterogeneity are demonstrated. A genetic diagnostic strategy combining the use of targeted gene sequencing and whole-exome sequencing with the aid of next-generation sequencing platforms has shown great promise for improving the diagnosis of neurologic Mendelian disorders. PMID:23726790

  18. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  19. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  20. A high-throughput optomechanical retrieval method for sequence-verified clonal DNA from the NGS platform.

    PubMed

    Lee, Howon; Kim, Hyoki; Kim, Sungsik; Ryu, Taehoon; Kim, Hwangbeom; Bang, Duhee; Kwon, Sunghoon

    2015-01-01

    Writing DNA plays a significant role in the fields of synthetic biology, functional genomics and bioengineering. DNA clones on next-generation sequencing (NGS) platforms have the potential to be a rich and cost-effective source of sequence-verified DNAs as a precursor for DNA writing. However, it is still very challenging to retrieve target clonal DNA from high-density NGS platforms. Here we propose an enabling technology called 'Sniper Cloning' that enables the precise mapping of target clone features on NGS platforms and non-contact rapid retrieval of targets for the full utilization of DNA clones. By merging the three cutting-edge technologies of NGS, DNA microarray and our pulse laser retrieval system, Sniper Cloning is a week-long process that produces 5,188 error-free synthetic DNAs in a single run of NGS with a single microarray DNA pool. We believe that this technology has potential as a universal tool for DNA writing in biological sciences. PMID:25641679

  1. A high-throughput optomechanical retrieval method for sequence-verified clonal DNA from the NGS platform.

    PubMed

    Lee, Howon; Kim, Hyoki; Kim, Sungsik; Ryu, Taehoon; Kim, Hwangbeom; Bang, Duhee; Kwon, Sunghoon

    2015-02-02

    Writing DNA plays a significant role in the fields of synthetic biology, functional genomics and bioengineering. DNA clones on next-generation sequencing (NGS) platforms have the potential to be a rich and cost-effective source of sequence-verified DNAs as a precursor for DNA writing. However, it is still very challenging to retrieve target clonal DNA from high-density NGS platforms. Here we propose an enabling technology called 'Sniper Cloning' that enables the precise mapping of target clone features on NGS platforms and non-contact rapid retrieval of targets for the full utilization of DNA clones. By merging the three cutting-edge technologies of NGS, DNA microarray and our pulse laser retrieval system, Sniper Cloning is a week-long process that produces 5,188 error-free synthetic DNAs in a single run of NGS with a single microarray DNA pool. We believe that this technology has potential as a universal tool for DNA writing in biological sciences.

  2. WITHDRAWN: Evaluation of next-generation sequencing software in mapping and assembly.

    PubMed

    Bao, Suying; Jiang, Rui; Kwan, Wingkeung; Wang, Binbin; Ma, Xu; Song, You-Qiang

    2011-06-16

    Next-generation high-throughput DNA sequencing technologies have advanced progressively in sequence-based genomic research and novel biological applications with the promise of sequencing DNA at unprecedented speed. These new non-Sanger-based technologies feature several advantages, when compared with traditional sequencing methods in terms of higher sequencing speed, lower per run cost and higher accuracy. However, reads from next-generation sequencing (NGS) platforms, such as 454/Roche, ABI/SOLiD and Illumina/Solexa, are usually short, thereby restricting the applications of NGS platforms in genome assembly and annotation. We presented an overview of the challenges that these novel technologies meet and particularly illustrated various bioinformatics attempts on mapping and assembly for problem solving. We then compared the performance of several programs in these two fields and further provided advices on selecting suitable tools for specific biological applications.Journal of Human Genetics advance online publication, 16 June 2011; doi:10.1038/jhg.2011.62.

  3. [Automatic analysis pipeline of next-generation sequencing data].

    PubMed

    Wenke, Li; Fengyu, Li; Siyao, Zhang; Bin, Cai; Na, Zheng; Yu, Nie; Dao, Zhou; Qian, Zhao

    2014-06-01

    The development of next-generation sequencing has generated high demand for data processing and analysis. Although there are a lot of software for analyzing next-generation sequencing data, most of them are designed for one specific function (e.g., alignment, variant calling or annotation). Therefore, it is necessary to combine them together for data analysis and to generate interpretable results for biologists. This study designed a pipeline to process Illumina sequencing data based on Perl programming language and SGE system. The pipeline takes original sequence data (fastq format) as input, calls the standard data processing software (e.g., BWA, Samtools, GATK, and Annovar), and finally outputs a list of annotated variants that researchers can further analyze. The pipeline simplifies the manual operation and improves the efficiency by automatization and parallel computation. Users can easily run the pipeline by editing the configuration file or clicking the graphical interface. Our work will facilitate the research projects using the sequencing technology.

  4. Next Generation Sequencing for the Diagnosis of Cardiac Arrhythmia Syndromes

    PubMed Central

    Lubitz, Steven A.; Ellinor, Patrick T.

    2015-01-01

    Inherited arrhythmia syndromes are collectively associated with substantial morbidity, yet our understanding of the genetic architecture of these conditions remains limited. Recent technological advances in DNA sequencing have led to the commercialization of genetic testing now widely available in clinical practice. In particular, next generation sequencing allows the large-scale and rapid assessment of entire genomes. Although next generation sequencing represents a major technological advance, it has introduced numerous challenges with respect to the interpretation of genetic variation, and has opened a veritable floodgate of biological data of unknown clinical significance to practitioners. In this review, we discuss current genetic testing indications for inherited arrhythmia syndromes, broadly outline characteristics of next generation sequencing techniques, and highlight challenges associated with such testing. We further summarize future directions that will be necessary to address to enable the widespread adoption of next generation sequencing in the routine management of patients with inherited arrhythmia syndromes. PMID:25625719

  5. Methods in virus diagnostics: from ELISA to next generation sequencing.

    PubMed

    Boonham, Neil; Kreuze, Jan; Winter, Stephan; van der Vlugt, René; Bergervoet, Jan; Tomlinson, Jenny; Mumford, Rick

    2014-06-24

    Despite the seemingly continuous development of newer and ever more elaborate methods for detecting and identifying viruses, very few of these new methods get adopted for routine use in testing laboratories, often despite the many and varied claimed advantages they possess. To understand why the rate of uptake of new technologies is so low, requires a strong understanding of what makes a good routine diagnostic tool to begin. This can be done by looking at the two most successfully established plant virus detection methods: enzyme-linked immunosorbant assay (ELISA) and more recently introduced real-time polymerase chain reaction (PCR). By examining the characteristics of this pair of technologies, it becomes clear that they share many benefits, such as an industry standard format and high levels of repeatability and reproducibility. These combine to make methods that are accessible to testing labs, which are easy to establish and robust in their use, even with new and inexperienced users. Hence, to ensure the establishment of new techniques it is necessary to not only provide benefits not found with ELISA or real-time PCR, but also to provide a platform that is easy to establish and use. In plant virus diagnostics, recent developments can be clustered into three core areas: (1) techniques that can be performed in the field or resource poor locations (e.g., loop-mediated isothermal amplification LAMP); (2) multiplex methods that are able to detect many viruses in a single test (e.g., Luminex bead arrays); and (3) methods suited to virus discovery (e.g., next generation sequencing, NGS). Field based methods are not new, with Lateral Flow Devices (LFDs) for the detection being available for a number of years now. However, the widespread uptake of this technology remains poor. LAMP does offer significant advantages over LFDs, in terms of sensitivity and generic application, but still faces challenges in terms of establishment. It is likely that the main barrier to the

  6. Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

    PubMed Central

    Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

    2015-01-01

    Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016

  7. Estimating genotype error rates from high-coverage next-generation sequence data.

    PubMed

    Wall, Jeffrey D; Tang, Ling Fung; Zerbe, Brandon; Kvale, Mark N; Kwok, Pui-Yan; Schaefer, Catherine; Risch, Neil

    2014-11-01

    Exome and whole-genome sequencing studies are becoming increasingly common, but little is known about the accuracy of the genotype calls made by the commonly used platforms. Here we use replicate high-coverage sequencing of blood and saliva DNA samples from four European-American individuals to estimate lower bounds on the error rates of Complete Genomics and Illumina HiSeq whole-genome and whole-exome sequencing. Error rates for nonreference genotype calls range from 0.1% to 0.6%, depending on the platform and the depth of coverage. Additionally, we found (1) no difference in the error profiles or rates between blood and saliva samples; (2) Complete Genomics sequences had substantially higher error rates than Illumina sequences had; (3) error rates were higher (up to 6%) for rare or unique variants; (4) error rates generally declined with genotype quality (GQ) score, but in a nonlinear fashion for the Illumina data, likely due to loss of specificity of GQ scores greater than 60; and (5) error rates increased with increasing depth of coverage for the Illumina data. These findings, especially (3)-(5), suggest that caution should be taken in interpreting the results of next-generation sequencing-based association studies, and even more so in clinical application of this technology in the absence of validation by other more robust sequencing or genotyping methods.

  8. Analyzing the safety of removal sequences for piles of an offshore jacket platform

    NASA Astrophysics Data System (ADS)

    Pan, Xin-Ying; Zhang, Zhao-De

    2009-12-01

    An inevitable consequence of the development of the offshore petroleum industry is the eventual obsolescence of large offshore structures. Proper methods for removal of decommissioned offshore platforms are becoming an important topic that the oil and gas industry must pay increasing attention to. While removing sections from a decommissioned jacket platform, the stability of the remaining parts is critical. The jacket danger indices D σ and D s defined in this paper are very useful for analyzing the safety of any procedure planned for disassembling a jacket platform. The safest piles cutting sequence can be determined easily by comparing every column of D σ and D s or simply analyzing the figures of every row of D σ and D s .

  9. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  10. Non-random DNA fragmentation in next-generation sequencing

    PubMed Central

    Poptsova, Maria S.; Il'icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-01-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed “reads” are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions. PMID:24681819

  11. Bioelectrochemical system platform for sustainable environmental remediation and energy generation.

    PubMed

    Wang, Heming; Luo, Haiping; Fallgren, Paul H; Jin, Song; Ren, Zhiyong Jason

    2015-01-01

    The increasing awareness of the energy-environment nexus is compelling the development of technologies that reduce environmental impacts during energy production as well as energy consumption during environmental remediation. Countries spend billions in pollution cleanup projects, and new technologies with low energy and chemical consumption are needed for sustainable remediation practice. This perspective review provides a comprehensive summary on the mechanisms of the new bioelectrochemical system (BES) platform technology for efficient and low cost remediation, including petroleum hydrocarbons, chlorinated solvents, perchlorate, azo dyes, and metals, and it also discusses the potential new uses of BES approach for some emerging contaminants remediation, such as CO2 in air and nutrients and micropollutants in water. The unique feature of BES for environmental remediation is the use of electrodes as non-exhaustible electron acceptors, or even donors, for contaminant degradation, which requires minimum energy or chemicals but instead produces sustainable energy for monitoring and other onsite uses. BES provides both oxidation (anode) and reduction (cathode) reactions that integrate microbial-electro-chemical removal mechanisms, so complex contaminants with different characteristics can be removed. We believe the BES platform carries great potential for sustainable remediation and hope this perspective provides background and insights for future research and development.

  12. Bioelectrochemical system platform for sustainable environmental remediation and energy generation.

    PubMed

    Wang, Heming; Luo, Haiping; Fallgren, Paul H; Jin, Song; Ren, Zhiyong Jason

    2015-01-01

    The increasing awareness of the energy-environment nexus is compelling the development of technologies that reduce environmental impacts during energy production as well as energy consumption during environmental remediation. Countries spend billions in pollution cleanup projects, and new technologies with low energy and chemical consumption are needed for sustainable remediation practice. This perspective review provides a comprehensive summary on the mechanisms of the new bioelectrochemical system (BES) platform technology for efficient and low cost remediation, including petroleum hydrocarbons, chlorinated solvents, perchlorate, azo dyes, and metals, and it also discusses the potential new uses of BES approach for some emerging contaminants remediation, such as CO2 in air and nutrients and micropollutants in water. The unique feature of BES for environmental remediation is the use of electrodes as non-exhaustible electron acceptors, or even donors, for contaminant degradation, which requires minimum energy or chemicals but instead produces sustainable energy for monitoring and other onsite uses. BES provides both oxidation (anode) and reduction (cathode) reactions that integrate microbial-electro-chemical removal mechanisms, so complex contaminants with different characteristics can be removed. We believe the BES platform carries great potential for sustainable remediation and hope this perspective provides background and insights for future research and development. PMID:25886880

  13. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data. PMID:25959587

  14. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data.

  15. Next-generation sequencing: the future of molecular genetics in poultry production and food safety.

    PubMed

    Diaz-Sanchez, S; Hanning, I; Pendleton, Sean; D'Souza, Doris

    2013-02-01

    The era of molecular biology and automation of the Sanger chain-terminator sequencing method has led to discovery and advances in diagnostics and biotechnology. The Sanger methodology dominated research for over 2 decades, leading to significant accomplishments and technological improvements in DNA sequencing. Next-generation high-throughput sequencing (HT-NGS) technologies were developed subsequently to overcome the limitations of this first generation technology that include higher speed, less labor, and lowered cost. Various platforms developed include sequencing-by-synthesis 454 Life Sciences, Illumina (Solexa) sequencing, SOLiD sequencing (among others), and the Ion Torrent semiconductor sequencing technologies that use different detection principles. As technology advances, progress made toward third generation sequencing technologies are being reported, which include Nanopore Sequencing and real-time monitoring of PCR activity through fluorescent resonant energy transfer. The advantages of these technologies include scalability, simplicity, with increasing DNA polymerase performance and yields, being less error prone, and even more economically feasible with the eventual goal of obtaining real-time results. These technologies can be directly applied to improve poultry production and enhance food safety. For example, sequence-based (determination of the gut microbial community, genes for metabolic pathways, or presence of plasmids) and function-based (screening for function such as antibiotic resistance, or vitamin production) metagenomic analysis can be carried out. Gut microbialflora/communities of poultry can be sequenced to determine the changes that affect health and disease along with efficacy of methods to control pathogenic growth. Thus, the purpose of this review is to provide an overview of the principles of these current technologies and their potential application to improve poultry production and food safety as well as public health.

  16. Next-generation sequencing in clinical virology: Discovery of new viruses

    PubMed Central

    Datta, Sibnarayan; Budhauliya, Raghvendra; Das, Bidisha; Chatterjee, Soumya; Vanlalhmuaka; Veer, Vijay

    2015-01-01

    Viruses are a cause of significant health problem worldwide, especially in the developing nations. Due to different anthropological activities, human populations are exposed to different viral pathogens, many of which emerge as outbreaks. In such situations, discovery of novel viruses is utmost important for deciding prevention and treatment strategies. Since last century, a number of different virus discovery methods, based on cell culture inoculation, sequence-independent PCR have been used for identification of a variety of viruses. However, the recent emergence and commercial availability of next-generation sequencers (NGS) has entirely changed the field of virus discovery. These massively parallel sequencing platforms can sequence a mixture of genetic materials from a very heterogeneous mix, with high sensitivity. Moreover, these platforms work in a sequence-independent manner, making them ideal tools for virus discovery. However, for their application in clinics, sample preparation or enrichment is necessary to detect low abundance virus populations. A number of techniques have also been developed for enrichment or viral nucleic acids. In this manuscript, we review the evolution of sequencing; NGS technologies available today as well as widely used virus enrichment technologies. We also discuss the challenges associated with their applications in the clinical virus discovery. PMID:26279987

  17. Strategy for microbiome analysis using 16S rRNA gene sequence analysis on the Illumina sequencing platform.

    PubMed

    Ram, Jeffrey L; Karim, Aos S; Sendler, Edward D; Kato, Ikuko

    2011-06-01

    Understanding the identity and changes of organisms in the urogenital and other microbiomes of the human body may be key to discovering causes and new treatments of many ailments, such as vaginosis. High-throughput sequencing technologies have recently enabled discovery of the great diversity of the human microbiome. The cost per base of many of these sequencing platforms remains high (thousands of dollars per sample); however, the Illumina Genome Analyzer (IGA) is estimated to have a cost per base less than one-fifth of its nearest competitor. The main disadvantage of the IGA for sequencing PCR-amplified 16S rRNA genes is that the maximum read-length of the IGA is only 100 bases; whereas, at least 300 bases are needed to obtain phylogenetically informative data down to the genus and species level. In this paper we describe and conduct a pilot test of a multiplex sequencing strategy suitable for achieving total reads of > 300 bases per extracted DNA molecule on the IGA. Results show that all proposed primers produce products of the expected size and that correct sequences can be obtained, with all proposed forward primers. Various bioinformatic optimization of the Illumina Bustard analysis pipeline proved necessary to extract the correct sequence from IGA image data, and these modifications of the data files indicate that further optimization of the analysis pipeline may improve the quality rankings of the data and enable more sequence to be correctly analyzed. The successful application of this method could result in an unprecedentedly deep description (800,000 taxonomic identifications per sample) of the urogenital and other microbiomes in a large number of samples at a reasonable cost per sample. PMID:21361774

  18. Lessons from next-generation sequencing analysis in hematological malignancies

    PubMed Central

    Braggio, E; Egan, J B; Fonseca, R; Stewart, A K

    2013-01-01

    Next-generation sequencing has led to a revolution in the study of hematological malignancies with a substantial number of publications and discoveries in the last few years. Significant discoveries associated with disease diagnosis, risk stratification, clonal evolution and therapeutic intervention have been generated by this powerful technology. As part of the post-genomic era, sequencing analysis will likely become part of routine clinical testing and the challenge will ultimately be successfully transitioning from gene discovery to preventive and therapeutic intervention as part of individualized medicine strategies. In this report, we review recent advances in the understanding of hematological malignancies derived through genome-wide sequence analysis. PMID:23872706

  19. Comparison of CAGE and RNA-seq transcriptome profiling using clonally amplified and single-molecule next-generation sequencing.

    PubMed

    Kawaji, Hideya; Lizio, Marina; Itoh, Masayoshi; Kanamori-Katayama, Mutsumi; Kaiho, Ai; Nishiyori-Sueki, Hiromi; Shin, Jay W; Kojima-Ishiyama, Miki; Kawano, Mitsuoki; Murata, Mitsuyoshi; Ninomiya-Fukuda, Noriko; Ishikawa-Kato, Sachi; Nagao-Sato, Sayaka; Noma, Shohei; Hayashizaki, Yoshihide; Forrest, Alistair R R; Carninci, Piero

    2014-04-01

    CAGE (cap analysis gene expression) and RNA-seq are two major technologies used to identify transcript abundances as well as structures. They measure expression by sequencing from either the 5' end of capped molecules (CAGE) or tags randomly distributed along the length of a transcript (RNA-seq). Library protocols for clonally amplified (Illumina, SOLiD, 454 Life Sciences [Roche], Ion Torrent), second-generation sequencing platforms typically employ PCR preamplification prior to clonal amplification, while third-generation, single-molecule sequencers can sequence unamplified libraries. Although these transcriptome profiling platforms have been demonstrated to be individually reproducible, no systematic comparison has been carried out between them. Here we compare CAGE, using both second- and third-generation sequencers, and RNA-seq, using a second-generation sequencer based on a panel of RNA mixtures from two human cell lines to examine power in the discrimination of biological states, detection of differentially expressed genes, linearity of measurements, and quantification reproducibility. We found that the quantified levels of gene expression are largely comparable across platforms and conclude that CAGE and RNA-seq are complementary technologies that can be used to improve incomplete gene models. We also found systematic bias in the second- and third-generation platforms, which is likely due to steps such as linker ligation, cleavage by restriction enzymes, and PCR amplification. This study provides a perspective on the performance of these platforms, which will be a baseline in the design of further experiments to tackle complex transcriptomes uncovered in a wide range of cell types.

  20. Next Generation Sequencing Technologies for Insect Virus Discovery

    PubMed Central

    Liu, Sijun; Vijayendran, Diveena; Bonning, Bryony C.

    2011-01-01

    Insects are commonly infected with multiple viruses including those that cause sublethal, asymptomatic, and latent infections. Traditional methods for virus isolation typically lack the sensitivity required for detection of such viruses that are present at low abundance. In this respect, next generation sequencing technologies have revolutionized methods for the discovery and identification of new viruses from insects. Here we review both traditional and modern methods for virus discovery, and outline analysis of transcriptome and small RNA data for identification of viral sequences. We will introduce methods for de novo assembly of viral sequences, identification of potential viral sequences from BLAST data, and bioinformatics for generating full-length or near full-length viral genome sequences. We will also discuss implications of the ubiquity of viruses in insects and in insect cell lines. All of the methods described in this article can also apply to the discovery of viruses in other organisms. PMID:22069519

  1. Third Generation Sequencing Techniques and Applications to Drug Discovery

    PubMed Central

    Ozsolak, Fatih

    2012-01-01

    Introduction There is an immediate need for functional and molecular studies to decipher differences between disease and “normal” settings to identify large quantities of validated targets with the highest therapeutic utilities. Furthermore, drug mechanism of action and biomarkers to predict drug efficacy and safety need to be identified for effective design of clinical trials, decreasing attrition rates, regulatory agency approval process and drug repositioning. By expanding the power of genetics and pharmacogenetics studies, next generation nucleic acid sequencing technologies have started to play an important role in all stages of drug discovery. Areas covered This article reviews the first and second generation sequencing technologies (SGSTs) and challenges they pose to biomedicine. The article then focuses on the emerging third generation sequencing technologies (TGSTs), their technological foundations and potential contributions to drug discovery. Expert Opinion Despite the scientific and commercial success of SGSTs, the goal of rapid, comprehensive and unbiased sequencing of nucleic acids has not been achieved. TGSTs promise to increase sequencing throughput and read lengths, decrease costs, run times and error rates, eliminate biases inherent in SGSTs, and offer capabilities beyond nucleic acid sequencing. Such changes will have positive impact in all sequencing applications to drug discovery. PMID:22468954

  2. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges.

    PubMed

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS. PMID:26473927

  3. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges.

    PubMed

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS.

  4. Next-Generation Sequencing in Clinical Molecular Diagnostics of Cancer: Advantages and Challenges

    PubMed Central

    Luthra, Rajyalakshmi; Chen, Hui; Roy-Chowdhuri, Sinchita; Singh, R. Rajesh

    2015-01-01

    The application of next-generation sequencing (NGS) to characterize cancer genomes has resulted in the discovery of numerous genetic markers. Consequently, the number of markers that warrant routine screening in molecular diagnostic laboratories, often from limited tumor material, has increased. This increased demand has been difficult to manage by traditional low- and/or medium-throughput sequencing platforms. Massively parallel sequencing capabilities of NGS provide a much-needed alternative for mutation screening in multiple genes with a single low investment of DNA. However, implementation of NGS technologies, most of which are for research use only (RUO), in a diagnostic laboratory, needs extensive validation in order to establish Clinical Laboratory Improvement Amendments (CLIA) and College of American Pathologists (CAP)-compliant performance characteristics. Here, we have reviewed approaches for validation of NGS technology for routine screening of tumors. We discuss the criteria for selecting gene markers to include in the NGS panel and the deciding factors for selecting target capture approaches and sequencing platforms. We also discuss challenges in result reporting, storage and retrieval of the voluminous sequencing data and the future potential of clinical NGS. PMID:26473927

  5. Collaborative Effort for a Centralized Worldwide Tuberculosis Relational Sequencing Data Platform

    PubMed Central

    Starks, Angela M.; Avilés, Enrique; Cirillo, Daniela M.; Denkinger, Claudia M.; Dolinger, David L.; Emerson, Claudia; Gallarda, Jim; Hanna, Debra; Kim, Peter S.; Liwski, Richard; Miotto, Paolo; Schito, Marco; Zignol, Matteo

    2015-01-01

    Continued progress in addressing challenges associated with detection and management of tuberculosis requires new diagnostic tools. These tools must be able to provide rapid and accurate information for detecting resistance to guide selection of the treatment regimen for each patient. To achieve this goal, globally representative genotypic, phenotypic, and clinical data are needed in a standardized and curated data platform. A global partnership of academic institutions, public health agencies, and nongovernmental organizations has been established to develop a tuberculosis relational sequencing data platform (ReSeqTB) that seeks to increase understanding of the genetic basis of resistance by correlating molecular data with results from drug susceptibility testing and, optimally, associated patient outcomes. These data will inform development of new diagnostics, facilitate clinical decision making, and improve surveillance for drug resistance. ReSeqTB offers an opportunity for collaboration to achieve improved patient outcomes and to advance efforts to prevent and control this devastating disease. PMID:26409275

  6. Collaborative Effort for a Centralized Worldwide Tuberculosis Relational Sequencing Data Platform.

    PubMed

    Starks, Angela M; Avilés, Enrique; Cirillo, Daniela M; Denkinger, Claudia M; Dolinger, David L; Emerson, Claudia; Gallarda, Jim; Hanna, Debra; Kim, Peter S; Liwski, Richard; Miotto, Paolo; Schito, Marco; Zignol, Matteo

    2015-10-15

    Continued progress in addressing challenges associated with detection and management of tuberculosis requires new diagnostic tools. These tools must be able to provide rapid and accurate information for detecting resistance to guide selection of the treatment regimen for each patient. To achieve this goal, globally representative genotypic, phenotypic, and clinical data are needed in a standardized and curated data platform. A global partnership of academic institutions, public health agencies, and nongovernmental organizations has been established to develop a tuberculosis relational sequencing data platform (ReSeqTB) that seeks to increase understanding of the genetic basis of resistance by correlating molecular data with results from drug susceptibility testing and, optimally, associated patient outcomes. These data will inform development of new diagnostics, facilitate clinical decision making, and improve surveillance for drug resistance. ReSeqTB offers an opportunity for collaboration to achieve improved patient outcomes and to advance efforts to prevent and control this devastating disease. PMID:26409275

  7. The generation challenge programme platform: semantic standards and workbench for crop science.

    PubMed

    Bruskiewich, Richard; Senger, Martin; Davenport, Guy; Ruiz, Manuel; Rouard, Mathieu; Hazekamp, Tom; Takeya, Masaru; Doi, Koji; Satoh, Kouji; Costa, Marcos; Simon, Reinhard; Balaji, Jayashree; Akintunde, Akinnola; Mauleon, Ramil; Wanchana, Samart; Shah, Trushar; Anacleto, Mylah; Portugal, Arllet; Ulat, Victor Jun; Thongjuea, Supat; Braak, Kyle; Ritter, Sebastian; Dereeper, Alexis; Skofic, Milko; Rojas, Edwin; Martins, Natalia; Pappas, Georgios; Alamban, Ryan; Almodiel, Roque; Barboza, Lord Hendrix; Detras, Jeffrey; Manansala, Kevin; Mendoza, Michael Jonathan; Morales, Jeffrey; Peralta, Barry; Valerio, Rowena; Zhang, Yi; Gregorio, Sergio; Hermocilla, Joseph; Echavez, Michael; Yap, Jan Michael; Farmer, Andrew; Schiltz, Gary; Lee, Jennifer; Casstevens, Terry; Jaiswal, Pankaj; Meintjes, Ayton; Wilkinson, Mark; Good, Benjamin; Wagner, James; Morris, Jane; Marshall, David; Collins, Anthony; Kikuchi, Shoshi; Metz, Thomas; McLaren, Graham; van Hintum, Theo

    2008-01-01

    The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform; (ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data; (iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.

  8. A Robust High Throughput Platform to Generate Functional Recombinant Monoclonal Antibodies Using Rabbit B Cells from Peripheral Blood

    PubMed Central

    Seeber, Stefan; Ros, Francesca; Thorey, Irmgard; Tiefenthaler, Georg; Kaluza, Klaus; Lifke, Valeria; Fischer, Jens André Alexander; Klostermann, Stefan; Endl, Josef; Kopetzki, Erhard; Pashine, Achal; Siewe, Basile; Kaluza, Brigitte; Platzer, Josef; Offner, Sonja

    2014-01-01

    We have developed a robust platform to generate and functionally characterize rabbit-derived antibodies using B cells from peripheral blood. The rapid high throughput procedure generates a diverse set of antibodies, yet requires only few animals to be immunized without the need to sacrifice them. The workflow includes (i) the identification and isolation of single B cells from rabbit blood expressing IgG antibodies, (ii) an elaborate short term B-cell cultivation to produce sufficient monoclonal antigen specific IgG for comprehensive phenotype screens, (iii) the isolation of VH and VL coding regions via PCR from B-cell clones producing antigen specific and functional antibodies followed by the sequence determination, and (iv) the recombinant expression and purification of IgG antibodies. The fully integrated and to a large degree automated platform (demonstrated in this paper using IL1RL1 immunized rabbits) yielded clonal and very diverse IL1RL1-specific and functional IL1RL1-inhibiting rabbit antibodies. These functional IgGs from individual animals were obtained at a short time range after immunization and could be identified already during primary screening, thus substantially lowering the workload for the subsequent B-cell PCR workflow. Early availability of sequence information permits one to select early-on function- and sequence-diverse antibodies for further characterization. In summary, this powerful technology platform has proven to be an efficient and robust method for the rapid generation of antigen specific and functional monoclonal rabbit antibodies without sacrificing the immunized animal. PMID:24503933

  9. A robust high throughput platform to generate functional recombinant monoclonal antibodies using rabbit B cells from peripheral blood.

    PubMed

    Seeber, Stefan; Ros, Francesca; Thorey, Irmgard; Tiefenthaler, Georg; Kaluza, Klaus; Lifke, Valeria; Fischer, Jens André Alexander; Klostermann, Stefan; Endl, Josef; Kopetzki, Erhard; Pashine, Achal; Siewe, Basile; Kaluza, Brigitte; Platzer, Josef; Offner, Sonja

    2014-01-01

    We have developed a robust platform to generate and functionally characterize rabbit-derived antibodies using B cells from peripheral blood. The rapid high throughput procedure generates a diverse set of antibodies, yet requires only few animals to be immunized without the need to sacrifice them. The workflow includes (i) the identification and isolation of single B cells from rabbit blood expressing IgG antibodies, (ii) an elaborate short term B-cell cultivation to produce sufficient monoclonal antigen specific IgG for comprehensive phenotype screens, (iii) the isolation of VH and VL coding regions via PCR from B-cell clones producing antigen specific and functional antibodies followed by the sequence determination, and (iv) the recombinant expression and purification of IgG antibodies. The fully integrated and to a large degree automated platform (demonstrated in this paper using IL1RL1 immunized rabbits) yielded clonal and very diverse IL1RL1-specific and functional IL1RL1-inhibiting rabbit antibodies. These functional IgGs from individual animals were obtained at a short time range after immunization and could be identified already during primary screening, thus substantially lowering the workload for the subsequent B-cell PCR workflow. Early availability of sequence information permits one to select early-on function- and sequence-diverse antibodies for further characterization. In summary, this powerful technology platform has proven to be an efficient and robust method for the rapid generation of antigen specific and functional monoclonal rabbit antibodies without sacrificing the immunized animal.

  10. Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition

    PubMed Central

    Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey

    2015-01-01

    Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation. PMID:26536029

  11. Pittosporum cryptic virus 1: genome sequence completion using next-generation sequencing.

    PubMed

    Elbeaino, Toufic; Kubaa, Raied Abou; Tuzlali, Hasan Tuna; Digiaro, Michele

    2016-07-01

    Next-generation sequencing (NGS) was applied to dsRNAs extracted from an Italian pittosporum plant infected with pittosporum cryptic virus 1 (PiCV1). NGS allowed assembly of the full genome sequence of PiCV1, comprising dsRNA1 (1.9 kbp) and dsRNA2 (1.5 kbp), which encode the RNA-dependent RNA polymerase and capsid protein genes, respectively. Phylogenetic and sequence analyses confirmed that PiCV1 is a new member of the genus Deltapartitivirus, family Partiviridae. From the same plant, NSG also permitted assembly of the complete genome sequence of eggplant mottled dwarf virus (EMDV), which shared 86 % to 98 % nucleotide sequence identity with complete and partial sequences (ca 6750 nt) of other known EMDV isolates with sequences available in the GenBank database. PMID:27087112

  12. A Pulse Generator Based on an Arduino Platform for Ultrasonic Applications

    NASA Astrophysics Data System (ADS)

    Acevedo, Pedro; Vázquez, Mónica; Durán, Joel; Petrearce, Rodolfo

    The objective of this work is to use the Arduino platform as an ultrasonic pulse generator to excite PVDF ultrasonic arrays in transmission. An experimental setup was implemented using a through-transmission configuration to evaluate the performance of the generator.

  13. Exploring the Switchgrass Transcriptome Using Second-Generation Sequencing Technology

    PubMed Central

    Iyer, Niranjani J.; Bryant, Douglas W.; Mockler, Todd C.; Mahalingam, Ramamurthy

    2012-01-01

    Background Switchgrass (Panicum virgatum L.) is a C4 perennial grass and widely popular as an important bioenergy crop. To accelerate the pace of developing high yielding switchgrass cultivars adapted to diverse environmental niches, the generation of genomic resources for this plant is necessary. The large genome size and polyploid nature of switchgrass makes whole genome sequencing a daunting task even with current technologies. Exploring the transcriptional landscape using next generation sequencing technologies provides a viable alternative to whole genome sequencing in switchgrass. Principal Findings Switchgrass cDNA libraries from germinating seedlings, emerging tillers, flowers, and dormant seeds were sequenced using Roche 454 GS-FLX Titanium technology, generating 980,000 reads with an average read length of 367 bp. De novo assembly generated 243,600 contigs with an average length of 535 bp. Using the foxtail millet genome as a reference greatly improved the assembly and annotation of switchgrass ESTs. Comparative analysis of the 454-derived switchgrass EST reads with other sequenced monocots including Brachypodium, sorghum, rice and maize indicated a 70–80% overlap. RPKM analysis demonstrated unique transcriptional signatures of the four tissues analyzed in this study. More than 24,000 ESTs were identified in the dormant seed library. In silico analysis indicated that there are more than 2000 EST-SSRs in this collection. Expression of several orphan ESTs was confirmed by RT-PCR. Significance We estimate that about 90% of the switchgrass gene space has been covered in this analysis. This study nearly doubles the amount of EST information for switchgrass currently in the public domain. The celerity and economical nature of second-generation sequencing technologies provide an in-depth view of the gene space of complex genomes like switchgrass. Sequence analysis of closely related members of the NAD+-malic enzyme type C4 grasses such as the model system

  14. Clinical Next Generation Sequencing for Precision Medicine in Cancer

    PubMed Central

    Dong, Ling; Wang, Wanheng; Li, Alvin; Kansal, Rina; Chen, Yuhan; Chen, Hong; Li, Xinmin

    2015-01-01

    Rapid adoption of next generation sequencing (NGS) in genomic medicine has been driven by low cost, high throughput sequencing and rapid advances in our understanding of the genetic bases of human diseases. Today, the NGS method has dominated sequencing space in genomic research, and quickly entered clinical practice. Because unique features of NGS perfectly meet the clinical reality (need to do more with less), the NGS technology is becoming a driving force to realize the dream of precision medicine. This article describes the strengths of NGS, NGS panels used in precision medicine, current applications of NGS in cytology, and its challenges and future directions for routine clinical use. PMID:27006629

  15. Phase-defined complete sequencing of the HLA genes by next-generation sequencing

    PubMed Central

    2013-01-01

    Background The human leukocyte antigen (HLA) region, the 3.8-Mb segment of the human genome at 6p21, has been associated with more than 100 different diseases, mostly autoimmune diseases. Due to the complex nature of HLA genes, there are difficulties in elucidating complete HLA gene sequences especially HLA gene haplotype structures by the conventional sequencing method. We propose a novel, accurate, and cost-effective method for generating phase-defined complete sequencing of HLA genes by using indexed multiplex next generation sequencing. Results A total of 33 HLA homozygous samples, 11 HLA heterozygous samples, and 3 parents-child families were subjected to phase-defined HLA gene sequencing. We applied long-range PCR to amplify six HLA genes (HLA-A, -C, -B, DRB1, -DQB1, and –DPB1) followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. Paired-end reads (2 × 250 bp) derived from the sequencer were aligned to the six HLA gene segments of UCSC hg19 allowing at most 80 bases mismatch. For HLA homozygous samples, the six amplicons of an individual were pooled and simultaneously sequenced and mapped as an individual-tagging method. The paired-end reads were aligned to corresponding genes of UCSC hg19 and unambiguous, continuous sequences were obtained. For HLA heterozygous samples, each amplicon was separately sequenced and mapped as a gene-tagging method. After alignments, we detected informative paired-end reads harboring SNVs on both forward and reverse reads that are used to separate two chromosomes and to generate two phase-defined sequences in an individual. Consequently, we were able to determine the phase-defined HLA gene sequences from promoter to 3′-UTR and assign up to 8-digit HLA allele numbers, regardless of whether the alleles are rare or novel. Parent–child trio-based sequencing validated our sequencing and phasing methods. Conclusions Our protocol generated phased-defined sequences of the entire

  16. Next-generation sequencing data interpretation: enhancing reproducibility and accessibility.

    PubMed

    Nekrutenko, Anton; Taylor, James

    2012-09-01

    Areas of life sciences research that were previously distant from each other in ideology, analysis practices and toolkits, such as microbial ecology and personalized medicine, have all embraced techniques that rely on next-generation sequencing instruments. Yet the capacity to generate the data greatly outpaces our ability to analyse it. Existing sequencing technologies are more mature and accessible than the methodologies that are available for individual researchers to move, store, analyse and present data in a fashion that is transparent and reproducible. Here we discuss currently pressing issues with analysis, interpretation, reproducibility and accessibility of these data, and we present promising solutions and venture into potential future developments.

  17. Using Illumina next generation sequencing technologies to sequence multigene families in de novo species.

    PubMed

    Hughes, Graham M; Gang, Li; Murphy, William J; Higgins, Desmond G; Teeling, Emma C

    2013-05-01

    The advent of Next Generation Sequencing Technology (NGST) has revolutionized molecular biology research, allowing for rapid gene/genome sequencing from a multitude of diverse species. As high throughput sequencing becomes more accessible, more efficient workflows must be developed to deal with the amounts of data produced and better assemble the genomes of de novo lineages. We combine traditional laboratory methods with Illumina NGST to amplify and sequence the largest mammalian multigene family, the Olfactory Receptor gene family, for species with and without a reference genome. We develop novel assembly methods to annotate and filter these data, which can be utilized for any gene family or any species. We find no significant difference between the ratio of genes within their respective gene families of our data compared with available genomic data. Using simulated data we explore the limitations of short-read sequence data and our assembly in recovering this gene family. We highlight the benefits and shortcomings of these methods. Compared with data generated from traditional polymerase chain reaction, cloning and Sanger sequencing methodologies, sequence data generated using our pipeline increases yield and sequencing efficiency without reducing the number of unique genes amplified. A cloning step is not required, therefore shortening data generation time. The novel downstream methodologies and workflows described provide a tool to be utilized by many fields of biology, to access and analyze the vast quantities of data generated. By combining laboratory and in silico methods, we provide a means of extracting genomic information for multigene families without complete genome sequencing. PMID:23480365

  18. A resampling procedure for generating conditioned daily weather sequences

    USGS Publications Warehouse

    Clark, M.P.; Gangopadhyay, S.; Brandon, D.; Werner, K.; Hay, L.; Rajagopalan, B.; Yates, D.

    2004-01-01

    [1] A method is introduced to generate conditioned daily precipitation and temperature time series at multiple stations. The method resamples data from the historical record "nens" times for the period of interest (nens = number of ensemble members) and reorders the ensemble members to reconstruct the observed spatial (intersite) and temporal correlation statistics. The weather generator model is applied to 2307 stations in the contiguous United States and is shown to reproduce the observed spatial correlation between neighboring stations, the observed correlation between variables (e.g., between precipitation and temperature), and the observed temporal correlation between subsequent days in the generated weather sequence. The weather generator model is extended to produce sequences of weather that are conditioned on climate indices (in this case the Nin??o 3.4 index). Example illustrations of conditioned weather sequences are provided for a station in Arizona (Petrified Forest, 34.8??N, 109.9??W), where El Nin??o and La Nin??a conditions have a strong effect on winter precipitation. The conditioned weather sequences generated using the methods described in this paper are appropriate for use as input to hydrologic models to produce multiseason forecasts of streamflow.

  19. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    PubMed

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes.

  20. A Real-Time de novo DNA Sequencing Assembly Platform Based on an FPGA Implementation.

    PubMed

    Hu, Yuanqi; Georgiou, Pantelis

    2016-01-01

    This paper presents an FPGA based DNA comparison platform which can be run concurrently with the sensing phase of DNA sequencing and shortens the overall time needed for de novo DNA assembly. A hybrid overlap searching algorithm is applied which is scalable and can deal with incremental detection of new bases. To handle the incomplete data set which gradually increases during sequencing time, all-against-all comparisons are broken down into successive window-against-window comparison phases and executed using a novel dynamic suffix comparison algorithm combined with a partitioned dynamic programming method. The complete system has been designed to facilitate parallel processing in hardware, which allows real-time comparison and full scalability as well as a decrease in the number of computations required. A base pair comparison rate of 51.2 G/s is achieved when implemented on an FPGA with successful DNA comparison when using data sets from real genomes. PMID:27045828

  1. DNA immunoprecipitation semiconductor sequencing (DIP-SC-seq) as a rapid method to generate genome wide epigenetic signatures.

    PubMed

    Thomson, John P; Fawkes, Angie; Ottaviano, Raffaele; Hunter, Jennifer M; Shukla, Ruchi; Mjoseng, Heidi K; Clark, Richard; Coutts, Audrey; Murphy, Lee; Meehan, Richard R

    2015-05-14

    Modification of DNA resulting in 5-methylcytosine (5 mC) or 5-hydroxymethylcytosine (5hmC) has been shown to influence the local chromatin environment and affect transcription. Although recent advances in next generation sequencing technology allow researchers to map epigenetic modifications across the genome, such experiments are often time-consuming and cost prohibitive. Here we present a rapid and cost effective method of generating genome wide DNA modification maps utilising commercially available semiconductor based technology (DNA immunoprecipitation semiconductor sequencing; "DIP-SC-seq") on the Ion Proton sequencer. Focussing on the 5hmC mark we demonstrate, by directly comparing with alternative sequencing strategies, that this platform can successfully generate genome wide 5hmC patterns from as little as 500 ng of genomic DNA in less than 4 days. Such a method can therefore facilitate the rapid generation of multiple genome wide epigenetic datasets.

  2. Transcriptome sequencing of a highly salt tolerant mangrove species Sonneratia alba using Illumina platform.

    PubMed

    Chen, Sufang; Zhou, Renchao; Huang, Yelin; Zhang, Meng; Yang, Guili; Zhong, Cairong; Shi, Suhua

    2011-06-01

    Mangroves are critical and threatened marine resources, yet few transcriptomic and genomic data are available in public databases. The transcriptome of a highly salt tolerant mangrove species, Sonneratia alba, was sequenced using the Illumina Genome Analyzer in this study. Over 15 million 75-bp paired-end reads were assembled into 30,628 unique sequences with an average length of 581 bp. Of them, 2358 SSRs were detected, with di-nucleotide repeats (59.2%) and tri-nucleotide repeats (37.7%) being the most common. Analysis of codon usage bias based on 20,945 coding sequences indicated that genes of S. alba were less biased than those of some microorganisms and Drosophila and that codon usage variation in S. alba was due primarily to compositional mutation bias, while translational selection has a relatively weak effect. Genome-wide gene ontology (GO) assignments showed that S. alba shared a similar GO slim classification with Arabidopsis thaliana. High percentages of sequences assigned to GO slim category 'mitochondrion' and four KEGG pathways, such as carbohydrates and secondary metabolites metabolism, may contribute to salt adaptation of S. alba. In addition, 1266 unique sequences matched to 273 known salt responsive genes (gene families) in other species were screened as candidates for salt tolerance of S. alba, and some of these genes showed fairly high coverage depth. At last, we identified four genes with signals of strong diversifying selection (K(a)/K(s)>1) by comparing the transcriptome sequences of S. alba with 249 known ESTs from its congener S. caseolaris. This study demonstrated a successful application of the Illumina platform to de novo assembly of the transcriptome of a non-model organism. Abundant SSR markers, salt responsive genes and four genes with signature of natural selection obtained from S. alba provide abundant sequence sources for future genetic diversity, salt adaptation and speciation studies. PMID:21620334

  3. Detection of Bacillus anthracis DNA in complex soil and air samples using next-generation sequencing.

    PubMed

    Be, Nicholas A; Thissen, James B; Gardner, Shea N; McLoughlin, Kevin S; Fofanov, Viacheslav Y; Koshinsky, Heather; Ellingson, Sally R; Brettin, Thomas S; Jackson, Paul J; Jaing, Crystal J

    2013-01-01

    Bacillus anthracis is the potentially lethal etiologic agent of anthrax disease, and is a significant concern in the realm of biodefense. One of the cornerstones of an effective biodefense strategy is the ability to detect infectious agents with a high degree of sensitivity and specificity in the context of a complex sample background. The nature of the B. anthracis genome, however, renders specific detection difficult, due to close homology with B. cereus and B. thuringiensis. We therefore elected to determine the efficacy of next-generation sequencing analysis and microarrays for detection of B. anthracis in an environmental background. We applied next-generation sequencing to titrated genome copy numbers of B. anthracis in the presence of background nucleic acid extracted from aerosol and soil samples. We found next-generation sequencing to be capable of detecting as few as 10 genomic equivalents of B. anthracis DNA per nanogram of background nucleic acid. Detection was accomplished by mapping reads to either a defined subset of reference genomes or to the full GenBank database. Moreover, sequence data obtained from B. anthracis could be reliably distinguished from sequence data mapping to either B. cereus or B. thuringiensis. We also demonstrated the efficacy of a microbial census microarray in detecting B. anthracis in the same samples, representing a cost-effective and high-throughput approach, complementary to next-generation sequencing. Our results, in combination with the capacity of sequencing for providing insights into the genomic characteristics of complex and novel organisms, suggest that these platforms should be considered important components of a biosurveillance strategy.

  4. Performance Evaluation Tools for Next Generation Scalable Computing Platforms

    NASA Technical Reports Server (NTRS)

    Yan, Jerry C.; Sarukkai, Sekhar; Craw, James (Technical Monitor)

    1995-01-01

    The Federal High Performance and Communications (HPCC) Program continue to focus on R&D in a wide range of high performance computing and communications technologies. Using its accomplishments in the past four years as building blocks towards a Global Information Infrastructure (GII), an Implementation Plan that identifies six Strategic Focus Areas for R&D has been proposed. This white paper argues that a new generation of system software and programming tools must be developed to support these focus areas, so that the R&D we invest today can lead to technology pay-off a decade from now. The Global Computing Infrastructure (GCI) in the Year 2000 and Beyond would consists of thousands of powerful computing nodes connected via high-speed networks across the globe. Users will be able to obtain computing in formation services the GCI with the ease of using a plugging a toaster into the electrical outlet on the wall anywhere in the country. Developing and managing the GO requires performance prediction and monitoring capabilities that do not exist. Various accomplishments in this field today must be integrated and expanded to support this vision.

  5. Repetitive reef to ooid sequences near leeward margin of Caicos Platform, British West Indies

    SciTech Connect

    Waltz, M.; Rossinsky, V.; Wanless, H.R.

    1987-05-01

    Drill core transects and outcrops near the leeward margin of the Caicos Platform, BWI, reveal repetitive (one Holocene and two Pleistocene) shallowing-upward sequences of either (a) reefal boundstones overlain by layered oolitic grainstones or (b) burrowed oolitic grainstones overlain by layered oolitic grainstones. Each sediment sequence is separated from the other by a calcrete exposure surface. A transect, perpendicular to the trend of an exposed Pleistocene barrier reef/ooid sand complex, shows two separate sediment packages of reefal boundstones and reef-derived skeletal packstones overlain by layered oolitic grainstones. The well-exposed upper package consists of a shallowing-upward barrier reef, which is immediately overlain by burrowed and cross-bedded oolitic grainstones, beach rock blocks, and coral rubble, capped by layered oolitic grainstones. Separated by an exposure horizon, the lowermost package consists of coral and skeletal sands overlain by layered oolitic grainstones. Cores from a transect in a non-reefal setting north of the barrier reef complex reveal highly burrowed oolitic grainstones capped by layered oolitic grainstones. As a Holocene example, immediately offshore of this transect, modern reefs and bioturbated oolitic grainstones are presently being buried beneath coral rubble, beach rock blocks, and prograding oolitic beaches. Deposition of the capping layered oolitic grainstones appears to occur during stable and falling sea levels. This co-occurrence of reefal sediment and ooid sands suggests that the two are not mutually exclusive and that reef-ooid succession is a reoccurring part of leeward margin platform margin-building.

  6. Next generation sequencing (NGS): a golden tool in forensic toolkit.

    PubMed

    Aly, S M; Sabri, D M

    2015-01-01

    The DNA analysis is a cornerstone in contemporary forensic sciences. DNA sequencing technologies are powerful tools that enrich molecular sciences in the past based on Sanger sequencing and continue to glowing these sciences based on Next generation sequencing (NGS). Next generation sequencing has excellent potential to flourish and increase the molecular applications in forensic sciences by jumping over the pitfalls of the conventional method of sequencing. The main advantages of NGS compared to conventional method that it utilizes simultaneously a large number of genetic markers with high-resolution of genetic data. These advantages will help in solving several challenges such as mixture analysis and dealing with minute degraded samples. Based on these new technologies, many markers could be examined to get important biological data such as age, geographical origins, tissue type determination, external visible traits and monozygotic twins identification. It also could get data related to microbes, insects, plants and soil which are of great medico-legal importance. Despite the dozens of forensic research involving NGS, there are requirements before using this technology routinely in forensic cases. Thus, there is a great need to more studies that address robustness of these techniques. Therefore, this work highlights the applications of forensic sciences in the era of massively parallel sequencing.

  7. Pattern Recognition on Read Positioning in Next Generation Sequencing

    PubMed Central

    Byeon, Boseon; Kovalchuk, Igor

    2016-01-01

    The usefulness and the utility of the next generation sequencing (NGS) technology are based on the assumption that the DNA or cDNA cleavage required to generate short sequence reads is random. Several previous reports suggest the existence of sequencing bias of NGS reads. To address this question in greater detail, we analyze NGS data from four organisms with different GC content, Plasmodium falciparum (19.39%), Arabidopsis thaliana (36.03%), Homo sapiens (40.91%) and Streptomyces coelicolor (72.00%). Using machine learning techniques, we recognize the pattern that the NGS read start is positioned in the local region where the nucleotide distribution is dissimilar from the global nucleotide distribution. We also demonstrate that the mono-nucleotide distribution underestimates sequencing bias, and the recognized pattern is explained largely by the distribution of multi-nucleotides (di-, tri-, and tetra- nucleotides) rather than mono-nucleotides. This implies that the correction of sequencing bias needs to be performed on the basis of the multi-nucleotide distribution. Providing companion software to quantify the effect of the recognized pattern on read positioning, we exemplify that the bias correction based on the mono-nucleotide distribution may not be sufficient to clean sequencing bias. PMID:27299343

  8. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline

    PubMed Central

    2014-01-01

    Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. PMID:24475911

  9. Nanomicroarray and Multiplex Next-Generation Sequencing for Simultaneous Identification and Characterization of Influenza Viruses

    PubMed Central

    Ragupathy, Viswanath; Liu, Jikun; Wang, Xue; Vemula, Sai Vikram; El Mubarak, Haja Sittana; Ye, Zhiping; Landry, Marie L.

    2015-01-01

    Conventional methods for detection and discrimination of influenza viruses are time consuming and labor intensive. We developed a diagnostic platform for simultaneous identification and characterization of influenza viruses that uses a combination of nanomicroarray for screening and multiplex next-generation sequencing (NGS) assays for laboratory confirmation. The nanomicroarray was developed to target hemagglutinin, neuraminidase, and matrix genes to identify influenza A and B viruses. PCR amplicons synthesized by using an adapted universal primer for all 8 gene segments of 9 influenza A subtypes were detected in the nanomicroarray and confirmed by the NGS assays. This platform can simultaneously detect and differentiate multiple influenza A subtypes in a single sample. Use of these methods as part of a new diagnostic algorithm for detection and confirmation of influenza infections may provide ongoing public health benefits by assisting with future epidemiologic studies and improving preparedness for potential influenza pandemics. PMID:25694248

  10. Insights into cancer biology through next-generation sequencing.

    PubMed

    Nik-Zainal, Serena

    2014-12-01

    Cancer is the ultimate disorder of the genome, characterised not by just one or two mutations, but by hundreds to thousands of acquired mutations that have been accrued through the development of a tumour. Thanks to the recent increase in the speed of sequencing offered by modern sequencing technologies, we are no longer restricted to exploring tiny fragments of protein-coding portions of the human genome. We can now read all the genetic material in human cells. Here, the framework of a next-generation sequencing experiment is explained, giving insight into the advances and difficulties posed by processing the enormous datasets generated through these methods. Some of the recent insights into tumour biology, that exploit the extraordinary surge in scale and the digital nature of next-generation sequencing, are highlighted, including cancer gene discovery, the detection of mutation signatures and cancer evolution. Technological and intellectual developments are starting to shape the personalized cancer genomic profiles of tomorrow. Let's train the next-generation of clinicians to be able to read them from today.

  11. Insights into cancer biology through next-generation sequencing.

    PubMed

    Nik-Zainal, Serena

    2014-12-01

    Cancer is the ultimate disorder of the genome, characterised not by just one or two mutations, but by hundreds to thousands of acquired mutations that have been accrued through the development of a tumour. Thanks to the recent increase in the speed of sequencing offered by modern sequencing technologies, we are no longer restricted to exploring tiny fragments of protein-coding portions of the human genome. We can now read all the genetic material in human cells. Here, the framework of a next-generation sequencing experiment is explained, giving insight into the advances and difficulties posed by processing the enormous datasets generated through these methods. Some of the recent insights into tumour biology, that exploit the extraordinary surge in scale and the digital nature of next-generation sequencing, are highlighted, including cancer gene discovery, the detection of mutation signatures and cancer evolution. Technological and intellectual developments are starting to shape the personalized cancer genomic profiles of tomorrow. Let's train the next-generation of clinicians to be able to read them from today. PMID:25468925

  12. Genome wide SNP discovery in flax through next generation sequencing of reduced representation libraries

    PubMed Central

    2012-01-01

    Background Flax (Linum usitatissimum L.) is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs. Results Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents). Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%. Conclusions Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from flax. The genotyping-by-sequencing

  13. Sequencing, De novo Assembly, Functional Annotation and Analysis of Phyllanthus amarus Leaf Transcriptome Using the Illumina Platform

    PubMed Central

    Bose Mazumdar, Aparupa; Chattopadhyay, Sharmila

    2016-01-01

    Phyllanthus amarus Schum. and Thonn., a widely distributed annual medicinal herb has a long history of use in the traditional system of medicine for over 2000 years. However, the lack of genomic data for P. amarus, a non-model organism hinders research at the molecular level. In the present study, high-throughput sequencing technology has been employed to enhance better understanding of this herb and provide comprehensive genomic information for future work. Here P. amarus leaf transcriptome was sequenced using the Illumina Miseq platform. We assembled 85,927 non-redundant (nr) “unitranscript” sequences with an average length of 1548 bp, from 18,060,997 raw reads. Sequence similarity analyses and annotation of these unitranscripts were performed against databases like green plants nr protein database, Gene Ontology (GO), Clusters of Orthologous Groups (COG), PlnTFDB, KEGG databases. As a result, 69,394 GO terms, 583 enzyme codes (EC), 134 KEGG maps, and 59 Transcription Factor (TF) families were generated. Functional and comparative analyses of assembled unitranscripts were also performed with the most closely related species like Populus trichocarpa and Ricinus communis using TRAPID. KEGG analysis showed that a number of assembled unitranscripts were involved in secondary metabolites, mainly phenylpropanoid, flavonoid, terpenoids, alkaloids, and lignan biosynthetic pathways that have significant medicinal attributes. Further, Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values of the identified secondary metabolite pathway genes were determined and Reverse Transcription PCR (RT-PCR) of a few of these genes were performed to validate the de novo assembled leaf transcriptome dataset. In addition 65,273 simple sequence repeats (SSRs) were also identified. To the best of our knowledge, this is the first transcriptomic dataset of P. amarus till date. Our study provides the largest genetic resource that will lead to drug development and pave

  14. Using chaos to generate variations on movement sequences

    NASA Astrophysics Data System (ADS)

    Bradley, Elizabeth; Stuart, Joshua

    1998-12-01

    We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.

  15. Next Generation Sequencing to Characterize Mitochondrial Genomic DNA Heteroplasmy

    PubMed Central

    Huang, Taosheng

    2015-01-01

    This protocol is to describe the methodology to characterize mitochondria DNA (mtDNA) heteroplasmy with parallel sequencing. Mitochondria play a very important role in important cellular functions. Each eukaryotic cell contains hundreds of mitochondria with hundreds of mitochondria genomes. The mutant mtDNA and the wild type may co-exist as heteroplasmy, and cause human disease. The purpose of this methodology is to simultaneously determine mtDNA sequence and to quantify the heteroplasmy level. The protocol includes two-fragment mitochondria genome DNA PCR amplification. The PCR product is then mixed at an equimolar ratio. The samples will be barcoded and sequenced with high-throughput next-generation sequencing technology. We found that this technology is highly sensitive, specific, and accurate in determining mtDNA mutations and the degree of heteroplasmic level. PMID:21975941

  16. The 2013 seismic sequence close to gas injection platform of the Castor project, offshore Spain

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Grigoli, Francesco; Heimann, Sebastian; Gonzalez, Alvaro; Buforn, Elisa; Maghsoudi, Samira; Blanch, Estefania; Dahm, Torsten

    2014-05-01

    A spatially localized seismic sequence has originated few tens of kilometres offshore the Mediterranean coast of Spain, starting on September 5, 2013, and lasting at least until October 2013. The sequence culminated in a maximal moment magnitude Mw 4.3 earthquake, on October 1, 2013. The epicentral region is located near the offshore platform of the Castor project, where gas is conducted through a pipeline from mainland and where it was recently injected in a depleted oil reservoir, at about 2 km depth. We analyse the temporal evolution of the seismic sequence and use full waveform techniques to derive absolute and relative locations, estimate depths and focal mechanisms for the largest events in the sequence (with magnitude mbLg larger than 3), and compare them to a previous event (April 8, 2012, mbLg 3.3) taking place in the same region prior to the gas injection. Moment tensor inversion results show that the overall seismicity in this sequence is characterized by oblique mechanisms with a normal fault component, with a 30° low-dip angle plane oriented NNE-SSW and a sub- vertical plane oriented NW-SE. The combined analysis of hypocentral location and focal mechanisms could indicate that the seismic sequence corresponds to rupture processes along sub- horizontal shallow surfaces, which could have been triggered by the gas injection in the reservoir,. An alternative scenario includes the iterated triggering of a system of steep faults oriented NW-SE, which were identified by prior marine seismics investigations. The most relevant seismogenic feature in the area is the Fosa de Amposta fault system, which includes different strands mapped at different distances to the coast, with a general NE-SW orientation, roughly parallel to the coastline. No significant known historical seismicity has involved this fault in the past. Our both scenarios exclude its activation, as its known orientation is inconsistent with focal mechanism results.

  17. Next Generation Sequencing of Pooled Samples: Guideline for Variants’ Filtering

    PubMed Central

    Anand, Santosh; Mangano, Eleonora; Barizzone, Nadia; Bordoni, Roberta; Sorosina, Melissa; Clarelli, Ferdinando; Corrado, Lucia; Martinelli Boneschi, Filippo; D’Alfonso, Sandra; De Bellis, Gianluca

    2016-01-01

    Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments. PMID:27670852

  18. Development of microsatellite markers for the Korean Mussel, Mytilus coruscus (Mytilidae) using next-generation sequencing.

    PubMed

    An, Hye Suck; Lee, Jang Wook

    2012-01-01

    Mytilus coruscus (family Mytilidae) is one of the most important marine shellfish species in Korea. During the past few decades, this species has become endangered due to the loss of habitats and overfishing. Despite this species' importance, information on its genetic background is scarce. In this study, we developed microsatellite markers for M. coruscus using next-generation sequencing. A total of 263,900 raw reads were obtained from a quarter-plate run on the 454 GS-FLX titanium platform, and 176,327 unique sequences were generated with an average length of 381 bp; 2569 (1.45%) sequences contained a minimum of five di- to tetra-nucleotide repeat motifs. Of the 51 loci screened, 46 were amplified successfully, and 22 were polymorphic among 30 individuals, with seven of trinucleotide repeats and three of tetranucleotide repeats. All loci exhibited high genetic variability, with an average of 17.32 alleles per locus, and the mean observed and expected heterozygosities were 0.67 and 0.90, respectively. In addition, cross-amplification was tested for all 22 loci in another congener species, M. galloprovincialis. None of the primer pairs resulted in effective amplification, which might be due to their high mutation rates. Our work demonstrated the utility of next-generation 454 sequencing as a method for the rapid and cost-effective identification of microsatellites. The high degree of polymorphism exhibited by the 22 newly developed microsatellites will be useful in future conservation genetic studies of this species.

  19. HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis

    PubMed Central

    Santana-Quintero, Luis; Dingerdissen, Hayley; Thierry-Mieg, Jean; Mazumder, Raja; Simonyan, Vahan

    2014-01-01

    Due to the size of Next-Generation Sequencing data, the computational challenge of sequence alignment has been vast. Inexact alignments can take up to 90% of total CPU time in bioinformatics pipelines. High-performance Integrated Virtual Environment (HIVE), a cloud-based environment optimized for storage and analysis of extra-large data, presents an algorithmic solution: the HIVE-hexagon DNA sequence aligner. HIVE-hexagon implements novel approaches to exploit both characteristics of sequence space and CPU, RAM and Input/Output (I/O) architecture to quickly compute accurate alignments. Key components of HIVE-hexagon include non-redundification and sorting of sequences; floating diagonals of linearized dynamic programming matrices; and consideration of cross-similarity to minimize computations. Availability https://hive.biochemistry.gwu.edu/hive/ PMID:24918764

  20. Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data

    PubMed Central

    Sunagawa, Shinichi; Järvelin, Aino I.; Chan, Michelle M.; Arumugam, Manimozhiyan; Raes, Jeroen; Bork, Peer

    2012-01-01

    Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available. PMID:22384016

  1. Next generation sequencing in sporadic retinoblastoma patients reveals somatic mosaicism

    PubMed Central

    Amitrano, Sara; Marozza, Annabella; Somma, Serena; Imperatore, Valentina; Hadjistilianou, Theodora; De Francesco, Sonia; Toti, Paolo; Galimberti, Daniela; Meloni, Ilaria; Cetta, Francesco; Piu, Pietro; Di Marco, Chiara; Dosa, Laura; Lo Rizzo, Caterina; Carignani, Giulia; Mencarelli, Maria Antonietta; Mari, Francesca; Renieri, Alessandra; Ariani, Francesca

    2015-01-01

    In about 50% of sporadic cases of retinoblastoma, no constitutive RB1 mutations are detected by conventional methods. However, recent research suggests that, at least in some of these cases, there is somatic mosaicism with respect to RB1 normal and mutant alleles. The increased availability of next generation sequencing improves our ability to detect the exact percentage of patients with mosaicism. Using this technology, we re-tested a series of 40 patients with sporadic retinoblastoma: 10 of them had been previously classified as constitutional heterozygotes, whereas in 30 no RB1 mutations had been found in lymphocytes. In 3 of these 30 patients, we have now identified low-level mosaic variants, varying in frequency between 8 and 24%. In 7 out of the 10 cases previously classified as heterozygous from testing blood cells, we were able to test additional tissues (ocular tissues, urine and/or oral mucosa): in three of them, next generation sequencing has revealed mosaicism. Present results thus confirm that a significant fraction (6/40; 15%) of sporadic retinoblastoma cases are due to postzygotic events and that deep sequencing is an efficient method to unambiguously distinguish mosaics. Re-testing of retinoblastoma patients through next generation sequencing can thus provide new information that may have important implications with respect to genetic counseling and family care. PMID:25712084

  2. Large-scale MHC class II genotyping of a wild lemur population by next generation sequencing.

    PubMed

    Huchard, Elise; Albrecht, Christina; Schliehe-Diecks, Susanne; Baniel, Alice; Roos, Christian; Kappeler, Peter M; Peter, Peter M Kappeler; Brameier, Markus

    2012-12-01

    The critical role of major histocompatibility complex (MHC) genes in disease resistance, along with their putative function in sexual selection, reproduction and chemical ecology, make them an important genetic system in evolutionary ecology. Studying selective pressures acting on MHC genes in the wild nevertheless requires population-wide genotyping, which has long been challenging because of their extensive polymorphism. Here, we report on large-scale genotyping of the MHC class II loci of the grey mouse lemur (Microcebus murinus) from a wild population in western Madagascar. The second exons from MHC-DRB and -DQB of 772 and 672 individuals were sequenced, respectively, using a 454 sequencing platform, generating more than 800,000 reads. Sequence analysis, through a stepwise variant validation procedure, allowed reliable typing of more than 600 individuals. The quality of our genotyping was evaluated through three independent methods, namely genotyping the same individuals by both cloning and 454 sequencing, running duplicates, and comparing parent-offspring dyads; each displaying very high accuracy. A total of 61 (including 20 new) and 60 (including 53 new) alleles were detected at DRB and DQB genes, respectively. Both loci were non-duplicated, in tight linkage disequilibrium and in Hardy-Weinberg equilibrium, despite the fact that sequence analysis revealed clear evidence of historical selection. Our results highlight the potential of 454 sequencing technology in attempts to investigate patterns of selection shaping MHC variation in contemporary populations. The power of this approach will nevertheless be conditional upon strict quality control of the genotyping data.

  3. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

    PubMed Central

    Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396

  4. Genomic characterization of a turkey reovirus field strain by Next-Generation Sequencing

    PubMed Central

    Tang, Yi; Lu, Huaguang; Sebastian, Aswathy; Yeh, Yin-Ting; Praul, Craig A.; Albert, Istvan U.; Zheng, Si-Yang

    2015-01-01

    The genome of a turkey arthritis reovirus (TARV) field strain (Reo/PA/Turkey/22342/13), isolated from a turkey flock in Pennsylvania (PA) in 2013, has been sequenced using Next-Generation Sequencing (NGS) on the Illumina MiSeq platform. The genome of the PA TARV field strain was 23,496bp in length with 10 dsRNA segments encoding 12 viral proteins. The lengths of the genomic segments ranged from 1,192bp (S4) to 3,959bp (L1). The 5’ and 3’ conserved terminal sequences of the PA TARV field strain were similar to the two Minnesota (MN) TARVs (MN9 and MN10) published recently and avian orthoreovirus (ARV) reference strains. Phylogenetic analysis of the nucleotide sequences of all 10 genome segments revealed that there was a low to significant nucleotide sequence divergence between the PA TARV field strain and reference TARV and ARV strains. Analysis of the PA TARV sequence indicates that this PA TARV field strain is a unique strain and is different from the TARV MN9 or MN10 in M2 segment genes and ARV S1133 vaccine strain. PMID:25841748

  5. Evaluation and application of the strand-specific protocol for next-generation sequencing.

    PubMed

    Tsai, Kuo-Wang; Chang, Bill; Pan, Cheng-Tsung; Lin, Wei-Chen; Chen, Ting-Wen; Li, Sung-Chou

    2015-01-01

    Next-generation sequencing (NGS) has become a powerful sequencing tool, applied in a wide range of biological studies. However, the traditional sample preparation protocol for NGS is non-strand-specific (NSS), leading to biased estimates of expression for transcripts overlapped at the antisense strand. Strand-specific (SS) protocols have recently been developed. In this study, we prepared the same RNA sample by using the SS and NSS protocols, followed by sequencing with Illumina HiSeq platform. Using real-time quantitative PCR as a standard, we first proved that the SS protocol more precisely estimates gene expressions compared with the NSS protocol, particularly for those overlapped at the antisense strand. In addition, we also showed that the sequence reads from the SS protocol are comparable with those from conventional NSS protocols in many aspects. Finally, we also mapped a fraction of sequence reads back to the antisense strand of the known genes, originally without annotated genes located. Using sequence assembly and PCR validation, we succeeded in identifying and characterizing the novel antisense genes. Our results show that the SS protocol performs more accurately than the traditional NSS protocol and can be applied in future studies.

  6. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    PubMed

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.

  7. Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

    PubMed

    Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

    2015-01-01

    Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396

  8. Next-generation sequencing in schizophrenia and other neuropsychiatric disorders.

    PubMed

    Schreiber, Matthew; Dorschner, Michael; Tsuang, Debby

    2013-10-01

    Schizophrenia is a debilitating lifelong illness that lacks a cure and poses a worldwide public health burden. The disease is characterized by a heterogeneous clinical and genetic presentation that complicates research efforts to identify causative genetic variations. This review examines the potential of current findings in schizophrenia and in other related neuropsychiatric disorders for application in next-generation technologies, particularly whole-exome sequencing (WES) and whole-genome sequencing (WGS). These approaches may lead to the discovery of underlying genetic factors for schizophrenia and may thereby identify and target novel therapeutic targets for this devastating disorder. PMID:24132899

  9. Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells

    PubMed Central

    Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J.; Yancopoulos, George D.; Lin, Hsin Chieh; Gromada, Jesper

    2016-01-01

    This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type–specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system. PMID:26951663

  10. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    SciTech Connect

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.

  11. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    DOE PAGESBeta

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  12. Bioinformatics: identification of markers from next-generation sequence data.

    PubMed

    Ruperao, Pradeep; Edwards, David

    2015-01-01

    With the advent of sequencing technology, next-generation sequencing (NGS) technology has dramatically revolutionized plant genomics. NGS technology combined with new software tools enables the discovery, validation, and assessment of genetic markers on a large scale. Among different markers systems, simple sequence repeats (SSRs) and Single nucleotide polymorphisms (SNPs) are the markers of choice for genetics and plant breeding. SSR markers have been a choice for large-scale characterization of germplasm collections, construction of genetic maps, and QTL identification. Similarly, SNPs are the most abundant genetic variations with higher frequencies throughout the genome of plant species. This chapter discusses various tools available for genome assembly and widely focuses on SSR and SNP marker discovery.

  13. Next-Generation Sequencing RNA-Seq Library Construction.

    PubMed

    Podnar, Jessica; Deiderick, Heather; Huerta, Gabriella; Hunicke-Smith, Scott

    2014-01-01

    This unit presents protocols for construction of next-generation sequencing (NGS) directional RNA sequencing libraries for the Illumina HiSeq and MiSeq from a wide variety of input RNA sources. The protocols are based on the New England Biolabs (NEB) small RNA library preparation set for Illumina, although similar kits exist from different vendors. The protocol preserves the orientation of the original RNA in the final sequencing library, enabling strand-specific analysis of the resulting data. These libraries have been used for differential gene expression analysis and small RNA discovery and are currently being tested for de novo transcriptome assembly. The protocol is robust and applicable to a broad range of RNA input types and RNA quality, making it ideal for high-throughput laboratories.

  14. Nanopore-based Fourth-generation DNA Sequencing Technology

    PubMed Central

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  15. Evaluation of GS Junior and MiSeq next-generation sequencing technologies as an alternative to Trugene population sequencing in the clinical HIV laboratory.

    PubMed

    Ram, Daniela; Leshkowitz, Dena; Gonzalez, Dimitri; Forer, Relly; Levy, Itzchak; Chowers, Michal; Lorber, Margalit; Hindiyeh, Musa; Mendelson, Ella; Mor, Orna

    2015-02-01

    Population HIV-1 sequencing is currently the method of choice for the identification and follow-up of HIV-1 antiretroviral drug resistance. It has limited sensitivity and results in a consensus sequence showing the most prevalent nucleotide per position. Moreover concomitant sequencing and interpretation of the results for several samples together is laborious and time consuming. In this study, the practical use of GS Junior and MiSeq bench-top next generation sequencing (NGS) platforms as an alternative to Trugene Sanger-based population sequencing in the clinical HIV laboratory was assessed. DeepChek(®)-HIV TherapyEdge software was used for processing all the protease and reverse transcriptase sequences and for resistance interpretation. Plasma samples from nine HIV-1 carriers, representing the major HIV-1 subtypes in Israel, were compared. The total number of amino acid substitutions identified in the nine samples by GS Junior (232 substitutions) and MiSeq (243 substitutions) was similar and higher than Trugene (181 substitutions), emphasizing the advantage of deep sequencing on population sequencing. More than 80% of the identified substitutions were identical between the GS Junior and MiSeq platforms, most of which (184 of 199) at similar frequency. Low abundance substitutions accounted for 20.9% of the MiSeq and 21.9% of the GS Junior output, the majority of which were not detected by Trugene. More drug resistance mutations were identified by both the NGS platforms, primarily, but not only, at low abundance. In conclusion, in combination with DeepChek, both GS Junior and MiSeq were found to be more sensitive than Trugene and adequate for HIV-1 resistance analysis in the clinical HIV laboratory.

  16. Evaluation of GS Junior and MiSeq next-generation sequencing technologies as an alternative to Trugene population sequencing in the clinical HIV laboratory.

    PubMed

    Ram, Daniela; Leshkowitz, Dena; Gonzalez, Dimitri; Forer, Relly; Levy, Itzchak; Chowers, Michal; Lorber, Margalit; Hindiyeh, Musa; Mendelson, Ella; Mor, Orna

    2015-02-01

    Population HIV-1 sequencing is currently the method of choice for the identification and follow-up of HIV-1 antiretroviral drug resistance. It has limited sensitivity and results in a consensus sequence showing the most prevalent nucleotide per position. Moreover concomitant sequencing and interpretation of the results for several samples together is laborious and time consuming. In this study, the practical use of GS Junior and MiSeq bench-top next generation sequencing (NGS) platforms as an alternative to Trugene Sanger-based population sequencing in the clinical HIV laboratory was assessed. DeepChek(®)-HIV TherapyEdge software was used for processing all the protease and reverse transcriptase sequences and for resistance interpretation. Plasma samples from nine HIV-1 carriers, representing the major HIV-1 subtypes in Israel, were compared. The total number of amino acid substitutions identified in the nine samples by GS Junior (232 substitutions) and MiSeq (243 substitutions) was similar and higher than Trugene (181 substitutions), emphasizing the advantage of deep sequencing on population sequencing. More than 80% of the identified substitutions were identical between the GS Junior and MiSeq platforms, most of which (184 of 199) at similar frequency. Low abundance substitutions accounted for 20.9% of the MiSeq and 21.9% of the GS Junior output, the majority of which were not detected by Trugene. More drug resistance mutations were identified by both the NGS platforms, primarily, but not only, at low abundance. In conclusion, in combination with DeepChek, both GS Junior and MiSeq were found to be more sensitive than Trugene and adequate for HIV-1 resistance analysis in the clinical HIV laboratory. PMID:25445792

  17. Designing a transcriptome next-generation sequencing project for a nonmodel plant species.

    PubMed

    Strickler, Susan R; Bombarely, Aureliano; Mueller, Lukas A

    2012-02-01

    The application of next-generation sequencing (NGS) to transcriptomics, commonly called RNA-seq, allows the nearly complete characterization of transcriptomic events occurring in a specific tissue. It has proven particularly useful in nonmodel species, which often lack the resources available for sequenced organisms. Mainly, RNA-seq does not require a reference genome to gain useful transcriptomic information. In this review, the application of RNA-seq to nonmodel plant species will be addressed. Important experimental considerations from presequencing issues to postsequencing analysis, including sample and platform selection, and useful bioinformatics tools for assembly and data analysis, are covered. Methods of assembling RNA-seq data and analyses commonly performed with RNA-seq data, including single nucleotide polymorphism detection and analysis of differential expression, are explored. In addition, studies that have used RNA-seq to elucidate nonmodel plant transcriptomics are highlighted.

  18. Next generation deep sequencing and vaccine design: today and tomorrow.

    PubMed

    Luciani, Fabio; Bull, Rowena A; Lloyd, Andrew R

    2012-09-01

    Next generation sequencing (NGS) technologies have redefined the modus operandi in both human and microbial genetics research, allowing the unprecedented generation of very large sequencing datasets on a short time scale and at affordable costs. Vaccine development research is rapidly taking full advantage of the advent of NGS. This review provides a concise summary of the current applications of NGS in relation to research seeking to develop vaccines for human infectious diseases, incorporating studies of both the pathogen and the host. We focus on rapidly mutating viral pathogens, which are major targets in current vaccine research. NGS is unraveling the complex dynamics of viral evolution and host responses against these viruses, thus contributing substantially to the likelihood of successful vaccine development.

  19. Mapping Sensorimotor Sequences to Word Sequences: A Connectionist Model of Language Acquisition and Sentence Generation

    ERIC Educational Resources Information Center

    Takac, Martin; Benuskova, Lubica; Knott, Alistair

    2012-01-01

    In this article we present a neural network model of sentence generation. The network has both technical and conceptual innovations. Its main technical novelty is in its semantic representations: the messages which form the input to the network are structured as sequences, so that message elements are delivered to the network one at a time. Rather…

  20. Microfluidic platform for isolating nucleic acid targets using sequence specific hybridization

    PubMed Central

    Wang, Jingjing; Morabito, Kenneth; Tang, Jay X.; Tripathi, Anubhav

    2013-01-01

    The separation of target nucleic acid sequences from biological samples has emerged as a significant process in today's diagnostics and detection strategies. In addition to the possible clinical applications, the fundamental understanding of target and sequence specific hybridization on surface modified magnetic beads is of high value. In this paper, we describe a novel microfluidic platform that utilizes a mobile magnetic field in static microfluidic channels, where single stranded DNA (ssDNA) molecules are isolated via nucleic acid hybridization. We first established efficient isolation of biotinylated capture probe (BP) using streptavidin-coated magnetic beads. Subsequently, we investigated the hybridization of target ssDNA with BP bound to beads and explained these hybridization kinetics using a dual-species kinetic model. The number of hybridized target ssDNA molecules was determined to be about 6.5 times less than that of BP on the bead surface, due to steric hindrance effects. The hybridization of target ssDNA with non-complementary BP bound to bead was also examined, and non-specific hybridization was found to be insignificant. Finally, we demonstrated highly efficient capture and isolation of target ssDNA in the presence of non-target ssDNA, where as low as 1% target ssDNA can be detected from mixture. The microfluidic method described in this paper is significantly relevant and is broadly applicable, especially towards point-of-care biological diagnostic platforms that require binding and separation of known target biomolecules, such as RNA, ssDNA, or protein. PMID:24404041

  1. All-optical pseudorandom bit sequences generator based on TOADs

    NASA Astrophysics Data System (ADS)

    Sun, Zhenchao; Wang, Zhi; Wu, Chongqing; Wang, Fu; Li, Qiang

    2016-03-01

    A scheme for all-optical pseudorandom bit sequences (PRBS) generator is demonstrated with optical logic gate 'XNOR' and all-optical wavelength converter based on cascaded Tera-Hertz Optical Asymmetric Demultiplexer (TOADs). Its feasibility is verified by generation of return-to-zero on-off keying (RZ-OOK) 263-1 PRBS at the speed of 1 Gb/s with 10% duty radio. The high randomness of ultra-long cycle PRBS is validated by successfully passing the standard benchmark test.

  2. Generating Researcher Networks with Identified Persons on a Semantic Service Platform

    NASA Astrophysics Data System (ADS)

    Jung, Hanmin; Lee, Mikyoung; Kim, Pyung; Lee, Seungwoo

    This paper describes a Semantic Web-based method to acquire researcher networks by means of identification scheme, ontology, and reasoning. Three steps are required to realize it; resolving co-references, finding experts, and generating researcher networks. We adopt OntoFrame as an underlying semantic service platform and apply reasoning to make direct relations between far-off classes in ontology schema. 453,124 Elsevier journal articles with metadata and full-text documents in information technology and biomedical domains have been loaded and served on the platform as a test set.

  3. Quantifying Population Genetic Differentiation from Next-Generation Sequencing Data

    PubMed Central

    Fumagalli, Matteo; Vieira, Filipe G.; Korneliussen, Thorfinn Sand; Linderoth, Tyler; Huerta-Sánchez, Emilia; Albrechtsen, Anders; Nielsen, Rasmus

    2013-01-01

    Over the past few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data. In particular, the use of naïve methods to identify polymorphic sites and infer genotypes can inflate downstream analyses. Recently, explicit modeling of genotype probability distributions has been proposed as a method for taking genotype call uncertainty into account. Based on this idea, we propose a novel method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy for investigating population structure via principal components analysis. Through extensive simulations, we compare the new method herein proposed to approaches based on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled individuals, suggesting that employing this new method is useful for investigating the genetic relationships of populations sampled at low coverage. PMID:23979584

  4. Next-generation sequencing for understanding and accelerating crop domestication.

    PubMed

    Henry, Robert J

    2012-01-01

    Next generation Sequencing (NGS) provides a powerful tool for discovery of domestication genes in crop plants and their wild relatives. The accelerated domestication of new plant species as crops may be facilitated by this knowledge. Re-sequencing of domesticated genotypes can identify regions of low diversity associated with domestication. Species-specific data can be obtained from related wild species by whole-genome shot-gun sequencing. This sequence data can be used to design species specific polymerase chain reaction (PCR) primers. Sequencing of the products of PCR amplification of target genes can be used to explore genetic variation in large numbers of genes and gene families. Novel allelic variation in close or distant relatives can be characterized by NGS. Examples of recent applications of NGS to capture of genetic diversity for crop improvement include rice, sugarcane and Eucalypts. Populations of large numbers of individuals can be screened rapidly. NGS supports the rapid domestication of new plant species and the efficient identification and capture of novel genetic variation from related species.

  5. Next generation sequencing in synovial sarcoma reveals novel gene mutations

    PubMed Central

    Vlenterie, Myrella; Hillebrandt-Roeffen, Melissa H.S.; Flucke, Uta E.; Groenen, Patricia J.T.A.; Tops, Bastiaan B.J.; Kamping, Eveline J.; Pfundt, Rolph; de Bruijn, Diederik R.H.; van Kessel, Ad H.M. Geurts; van Krieken, Han J.H.J.M.; van der Graaf, Winette T.A.; Versleijen-Jonkers, Yvonne M.H.

    2015-01-01

    Over 95% of all synovial sarcomas (SS) share a unique translocation, t(X;18), however, they show heterogeneous clinical behavior. We analyzed multiple SS to reveal additional genetic alterations besides the translocation. Twenty-six SS from 22 patients were sequenced for 409 cancer-related genes using the Comprehensive Cancer Panel (Life Technologies, USA) on an Ion Torrent platform. The detected variants were verified by Sanger sequencing and compared to matched normal DNAs. Copy number variation was assessed in six tumors using the Oncoscan array (Affymetrix, USA). In total, eight somatic mutations were detected in eight samples. These mutations have not been reported previously in SS. Two of these, in KRAS and CCND1, represent known oncogenic mutations in other malignancies. Additional mutations were detected in RNF213, SEPT9, KDR, CSMD3, MLH1 and ERBB4. DNA alterations occurred more often in adult tumors. A distinctive loss of 6q was found in a metastatic lesion progressing under pazopanib, but not in the responding lesion. Our results emphasize t(X;18) as a single initiating event in SS and as the main oncogenic driver. Our results also show the occurrence of additional genetic events, mutations or chromosomal aberrations, occurring more frequently in SS with an onset in adults. PMID:26415226

  6. Efficient and sensitive identification and quantification of airborne pollen using next-generation DNA sequencing.

    PubMed

    Kraaijeveld, Ken; de Weger, Letty A; Ventayol García, Marina; Buermans, Henk; Frank, Jeroen; Hiemstra, Pieter S; den Dunnen, Johan T

    2015-01-01

    Pollen monitoring is an important and widely used tool in allergy research and creation of awareness in pollen-allergic patients. Current pollen monitoring methods are microscope-based, labour intensive and cannot identify pollen to the genus level in some relevant allergenic plant groups. Therefore, a more efficient, cost-effective and sensitive method is needed. Here, we present a method for identification and quantification of airborne pollen using DNA sequencing. Pollen is collected from ambient air using standard techniques. DNA is extracted from the collected pollen, and a fragment of the chloroplast gene trnL is amplified using PCR. The PCR product is subsequently sequenced on a next-generation sequencing platform (Ion Torrent). Amplicon molecules are sequenced individually, allowing identification of different sequences from a mixed sample. We show that this method provides an accurate qualitative and quantitative view of the species composition of samples of airborne pollen grains. We also show that it correctly identifies the individual grass genera present in a mixed sample of grass pollen, which cannot be achieved using microscopic pollen identification. We conclude that our method is more efficient and sensitive than current pollen monitoring techniques and therefore has the potential to increase the throughput of pollen monitoring.

  7. Dynamic linear model for the identification of miRNAs in next-generation sequencing data

    PubMed Central

    Johnson, W. Evan; Welker, Noah C.; Bass, Brenda L.

    2011-01-01

    Summary Next generation sequencing technologies are poised to revolutionize the field of biomedical research. The increased resolution of these data promise to provide a greater understanding of the molecular processes that control the morphology and behavior of a cell. However, the increased amounts of data require innovative statistical procedures that are powerful while still being computationally feasible. In this research, we present a method for identifying small RNA molecules, called miRNAs, which regulate genes by targeting their mRNAs for degradation or translational repression. In the first step of our modeling procedure, we apply an innovative dynamic linear model that identifies candidate miRNA genes in high-throughput sequencing data. The model is flexible and can accurately identify interesting biological features while accounting for both the read count, read spacing, and sequencing depth. Additionally, miRNA candidates are also processed using a modified Smith-Waterman sequence alignment that scores the regions for potential RNA hairpins, one of the defining features of miRNAs. We illustrate our method on simulated data sets as well as on a small RNA Caenorhabditis elegans data set from the Illumina sequencing platform. These examples show that our method is highly sensitive for identifying known and novel miRNA genes. PMID:21385162

  8. Next generation sequencing improves detection of drug resistance mutations in infants after PMTCT failure

    PubMed Central

    FISHER, Randall G.; SMITH, Davey M.; MURRELL, Ben; SLABBERT, Ruhan; KIRBY, Bronwyn M.; EDSON, Clair; COTTON, Mark F.; HAUBRICH, Richard H.; KOSAKOVSKY POND, Sergei L.; VAN ZYL, Gert U.

    2014-01-01

    Background Next generation sequencing (NGS) allows the detection of minor variant HIV drug resistance mutations (DRMs). However data from new NGS platforms after Prevention-of-Mother-to Child-Transmission (PMTCT) regimen failure are limited. Objective To compare major and minor variant HIV DRMs with Illumina MiSeq and Life Technologies Ion Personal Genome machine (PGM) in infants infected despite a PMTCT regimen. Study Design We conducted a cross-sectional study of NGS for detecting DRMs in infants infected despite a zidovudine (AZT) and Nevirapine (NVP) regimen, before initiation of combination antiretroviral therapy. Sequencing was performed on PCR products from plasma samples on PGM and MiSeq platforms. Bioinformatic analyses were undertaken using a codon-aware version of the Smith-Waterman mapping algorithm and a mixture multinomial error filtering statistical model. Results Of 15 infants, tested at a median age of 3.4 months after birth, 2 (13%) had non-nucleoside reverse transcriptase inhibitor (NNRTI) DRMs (K103N and Y181C) by bulk sequencing, whereas PGM detected 4 (26%) and MiSeq 5 (30%). NGS enabled the detection of additional minor variant DRMs in the infant with K103N. Coverage and instrument quality scores were higher with MiSeq, increasing the confidence of minor variant calls. Conclusions NGS followed by bioinformatic analyses detected multiple minor variant DRMs in HIV-1 RT among infants where PMTCT failed. The high coverage of MiSeq and high read quality improved the confidence of identified DRMs and may make this platform ideal for minor variant detection. PMID:25542470

  9. A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    PubMed Central

    Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

    2016-01-01

    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available. PMID:27327771

  10. Preliminary Sequence stratigraphy framework of the SW part of the Actopan Platform, Lower Cretaceous, Hidalgo, Mexico

    NASA Astrophysics Data System (ADS)

    Abascal, G.; Murillo-Muñeton, G.

    2013-05-01

    The oldest sedimentary rocks in what is known as the Actopan Platform, in the State of Hidalgo, Mexico, are superbly exposed toward the southwestern part of such platform. A detailed stratigraphic/sedimentologic study was carried out to a 623 m-thick section; this study was focused to establish a sequence stratigraphic framework. The base of the section consists of a Lower Cretaceous 6223-m thick, mixed siliciclastic-carbonate sedimentary succession that has been named Santuario Formation. The terrigenous facies of this unit correspond to red beds that consist of shales, sandstones y few conglomerates deposited under continental conditions (fluvial). White and yellowish sandstones, possibly deposited by deltaic systems, occur in minor amounts. A tuff layer is found in its lower part. The carbonate facies of the Santuario Formation consist mainly of skeletal mudstones/wackestones de bioclastos-peloides and subordinate quantities of sandy dolostones, skeletal packstones/grainstones and rudist (requeniids) boundstones. The middle and upper parts of the studied stratigraphic section correspond to an essentially carbonate succession that in known as El Abra Formation. This unit is comprised of the following facies: skeletal mudstones/wackestones, skeletal packstones/grainstone, and minor rudist (requeniid and Chondrodonta) boundstones and cryptalgal laminites deposited in shallow subtidal lagoon to tidal flat conditions. At this location, a "Middle" Cretaceous age (Albian-Cenomanian) has been assigned to the El Abra Formation. However, the common presence of the benthic foraminifer Chofatella decipiens Schlumberger in these facies indicates that their age extends, at least, to the Lower Cretaceous (Barremian). This age was confirmed with the dating of zircons in tuff deposited in the base section. The carbonate facies of the Santuario Formation stack forming fifth-order subtidal cycles or parasequences. While the carbonate facies of the El Abra Formation also stack

  11. Humans cannot consciously generate random numbers sequences: Polemic study.

    PubMed

    Figurska, Małgorzata; Stańczyk, Maciej; Kulesza, Kamil

    2008-01-01

    It is widely believed, that randomness exists in Nature. In fact such an assumption underlies many scientific theories and is embedded in the foundations of quantum mechanics. Assuming that this hypothesis is valid one can use natural phenomena, like radioactive decay, to generate random numbers. Today, computers are capable of generating the so-called pseudorandom numbers. Such series of numbers are only seemingly random (bias in the randomness quality can be observed). Question whether people can produce random numbers, has been investigated by many scientists in the recent years. The paper "Humans can consciously generate random numbers sequences..." published recently in Medical Hypotheses made claims that were in many ways contrary to state of art; it also stated far-reaching hypotheses. So, we decided to repeat the experiments reported, with special care being taken of proper laboratory procedures. Here, we present the results and discuss possible implications in computer and other sciences. PMID:17888582

  12. Analysis of Metagenomics Next Generation Sequence Data for Fungal ITS Barcoding: Do You Need Advance Bioinformatics Experience?

    PubMed Central

    Ahmed, Abdalla

    2016-01-01

    During the last few decades, most of microbiology laboratories have become familiar in analyzing Sanger sequence data for ITS barcoding. However, with the availability of next-generation sequencing platforms in many centers, it has become important for medical mycologists to know how to make sense of the massive sequence data generated by these new sequencing technologies. In many reference laboratories, the analysis of such data is not a big deal, since suitable IT infrastructure and well-trained bioinformatics scientists are always available. However, in small research laboratories and clinical microbiology laboratories the availability of such resources are always lacking. In this report, simple and user-friendly bioinformatics work-flow is suggested for fast and reproducible ITS barcoding of fungi. PMID:27507959

  13. Seismic stratigraphy of the western Florida carbonate platform above the Mid-Cretaceous sequence boundary (MCSB)

    SciTech Connect

    Jee, J.L. . Dept. of Geology)

    1993-03-01

    From the Apalachicola Basin (AB) to the Sarasota Arch, a web of multifold seismic and 29 wells were analyzed to determine Upper Cretaceous-Cenozoic stratigraphy. Concordant reflection geometries above and below the MCSB throughout most of the study area do not suggest prolonged subaerial exposure of the platform as some have considered. The configuration of the MCSB surface influenced the distribution of overlying sediment such that the section is thick in the basins and thin on the highs. The three main units recognized are Upper Cretaceous, Paleocene-Eocene, and post-Eocene. The Upper Cretaceous has two subunits, KU1 and KU2. KU1 corresponds in age to the Tuscaloosa-Eutaw lithostratigraphic units, has continuous, parallel seismic facies, and tends to thicken in depressions on the MCSB. KU2 is age-equivalent to part of the Selma Gp. Maastrichtian strata are locally thin to partly absent. In the AB, KU2 appears intensely faulted. Sonic velocities in KU2 show southeastward change to more carbonate rock across the Middle Ground Arch, where hummocky-to-contorted seismic facies and thickening on the structural high suggest constructional accumulation. In wells, Paleocene strata lie unconformably on the Upper Cretaceous. The Paleocene section is thin and not easy to resolve on seismic sections. In the AB, the lowermost Eocene sequence is a wedge that thickens dramatically to the west. In the eastern AB, younger Eocene sequences are stacked to form broad en echelon mounds. Post-Eocene strata in the AB are continuous, parallel and drape the upper Eocene surface. Along the southeastern, up-dip margin of the Tampa Embayment (TE), a belt of west-prograding clinoforms marks the Eocene shelf edge. Landward of this, a seismic marbled zone suggests dolomitic facies. In the post-Eocene section of the TE, Oligocene-Lower Miocene strata form successive sequences of progradational clinoforms that steepen as they impinge on the FL Escarpment.

  14. Next-generation sequencing for diagnosis of rare diseases in the neonatal intensive care unit

    PubMed Central

    Daoud, Hussein; Luco, Stephanie M.; Li, Rui; Bareke, Eric; Beaulieu, Chandree; Jarinova, Olga; Carson, Nancy; Nikkel, Sarah M.; Graham, Gail E.; Richer, Julie; Armour, Christine; Bulman, Dennis E.; Chakraborty, Pranesh; Geraghty, Michael; Lines, Matthew A.; Lacaze-Masmonteil, Thierry; Majewski, Jacek; Boycott, Kym M.; Dyment, David A.

    2016-01-01

    Background: Rare diseases often present in the first days and weeks of life and may require complex management in the setting of a neonatal intensive care unit (NICU). Exhaustive consultations and traditional genetic or metabolic investigations are costly and often fail to arrive at a final diagnosis when no recognizable syndrome is suspected. For this pilot project, we assessed the feasibility of next-generation sequencing as a tool to improve the diagnosis of rare diseases in newborns in the NICU. Methods: We retrospectively identified and prospectively recruited newborns and infants admitted to the NICU of the Children’s Hospital of Eastern Ontario and the Ottawa Hospital, General Campus, who had been referred to the medical genetics or metabolics inpatient consult service and had features suggesting an underlying genetic or metabolic condition. DNA from the newborns and parents was enriched for a panel of clinically relevant genes and sequenced on a MiSeq sequencing platform (Illumina Inc.). The data were interpreted with a standard informatics pipeline and reported to care providers, who assessed the importance of genotype–phenotype correlations. Results: Of 20 newborns studied, 8 received a diagnosis on the basis of next-generation sequencing (diagnostic rate 40%). The diagnoses were renal tubular dysgenesis, SCN1A-related encephalopathy syndrome, myotubular myopathy, FTO deficiency syndrome, cranioectodermal dysplasia, congenital myasthenic syndrome, autosomal dominant intellectual disability syndrome type 7 and Denys–Drash syndrome. Interpretation: This pilot study highlighted the potential of next-generation sequencing to deliver molecular diagnoses rapidly with a high success rate. With broader use, this approach has the potential to alter health care delivery in the NICU. PMID:27241786

  15. Next generation sequencing applications for breast cancer research

    PubMed Central

    PETRIC, ROXANA COJOCNEANU; POP, LAURA-ANCUTA; JURJ, ANCUTA; RADULY, LAJOS; DUMITRASCU, DAN; DRAGOS, NICOLAE; NEAGOE, IOANA BERINDAN

    2015-01-01

    For some time, cancer has not been thought of as a disease, but as a multifaceted, heterogeneous complex of genotypic and phenotypic manifestations leading to tumorigenesis. Due to recent technological progress, the outcome of cancer patients can be greatly improved by introducing in clinical practice the advantages brought about by the development of next generation sequencing techniques. Biomedical suppliers have come up with various applications which medical researchers can use to characterize a patient’s disease from molecular and genetic point of view in order to provide caregivers with rapid and relevant information to guide them in choosing the most appropriate course of treatment, with maximum efficiency and minimal side effects. Breast cancer, whose incidence has risen dramatically, is a good candidate for these novel diagnosis and therapeutic approaches, particularly when referring to specific sequencing panels which are designed to detect germline or somatic mutations in genes that are involved in breast cancer tumorigenesis and progression. Benchtop next generation sequencing machines are becoming a more common presence in the clinical setting, empowering physicians to better treat their patients, by offering early diagnosis alternatives, targeted remedies, and bringing medicine a step closer to achieving its ultimate goal, personalized therapy. PMID:26609257

  16. Generating animated sequences from 3D whole-body scans

    NASA Astrophysics Data System (ADS)

    Pargas, Roy P.; Chhatriwala, Murtuza; Mulfinger, Daniel; Deshmukh, Pushkar; Vadhiyar, Sathish

    1999-03-01

    3D images of human subjects are, today, easily obtained using 3D wholebody scanners. 3D human images can provide static information about the physical characteristics of a person, information valuable to professionals such as clothing designers, anthropometrists, medical doctors, physical therapists, athletic trainers, and sculptors. Can 3D human images can be used to provide e more than static physical information. This research described in this paper attempts to answer the question by explaining a way that animated sequences may be generated from a single 3D scan. The process stars by subdividing the human image into segments and mapping the segments to those of a human model defined in a human-motion simulation package. The simulation software provides information used to display movement of the human image. Snapshots of the movement are captured and assembled to create an animated sequence. All of the postures and motion of the human images come from a single 3D scan. This paper describes the process involved in animating human figures from static 3D wholebody scans, presents an example of a generated animated sequence, and discusses possible applications of this approach.

  17. Next generation sequencing: new tools in immunology and hematology

    PubMed Central

    Mori, Antonio; Deola, Sara; Xumerle, Luciano; Mijatovic, Vladan; Malerba, Giovanni

    2013-01-01

    One of the hallmarks of the adaptive immune system is the specificity of B and T cell receptors. Thanks to somatic recombination, a large repertoire of receptors can be generated within an individual that guarantee the recognition of a vast number of antigens. Monoclonal antibodies have limited applicability, given the high degree of diversity among these receptors, in BCR and TCR monitoring. Furthermore, with regard to cancer, better characterization of complex genomes and the ability to monitor tumor-specific cryptic mutations or translocations are needed to develop better tailored therapies. Novel technologies, by enhancing the ability of BCR and TCR monitoring, can help in the search for minimal residual disease during hematological malignancy diagnosis and follow-up, and can aid in improving bone marrow transplantation techniques. Recently, a novel technology known as next generation sequencing has been developed; this allows the recognition of unique sequences and provides depth of coverage, heterogeneity, and accuracy of sequencing. This provides a powerful tool that, along with microarray analysis for gene expression, may become integral in resolving the remaining key problems in hematology. This review describes the state of the art of this novel technology, its application in the immunological and hematological fields, and the possible benefits it will provide for the hematology and immunology community. PMID:24466547

  18. Unraveling genomic variation from next generation sequencing data.

    PubMed

    Pavlopoulos, Georgios A; Oulas, Anastasis; Iacucci, Ernesto; Sifrim, Alejandro; Moreau, Yves; Schneider, Reinhard; Aerts, Jan; Iliopoulos, Ioannis

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.

  19. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    PubMed

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  20. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach

    PubMed Central

    Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P.

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions. PMID:26908260

  1. Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

    PubMed Central

    Desikan, Srinidhi; Narayanan, Sujatha

    2015-01-01

    Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019

  2. Molecular Characterization of Transgenic Events Using Next Generation Sequencing Approach.

    PubMed

    Guttikonda, Satish K; Marri, Pradeep; Mammadov, Jafar; Ye, Liang; Soe, Khaing; Richey, Kimberly; Cruse, James; Zhuang, Meibao; Gao, Zhifang; Evans, Clive; Rounsley, Steve; Kumpatla, Siva P

    2016-01-01

    Demand for the commercial use of genetically modified (GM) crops has been increasing in light of the projected growth of world population to nine billion by 2050. A prerequisite of paramount importance for regulatory submissions is the rigorous safety assessment of GM crops. One of the components of safety assessment is molecular characterization at DNA level which helps to determine the copy number, integrity and stability of a transgene; characterize the integration site within a host genome; and confirm the absence of vector DNA. Historically, molecular characterization has been carried out using Southern blot analysis coupled with Sanger sequencing. While this is a robust approach to characterize the transgenic crops, it is both time- and resource-consuming. The emergence of next-generation sequencing (NGS) technologies has provided highly sensitive and cost- and labor-effective alternative for molecular characterization compared to traditional Southern blot analysis. Herein, we have demonstrated the successful application of both whole genome sequencing and target capture sequencing approaches for the characterization of single and stacked transgenic events and compared the results and inferences with traditional method with respect to key criteria required for regulatory submissions.

  3. A Comprehensive Platform for NGS Data Analysis

    SciTech Connect

    Kravitz, Saul

    2010-06-03

    Saul Kravitz of CLC Bio discusses the company's Genomic Workbench and how it can be used with data from next generation sequencing platforms on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  4. Next generation sequencing: Coping with rare genetic diseases in China

    PubMed Central

    Cram, David S; Zhou, Daixing

    2016-01-01

    Summary With a population of 1.4 billion, China shares the largest burden of rare genetic diseases worldwide. Current estimates suggest that there are over ten million individuals afflicted with chromosome disease syndromes and well over one million individuals with monogenic disease. Care of patients with rare genetic diseases remains a largely unmet need due to the paucity of available and affordable treatments. Over recent years, there is increasing recognition of the need for affirmative action by government, health providers, clinicians and patients. The advent of new next generation sequencing (NGS) technologies such as whole genome/exome sequencing, offers an unprecedented opportunity to provide large-scale population screening of the Chinese population to identify the molecular causes of rare genetic diseases. As a surrogate for lack of effective treatments, recent development and implementation of noninvasive prenatal testing (NIPT) in China has the greatest potential, as a single technology, for reducing the number of children born with rare genetic diseases.

  5. Next-Generation Sequencing: Role in Gynecologic Cancers.

    PubMed

    Evans, Tarra; Matulonis, Ursula

    2016-09-01

    Next-generation sequencing (NGS) has risen to the forefront of tumor analysis and has enabled unprecedented advances in the molecular profiling of solid tumors. Through massively parallel sequencing, previously unrecognized genomic alterations have been unveiled in many malignancies, including gynecologic cancers, thus expanding the potential repertoire for the use of targeted therapies. NGS has expanded the understanding of the genomic foundation of gynecologic malignancies and has allowed identification of germline and somatic mutations associated with cancer development, enabled tumor reclassification, and helped determine mechanisms of treatment resistance. NGS has also facilitated rationale therapeutic strategies based on actionable molecular aberrations. However, issues remain regarding cost and clinical utility. This review covers NGS analysis of and its impact thus far on gynecologic cancers, specifically ovarian, endometrial, cervical, and vulvar cancers. PMID:27587626

  6. Next generation sequencing: Coping with rare genetic diseases in China.

    PubMed

    Cram, David S; Zhou, Daixing

    2016-08-01

    With a population of 1.4 billion, China shares the largest burden of rare genetic diseases worldwide. Current estimates suggest that there are over ten million individuals afflicted with chromosome disease syndromes and well over one million individuals with monogenic disease. Care of patients with rare genetic diseases remains a largely unmet need due to the paucity of available and affordable treatments. Over recent years, there is increasing recognition of the need for affirmative action by government, health providers, clinicians and patients. The advent of new next generation sequencing (NGS) technologies such as whole genome/exome sequencing, offers an unprecedented opportunity to provide large-scale population screening of the Chinese population to identify the molecular causes of rare genetic diseases. As a surrogate for lack of effective treatments, recent development and implementation of noninvasive prenatal testing (NIPT) in China has the greatest potential, as a single technology, for reducing the number of children born with rare genetic diseases. PMID:27672536

  7. Second-generation environmental sequencing unmasks marine metazoan biodiversity.

    PubMed

    Fonseca, Vera G; Carvalho, Gary R; Sung, Way; Johnson, Harriet F; Power, Deborah M; Neill, Simon P; Packer, Margaret; Blaxter, Mark L; Lambshead, P John D; Thomas, W Kelley; Creer, Simon

    2010-10-19

    Biodiversity is of crucial importance for ecosystem functioning, sustainability and resilience, but the magnitude and organization of marine diversity at a range of spatial and taxonomic scales are undefined. In this paper, we use second-generation sequencing to unmask putatively diverse marine metazoan biodiversity in a Scottish temperate benthic ecosystem. We show that remarkable differences in diversity occurred at microgeographical scales and refute currently accepted ecological and taxonomic paradigms of meiofaunal identity, rank abundance and concomitant understanding of trophic dynamics. Richness estimates from the current benchmarked Operational Clustering of Taxonomic Units from Parallel UltraSequencing analyses are broadly aligned with those derived from morphological assessments. However, the slope of taxon rarefaction curves for many phyla remains incomplete, suggesting that the true alpha diversity is likely to exceed current perceptions. The approaches provide a rapid, objective and cost-effective taxonomic framework for exploring links between ecosystem structure and function of all hitherto intractable, but ecologically important, communities.

  8. Drug resistance analysis by next generation sequencing in Leishmania

    PubMed Central

    Leprohon, Philippe; Fernandez-Prada, Christopher; Gazanion, Élodie; Monte-Neto, Rubens; Ouellette, Marc

    2014-01-01

    The use of next generation sequencing has the power to expedite the identification of drug resistance determinants and biomarkers and was applied successfully to drug resistance studies in Leishmania. This allowed the identification of modulation in gene expression, gene dosage alterations, changes in chromosome copy numbers and single nucleotide polymorphisms that correlated with resistance in Leishmania strains derived from the laboratory and from the field. An impressive heterogeneity at the population level was also observed, individual clones within populations often differing in both genotypes and phenotypes, hence complicating the elucidation of resistance mechanisms. This review summarizes the most recent highlights that whole genome sequencing brought to our understanding of Leishmania drug resistance and likely new directions. PMID:25941624

  9. Second-generation environmental sequencing unmasks marine metazoan biodiversity

    PubMed Central

    Fonseca, Vera G.; Carvalho, Gary R.; Sung, Way; Johnson, Harriet F.; Power, Deborah M.; Neill, Simon P.; Packer, Margaret; Blaxter, Mark L.; Lambshead, P. John D.; Thomas, W. Kelley; Creer, Simon

    2010-01-01

    Biodiversity is of crucial importance for ecosystem functioning, sustainability and resilience, but the magnitude and organization of marine diversity at a range of spatial and taxonomic scales are undefined. In this paper, we use second-generation sequencing to unmask putatively diverse marine metazoan biodiversity in a Scottish temperate benthic ecosystem. We show that remarkable differences in diversity occurred at microgeographical scales and refute currently accepted ecological and taxonomic paradigms of meiofaunal identity, rank abundance and concomitant understanding of trophic dynamics. Richness estimates from the current benchmarked Operational Clustering of Taxonomic Units from Parallel UltraSequencing analyses are broadly aligned with those derived from morphological assessments. However, the slope of taxon rarefaction curves for many phyla remains incomplete, suggesting that the true alpha diversity is likely to exceed current perceptions. The approaches provide a rapid, objective and cost-effective taxonomic framework for exploring links between ecosystem structure and function of all hitherto intractable, but ecologically important, communities. PMID:20981026

  10. Next-generation sequencing technology in clinical virology.

    PubMed

    Capobianchi, M R; Giombini, E; Rozera, G

    2013-01-01

    Recent advances in nucleic acid sequencing technologies, referred to as 'next-generation' sequencing (NGS), have produced a true revolution and opened new perspectives for research and diagnostic applications, owing to the high speed and throughput of data generation. So far, NGS has been applied to metagenomics-based strategies for the discovery of novel viruses and the characterization of viral communities. Additional applications include whole viral genome sequencing, detection of viral genome variability, and the study of viral dynamics. These applications are particularly suitable for viruses such as human immunodeficiency virus, hepatitis B virus, and hepatitis C virus, whose error-prone replication machinery, combined with the high replication rate, results, in each infected individual, in the formation of many genetically related viral variants referred to as quasi-species. The viral quasi-species, in turn, represents the substrate for the selective pressure exerted by the immune system or by antiviral drugs. With traditional approaches, it is difficult to detect and quantify minority genomes present in viral quasi-species that, in fact, may have biological and clinical relevance. NGS provides, for each patient, a dataset of clonal sequences that is some order of magnitude higher than those obtained with conventional approaches. Hence, NGS is an extremely powerful tool with which to investigate previously inaccessible aspects of viral dynamics, such as the contribution of different viral reservoirs to replicating virus in the course of the natural history of the infection, co-receptor usage in minority viral populations harboured by different cell lineages, the dynamics of development of drug resistance, and the re-emergence of hidden genomes after treatment interruptions. The diagnostic application of NGS is just around the corner.

  11. Using next generation transcriptome sequencing to predict an ectomycorrhizal metablome.

    SciTech Connect

    Larsen, P. E.; Sreedasyam, A.; Trivedi, G; Podila, G. K.; Cseke, L. J.; Collart, F. R.

    2011-05-13

    Mycorrhizae, symbiotic interactions between soil fungi and tree roots, are ubiquitous in terrestrial ecosystems. The fungi contribute phosphorous, nitrogen and mobilized nutrients from organic matter in the soil and in return the fungus receives photosynthetically-derived carbohydrates. This union of plant and fungal metabolisms is the mycorrhizal metabolome. Understanding this symbiotic relationship at a molecular level provides important contributions to the understanding of forest ecosystems and global carbon cycling. We generated next generation short-read transcriptomic sequencing data from fully-formed ectomycorrhizae between Laccaria bicolor and aspen (Populus tremuloides) roots. The transcriptomic data was used to identify statistically significantly expressed gene models using a bootstrap-style approach, and these expressed genes were mapped to specific metabolic pathways. Integration of expressed genes that code for metabolic enzymes and the set of expressed membrane transporters generates a predictive model of the ectomycorrhizal metabolome. The generated model of mycorrhizal metabolome predicts that the specific compounds glycine, glutamate, and allantoin are synthesized by L. bicolor and that these compounds or their metabolites may be used for the benefit of aspen in exchange for the photosynthetically-derived sugars fructose and glucose. The analysis illustrates an approach to generate testable biological hypotheses to investigate the complex molecular interactions that drive ectomycorrhizal symbiosis. These models are consistent with experimental environmental data and provide insight into the molecular exchange processes for organisms in this complex ecosystem. The method used here for predicting metabolomic models of mycorrhizal systems from deep RNA sequencing data can be generalized and is broadly applicable to transcriptomic data derived from complex systems.

  12. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers.

    PubMed

    Myer, Phillip R; Kim, MinSeok; Freetly, Harvey C; Smith, Timothy P L

    2016-08-01

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplification primer selection, and read length, which can affect the apparent microbial community. In this study, we compared short read 16S rRNA variable regions, V1-V3, with that of near-full length 16S regions, V1-V8, using highly diverse steer rumen microbial communities, in order to examine the impact of technology selection on phylogenetic profiles. Short paired-end reads from the Illumina MiSeq platform were used to generate V1-V3 sequence, while long "circular consensus" reads from the Pacific Biosciences RSII instrument were used to generate V1-V8 data. The two platforms revealed similar microbial operational taxonomic units (OTUs), as well as similar species richness, Good's coverage, and Shannon diversity metrics. However, the V1-V8 amplified ruminal community resulted in significant increases in several orders of taxa, such as phyla Proteobacteria and Verrucomicrobia (P < 0.05). Taxonomic classification accuracy was also greater in the near full-length read. UniFrac distance matrices using jackknifed UPGMA clustering also noted differences between the communities. These data support the consensus that longer reads result in a finer phylogenetic resolution that may not be achieved by shorter 16S rRNA gene fragments. Our work on the cattle rumen bacterial community demonstrates that utilizing near full-length 16S reads may be useful in conducting a more thorough study, or for developing a niche-specific database to use in analyzing data from shorter read technologies when budgetary constraints preclude use of near-full length 16S sequencing. PMID:27282101

  13. Computational characterisation of cancer molecular profiles derived using next generation sequencing

    PubMed Central

    Oleksiewicz, Urszula; Tomczak, Katarzyna; Woropaj, Jakub; Markowska, Monika; Stępniak, Piotr

    2015-01-01

    Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets. PMID:25691827

  14. NGS-Trex: Next Generation Sequencing Transcriptome profile explorer

    PubMed Central

    2013-01-01

    Background Next-Generation Sequencing (NGS) technology has exceptionally increased the ability to sequence DNA in a massively parallel and cost-effective manner. Nevertheless, NGS data analysis requires bioinformatics skills and computational resources well beyond the possibilities of many "wet biology" laboratories. Moreover, most of projects only require few sequencing cycles and standard tools or workflows to carry out suitable analyses for the identification and annotation of genes, transcripts and splice variants found in the biological samples under investigation. These projects can take benefits from the availability of easy to use systems to automatically analyse sequences and to mine data without the preventive need of strong bioinformatics background and hardware infrastructure. Results To address this issue we developed an automatic system targeted to the analysis of NGS data obtained from large-scale transcriptome studies. This system, we named NGS-Trex (NGS Transcriptome profile explorer) is available through a simple web interface http://www.ngs-trex.org and allows the user to upload raw sequences and easily obtain an accurate characterization of the transcriptome profile after the setting of few parameters required to tune the analysis procedure. The system is also able to assess differential expression at both gene and transcript level (i.e. splicing isoforms) by comparing the expression profile of different samples. By using simple query forms the user can obtain list of genes, transcripts, splice sites ranked and filtered according to several criteria. Data can be viewed as tables, text files or through a simple genome browser which helps the visual inspection of the data. Conclusions NGS-Trex is a simple tool for RNA-Seq data analysis mainly targeted to "wet biology" researchers with limited bioinformatics skills. It offers simple data mining tools to explore transcriptome profiles of samples investigated taking advantage of NGS technologies

  15. Targeted DNA methylation analysis by next-generation sequencing.

    PubMed

    Masser, Dustin R; Stanford, David R; Freeman, Willard M

    2015-02-24

    The role of epigenetic processes in the control of gene expression has been known for a number of years. DNA methylation at cytosine residues is of particular interest for epigenetic studies as it has been demonstrated to be both a long lasting and a dynamic regulator of gene expression. Efforts to examine epigenetic changes in health and disease have been hindered by the lack of high-throughput, quantitatively accurate methods. With the advent and popularization of next-generation sequencing (NGS) technologies, these tools are now being applied to epigenomics in addition to existing genomic and transcriptomic methodologies. For epigenetic investigations of cytosine methylation where regions of interest, such as specific gene promoters or CpG islands, have been identified and there is a need to examine significant numbers of samples with high quantitative accuracy, we have developed a method called Bisulfite Amplicon Sequencing (BSAS). This method combines bisulfite conversion with targeted amplification of regions of interest, transposome-mediated library construction and benchtop NGS. BSAS offers a rapid and efficient method for analysis of up to 10 kb of targeted regions in up to 96 samples at a time that can be performed by most research groups with basic molecular biology skills. The results provide absolute quantitation of cytosine methylation with base specificity. BSAS can be applied to any genomic region from any DNA source. This method is useful for hypothesis testing studies of target regions of interest as well as confirmation of regions identified in genome-wide methylation analyses such as whole genome bisulfite sequencing, reduced representation bisulfite sequencing, and methylated DNA immunoprecipitation sequencing.

  16. Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.

    PubMed

    Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L

    2016-05-01

    Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies. PMID:26944624

  17. Small RNAs in angiosperms: sequence characteristics, distribution and generation.

    PubMed

    Chen, Dijun; Meng, Yijun; Ma, Xiaoxia; Mao, Chuanzao; Bai, Youhuang; Cao, Junjie; Gu, Haibin; Wu, Ping; Chen, Ming

    2010-06-01

    High-throughput sequencing (HTS) has opened up a new era for small RNA (sRNA) exploration. Using HTS data for a global survey of sRNAs in 26 angiosperms, elevated GC contents were detected in the monocots, whereas the 5(')-terminal compositions were quite uniform among the angiosperms. Chromosome-wide distribution patterns of sRNAs were investigated by using scrolling-window analysis. We performed de novo natural antisense transcript (NAT) prediction, and found that the overlapping regions of trans-NATs, but not cis-NATs, were hotspots for sRNA generation. One cis-NAT generates phased natural antisense short interfering RNAs (nat-siRNAs) specifically from flowers in Arabidopsis, while one in rice produces phased nat-siRNAs from grains, suggesting their organ-specific regulatory roles. PMID:20378553

  18. Small RNAs in angiosperms: sequence characteristics, distribution and generation.

    PubMed

    Chen, Dijun; Meng, Yijun; Ma, Xiaoxia; Mao, Chuanzao; Bai, Youhuang; Cao, Junjie; Gu, Haibin; Wu, Ping; Chen, Ming

    2010-06-01

    High-throughput sequencing (HTS) has opened up a new era for small RNA (sRNA) exploration. Using HTS data for a global survey of sRNAs in 26 angiosperms, elevated GC contents were detected in the monocots, whereas the 5(')-terminal compositions were quite uniform among the angiosperms. Chromosome-wide distribution patterns of sRNAs were investigated by using scrolling-window analysis. We performed de novo natural antisense transcript (NAT) prediction, and found that the overlapping regions of trans-NATs, but not cis-NATs, were hotspots for sRNA generation. One cis-NAT generates phased natural antisense short interfering RNAs (nat-siRNAs) specifically from flowers in Arabidopsis, while one in rice produces phased nat-siRNAs from grains, suggesting their organ-specific regulatory roles.

  19. Transcriptome Sequencing and Analysis of Leaf Tissue of Avicennia marina Using the Illumina Platform

    PubMed Central

    Zhang, Wanke; Huang, Rongfeng; Chen, Shouyi; Zheng, Yizhi

    2014-01-01

    Avicennia marina is a widely distributed mangrove species that thrives in high-salinity habitats. It plays a significant role in supporting coastal ecosystem and holds unique potential for studying molecular mechanisms underlying ecological adaptation. Despite and sometimes because of its numerous merits, this species is facing increasing pressure of exploitation and deforestation. Both study on adaptation mechanisms and conservation efforts necessitate more genomic resources for A. marina. In this study, we used Illumina sequencing of an A. marina foliar cDNA library to generate a transcriptome dataset for gene and marker discovery. We obtained 40 million high-quality reads and assembled them into 91,125 unigenes with a mean length of 463 bp. These unigenes covered most of the publicly available A. marina Sanger ESTs and greatly extended the repertoire of transcripts for this species. A total of 54,497 and 32,637 unigenes were annotated based on homology to sequences in the NCBI non-redundant and the Swiss-prot protein databases, respectively. Both Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed some transcriptomic signatures of stress adaptation for this halophytic species. We also detected an extraordinary amount of transcripts derived from fungal endophytes and demonstrated the utility of transcriptome sequencing in surveying endophyte diversity without isolating them out of plant tissues. Additionally, we identified 3,423 candidate simple sequence repeats (SSRs) from 3,141 unigenes with a density of one SSR locus every 8.25 kb sequence. Our transcriptomic data will provide valuable resources for ecological, genetic and evolutionary studies in A. marina. PMID:25265387

  20. SNP discovery in the transcriptome of white Pacific shrimp Litopenaeus vannamei by next generation sequencing.

    PubMed

    Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2014-01-01

    The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies.

  1. Next generation sequencing in predicting gene function in podophyllotoxin biosynthesis.

    PubMed

    Marques, Joaquim V; Kim, Kye-Won; Lee, Choonseok; Costa, Michael A; May, Gregory D; Crow, John A; Davin, Laurence B; Lewis, Norman G

    2013-01-01

    Podophyllum species are sources of (-)-podophyllotoxin, an aryltetralin lignan used for semi-synthesis of various powerful and extensively employed cancer-treating drugs. Its biosynthetic pathway, however, remains largely unknown, with the last unequivocally demonstrated intermediate being (-)-matairesinol. Herein, massively parallel sequencing of Podophyllum hexandrum and Podophyllum peltatum transcriptomes and subsequent bioinformatics analyses of the corresponding assemblies were carried out. Validation of the assembly process was first achieved through confirmation of assembled sequences with those of various genes previously established as involved in podophyllotoxin biosynthesis as well as other candidate biosynthetic pathway genes. This contribution describes characterization of two of the latter, namely the cytochrome P450s, CYP719A23 from P. hexandrum and CYP719A24 from P. peltatum. Both enzymes were capable of converting (-)-matairesinol into (-)-pluviatolide by catalyzing methylenedioxy bridge formation and did not act on other possible substrates tested. Interestingly, the enzymes described herein were highly similar to methylenedioxy bridge-forming enzymes from alkaloid biosynthesis, whereas candidates more similar to lignan biosynthetic enzymes were catalytically inactive with the substrates employed. This overall strategy has thus enabled facile further identification of enzymes putatively involved in (-)-podophyllotoxin biosynthesis and underscores the deductive power of next generation sequencing and bioinformatics to probe and deduce medicinal plant biosynthetic pathways.

  2. Application of next-generation sequencing technologies in Neurology

    PubMed Central

    Jiang, Teng; Tan, Meng-Shan

    2014-01-01

    Genetic risk factors that underlie many rare and common neurological diseases remain poorly understood because of the multi-factorial and heterogeneous nature of these disorders. Although genome-wide association studies (GWAS) have successfully uncovered numerous susceptibility genes for these diseases, odds ratios associated with risk alleles are generally low and account for only a small proportion of estimated heritability. These results implicated that there are rare (present in <5% of the population) but not causative variants exist in the pathogenesis of these diseases, which usually have large effect size and cannot be captured by GWAS. With the decreasing cost of next-generation sequencing (NGS) technologies, whole-genome sequencing (WGS) and whole-exome sequencing (WES) have enabled the rapid identification of rare variants with large effect size, which made huge progress in understanding the basis of many Mendelian neurological conditions as well as complex neurological diseases. In this article, recent NGS-based studies that aimed to investigate genetic causes for neurological diseases, including Alzheimer’s disease, Parkinson’s disease, epilepsy, multiple sclerosis, stroke, amyotrophic lateral sclerosis and spinocerebellar ataxias, have been reviewed. In addition, we also discuss the future directions of NGS applications in this article. PMID:25568878

  3. Next Generation Sequencing in Predicting Gene Function in Podophyllotoxin Biosynthesis*

    PubMed Central

    Marques, Joaquim V.; Kim, Kye-Won; Lee, Choonseok; Costa, Michael A.; May, Gregory D.; Crow, John A.; Davin, Laurence B.; Lewis, Norman G.

    2013-01-01

    Podophyllum species are sources of (−)-podophyllotoxin, an aryltetralin lignan used for semi-synthesis of various powerful and extensively employed cancer-treating drugs. Its biosynthetic pathway, however, remains largely unknown, with the last unequivocally demonstrated intermediate being (−)-matairesinol. Herein, massively parallel sequencing of Podophyllum hexandrum and Podophyllum peltatum transcriptomes and subsequent bioinformatics analyses of the corresponding assemblies were carried out. Validation of the assembly process was first achieved through confirmation of assembled sequences with those of various genes previously established as involved in podophyllotoxin biosynthesis as well as other candidate biosynthetic pathway genes. This contribution describes characterization of two of the latter, namely the cytochrome P450s, CYP719A23 from P. hexandrum and CYP719A24 from P. peltatum. Both enzymes were capable of converting (−)-matairesinol into (−)-pluviatolide by catalyzing methylenedioxy bridge formation and did not act on other possible substrates tested. Interestingly, the enzymes described herein were highly similar to methylenedioxy bridge-forming enzymes from alkaloid biosynthesis, whereas candidates more similar to lignan biosynthetic enzymes were catalytically inactive with the substrates employed. This overall strategy has thus enabled facile further identification of enzymes putatively involved in (−)-podophyllotoxin biosynthesis and underscores the deductive power of next generation sequencing and bioinformatics to probe and deduce medicinal plant biosynthetic pathways. PMID:23161544

  4. Comparison of DNA Quantification Methods for Next Generation Sequencing

    PubMed Central

    Robin, Jérôme D.; Ludlow, Andrew T.; LaRanger, Ryan; Wright, Woodring E.; Shay, Jerry W.

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library’s heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  5. Next generation sequencing technologies: tool to study avian virus diversity.

    PubMed

    Kapgate, S S; Barbuddhe, S B; Kumanan, K

    2015-03-01

    Increased globalisation, climatic changes and wildlife-livestock interface led to emergence of novel viral pathogens or zoonoses that have become serious concern to avian, animal and human health. High biodiversity and bird migration facilitate spread of the pathogen and provide reservoirs for emerging infectious diseases. Current classical diagnostic methods designed to be virus-specific or aim to be limited to group of viral agents, hinder identifying of novel viruses or viral variants. Recently developed approaches of next-generation sequencing (NGS) provide culture-independent methods that are useful for understanding viral diversity and discovery of novel virus, thereby enabling a better diagnosis and disease control. This review discusses the different possible steps of a NGS study utilizing sequence-independent amplification, high-throughput sequencing and bioinformatics approaches to identify novel avian viruses and their diversity. NGS lead to the identification of a wide range of new viruses such as picobirnavirus, picornavirus, orthoreovirus and avian gamma coronavirus associated with fulminating disease in guinea fowl and is also used in describing viral diversity among avian species. The review also briefly discusses areas of viral-host interaction and disease associated causalities with newly identified avian viruses. PMID:25790045

  6. Next generation sequencing technologies: tool to study avian virus diversity.

    PubMed

    Kapgate, S S; Barbuddhe, S B; Kumanan, K

    2015-03-01

    Increased globalisation, climatic changes and wildlife-livestock interface led to emergence of novel viral pathogens or zoonoses that have become serious concern to avian, animal and human health. High biodiversity and bird migration facilitate spread of the pathogen and provide reservoirs for emerging infectious diseases. Current classical diagnostic methods designed to be virus-specific or aim to be limited to group of viral agents, hinder identifying of novel viruses or viral variants. Recently developed approaches of next-generation sequencing (NGS) provide culture-independent methods that are useful for understanding viral diversity and discovery of novel virus, thereby enabling a better diagnosis and disease control. This review discusses the different possible steps of a NGS study utilizing sequence-independent amplification, high-throughput sequencing and bioinformatics approaches to identify novel avian viruses and their diversity. NGS lead to the identification of a wide range of new viruses such as picobirnavirus, picornavirus, orthoreovirus and avian gamma coronavirus associated with fulminating disease in guinea fowl and is also used in describing viral diversity among avian species. The review also briefly discusses areas of viral-host interaction and disease associated causalities with newly identified avian viruses.

  7. Generation of animation sequences of three dimensional models

    NASA Technical Reports Server (NTRS)

    Poi, Sharon (Inventor); Bell, Brad N. (Inventor)

    1990-01-01

    The invention is directed toward a method and apparatus for generating an animated sequence through the movement of three-dimensional graphical models. A plurality of pre-defined graphical models are stored and manipulated in response to interactive commands or by means of a pre-defined command file. The models may be combined as part of a hierarchical structure to represent physical systems without need to create a separate model which represents the combined system. System motion is simulated through the introduction of translation, rotation and scaling parameters upon a model within the system. The motion is then transmitted down through the system hierarchy of models in accordance with hierarchical definitions and joint movement limitations. The present invention also calls for a method of editing hierarchical structure in response to interactive commands or a command file such that a model may be included, deleted, copied or moved within multiple system model hierarchies. The present invention also calls for the definition of multiple viewpoints or cameras which may exist as part of a system hierarchy or as an independent camera. The simulated movement of the models and systems is graphically displayed on a monitor and a frame is recorded by means of a video controller. Multiple movement and hierarchy manipulations are then recorded as a sequence of frames which may be played back as an animation sequence on a video cassette recorder.

  8. Generation and functional assessment of 3D multicellular spheroids in droplet based microfluidics platform.

    PubMed

    Sabhachandani, P; Motwani, V; Cohen, N; Sarkar, S; Torchilin, V; Konry, T

    2016-02-01

    Here we describe a robust, microfluidic technique to generate and analyze 3D tumor spheroids, which resembles tumor microenvironment and can be used as a more effective preclinical drug testing and screening model. Monodisperse cell-laden alginate droplets were generated in polydimethylsiloxane (PDMS) microfluidic devices that combine T-junction droplet generation and external gelation for spheroid formation. The proposed approach has the capability to incorporate multiple cell types. For the purposes of our study, we generated spheroids with breast cancer cell lines (MCF-7 drug sensitive and resistant) and co-culture spheroids of MCF-7 together with a fibroblast cell line (HS-5). The device has the capability to house 1000 spheroids on chip for drug screening and other functional analysis. Cellular viability of spheroids in the array part of the device was maintained for two weeks by continuous perfusion of complete media into the device. The functional performance of our 3D tumor models and a dose dependent response of standard chemotherapeutic drug, doxorubicin (Dox) and standard drug combination Dox and paclitaxel (PCT) was analyzed on our chip-based platform. Altogether, our work provides a simple and novel, in vitro platform to generate, image and analyze uniform, 3D monodisperse alginate hydrogel tumors for various omic studies and therapeutic efficiency screening, an important translational step before in vivo studies. PMID:26686985

  9. Generation and application of a standardized load-time history to tubular T-joints in offshore platforms

    NASA Astrophysics Data System (ADS)

    Li, Shan-shan; Cui, Wei-cheng

    2015-10-01

    Marine structures are mostly made of metals and always experience complex random loading during their service periods. The fatigue crack growth behaviors of metal materials have been proved from laboratory tests to be sensitive to the loading sequence encountered. In order to take account of the loading sequence effect, fatigue life prediction should be based on fatigue crack propagation (FCP) theory rather than the currently used cumulative fatigue damage (CFD) theory. A unified fatigue life prediction (UFLP) method for marine structures has been proposed by the authors' group. In order to apply the UFLP method for newly designed structures, authorities such as the classification societies should provide a standardized load-time history (SLH) such as the TWIST and FALSTAFF sequences for transport and fighter aircraft. This paper mainly aims at proposing a procedure to generate the SLHs for marine structures based on a short-term loading sample and to provide an illustration on how to use the presented SLH to a typical tubular T-joint in an offshore platform based on the UFLP method.

  10. Application of next-generation sequencing technologies in virology.

    PubMed

    Radford, Alan D; Chapman, David; Dixon, Linda; Chantrey, Julian; Darby, Alistair C; Hall, Neil

    2012-09-01

    The progress of science is punctuated by the advent of revolutionary technologies that provide new ways and scales to formulate scientific questions and advance knowledge. Following on from electron microscopy, cell culture and PCR, next-generation sequencing is one of these methodologies that is now changing the way that we understand viruses, particularly in the areas of genome sequencing, evolution, ecology, discovery and transcriptomics. Possibilities for these methodologies are only limited by our scientific imagination and, to some extent, by their cost, which has restricted their use to relatively small numbers of samples. Challenges remain, including the storage and analysis of the large amounts of data generated. As the chemistries employed mature, costs will decrease. In addition, improved methods for analysis will become available, opening yet further applications in virology including routine diagnostic work on individuals, and new understanding of the interaction between viral and host transcriptomes. An exciting era of viral exploration has begun, and will set us new challenges to understand the role of newly discovered viral diversity in both disease and health.

  11. [Next generation sequencing for the diagnostics and epidemiology of tuberculosis].

    PubMed

    Comas, Iñaki; Gil, Ana

    2016-07-01

    Tuberculosis (TB) has overtaken HIV (human immunodeficiency virus) and malaria as the leading cause of death by an infectious disease worldwide. The reduction in the TB incidence is a modest 2% of cases per year, thus we will need 200 years to eradicate the disease. Part of the problem is that TB control tools are decades old and cannot anymore contribute to accelerate eradication of TB. New diagnostics, treatments and vaccines are urgently needed. Next generation sequencing has the potential to become one of these new tools. Genomic characterization of TB isolates is already showing its potential for epidemiology and diagnostics, particularly to identify drug resistance mutations. However, the experimental and bioinformatics skills needed are still far from being standardized and are not easy to incorporate as a routine in clinical laboratories. In this review we will describe current next generation sequencing approaches applied to the Mycobacterium tuberculosis complex, their contribution to the diagnostics and epidemiology of the disease and the efforts that are being undertaken to make the technology accessible to public health and clinical microbiology laboratories.

  12. Next-Generation Sequencing and Genome Editing in Plant Virology.

    PubMed

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007

  13. Next-Generation Sequencing and Genome Editing in Plant Virology

    PubMed Central

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21–24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007

  14. Next-Generation Sequencing and Genome Editing in Plant Virology.

    PubMed

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21-24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology.

  15. Next-Generation Sequencing and Genome Editing in Plant Virology

    PubMed Central

    Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina

    2016-01-01

    Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21–24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology.

  16. Metre-scale cyclicity in Middle Eocene platform carbonates in northern Egypt: Implications for facies development and sequence stratigraphy

    NASA Astrophysics Data System (ADS)

    Tawfik, Mohamed; El-Sorogy, Abdelbaset; Moussa, Mahmoud

    2016-07-01

    The shallow-water carbonates of the Middle Eocene in northern Egypt represent a Tethyan reef-rimmed carbonate platform with bedded inner-platform facies. Based on extensive micro- and biofacies documentation, five lithofacies associations were defined and their respective depositional environments were interpreted. Investigated sections were subdivided into three third-order sequences, named S1, S2 and S3. Sequence S1 is interpreted to correspond to the Lutetian, S2 corresponds to the Late Lutetian and Early Bartonian, and S3 represents the Late Bartonian. Each of the three sequences was further subdivided into fourth-order cycle sets and fifth-order cycles. The complete hierarchy of cycles can be correlated along 190 km across the study area, and highlighting a general "layer-cake" stratigraphic architecture. The documentation of the studied outcrops may contribute to the better regional understanding of the Middle Eocene formations in northern Egypt and to Tethyan pericratonic carbonate models in general.

  17. Addressing Benefits, Risks and Consent in Next Generation Sequencing Studies

    PubMed Central

    Meller, R

    2016-01-01

    The sequencing of the human genome and technological advances in DNA sequencing have led to a revolution with respect to DNA sequencing and its potential to diagnose genetic disorders. However, requests for open access to genomic data must be balanced against the guiding principles of the Common Rule for human subject research. Unfortunately, the risks to patients involved in genomic studies are still evolving and as such may not be clear to learned and well-intentioned scientists. Central to this issue are the strategies that enable human participants in such studies to remain anonymous, or de-identified. The wealth of genomic data on the Internet in genomic data repositories and other databases has enabled de-identified data to be broken and research subjects to be identified. The security of de-identification neglects the fact that DNA itself is an identifying element. Therefore, it is questionable whether data security standards can ever truly protect the identity of a patient, under the current conditions or in the future. As Big Data methodologies advance, additional sources of data may enable the re-identification of patients enrolled in next-generation sequencing (NGS) studies. As such, it is time to re-evaluate the risks of sharing genomic data and establish new guidelines for good practices. In this commentary, I address the challenges facing federally funded investigators who need to strike a balance between compliance with federal (US) rules for human subjects and the recent requirement for open access/sharing of data from National Institute for Health (NIH)-funded studies involving human subjects. PMID:27375922

  18. Accessing complex crop genomes with next-generation sequencing.

    PubMed

    Edwards, David; Batley, Jacqueline; Snowdon, Rod J

    2013-01-01

    Many important crop species have genomes originating from ancestral or recent polyploidisation events. Multiple homoeologous gene copies, chromosomal rearrangements and amplification of repetitive DNA within large and complex crop genomes can considerably complicate genome analysis and gene discovery by conventional, forward genetics approaches. On the other hand, ongoing technological advances in molecular genetics and genomics today offer unprecedented opportunities to analyse and access even more recalcitrant genomes. In this review, we describe next-generation sequencing and data analysis techniques that vastly improve our ability to dissect and mine genomes for causal genes underlying key traits and allelic variation of interest to breeders. We focus primarily on wheat and oilseed rape, two leading examples of major polyploid crop genomes whose size or complexity present different, significant challenges. In both cases, the latest DNA sequencing technologies, applied using quite different approaches, have enabled considerable progress towards unravelling the respective genomes. Our ability to discover the extent and distribution of genetic diversity in crop gene pools, and its relationship to yield and quality-related traits, is swiftly gathering momentum as DNA sequencing and the bioinformatic tools to deal with growing quantities of genomic data continue to develop. In the coming decade, genomic and transcriptomic sequencing, discovery and high-throughput screening of single nucleotide polymorphisms, presence-absence variations and other structural chromosomal variants in diverse germplasm collections will give detailed insight into the origins, domestication and available trait-relevant variation of polyploid crops, in the process facilitating novel approaches and possibilities for genomics-assisted breeding.

  19. SMITH: a LIMS for handling next-generation sequencing workflows

    PubMed Central

    2014-01-01

    Background Life-science laboratories make increasing use of Next Generation Sequencing (NGS) for studying bio-macromolecules and their interactions. Array-based methods for measuring gene expression or protein-DNA interactions are being replaced by RNA-Seq and ChIP-Seq. Sequencing is generally performed by specialized facilities that have to keep track of sequencing requests, trace samples, ensure quality and make data available according to predefined privileges. An integrated tool helps to troubleshoot problems, to maintain a high quality standard, to reduce time and costs. Commercial and non-commercial tools called LIMS (Laboratory Information Management Systems) are available for this purpose. However, they often come at prohibitive cost and/or lack the flexibility and scalability needed to adjust seamlessly to the frequently changing protocols employed. In order to manage the flow of sequencing data produced at the Genomic Unit of the Italian Institute of Technology (IIT), we developed SMITH (Sequencing Machine Information Tracking and Handling). Methods SMITH is a web application with a MySQL server at the backend. Wet-lab scientists of the Centre for Genomic Science and database experts from the Politecnico of Milan in the context of a Genomic Data Model Project developed SMITH. The data base schema stores all the information of an NGS experiment, including the descriptions of all protocols and algorithms used in the process. Notably, an attribute-value table allows associating an unconstrained textual description to each sample and all the data produced afterwards. This method permits the creation of metadata that can be used to search the database for specific files as well as for statistical analyses. Results SMITH runs automatically and limits direct human interaction mainly to administrative tasks. SMITH data-delivery procedures were standardized making it easier for biologists and analysts to navigate the data. Automation also helps saving time. The

  20. Metagenome of microorganisms associated with the toxic Cyanobacteria Microcystis aeruginosa analyzed using the 454 sequencing platform

    NASA Astrophysics Data System (ADS)

    Li, Nan; Zhang, Lei; Li, Fuchao; Wang, Yuezhu; Zhu, Yongqiang; Kang, Hui; Wang, Shengyue; Qin, Song

    2011-05-01

    In this study, the 454 pyrosequencing technology was used to analyze the DNA of the Microcystis aeruginosa symbiosis system from cyanobacterial algal blooms in Taihu Lake, China. We generated 183 228 reads with an average length of 248 bp. Running the 454 assembly algorithm over our sequences yielded 22 239 significant contigs. After excluding the M. aeruginosa sequences, we obtained 1 322 assembled contigs longer than 1 000 bp. Taxonomic analysis indicated that four kingdoms were represented in the community: Archaea ( n = 9; 0.01%), Bacteria ( n = 98 921; 99.6%), Eukaryota ( n = 373; 3.7%), and Viruses ( n = 18; 0.02%). The bacterial sequences were predominantly Alphaproteobacteria ( n = 41 805; 83.3%), Betaproteobacteria ( n = 5 254; 10.5%) and Gammaproteobacteria ( n = 1 180; 2.4%). Gene annotations and assignment of COG (clusters of orthologous groups) functional categories indicate that a large number of the predicted genes are involved in metabolic, genetic, and environmental information processes. Our results demonstrate the extraordinary diversity of a microbial community in an ectosymbiotic system and further establish the tremendous utility of pyrosequencing.

  1. ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using Next Generation Sequence

    PubMed Central

    2011-01-01

    Background The possibilities offered by next generation sequencing (NGS) platforms are revolutionizing biotechnological laboratories. Moreover, the combination of NGS sequencing and affordable high-throughput genotyping technologies is facilitating the rapid discovery and use of SNPs in non-model species. However, this abundance of sequences and polymorphisms creates new software needs. To fulfill these needs, we have developed a powerful, yet easy-to-use application. Results The ngs_backbone software is a parallel pipeline capable of analyzing Sanger, 454, Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection) sequence reads. Its main supported analyses are: read cleaning, transcriptome assembly and annotation, read mapping and single nucleotide polymorphism (SNP) calling and selection. In order to build a truly useful tool, the software development was paired with a laboratory experiment. All public tomato Sanger EST reads plus 14.2 million Illumina reads were employed to test the tool and predict polymorphism in tomato. The cleaned reads were mapped to the SGN tomato transcriptome obtaining a coverage of 4.2 for Sanger and 8.5 for Illumina. 23,360 single nucleotide variations (SNVs) were predicted. A total of 76 SNVs were experimentally validated, and 85% were found to be real. Conclusions ngs_backbone is a new software package capable of analyzing sequences produced by NGS technologies and predicting SNVs with great accuracy. In our tomato example, we created a highly polymorphic collection of SNVs that will be a useful resource for tomato researchers and breeders. The software developed along with its documentation is freely available under the AGPL license and can be downloaded from http://bioinf.comav.upv.es/ngs_backbone/ or http://github.com/JoseBlanca/franklin. PMID:21635747

  2. Estimates of acoustic noise generated by supply vessels working with oil-drilling platforms

    NASA Astrophysics Data System (ADS)

    Rutenko, A. N.; Ushchipovskii, V. G.

    2015-09-01

    The paper presents results on spatial measurements of acoustic noise generated by two types of tugs during their movement near the Molikpaq platform and in a dynamic positioning mode during operation with the PA-B platform. Based on the results of these measurements with the aid of simulation and preliminary research of the loss function conducted on acoustic profiles spanning from the platforms to the nearshore Piltun gray whale summer—fall feeding area, the spectra of equivalent point sources are constructed, which make it possible to construct the 1/3-octave spectra of anthropogenic noise at any point of the western profile and estimate the value of their level in a given frequency band with an accuracy of up to 2 dB. Field measurements have shown that in the dynamic positioning mode, the tugs generate 10 dB more noise than during movement; in fact, a diesel electric tug in both modes produced approximately 5 dB less noise than a diesel tug.

  3. Whole dystrophin gene analysis by next-generation sequencing: a comprehensive genetic diagnosis of Duchenne and Becker muscular dystrophy.

    PubMed

    Wang, Yan; Yang, Yao; Liu, Jing; Chen, Xiao-Chun; Liu, Xin; Wang, Chun-Zhi; He, Xi-Yu

    2014-10-01

    Duchenne/Becker muscular dystrophies are the most frequent inherited neuromuscular diseases caused by mutations of the dystrophin gene. However, approximately 30% of patients with the disease do not receive a molecular diagnosis because of the complex mutational spectrum and the large size of the gene. The introduction and use of next-generation sequencing have advanced clinical genetic research and might be a suitable method for the detection of various types of mutations in the dystrophin gene. To identify the mutational spectrum using a single platform, whole dystrophin gene sequencing was performed using next-generation sequencing. The entire dystrophin gene, including all exons, introns and promoter regions, was target enriched using a DMD whole gene enrichment kit. The enrichment libraries were sequenced on an Illumina HiSeq 2000 sequencer using paired read 100 bp sequencing. We studied 26 patients: 21 had known large deletion/duplications and 5 did not have detectable large deletion/duplications by multiplex ligation-dependent probe amplification technology (MLPA). We applied whole dystrophin gene analysis by next-generation sequencing to the five patients who did not have detectable large deletion/duplications and to five randomly chosen patients from the 21 who did have large deletion/duplications. The sequencing data covered almost 100% of the exonic region of the dystrophin gene by ≥10 reads with a mean read depth of 147. Five small mutations were identified in the first five patients, of which four variants were unreported in the dmd.nl database. The deleted or duplicated exons and the breakpoints in the five large deletion/duplication patients were precisely identified. Whole dystrophin gene sequencing by next-generation sequencing may be a useful tool for the genetic diagnosis of Duchenne and Becker muscular dystrophies.

  4. Characterization of sequence-specific errors in various next-generation sequencing systems.

    PubMed

    Shin, Sunguk; Park, Joonhong

    2016-03-01

    Next-generation sequencing (NGS) is a popular method for assessing the molecular diversity of microbial communities without cultivation, for identifying polymorphisms in populations, and for comparing genomes and transcriptomes. However, sequence-specific errors (SSEs) by NGS systems can result in genome mis-assembly, overestimation of diversity in microbial community analyses, and false polymorphism discovery. SSEs can be particularly problematic due to rich microbial biodiversity and genomes containing frequent repeats. In this study, SSEs in public data from all popular NGS systems were discovered using a Markov chain model and hotspots for sequence errors were identified. Deletion errors were frequently preceded by homopolymers in non-Illumina NGS systems, such as GS FLX+. Substitution errors were often related to high GC contents and long G/C homopolymers in Illumina sequencing systems such as HiSeq. After removal of long G/C homopolymers in HiSeq, the average lengths of contigs and average SNP quality increased. SSEs were selectively removed from our mock community data by quality filtering, and a bias against specific microbes was identified. Our findings provide a scientific basis for filtering poor-quality reads, correcting deletion errors, preventing genome mis-assembly, and accurately assessing microbial community compositions and polymorphisms.

  5. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  6. Deletion of the Pichia pastoris KU70 homologue facilitates platform strain generation for gene expression and synthetic biology.

    PubMed

    Näätsaari, Laura; Mistlberger, Beate; Ruth, Claudia; Hajek, Tanja; Hartner, Franz S; Glieder, Anton

    2012-01-01

    Targeted gene replacement to generate knock-outs and knock-ins is a commonly used method to study the function of unknown genes. In the methylotrophic yeast Pichia pastoris, the importance of specific gene targeting has increased since the genome sequencing projects of the most commonly used strains have been accomplished, but rapid progress in the field has been impeded by inefficient mechanisms for accurate integration. To improve gene targeting efficiency in P. pastoris, we identified and deleted the P. pastoris KU70 homologue. We observed a substantial increase in the targeting efficiency using the two commonly known and used integration loci HIS4 and ADE1, reaching over 90% targeting efficiencies with only 250-bp flanking homologous DNA. Although the ku70 deletion strain was noted to be more sensitive to UV rays than the corresponding wild-type strain, no lethality, severe growth retardation or loss of gene copy numbers could be detected during repetitive rounds of cultivation and induction of heterologous protein production. Furthermore, we demonstrated the use of the ku70 deletion strain for fast and simple screening of genes in the search of new auxotrophic markers by targeting dihydroxyacetone synthase and glycerol kinase genes. Precise knock-out strains for the well-known P. pastoris AOX1, ARG4 and HIS4 genes and a whole series of expression vectors were generated based on the wild-type platform strain, providing a broad spectrum of precise tools for both intracellular and secreted production of heterologous proteins utilizing various selection markers and integration strategies for targeted or random integration of single and multiple genes. The simplicity of targeted integration in the ku70 deletion strain will further support protein production strain generation and synthetic biology using P. pastoris strains as platform hosts. PMID:22768112

  7. Molecular diagnostics of a single drug-resistant multiple myeloma case using targeted next-generation sequencing

    PubMed Central

    Ikeda, Hiroshi; Ishiguro, Kazuya; Igarashi, Tetsuyuki; Aoki, Yuka; Hayashi, Toshiaki; Ishida, Tadao; Sasaki, Yasushi; Tokino, Takashi; Shinomura, Yasuhisa

    2015-01-01

    A 69-year-old man was diagnosed with IgG λ-type multiple myeloma (MM), Stage II in October 2010. He was treated with one cycle of high-dose dexamethasone. After three cycles of bortezomib, the patient exhibited slow elevations in the free light-chain levels and developed a significant new increase of serum M protein. Bone marrow cytogenetic analysis revealed a complex karyotype characteristic of malignant plasma cells. To better understand the molecular pathogenesis of this patient, we sequenced for mutations in the entire coding regions of 409 cancer-related genes using a semiconductor-based sequencing platform. Sequencing analysis revealed eight nonsynonymous somatic mutations in addition to several copy number variants, including CCND1 and RB1. These alterations may play roles in the pathobiology of this disease. This targeted next-generation sequencing can allow for the prediction of drug resistance and facilitate improvements in the treatment of MM patients. PMID:26491355

  8. Protocol: a simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species.

    PubMed

    Healey, Adam; Furtado, Agnelo; Cooper, Tal; Henry, Robert J

    2014-01-01

    Next-generation sequencing technologies rely on high quality DNA that is suitable for library preparation followed by sequencing. Some plant species store large amounts of phenolics and polysaccharides within their leaf tissue making genomic DNA extraction difficult. While many DNA extraction methods exist that contend with the presence of phenolics and polysaccharides, these methods rely on long incubations, multiple precipitations or commercially available kits to produce high molecular weight and contaminant-free DNA. In this protocol, we describe simple modifications to the established CTAB- based extraction method that allows for reliable isolation of high molecular weight genomic DNA from difficult to isolate plant species Corymbia (a eucalypt) and Coffea (coffee). The simplified protocol does not require multiple clean up steps or commercial based kits, and the isolated DNA passed stringent quality control standards for whole genome sequencing on Illumina HiSeq and TruSeq sequencing platforms.

  9. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing.

    PubMed

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity.

  10. Low Diversity in the Mitogenome of Sperm Whales Revealed by Next-Generation Sequencing

    PubMed Central

    Alexander, Alana; Steel, Debbie; Slikas, Beth; Hoekzema, Kendra; Carraher, Colm; Parks, Matthew; Cronn, Richard; Baker, C. Scott

    2013-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity. PMID:23254394

  11. Controls on facies and sequence stratigraphy of an upper Miocene carbonate ramp and platform, Melilla basin, NE Morocco

    USGS Publications Warehouse

    Cunningham, K.J.; Collins, Luke S.

    2002-01-01

    Upwelling of cool seawater, paleoceanographic circulation, paleoclimate, local tectonics and relative sea-level change controlled the lithofacies and sequence stratigraphy of a carbonate ramp and overlying platform that are part of a temporally well constrained carbonate complex in the Melilla basin, northeastern Morocco. At Melilla, from oldest to youngest, a third-order depositional sequence within the carbonate complex contains (1) a retrogradational, transgressive, warm temperate-type rhodalgal ramp; (2) an early highstand, progradational, bioclastic platform composed mainly of a temperate-type, bivalve-rich molechfor facies; and (3) late highstand, progradational to downstepping, subtropical/tropical-type chlorozoan fringing Porites reefs. The change from rhodalgal ramp to molechfor platform occurred at 7.0??0.14 Ma near the Tortonian/Messinian boundary. During a late stage in the development of the bioclastic platform a transition from temperate-type molechfor facies to subtropical/tropical-type chlorozoan facies occurred and is bracketed by chron 3An.2n (??? 6.3-6.6 Ma). Comparison to a well-dated carbonate complex in southeastern Spain at Cabo de Gata suggests that upwelling of cool seawater influenced production of temperate-type limestone within the ramp and platform at Melilla during postulated late Tortonian-early Messinian subtropical/tropical paleoclimatic conditions in the western Paleo-Mediterranean region. The upwelling of cool seawater across the bioclastic platform at Melilla could be related to the beginning of 'siphoning' of deep, cold Atlantic waters into the Paleo-Mediterranean Sea at 7.17 Ma. The facies change within the bioclastic platform from molechfor to chlorozoan facies may be coincident with a reduction of the siphoning of Atlantic waters and the end of upwelling at Melilla during chron 3An.2n. The ramp contains one retrogradational parasequence and the bioclastic platform three progradational parasequences. Minor erosional surfaces

  12. Connectivity Mapping for Candidate Therapeutics Identification Using Next Generation Sequencing RNA-Seq Data

    PubMed Central

    McArt, Darragh G.; Dunne, Philip D.; Blayney, Jaine K.; Salto-Tellez, Manuel; Van Schaeybroeck, Sandra; Hamilton, Peter W.; Zhang, Shu-Dong

    2013-01-01

    The advent of next generation sequencing technologies (NGS) has expanded the area of genomic research, offering high coverage and increased sensitivity over older microarray platforms. Although the current cost of next generation sequencing is still exceeding that of microarray approaches, the rapid advances in NGS will likely make it the platform of choice for future research in differential gene expression. Connectivity mapping is a procedure for examining the connections among diseases, genes and drugs by differential gene expression initially based on microarray technology, with which a large collection of compound-induced reference gene expression profiles have been accumulated. In this work, we aim to test the feasibility of incorporating NGS RNA-Seq data into the current connectivity mapping framework by utilizing the microarray based reference profiles and the construction of a differentially expressed gene signature from a NGS dataset. This would allow for the establishment of connections between the NGS gene signature and those microarray reference profiles, alleviating the associated incurring cost of re-creating drug profiles with NGS technology. We examined the connectivity mapping approach on a publicly available NGS dataset with androgen stimulation of LNCaP cells in order to extract candidate compounds that could inhibit the proliferative phenotype of LNCaP cells and to elucidate their potential in a laboratory setting. In addition, we also analyzed an independent microarray dataset of similar experimental settings. We found a high level of concordance between the top compounds identified using the gene signatures from the two datasets. The nicotine derivative cotinine was returned as the top candidate among the overlapping compounds with potential to suppress this proliferative phenotype. Subsequent lab experiments validated this connectivity mapping hit, showing that cotinine inhibits cell proliferation in an androgen dependent manner. Thus the

  13. Genomic resources for a commercial flatfish, the Senegalese sole (Solea senegalensis): EST sequencing, oligo microarray design, and development of the Soleamold bioinformatic platform

    PubMed Central

    Cerdà, Joan; Mercadé, Jaume; Lozano, Juan José; Manchado, Manuel; Tingaud-Sequeira, Angèle; Astola, Antonio; Infante, Carlos; Halm, Silke; Viñas, Jordi; Castellana, Barbara; Asensio, Esther; Cañavate, Pedro; Martínez-Rodríguez, Gonzalo; Piferrer, Francesc; Planas, Josep V; Prat, Francesc; Yúfera, Manuel; Durany, Olga; Subirada, Francesc; Rosell, Elisabet; Maes, Tamara

    2008-01-01

    Background The Senegalese sole, Solea senegalensis, is a highly prized flatfish of growing commercial interest for aquaculture in Southern Europe. However, despite the industrial production of Senegalese sole being hampered primarily by lack of information on the physiological mechanisms involved in reproduction, growth and immunity, very limited genomic information is available on this species. Results Sequencing of a S. senegalensis multi-tissue normalized cDNA library, from adult tissues (brain, stomach, intestine, liver, ovary, and testis), larval stages (pre-metamorphosis, metamorphosis), juvenile stages (post-metamorphosis, abnormal fish), and undifferentiated gonads, generated 10,185 expressed sequence tags (ESTs). Clones were sequenced from the 3'-end to identify isoform specific sequences. Assembly of the entire EST collection into contigs gave 5,208 unique sequences of which 1,769 (34%) had matches in GenBank, thus showing a low level of redundancy. The sequence of the 5,208 unigenes was used to design and validate an oligonucleotide microarray representing 5,087 unique Senegalese sole transcripts. Finally, a novel interactive bioinformatic platform, Soleamold, was developed for the Senegalese sole EST collection as well as microarray and ISH data. Conclusion New genomic resources have been developed for S. senegalensis, an economically important fish in aquaculture, which include a collection of expressed genes, an oligonucleotide microarray, and a publicly available bioinformatic platform that can be used to study gene expression in this species. These resources will help elucidate transcriptional regulation in wild and captive Senegalese sole for optimization of its production under intensive culture conditions. PMID:18973667

  14. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data

    PubMed Central

    Beerenwinkel, Niko; Günthard, Huldrych F.; Roth, Volker; Metzner, Karin J.

    2012-01-01

    Many viruses, including the clinically relevant RNA viruses HIV (human immunodeficiency virus) and HCV (hepatitis C virus), exist in large populations and display high genetic heterogeneity within and between infected hosts. Assessing intra-patient viral genetic diversity is essential for understanding the evolutionary dynamics of viruses, for designing effective vaccines, and for the success of antiviral therapy. Next-generation sequencing (NGS) technologies allow the rapid and cost-effective acquisition of thousands to millions of short DNA sequences from a single sample. However, this approach entails several challenges in experimental design and computational data analysis. Here, we review the entire process of inferring viral diversity from sample collection to computing measures of genetic diversity. We discuss sample preparation, including reverse transcription and amplification, and the effect of experimental conditions on diversity estimates due to in vitro base substitutions, insertions, deletions, and recombination. The use of different NGS platforms and their sequencing error profiles are compared in the context of various applications of diversity estimation, ranging from the detection of single nucleotide variants (SNVs) to the reconstruction of whole-genome haplotypes. We describe the statistical and computational challenges arising from these technical artifacts, and we review existing approaches, including available software, for their solution. Finally, we discuss open problems, and highlight successful biomedical applications and potential future clinical use of NGS to estimate viral diversity. PMID:22973268

  15. Validation of next-generation sequencing for comprehensive chromosome screening of embryos.

    PubMed

    Kung, Allen; Munné, Santiago; Bankowski, Brandon; Coates, Alison; Wells, Dagan

    2015-12-01

    Massively parallel genome sequencing, also known as next-generation sequencing (NGS), is the latest approach for preimplantation genetic diagnosis. The purpose of this study was to determine whether NGS can accurately detect aneuploidy in human embryos. Low coverage genome sequencing was applied to trophectoderm biopsies of embryos at the blastocyst stage of development. Sensitivity and specificity of NGS was determined by comparison of results with a previously validated platform, array-comparative genomic hybridization (aCGH). In total, 156 samples (116 were blindly assessed) were tested: 40 samples were re-biopsies of blastocysts where the original biopsy specimen was previously tested for aCGH; four samples were re-biopsies of single blastomeres from embryos previously biopsied at the cleavage stage and tested using aCGH; 18 samples were single cells derived from well-characterized cell lines; 94 samples were whole-genome amplification products from embryo biopsies taken from previous preimplantation genetic screening cycles analysed using aCGH. Per embryo, NGS sensitivity was 100% (no false negatives), and 100% specificity (no false positives). Per chromosome, NGS concordance was 99.20%. With more improvement, NGS will allow the simultaneous diagnosis of single gene disorders and aneuploidy, and may have the potential to provide more detailed insight into other aspects of embryo viability. PMID:26520420

  16. Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data.

    PubMed

    Beerenwinkel, Niko; Günthard, Huldrych F; Roth, Volker; Metzner, Karin J

    2012-01-01

    Many viruses, including the clinically relevant RNA viruses HIV (human immunodeficiency virus) and HCV (hepatitis C virus), exist in large populations and display high genetic heterogeneity within and between infected hosts. Assessing intra-patient viral genetic diversity is essential for understanding the evolutionary dynamics of viruses, for designing effective vaccines, and for the success of antiviral therapy. Next-generation sequencing (NGS) technologies allow the rapid and cost-effective acquisition of thousands to millions of short DNA sequences from a single sample. However, this approach entails several challenges in experimental design and computational data analysis. Here, we review the entire process of inferring viral diversity from sample collection to computing measures of genetic diversity. We discuss sample preparation, including reverse transcription and amplification, and the effect of experimental conditions on diversity estimates due to in vitro base substitutions, insertions, deletions, and recombination. The use of different NGS platforms and their sequencing error profiles are compared in the context of various applications of diversity estimation, ranging from the detection of single nucleotide variants (SNVs) to the reconstruction of whole-genome haplotypes. We describe the statistical and computational challenges arising from these technical artifacts, and we review existing approaches, including available software, for their solution. Finally, we discuss open problems, and highlight successful biomedical applications and potential future clinical use of NGS to estimate viral diversity.

  17. Mutation Detection in Patients with Retinal Dystrophies Using Targeted Next Generation Sequencing

    PubMed Central

    Weisschuh, Nicole; Mayer, Anja K.; Strom, Tim M.; Kohl, Susanne; Glöckle, Nicola; Schubach, Max; Andreasson, Sten; Bernd, Antje; Birch, David G.; Hamel, Christian P.; Heckenlively, John R.; Jacobson, Samuel G.; Kamme, Christina; Kellner, Ulrich; Kunstmann, Erdmute; Maffei, Pietro; Reiff, Charlotte M.; Rohrschneider, Klaus; Rosenberg, Thomas; Rudolph, Günther; Vámos, Rita; Varsányi, Balázs; Weleber, Richard G.; Wissinger, Bernd

    2016-01-01

    Retinal dystrophies (RD) constitute a group of blinding diseases that are characterized by clinical variability and pronounced genetic heterogeneity. The different nonsyndromic and syndromic forms of RD can be attributed to mutations in more than 200 genes. Consequently, next generation sequencing (NGS) technologies are among the most promising approaches to identify mutations in RD. We screened a large cohort of patients comprising 89 independent cases and families with various subforms of RD applying different NGS platforms. While mutation screening in 50 cases was performed using a RD gene capture panel, 47 cases were analyzed using whole exome sequencing. One family was analyzed using whole genome sequencing. A detection rate of 61% was achieved including mutations in 34 known and two novel RD genes. A total of 69 distinct mutations were identified, including 39 novel mutations. Notably, genetic findings in several families were not consistent with the initial clinical diagnosis. Clinical reassessment resulted in refinement of the clinical diagnosis in some of these families and confirmed the broad clinical spectrum associated with mutations in RD genes. PMID:26766544

  18. Next generation sequencing for characterizing biodiversity: promises and challenges.

    PubMed

    Pompanon, François; Samadi, Sarah

    2015-04-01

    DNA barcoding approaches are used to describe biodiversity by analysing specimens or environmental samples in taxonomic, phylogenetic and ecological studies. While sharing data among these disciplines would be highly valuable, this remains difficult because of contradictory requirements. The properties making a DNA barcode efficient for specimen identification or species delimitation are hardly reconcilable with those required for a powerful analysis of degraded DNA from environmental samples. The use of next generation sequencing methods open up the way towards the development of new markers (e.g., multilocus barcodes) that would overcome such limitations. However, several challenges should be taken up for coordinating actions at the interface between taxonomy, ecology, molecular biology and bioinformatics in order to develop methods and protocols compatible with both taxonomic and ecological studies.

  19. Unrevealed mosaicism in the next-generation sequencing era.

    PubMed

    Gajecka, Marzena

    2016-04-01

    Mosaicism refers to the presence in an individual of normal and abnormal cells that are genotypically distinct and are derived from a single zygote. The incidence of mosaicism events in the human body is underestimated as the genotypes in the mosaic ratio, especially in the low-grade mosaicism, stay unrevealed. This review summarizes various research outcomes and diagnostic questions in relation to different types of mosaicism. The impact of both tested biological material and applied method on the mosaicism detection rate is especially highlighted. As next-generation sequencing technologies constitute a promising methodological solution in mosaicism detection in the coming years, revisions in current diagnostic protocols are necessary to increase the detection rate of the unrevealed mosaicism events. Since mosaicism identification is a complex process, numerous examples of multistep mosaicism investigations are presented and discussed.

  20. Application of next-generation sequencing technology in forensic science.

    PubMed

    Yang, Yaran; Xie, Bingbing; Yan, Jiangwei

    2014-10-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice.

  1. Unrevealed mosaicism in the next-generation sequencing era.

    PubMed

    Gajecka, Marzena

    2016-04-01

    Mosaicism refers to the presence in an individual of normal and abnormal cells that are genotypically distinct and are derived from a single zygote. The incidence of mosaicism events in the human body is underestimated as the genotypes in the mosaic ratio, especially in the low-grade mosaicism, stay unrevealed. This review summarizes various research outcomes and diagnostic questions in relation to different types of mosaicism. The impact of both tested biological material and applied method on the mosaicism detection rate is especially highlighted. As next-generation sequencing technologies constitute a promising methodological solution in mosaicism detection in the coming years, revisions in current diagnostic protocols are necessary to increase the detection rate of the unrevealed mosaicism events. Since mosaicism identification is a complex process, numerous examples of multistep mosaicism investigations are presented and discussed. PMID:26481646

  2. Prenatal diagnosis of Gaucher disease using next-generation sequencing.

    PubMed

    Yoshida, Shinichiro; Kido, Jun; Matsumoto, Shirou; Momosaki, Ken; Mitsubuchi, Hiroshi; Shimazu, Tomoyuki; Sugawara, Keishin; Endo, Fumio; Nakamura, Kimitoshi

    2016-09-01

    In the prenatal diagnosis of Gaucher disease (GD), glucocerebrosidase (GBA) activity is measured with fetal cells, and gene analysis is performed when pathogenic mutations in GBA are identified in advance. Herein is described prenatal diagnosis in a family in which two children had GD. Although prior genetic information for this GD family was not obtained, next-generation sequencing (NGS) was carried out for this family because immediate prenatal diagnosis was necessary. Three mutations were identified in this GD family. The father had one mutation in intron 3 (IVS2 + 1), the mother had two mutations in exons 3 (I[-20]V) and 5 (M85T), and child 1 had all three of these mutations; child 3 had none of these mutations. On NGS the present fetus (child 3) was not a carrier of GD-related mutations. NGS may facilitate early detection and treatment before disease onset. PMID:27682613

  3. Application of Next-generation Sequencing Technology in Forensic Science

    PubMed Central

    Yang, Yaran; Xie, Bingbing; Yan, Jiangwei

    2014-01-01

    Next-generation sequencing (NGS) technology, with its high-throughput capacity and low cost, has developed rapidly in recent years and become an important analytical tool for many genomics researchers. New opportunities in the research domain of the forensic studies emerge by harnessing the power of NGS technology, which can be applied to simultaneously analyzing multiple loci of forensic interest in different genetic contexts, such as autosomes, mitochondrial and sex chromosomes. Furthermore, NGS technology can also have potential applications in many other aspects of research. These include DNA database construction, ancestry and phenotypic inference, monozygotic twin studies, body fluid and species identification, and forensic animal, plant and microbiological analyses. Here we review the application of NGS technology in the field of forensic science with the aim of providing a reference for future forensics studies and practice. PMID:25462152

  4. Prenatal diagnosis of Gaucher disease using next-generation sequencing.

    PubMed

    Yoshida, Shinichiro; Kido, Jun; Matsumoto, Shirou; Momosaki, Ken; Mitsubuchi, Hiroshi; Shimazu, Tomoyuki; Sugawara, Keishin; Endo, Fumio; Nakamura, Kimitoshi

    2016-09-01

    In the prenatal diagnosis of Gaucher disease (GD), glucocerebrosidase (GBA) activity is measured with fetal cells, and gene analysis is performed when pathogenic mutations in GBA are identified in advance. Herein is described prenatal diagnosis in a family in which two children had GD. Although prior genetic information for this GD family was not obtained, next-generation sequencing (NGS) was carried out for this family because immediate prenatal diagnosis was necessary. Three mutations were identified in this GD family. The father had one mutation in intron 3 (IVS2 + 1), the mother had two mutations in exons 3 (I[-20]V) and 5 (M85T), and child 1 had all three of these mutations; child 3 had none of these mutations. On NGS the present fetus (child 3) was not a carrier of GD-related mutations. NGS may facilitate early detection and treatment before disease onset.

  5. Generative Technologies for Model Animation in the TopCased Platform

    NASA Astrophysics Data System (ADS)

    Crégut, Xavier; Combemale, Benoit; Pantel, Marc; Faudoux, Raphaël; Pavei, Jonatas

    Domain Specific Modeling Languages (DSML) are more and more used to handle high level concepts, and thus bring complex software development under control. The increasingly recurring definition of new languages raises the problem of the definition of support tools such as editor, simulator, compiler, etc. In this paper we propose generative technologies that have been designed to ease the development of model animation tools inside the TopCased platform. These tools rely on the automatically generated graphical editors of TopCased and provide additional generators for building model animator graphical interface. We also rely on an architecture for executable metamodel (i.e., the TopCased model execution metamodeling pattern) to bind the behavioral semantics of the modeling language. These tools were designed in a pragmatic manner by abstracting the various model animators that had been hand-coded in the TopCased project, and then validated by refactoring these animators.

  6. Second-generation sequencing for gene discovery in the Brassicaceae.

    PubMed

    Hayward, Alice; Vighnesh, Guru; Delay, Christina; Samian, Mohd Rafizan; Manoli, Sahana; Stiller, Jiri; McKenzie, Megan; Edwards, David; Batley, Jacqueline

    2012-08-01

    The Brassicaceae contains the most diverse collection of agriculturally important crop species of all plant families. Yet, this is one of the few families that do not form functional symbiotic associations with mycorrhizal fungi in the soil for improved nutrient acquisition. The genes involved in this symbiosis were more recently recruited by legumes for symbiotic association with nitrogen-fixing rhizobia bacteria. This study applied second-generation sequencing (SGS) and analysis tools to discover that two such genes, NSP1 (Nodulation Signalling Pathway 1) and NSP2, remain conserved in diverse members of the Brassicaceae despite the absence of these symbioses. We demonstrate the utility of SGS data for the discovery of putative gene homologs and their analysis in complex polyploid crop genomes with little prior sequence information. Furthermore, we show how this data can be applied to enhance downstream reverse genetics analyses. We hypothesize that Brassica NSP genes may function in the root in other plant-microbe interaction pathways that were recruited for mycorrhizal and rhizobial symbioses during evolution.

  7. Applications of Next Generation Sequencing to Blood and Marrow Transplantation

    PubMed Central

    Chapman, Michael; Warren, Edus H.; Wu, Catherine J.

    2011-01-01

    Since the advent of next-generation sequencing (NGS) in 2005, there has been an explosion of published studies employing the technology to tackle previously intractable questions in many disparate biological fields. This has been coupled with technology development that has occurred at a remarkable pace. This review discusses the potential impact of this new technology on the field of blood and marrow stem cell transplantation. Hematologic malignancies have been among the forefront of those cancers whose genomes have been the subject of NGS. Hence, these studies have opened novel areas of biology that can be exploited for prognostic, diagnostic, and therapeutic means. Because of the unprecedented depth, resolution and accuracy achievable by NGS, this technology is well-suited for providing detailed information on the diversity of receptors that govern antigen recognition; this approach has the potential to contribute important insights into understanding the biologic effects of transplantation. Finally, the ability to perform comprehensive tumor sequencing provides a systematic approach to the discovery of genetic alterations that can encode peptides with restricted tumor expression, and hence serve as potential target antigens of GvL responses. Altogether, this increasingly affordable technology will undoubtedly impact the future practice and care of patients with hematologic malignancies. PMID:22226099

  8. Recommendations on e-infrastructures for next-generation sequencing.

    PubMed

    Spjuth, Ola; Bongcam-Rudloff, Erik; Dahlberg, Johan; Dahlö, Martin; Kallio, Aleksi; Pireddu, Luca; Vezzi, Francesco; Korpelainen, Eija

    2016-01-01

    With ever-increasing amounts of data being produced by next-generation sequencing (NGS) experiments, the requirements placed on supporting e-infrastructures have grown. In this work, we provide recommendations based on the collective experiences from participants in the EU COST Action SeqAhead for the tasks of data preprocessing, upstream processing, data delivery, and downstream analysis, as well as long-term storage and archiving. We cover demands on computational and storage resources, networks, software stacks, automation of analysis, education, and also discuss emerging trends in the field. E-infrastructures for NGS require substantial effort to set up and maintain over time, and with sequencing technologies and best practices for data analysis evolving rapidly it is important to prioritize both processing capacity and e-infrastructure flexibility when making strategic decisions to support the data analysis demands of tomorrow. Due to increasingly demanding technical requirements we recommend that e-infrastructure development and maintenance be handled by a professional service unit, be it internal or external to the organization, and emphasis should be placed on collaboration between researchers and IT professionals. PMID:27267963

  9. Next generation sequencing: Coping with rare genetic diseases in China

    PubMed Central

    Cram, David S; Zhou, Daixing

    2016-01-01

    Summary With a population of 1.4 billion, China shares the largest burden of rare genetic diseases worldwide. Current estimates suggest that there are over ten million individuals afflicted with chromosome disease syndromes and well over one million individuals with monogenic disease. Care of patients with rare genetic diseases remains a largely unmet need due to the paucity of available and affordable treatments. Over recent years, there is increasing recognition of the need for affirmative action by government, health providers, clinicians and patients. The advent of new next generation sequencing (NGS) technologies such as whole genome/exome sequencing, offers an unprecedented opportunity to provide large-scale population screening of the Chinese population to identify the molecular causes of rare genetic diseases. As a surrogate for lack of effective treatments, recent development and implementation of noninvasive prenatal testing (NIPT) in China has the greatest potential, as a single technology, for reducing the number of children born with rare genetic diseases. PMID:27672536

  10. Next generation sequencing for disorders of sex development.

    PubMed

    Tobias, Edward S; McElreavey, Ken

    2014-01-01

    Advances in sequencing technologies are having a major impact on our understanding of the genetic causes of many human congenital disorders. Next generation sequencing (NGS) approaches are particularly important for determining the inherited genetic changes leading to disorders of sex development (DSD). Knowledge of the genetic pathways involved in ovary or testis development is incomplete and, currently, a molecular diagnosis is made in a minority of DSD cases. Here, we review the different NGS strategies applied to the analysis of rare diseases and highlight the potential pitfalls and advantages that are associated with each approach. We also discuss the problems of variant calling as well as the challenges involved in the identification and interpretation of pathogenic mutations from NGS datasets. As clinics start to use NGS on a routine basis, a close collaboration between the molecular and clinical geneticists is essential. This is particularly relevant in the context of unsolicited genetic findings, where clear guidelines regarding counseling, truly informed consent and precise data interpretation will be invaluable.

  11. Second-generation sequencing for gene discovery in the Brassicaceae.

    PubMed

    Hayward, Alice; Vighnesh, Guru; Delay, Christina; Samian, Mohd Rafizan; Manoli, Sahana; Stiller, Jiri; McKenzie, Megan; Edwards, David; Batley, Jacqueline

    2012-08-01

    The Brassicaceae contains the most diverse collection of agriculturally important crop species of all plant families. Yet, this is one of the few families that do not form functional symbiotic associations with mycorrhizal fungi in the soil for improved nutrient acquisition. The genes involved in this symbiosis were more recently recruited by legumes for symbiotic association with nitrogen-fixing rhizobia bacteria. This study applied second-generation sequencing (SGS) and analysis tools to discover that two such genes, NSP1 (Nodulation Signalling Pathway 1) and NSP2, remain conserved in diverse members of the Brassicaceae despite the absence of these symbioses. We demonstrate the utility of SGS data for the discovery of putative gene homologs and their analysis in complex polyploid crop genomes with little prior sequence information. Furthermore, we show how this data can be applied to enhance downstream reverse genetics analyses. We hypothesize that Brassica NSP genes may function in the root in other plant-microbe interaction pathways that were recruited for mycorrhizal and rhizobial symbioses during evolution. PMID:22765874

  12. Management of Incidental Findings in the Era of Next-generation Sequencing

    PubMed Central

    Blackburn, Heather L.; Schroeder, Bradley; Turner, Clesson; Shriver, Craig D.; Ellsworth, Darrell L.; Ellsworth, Rachel E.

    2015-01-01

    Next-generation sequencing (NGS) technologies allow for the generation of whole exome or whole genome sequencing data, which can be used to identify novel genetic alterations associated with defined phenotypes or to expedite discovery of functional variants for improved patient care. Because this robust technology has the ability to identify all mutations within a genome, incidental findings (IF)- genetic alterations associated with conditions or diseases unrelated to the patient’s present condition for which current tests are being performed- may have important clinical ramifications. The current debate among genetic scientists and clinicians focuses on the following questions: 1) should any IF be disclosed to patients, and 2) which IF should be disclosed – actionable mutations, variants of unknown significance, or all IF? Policies for disclosure of IF are being developed for when and how to convey these findings and whether adults, minors, or individuals unable to provide consent have the right to refuse receipt of IF. In this review, we detail current NGS technology platforms, discuss pressing issues regarding disclosure of IF, and how IF are currently being handled in prenatal, pediatric, and adult patients. PMID:26069456

  13. Genetic sequence relationships of Winnipegosis platform carbonates, Southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-07-01

    Examination of cores and well-log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger Vail-type sequences separated by regionally persistent unconformities or their correlative conformities. Sea level rise during the early Eifelian caused southeastward onlap of the Ashern Formation onto Middle Silurian carbonates of the Interlake Formation. Maximum flooding, expressed by deepest marine facies and a hardground surface, suggests the existence of a condensed section at the top of the Ashern Formation. This section was developed during the maximum rate of sea level rise. A decrease in the rate of sea level rise resulted in aggradation of lower Winnipegosis units on a gently dipping ramp. These units are presented by nodular and burrowed open-marine limestones with scattered stromatoporoid patch reefs and grainstone shoals. During the subsequent sea level fall, represented by Temple units, a shelf margin with pronounced depositional topography and adjacent starved basin were developed. Temple strata include coral-brachiopod-stromatoporoid reefs and productive fore-reef talus deposits along the shelf-margin rim. With increased rates of sea level fall, the platform interior and shelf margin were subaerially exposed, slope carbonates were dolomitized, and the E-shale was deposited as a lowstand wedge.

  14. Defining a sample preparation workflow for advanced virus detection and understanding sensitivity by next-generation sequencing.

    PubMed

    Wang, Christopher J; Feng, Szi Fei; Duncan, Paul

    2014-01-01

    The application of next-generation sequencing (also known as deep sequencing or massively parallel sequencing) for adventitious agent detection is an evolving field that is steadily gaining acceptance in the biopharmaceutical industry. In order for this technology to be successfully applied, a robust method that can isolate viral nucleic acids from a variety of biological samples (such as host cell substrates, cell-free culture fluids, viral vaccine harvests, and animal-derived raw materials) must be established by demonstrating recovery of model virus spikes. In this report, we implement the sample preparation workflow developed by Feng et. al. and assess the sensitivity of virus detection in a next-generation sequencing readout using the Illumina MiSeq platform. We describe a theoretical model to estimate the detection of a target virus in a cell lysate or viral vaccine harvest sample. We show that nuclease treatment can be used for samples that contain a high background of non-relevant nucleic acids (e.g., host cell DNA) in order to effectively increase the sensitivity of sequencing target viruses and reduce the complexity of data analysis. Finally, we demonstrate that at defined spike levels, nucleic acids from a panel of model viruses spiked into representative cell lysate and viral vaccine harvest samples can be confidently recovered by next-generation sequencing.

  15. Plasmid-Based Materials as Multiplex Quality Controls and Calibrators for Clinical Next-Generation Sequencing Assays.

    PubMed

    Sims, David J; Harrington, Robin D; Polley, Eric C; Forbes, Thomas D; Mehaffey, Michele G; McGregor, Paul M; Camalier, Corinne E; Harper, Kneshay N; Bouk, Courtney H; Das, Biswajit; Conley, Barbara A; Doroshow, James H; Williams, P Mickey; Lih, Chih-Jian

    2016-05-01

    Although next-generation sequencing technologies have been widely adapted for clinical diagnostic applications, an urgent need exists for multianalyte calibrator materials and controls to evaluate the performance of these assays. Control materials will also play a major role in the assessment, development, and selection of appropriate alignment and variant calling pipelines. We report an approach to provide effective multianalyte controls for next-generation sequencing assays, referred to as the control plasmid spiked-in genome (CPSG). Control plasmids that contain approximately 1000 bases of human genomic sequence with a specific mutation of interest positioned near the middle of the insert and a nearby 6-bp molecular barcode were synthesized, linearized, quantitated, and spiked into genomic DNA derived from formalin-fixed, paraffin-embedded-prepared hapmap cell lines at defined copy number ratios. Serial titration experiments demonstrated the CPSGs performed with similar efficiency of variant detection as formalin-fixed, paraffin-embedded cell line genomic DNA. Repetitive analyses of one lot of CPSGs 90 times during 18 months revealed that the reagents were stable with consistent detection of each of the plasmids at similar variant allele frequencies. CPSGs are designed to work across most next-generation sequencing methods, platforms, and data analysis pipelines. CPSGs are robust controls and can be used to evaluate the performance of different next-generation sequencing diagnostic assays, assess data analysis pipelines, and ensure robust assay performance metrics. PMID:27105923

  16. Microfluidic platforms for generating dynamic environmental perturbations to study the responses of single yeast cells.

    PubMed

    Bisaria, Anjali; Hersen, Pascal; McClean, Megan N

    2014-01-01

    Microfluidic platforms are ideal for generating dynamic temporal and spatial perturbations in extracellular environments. Single cells and organisms can be trapped and maintained in microfluidic platforms for long periods of time while their responses to stimuli are measured using appropriate fluorescence reporters and time-lapse microscopy. Such platforms have been used to study problems as diverse as C. elegans olfaction (Chronis et al. Nature Methods 4:727-731, 2007), cancer cell migration (Huang et al. Biomicrofluidics 5:13412, 2011), and E. coli chemotaxis (Ahmed et al. Integr Biol 2:604-629, 2010). In this paper we describe how to construct and use a microfluidic chip to study the response of single yeast cells to dynamic perturbations of their fluid environment. The method involves creation of a photoresist master mold followed by subsequent creation of a polydimethylsiloxane (PDMS) microfluidic chip for maintaining live yeast cells in a channel with two inputs for stimulating the cells. We emphasize simplicity and the methods discussed here are accessible to the average biological laboratory. We cover the basic toolbox for making microfluidic lab-on-a-chip devices, and the techniques discussed serve as a starting point for creating sophisticated microfluidic devices capable of implementing more complicated experimental protocols.

  17. A novel method for the multiplexed target enrichment of MinION next generation sequencing libraries using PCR-generated baits.

    PubMed

    Karamitros, Timokratis; Magiorkinis, Gkikas

    2015-12-15

    The enrichment of targeted regions within complex next generation sequencing libraries commonly uses biotinylated baits to capture the desired sequences. This method results in high read coverage over the targets and their flanking regions. Oxford Nanopore Technologies recently released an USB3.0-interfaced sequencer, the MinION. To date no particular method for enriching MinION libraries has been standardized. Here, using biotinylated PCR-generated baits in a novel approach, we describe a simple and efficient way for multiplexed enrichment of MinION libraries, overcoming technical limitations related with the chemistry of the sequencing-adapters and the length of the DNA fragments. Using Phage Lambda and Escherichia coli as models we selectively enrich for specific targets, significantly increasing the corresponding read-coverage, eliminating unwanted regions. We show that by capturing genomic fragments, which contain the target sequences, we recover reads extending targeted regions and thus can be used for the determination of potentially unknown flanking sequences. By pooling enriched libraries derived from two distinct E. coli strains and analyzing them in parallel, we demonstrate the efficiency of this method in multiplexed format. Crucially we evaluated the optimal bait size for large fragment libraries and we describe for the first time a standardized method for target enrichment in MinION platform.

  18. Human identification by lice: A Next Generation Sequencing challenge.

    PubMed

    Pilli, Elena; Agostino, Alessandro; Vergani, Debora; Salata, Elena; Ciuna, Ignazio; Berti, Andrea; Caramelli, David; Lambiase, Simonetta

    2016-09-01

    Rapid and progressive advances in molecular biology techniques and the advent of Next Generation Sequencing (NGS) have opened new possibilities for analyses also in the identification of entomological matrixes. Insects and other arthropods are widespread in nature and those found at a crime scene can provide a useful contribution to forensic investigations. Entomological evidence is used by experts to define the postmortem interval (PMI), which is essentially based on morphological recognition of the insect and an estimation of its insect life cycle stage. However, molecular genotyping methods can also provide an important support for forensic entomological investigations when the identification of species or human genetic material is required. This case study concerns a collection of insects found in the house of a woman who died from unknown causes. Initially the insects were identified morphologically as belonging to the Pediculidae family, and then, human DNA was extracted and analyzed from their gastrointestinal tract. The application of the latest generation forensic DNA assays, such as the Quantifiler(®) Trio DNA Quantification Kit and the HID-Ion AmpliSeq™ Identity Panel (Applied Biosystems(®)), individuated the presence of human DNA in the samples and determined the genetic profile. PMID:27289564

  19. A targeted next-generation sequencing method for identifying clinically relevant mutation profiles in lung adenocarcinoma

    PubMed Central

    Shao, Di; Lin, Yongping; Liu, Jilong; Wan, Liang; Liu, Zu; Cheng, Shaomin; Fei, Lingna; Deng, Rongqing; Wang, Jian; Chen, Xi; Liu, Liping; Gu, Xia; Liang, Wenhua; He, Ping; Wang, Jun; Ye, Mingzhi; He, Jianxing

    2016-01-01

    Molecular profiling of lung cancer has become essential for prediction of an individual’s response to targeted therapies. Next-generation sequencing (NGS) is a promising technique for routine diagnostics, but has not been sufficiently evaluated in terms of feasibility, reliability, cost and capacity with routine diagnostic formalin-fixed, paraffin-embedded (FFPE) materials. Here, we report the validation and application of a test based on Ion Proton technology for the rapid characterisation of single nucleotide variations (SNVs), short insertions and deletions (InDels), copy number variations (CNVs), and gene rearrangements in 145 genes with FFPE clinical specimens. The validation study, using 61 previously profiled clinical tumour samples, showed a concordance rate of 100% between results obtained by NGS and conventional test platforms. Analysis of tumour cell lines indicated reliable mutation detection in samples with 5% tumour content. Furthermore, application of the panel to 58 clinical cases, identified at least one actionable mutation in 43 cases, 1.4 times the number of actionable alterations detected by current diagnostic tests. We demonstrated that targeted NGS is a cost-effective and rapid platform to detect multiple mutations simultaneously in various genes with high reproducibility and sensitivity. PMID:26936516

  20. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  1. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  2. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  3. Ocean colour opportunities from Meteosat Second and Third Generation geostationary platforms

    NASA Astrophysics Data System (ADS)

    Kwiatkowska, Ewa J.; Ruddick, Kevin; Ramon, Didier; Vanhellemont, Quinten; Brockmann, Carsten; Lebreton, Carole; Bonekamp, Hans G.

    2016-05-01

    Ocean colour applications from medium-resolution polar-orbiting satellite sensors have now matured and evolved into operational services. These applications are enabled by the Sentinel-3 OLCI space sensors of the European Earth Observation Copernicus programme and the VIIRS sensors of the US Joint Polar Satellite System programme. Key drivers for the Copernicus ocean colour services are the national obligations of the EU member states to report on the quality of marine, coastal and inland waters for the EU Water Framework Directive and Marine Strategy Framework Directive. Further applications include CO2 sequestration, carbon cycle and climate, fisheries and aquaculture management, near-real-time alerting to harmful algae blooms, environmental monitoring and forecasting, and assessment of sediment transport in coastal waters. Ocean colour data from polar-orbiting satellite platforms, however, suffer from fractional coverage, primarily due to clouds, and inadequate resolution of quickly varying processes. Ocean colour remote sensing from geostationary platforms can provide significant improvements in coverage and sampling frequency and support new applications and services. EUMETSAT's SEVIRI instrument on the geostationary Meteosat Second Generation platforms (MSG) is not designed to meet ocean colour mission requirements, however, it has been demonstrated to provide valuable contribution, particularly in combination with dedicated ocean colour polar observations. This paper describes the ongoing effort to develop operational ocean colour water turbidity and related products and user services from SEVIRI. SEVIRI's multi-temporal capabilities can benefit users requiring improved local-area coverage and frequent diurnal observations. A survey of user requirements and a study of technical capabilities and limitations of the SEVIRI instruments are the basis for this development and are described in this paper. The products will support monitoring of sediment transport

  4. Ocean colour products from geostationary platforms, opportunities with Meteosat Second and Third Generation

    NASA Astrophysics Data System (ADS)

    Kwiatkowska, E. J.; Ruddick, K.; Ramon, D.; Vanhellemont, Q.; Brockmann, C.; Lebreton, C.; Bonekamp, H. G.

    2015-12-01

    Ocean colour applications from medium-resolution polar-orbiting satellite sensors have now matured and evolved into operational services. The examples include the Sentinel-3 OLCI missions of the European Earth Observation Copernicus programme and the VIIRS missions of the US Joint Polar Satellite System programme. Key drivers for Copernicus ocean colour services are the national obligations of the EU member states to report on the quality of marine, coastal and inland waters for the EU Water Framework Directive and Marine Strategy Framework Directive. Further applications include CO2 sequestration, carbon cycle and climate, fisheries and aquaculture management, near-real-time alerting to harmful algae blooms, environmental monitoring and forecasting, and assessment of sediment transport in coastal waters. Ocean colour data from polar-orbiting satellite platforms, however, suffer from fractional coverage, primarily due to clouds, and inadequate resolution of quickly varying processes. Ocean colour remote sensing from geostationary platforms can provide significant improvements in coverage and sampling frequency and support new applications and services. EUMETSAT's SEVIRI instrument on the geostationary Meteosat Second Generation platforms (MSG) is not designed to meet ocean colour mission requirements, however, it has been demonstrated to provide valuable contribution, particularly in combination with dedicated ocean colour polar observations. This paper describes the ongoing effort to develop operational ocean colour water turbidity and related products and user services from SEVIRI. A survey of user requirements and a study of technical capabilities and limitations of the SEVIRI instruments are the basis for this development and are described in this paper. The products will support monitoring of sediment transport, water clarity, and tidal dynamics. Further products and services are anticipated from EUMETSAT's FCI instruments on Meteosat Third Generation

  5. Efficient DNA fingerprinting based on the targeted sequencing of active retrotransposon insertion sites using a bench-top high-throughput sequencing platform.

    PubMed

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-10-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships.

  6. Authentication of Herbal Supplements Using Next-Generation Sequencing

    PubMed Central

    Braukmann, Thomas W. A.; Borisenko, Alex V.; Zakharov, Evgeny V.

    2016-01-01

    Background DNA-based testing has been gaining acceptance as a tool for authentication of a wide range of food products; however, its applicability for testing of herbal supplements remains contentious. Methods We utilized Sanger and Next-Generation Sequencing (NGS) for taxonomic authentication of fifteen herbal supplements representing three different producers from five medicinal plants: Echinacea purpurea, Valeriana officinalis, Ginkgo biloba, Hypericum perforatum and Trigonella foenum-graecum. Experimental design included three modifications of DNA extraction, two lysate dilutions, Internal Amplification Control, and multiple negative controls to exclude background contamination. Ginkgo supplements were also analyzed using HPLC-MS for the presence of active medicinal components. Results All supplements yielded DNA from multiple species, rendering Sanger sequencing results for rbcL and ITS2 regions either uninterpretable or non-reproducible between the experimental replicates. Overall, DNA from the manufacturer-listed medicinal plants was successfully detected in seven out of eight dry herb form supplements; however, low or poor DNA recovery due to degradation was observed in most plant extracts (none detected by Sanger; three out of seven–by NGS). NGS also revealed a diverse community of fungi, known to be associated with live plant material and/or the fermentation process used in the production of plant extracts. HPLC-MS testing demonstrated that Ginkgo supplements with degraded DNA contained ten key medicinal components. Conclusion Quality control of herbal supplements should utilize a synergetic approach targeting both DNA and bioactive components, especially for standardized extracts with degraded DNA. The NGS workflow developed in this study enables reliable detection of plant and fungal DNA and can be utilized by manufacturers for quality assurance of raw plant materials, contamination control during the production process, and the final product

  7. Pathway analysis with next-generation sequencing data.

    PubMed

    Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

    2015-04-01

    Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods. PMID:24986826

  8. Statistical signatures of aftershock sequences generated by supershear mainshocks

    NASA Astrophysics Data System (ADS)

    Bhattacharya, P.; Shcherbakov, R.; Tiampo, K. F.; Mansinha, L.

    2010-12-01

    The rupture process during supershear earthquakes generates a seismic shock wave redistributing stress away from the fault resembling a sonic boom produced by a supersonic aircraft. This leads to a relative quiescence in aftershock activity along the supershear segment of the rupture. The occurrence of supershear ruptures is also generally associated with a region of local high pre-stress and an unusually smooth friction profile over the supershear segment, leading to a conspicuous absence of high frequency ground motions. We have considered the aftershock sequences of five well-known supershear earthquakes from around the world (1979 Imperial Valley, 1992 Landers, 1999 Izmit and Duzce and 2002 Denali earthquakes) to test whether the aftershock statistics around the supershear rupture are different from the statistics in the rest of the region due to the aforementioned stress conditions and redistributions. Specifically, we have looked at the frequency-magnitude distribution in order to study the variation of the b value for each of the sequences and observe statistically significant variations. In particular, we have determined that the b value is always higher in the zone surrounding a supershear segment than in the rest of the aftershock region. The Omori Law, however, does not show such clear trends. We also looked at the average difference in magnitude between the mainshock and the largest aftershock and found it is larger than that predicted by Bath's law. The results certainly point towards a relationship between aftershock statistics and the mainshock rupture process and might facilitate a physical process based understanding of the empirical laws of earthquake statistics.

  9. DNA qualification workflow for next generation sequencing of histopathological samples.

    PubMed

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  10. DNA qualification workflow for next generation sequencing of histopathological samples.

    PubMed

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  11. Splicing Express: a software suite for alternative splicing analysis using next-generation sequencing data.

    PubMed

    Kroll, Jose E; Kim, Jihoon; Ohno-Machado, Lucila; de Souza, Sandro J

    2015-01-01

    Motivation. Alternative splicing events (ASEs) are prevalent in the transcriptome of eukaryotic species and are known to influence many biological phenomena. The identification and quantification of these events are crucial for a better understanding of biological processes. Next-generation DNA sequencing technologies have allowed deep characterization of transcriptomes and made it possible to address these issues. ASEs analysis, however, represents a challenging task especially when many different samples need to be compared. Some popular tools for the analysis of ASEs are known to report thousands of events without annotations and/or graphical representations. A new tool for the identification and visualization of ASEs is here described, which can be used by biologists without a solid bioinformatics background. Results. A software suite named Splicing Express was created to perform ASEs analysis from transcriptome sequencing data derived from next-generation DNA sequencing platforms. Its major goal is to serve the needs of biomedical researchers who do not have bioinformatics skills. Splicing Express performs automatic annotation of transcriptome data (GTF files) using gene coordinates available from the UCSC genome browser and allows the analysis of data from all available species. The identification of ASEs is done by a known algorithm previously implemented in another tool named Splooce. As a final result, Splicing Express creates a set of HTML files composed of graphics and tables designed to describe the expression profile of ASEs among all analyzed samples. By using RNA-Seq data from the Illumina Human Body Map and the Rat Body Map, we show that Splicing Express is able to perform all tasks in a straightforward way, identifying well-known specific events. Availability and Implementation. Splicing Express is written in Perl and is suitable to run only in UNIX-like systems. More details can be found at: http://www.bioinformatics-brazil.org/splicingexpress.

  12. Scanning the Effects of Ethyl Methanesulfonate on the Whole Genome of Lotus japonicus Using Second-Generation Sequencing Analysis

    PubMed Central

    Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M.; Biswas, Bandana; Batley, Jacqueline

    2015-01-01

    Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm. PMID:25660167

  13. Evolution of a Reconfigurable Processing Platform for a Next Generation Space Software Defined Radio

    NASA Technical Reports Server (NTRS)

    Kacpura, Thomas J.; Downey, Joseph A.; Anderson, Keffery R.; Baldwin, Keith

    2014-01-01

    The National Aeronautics and Space Administration (NASA)Harris Ka-Band Software Defined Radio (SDR) is the first, fully reprogrammable space-qualified SDR operating in the Ka-Band frequency range. Providing exceptionally higher data communication rates than previously possible, this SDR offers in-orbit reconfiguration, multi-waveform operation, and fast deployment due to its highly modular hardware and software architecture. Currently in operation on the International Space Station (ISS), this new paradigm of reconfigurable technology is enabling experimenters to investigate navigation and networking in the space environment.The modular SDR and the NASA developed Space Telecommunications Radio System (STRS) architecture standard are the basis for Harris reusable, digital signal processing space platform trademarked as AppSTAR. As a result, two new space radio products are a synthetic aperture radar payload and an Automatic Detection Surveillance Broadcast (ADS-B) receiver. In addition, Harris is currently developing many new products similar to the Ka-Band software defined radio for other applications. For NASAs next generation flight Ka-Band radio development, leveraging these advancements could lead to a more robust and more capable software defined radio.The space environment has special considerations different from terrestrial applications that must be considered for any system operated in space. Each space mission has unique requirements that can make these systems unique. These unique requirements can make products that are expensive and limited in reuse. Space systems put a premium on size, weight and power. A key trade is the amount of reconfigurability in a space system. The more reconfigurable the hardware platform, the easier it is to adapt to the platform to the next mission, and this reduces the amount of non-recurring engineering costs. However, the more reconfigurable platforms often use more spacecraft resources. Software has similar considerations

  14. IDBA-MT: de novo assembler for metatranscriptomic data generated from next-generation sequencing technology.

    PubMed

    Leung, Henry C M; Yiu, Siu-Ming; Parkinson, John; Chin, Francis Y L

    2013-07-01

    High-throughput next-generation sequencing technology provides a great opportunity for analyzing metatranscriptomic data. However, the reads produced by these technologies are short and an assembling step is required to combine the short reads into longer contigs. As there are many repeat patterns in mRNAs from different genomes and the abundance ratio of mRNAs in a sample varies a lot, existing assemblers for genomic data, transcriptomic data, and metagenomic data do not work on metatranscriptomic data and produce chimeric contigs, that is, incorrect contigs formed by merging multiple mRNA sequences. To our best knowledge, there is no assembler designed for metatranscriptomic data. In this article, we introduce an assembler called IDBA-MT, which is designed for assembling reads from metatranscriptomic data. IDBA-MT produces much fewer chimeric contigs (reduce by 50% or more) when compared with existing assemblers such as Oases, IDBA-UD, and Trinity. PMID:23829653

  15. SNP discovery using Next Generation Transcriptomic Sequencing in Atlantic herring (Clupea harengus).

    PubMed

    Helyar, Sarah J; Limborg, Morten T; Bekkevold, Dorte; Babbucci, Massimiliano; van Houdt, Jeroen; Maes, Gregory E; Bargelloni, Luca; Nielsen, Rasmus O; Taylor, Martin I; Ogden, Rob; Cariani, Alessia; Carvalho, Gary R; Panitz, Frank

    2012-01-01

    The introduction of Next Generation Sequencing (NGS) has revolutionised population genetics, providing studies of non-model species with unprecedented genomic coverage, allowing evolutionary biologists to address questions previously far beyond the reach of available resources. Furthermore, the simple mutation model of Single Nucleotide Polymorphisms (SNPs) permits cost-effective high-throughput genotyping in thousands of individuals simultaneously. Genomic resources are scarce for the Atlantic herring (Clupea harengus), a small pelagic species that sustains high revenue fisheries. This paper details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed by de novo sequence clustering and contig assembly, followed by the mapping of reads against consensus contig sequences. Selection of candidate SNPs for genotyping was conducted using an in silico approach. SNP validation and genotyping were performed simultaneously using an Illumina 1,536 GoldenGate assay. Although the conversion rate of candidate SNPs in the genotyping assay cannot be predicted in advance, this approach has the potential to maximise cost and time efficiencies by avoiding expensive and time-consuming laboratory stages of SNP validation. Additionally, the in silico approach leads to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic resources presented here represent a valuable multi-purpose resource for developing informative marker panels for population discrimination, microarray development and for population

  16. SNP Discovery Using Next Generation Transcriptomic Sequencing in Atlantic Herring (Clupea harengus)

    PubMed Central

    Bekkevold, Dorte; Babbucci, Massimiliano; van Houdt, Jeroen; Maes, Gregory E.; Bargelloni, Luca; Nielsen, Rasmus O.; Taylor, Martin I.; Ogden, Rob; Cariani, Alessia; Carvalho, Gary R.; Consortium, FishPopTrace; Panitz, Frank

    2012-01-01

    The introduction of Next Generation Sequencing (NGS) has revolutionised population genetics, providing studies of non-model species with unprecedented genomic coverage, allowing evolutionary biologists to address questions previously far beyond the reach of available resources. Furthermore, the simple mutation model of Single Nucleotide Polymorphisms (SNPs) permits cost-effective high-throughput genotyping in thousands of individuals simultaneously. Genomic resources are scarce for the Atlantic herring (Clupea harengus), a small pelagic species that sustains high revenue fisheries. This paper details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed by de novo sequence clustering and contig assembly, followed by the mapping of reads against consensus contig sequences. Selection of candidate SNPs for genotyping was conducted using an in silico approach. SNP validation and genotyping were performed simultaneously using an Illumina 1,536 GoldenGate assay. Although the conversion rate of candidate SNPs in the genotyping assay cannot be predicted in advance, this approach has the potential to maximise cost and time efficiencies by avoiding expensive and time-consuming laboratory stages of SNP validation. Additionally, the in silico approach leads to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic resources presented here represent a valuable multi-purpose resource for developing informative marker panels for population discrimination, microarray development and for population

  17. High-Throughput Microdissection for Next-Generation Sequencing

    PubMed Central

    Rosenberg, Avi Z.; Armani, Michael D.; Fetsch, Patricia A.; Xi, Liqiang; Pham, Tina Thu; Raffeld, Mark; Chen, Yun; O’Flaherty, Neil; Stussman, Rebecca; Blackler, Adele R.; Du, Qiang; Hanson, Jeffrey C.; Roth, Mark J.; Filie, Armando C.; Roh, Michael H.; Emmert-Buck, Michael R.; Hipp, Jason D.; Tangrea, Michael A.

    2016-01-01

    Precision medicine promises to enhance patient treatment through the use of emerging molecular technologies, including genomics, transcriptomics, and proteomics. However, current tools in surgical pathology lack the capability to efficiently isolate specific cell populations in complex tissues/tumors, which can confound molecular results. Expression microdissection (xMD) is an immuno-based cell/subcellular isolation tool that procures targets of interest from a cytological or histological specimen. In this study, we demonstrate the accuracy and precision of xMD by rapidly isolating immunostained targets, including cytokeratin AE1/AE3, p53, and estrogen receptor (ER) positive cells and nuclei from tissue sections. Other targets procured included green fluorescent protein (GFP) expressing fibroblasts, in situ hybridization positive Epstein-Barr virus nuclei, and silver stained fungi. In order to assess the effect on molecular data, xMD was utilized to isolate specific targets from a mixed population of cells where the targets constituted only 5% of the sample. Target enrichment from this admixed cell population prior to next-generation sequencing (NGS) produced a minimum 13-fold increase in mutation allele frequency detection. These data suggest a role for xMD in a wide range of molecular pathology studies, as well as in the clinical workflow for samples where tumor cell enrichment is needed, or for those with a relative paucity of target cells. PMID:26999048

  18. A Computer Program for Generating Sequences of Primary Arithmetic Facts in Random Order.

    ERIC Educational Resources Information Center

    Burns, Edward

    A computer program which generates randomly sequenced problems for testing the abilities of students to add, subtract, and multiply one-digit numbers is described. Appendices provide tables of random sequences with directions for using the tables. The 54-statement FORTRAN program which can be used in generating additional sequences is also…

  19. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues.

    PubMed

    García-Chequer, A J; Méndez-Tenorio, A; Olguín-López, G; Sánchez-Vallejo, C; Isa, P; Arias, C F; Torres, J; Hernández-Angeles, A; Ramírez-Ortiz, M A; Lara, C; Cabrera-Muñoz, Ma de L; Sadowinski-Pine, S; Bravo-Ortiz, J C; Ramón-García, G; Diegopérez-Ramírez, J; Ramírez-Reyes, G; Casarrubias-Islas, R; Ramírez, J; Orjuela, M; Ponce-Castañeda, M V

    2016-03-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma.

  20. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues

    PubMed Central

    García-Chequer, A.J.; Méndez-Tenorio, A.; Olguín-López, G.; Sánchez-Vallejo, C.; Isa, P.; Arias, C.F.; Torres, J.; Hernández-Angeles, A.; Ramírez-Ortiz, M.A.; Lara, C.; Cabrera-Muñoz, Ma.de.L.; Sadowinski-Pine, S.; Bravo-Ortiz, J.C.; Ramón-García, G.; Diegopérez-Ramírez, J.; Ramírez-Reyes, G.; Casarrubias-Islas, R.; Ramírez, J.; Orjuela, M.; Ponce-Castañeda, M.V.

    2016-01-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma. PMID:26937470

  1. Illumina next generation sequencing data and expression microarrays data from retinoblastoma and medulloblastoma tissues.

    PubMed

    García-Chequer, A J; Méndez-Tenorio, A; Olguín-López, G; Sánchez-Vallejo, C; Isa, P; Arias, C F; Torres, J; Hernández-Angeles, A; Ramírez-Ortiz, M A; Lara, C; Cabrera-Muñoz, Ma de L; Sadowinski-Pine, S; Bravo-Ortiz, J C; Ramón-García, G; Diegopérez-Ramírez, J; Ramírez-Reyes, G; Casarrubias-Islas, R; Ramírez, J; Orjuela, M; Ponce-Castañeda, M V

    2016-03-01

    Retinoblastoma (Rb) is a pediatric intraocular malignancy and probably the most robust clinical model on which genetic predisposition to develop cancer has been demonstrated. Since deletions in chromosome 13 have been described in this tumor, we performed next generation sequencing to test whether recurrent losses could be detected in low coverage data. We used Illumina platform for 13 tumor tissue samples: two pools of 4 retinoblastoma cases each and one pool of 5 medulloblastoma cases (raw data can be found at http://www.ebi.ac.uk/ena/data/view/PRJEB6630). We first created an in silico reference profile generated from a human sequenced genome (GRCh37p5). From this data we calculated an integrity score to get an overview of gains and losses in all chromosomes; we next analyzed each chromosome in windows of 40 kb length, calculating for each window the log2 ratio between reads from tumor pool and in silico reference. Finally we generated panoramic maps with all the windows whether lost or gained along each chromosome associated to its cytogenetic bands to facilitate interpretation. Expression microarrays was done for the same samples and a list of over and under expressed genes is presented here. For this detection a significance analysis was done and a log2 fold change was chosen as significant (raw data can be found at http://www.ncbi.nlm.nih.gov/geo/accession number GSE11488). The complete research article can be found at Cancer Genetics journal (Garcia-Chequer et al., in press) [1]. In summary here we provide an overview with visual graphics of gains and losses chromosome by chromosome in retinoblastoma and medulloblastoma, also the integrity score analysis and a list of genes with relevant expression associated. This material can be useful to researchers that may want to explore gains and losses in other malignant tumors with this approach or compare their data with retinoblastoma. PMID:26937470

  2. Detection of inherited mutations for hereditary cancer using target enrichment and next generation sequencing.

    PubMed

    Guan, Yanfang; Hu, Hong; Peng, Yin; Gong, Yuhua; Yi, Yuting; Shao, Libin; Liu, Tengfei; Li, Gairui; Wang, Rongjiao; Dai, Pingping; Bignon, Yves-Jean; Xiao, Zhe; Yang, Ling; Mu, Feng; Xiao, Liang; Xie, Zeming; Yan, Wenhui; Xu, Nan; Zhou, Dongxian; Yi, Xin

    2015-03-01

    Hereditary cancers occur because of inherited gene mutations. Genetic testing has been approved to provide information for risk assessment and rationale for appropriate intervention. Testing methods currently available for clinical use have some limitations, including sensitivity and testing throughput, etc. Next generation sequencing (NGS) has been rapidly evolving to increase testing sensitivity and throughput. It can be potentially used to identify inherited mutation in clinical diagnostic setting. Here we develop an effective method employing target enrichment and NGS platform to detect common as well as rare mutations for all common hereditary cancers in a single assay. Single base substitution across 115 hereditary cancer related genes using YH (the first Asian genome) was characterized to validate our method. Sensitivity, specificity and accuracy of 93.66, 99.98 and 99.97 %, were achieved, respectively. In addition, we correctly identified 53 SNVs and indels of BRCA1 and BRCA2 in two breast cancer specimens, all confirmed by Sanger sequencing. Accuracy in detecting copy number variation (CNV) was corroborated in 4 breast cancer specimens with known CNVs in BRAC1. Application of the method to 85 clinical cases revealed 22 deleterious mutations, 11 of which were novel. In summary, our studies demonstrate that the target enrichment combined with NGS method provides the accuracy, sensitivity, and high throughput for genetic testing for patients with high risk of hereditary or familial cancer.

  3. Next generation sequencing technologies in cancer diagnostics and therapeutics: A mini review.

    PubMed

    Li, W; Zhao, K; Kirberger, M; Liao, W; Yan, Y

    2015-10-30

    The development of advanced molecular technologies has ushered in the era of 'omics' science, including transcriptomics, proteomics, and genomics. Genomics, or whole genome approach, has become the most comprehensive investigative method to identify new gene mutations, signal pathways and drug targets for cancers. The purpose of this review is to summarize current second generation sequencing techniques in applied genomics, and to analyze the advantages and/or problems associated with each of the various sequencing platforms. Our understanding of molecular factors associated with tumorigenesis is no longer limited to the mutation of well—known cancer related genes, but may involve a broader range of factors involved in tumor development, including novel somatic mutations, gene fusions, long non—coding RNAs, microRNAs, copy number variations, methylation, and genomic structural variations. Furthermore, these new methods are not limited to analyses of single genetic or epigenetic factor, but offer comprehensive molecule profiling as a more critical and powerful approach to decoding the mystery of tumor development and identifying more reliable cancer biomarkers.

  4. Strategies for Achieving High Sequencing Accuracy for Low Diversity Samples and Avoiding Sample Bleeding Using Illumina Platform

    PubMed Central

    Mitra, Abhishek; Skrzypczak, Magdalena; Ginalski, Krzysztof; Rowicka, Maga

    2015-01-01

    Sequencing microRNA, reduced representation sequencing, Hi-C technology and any method requiring the use of in-house barcodes result in sequencing libraries with low initial sequence diversity. Sequencing such data on the Illumina platform typically produces low quality data due to the limitations of the Illumina cluster calling algorithm. Moreover, even in the case of diverse samples, these limitations are causing substantial inaccuracies in multiplexed sample assignment (sample bleeding). Such inaccuracies are unacceptable in clinical applications, and in some other fields (e.g. detection of rare variants). Here, we discuss how both problems with quality of low-diversity samples and sample bleeding are caused by incorrect detection of clusters on the flowcell during initial sequencing cycles. We propose simple software modifications (Long Template Protocol) that overcome this problem. We present experimental results showing that our Long Template Protocol remarkably increases data quality for low diversity samples, as compared with the standard analysis protocol; it also substantially reduces sample bleeding for all samples. For comprehensiveness, we also discuss and compare experimental results from alternative approaches to sequencing low diversity samples. First, we discuss how the low diversity problem, if caused by barcodes, can be avoided altogether at the barcode design stage. Second and third, we present modified guidelines, which are more stringent than the manufacturer’s, for mixing low diversity samples with diverse samples and lowering cluster density, which in our experience consistently produces high quality data from low diversity samples. Fourth and fifth, we present rescue strategies that can be applied when sequencing results in low quality data and when there is no more biological material available. In such cases, we propose that the flowcell be re-hybridized and sequenced again using our Long Template Protocol. Alternatively, we discuss how

  5. A systems approach to designing next generation vaccines: combining α-galactose modified antigens with nanoparticle platforms.

    PubMed

    Phanse, Yashdeep; Carrillo-Conde, Brenda R; Ramer-Tait, Amanda E; Broderick, Scott; Kong, Chang Sun; Rajan, Krishna; Flick, Ramon; Mandell, Robert B; Narasimhan, Balaji; Wannemuehler, Michael J

    2014-01-20

    Innovative vaccine platforms are needed to develop effective countermeasures against emerging and re-emerging diseases. These platforms should direct antigen internalization by antigen presenting cells and promote immunogenic responses. This work describes an innovative systems approach combining two novel platforms, αGalactose (αGal)-modification of antigens and amphiphilic polyanhydride nanoparticles as vaccine delivery vehicles, to rationally design vaccine formulations. Regimens comprising soluble αGal-modified antigen and nanoparticle-encapsulated unmodified antigen induced a high titer, high avidity antibody response with broader epitope recognition of antigenic peptides than other regimen. Proliferation of antigen-specific CD4(+) T cells was also enhanced compared to a traditional adjuvant. Combining the technology platforms and augmenting immune response studies with peptide arrays and informatics analysis provides a new paradigm for rational, systems-based design of next generation vaccine platforms against emerging and re-emerging pathogens.

  6. A systems approach to designing next generation vaccines: combining α-galactose modified antigens with nanoparticle platforms

    NASA Astrophysics Data System (ADS)

    Phanse, Yashdeep; Carrillo-Conde, Brenda R.; Ramer-Tait, Amanda E.; Broderick, Scott; Kong, Chang Sun; Rajan, Krishna; Flick, Ramon; Mandell, Robert B.; Narasimhan, Balaji; Wannemuehler, Michael J.

    2014-01-01

    Innovative vaccine platforms are needed to develop effective countermeasures against emerging and re-emerging diseases. These platforms should direct antigen internalization by antigen presenting cells and promote immunogenic responses. This work describes an innovative systems approach combining two novel platforms, αGalactose (αGal)-modification of antigens and amphiphilic polyanhydride nanoparticles as vaccine delivery vehicles, to rationally design vaccine formulations. Regimens comprising soluble αGal-modified antigen and nanoparticle-encapsulated unmodified antigen induced a high titer, high avidity antibody response with broader epitope recognition of antigenic peptides than other regimen. Proliferation of antigen-specific CD4+ T cells was also enhanced compared to a traditional adjuvant. Combining the technology platforms and augmenting immune response studies with peptide arrays and informatics analysis provides a new paradigm for rational, systems-based design of next generation vaccine platforms against emerging and re-emerging pathogens.

  7. Next-Generation Sequencing in the Understanding of Kaposi’s Sarcoma-Associated Herpesvirus (KSHV) Biology

    PubMed Central

    Strahan, Roxanne; Uppal, Timsy; Verma, Subhash C.

    2016-01-01

    Non-Sanger-based novel nucleic acid sequencing techniques, referred to as Next-Generation Sequencing (NGS), provide a rapid, reliable, high-throughput, and massively parallel sequencing methodology that has improved our understanding of human cancers and cancer-related viruses. NGS has become a quintessential research tool for more effective characterization of complex viral and host genomes through its ever-expanding repertoire, which consists of whole-genome sequencing, whole-transcriptome sequencing, and whole-epigenome sequencing. These new NGS platforms provide a comprehensive and systematic genome-wide analysis of genomic sequences and a full transcriptional profile at a single nucleotide resolution. When combined, these techniques help unlock the function of novel genes and the related pathways that contribute to the overall viral pathogenesis. Ongoing research in the field of virology endeavors to identify the role of various underlying mechanisms that control the regulation of the herpesvirus biphasic lifecycle in order to discover potential therapeutic targets and treatment strategies. In this review, we have complied the most recent findings about the application of NGS in Kaposi’s sarcoma-associated herpesvirus (KSHV) biology, including identification of novel genomic features and whole-genome KSHV diversities, global gene regulatory network profiling for intricate transcriptome analyses, and surveying of epigenetic marks (DNA methylation, modified histones, and chromatin remodelers) during de novo, latent, and productive KSHV infections. PMID:27043613

  8. On-chip generation and demultiplexing of quantum correlated photons using a silicon-silica monolithic photonic integration platform.

    PubMed

    Matsuda, Nobuyuki; Karkus, Peter; Nishi, Hidetaka; Tsuchizawa, Tai; Munro, William J; Takesue, Hiroki; Yamada, Koji

    2014-09-22

    We demonstrate the generation and demultiplexing of quantum correlated photons on a monolithic photonic chip composed of silicon and silica-based waveguides. Photon pairs generated in a nonlinear silicon waveguide are successfully separated into two optical channels of an arrayed-waveguide grating fabricated on a silica-based waveguide platform.

  9. Using next generation sequencing approaches for the isolation of simple sequence repeats (SSF) in the plant sciences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The application of next-generation sequencing (NGS) technologies for the development of simple sequence repeat (SSR) or microsatellite loci for genetic research in the botanical sciences is described. The major advantage of using NGS methods to isolate SSR loci is their ability to quickly and cost-e...

  10. Next generation sequencing (NGS)technologies and applications

    SciTech Connect

    Vuyisich, Momchilo

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  11. Sample Prep, Workflow Automation and Nucleic Acid Fractionation for Next Generation Sequencing

    SciTech Connect

    Roskey, Mark

    2010-06-03

    Mark Roskey of Caliper LifeSciences discusses how the company's technologies fit into the next generation sequencing workflow on June 3, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  12. Next-Generation Sequencing Tech Panel ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Rhodes, Michael; Fiske, Haley; Knight, Jim; Turner, Steve (Pacific Biosciences

    2012-06-01

    Representatives from several next-generation sequencer manufacturers take part in a panel discussion at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  13. Data-driven sequence learning or search: What are the prerequisites for the generation of explicit sequence knowledge?

    PubMed Central

    Schwager, Sabine; Rünger, Dennis; Gaschler, Robert; Frensch, Peter A.

    2012-01-01

    In incidental sequence learning situations, there is often a number of participants who can report the task-inherent sequential regularity after training. Two kinds of mechanisms for the generation of this explicit knowledge have been proposed in the literature. First, a sequence representation may become explicit when its strength reaches a certain level (Cleeremans, 2006), and secondly, explicit knowledge may emerge as the result of a search process that is triggered by unexpected events that occur during task processing and require an explanation (the unexpected-event hypothesis; Haider & Frensch, 2009). Our study aimed at systematically exploring the contribution of both mechanisms to the generation of explicit sequence knowledge in an incidental learning situation. We varied the amount of specific sequence training and inserted unexpected events into a 6-choice serial reaction time task. Results support the unexpected-event view, as the generation of explicit sequence knowledge could not be predicted by the representation strength acquired through implicit sequence learning. Rather sequence detection turned out to be more likely when participants were shifted to the fixed repeating sequence after training than when practicing one and the same fixed sequence without interruption. The behavioral effects of representation strength appear to be related to the effectiveness of unexpected changes in performance as triggers of a controlled search. PMID:22723812

  14. Generation of droplets to serpentine threads on a rotating compact-disk platform

    NASA Astrophysics Data System (ADS)

    Kar, Shantimoy; Joshi, Sumit; Chaudhary, Kaustav; Maiti, Tapas Kumar; Chakraborty, Suman

    2015-12-01

    We generate stable monodisperse droplets of nano-liter volumes and long serpentine liquid threads in a single, simple "Y"-shaped microchannel mounted on a rotationally actuated lab-on-a-compact-disk platform. Exploitation of Coriolis force offers versatile modus operandi of the present setup, without involving any design complications. Based on the fundamental understanding and subsequent analysis, we present scaling theories consistent with the experimental observations. We also outline specific applications of this technique, in the biological as well as in the physical domain, including digital polymerase chain reaction (PCR), controlled release of medical components, digital counting of colony forming units, hydrogel engineering, optical sensors and scaffolds for living tissues, to name a few.

  15. Integrating chemical mutagenesis and whole genome sequencing as a platform for forward and reverse genetic analysis of Chlamydia

    PubMed Central

    Kokes, Marcela; Dunn, Joe Dan; Granek, Joshua A.; Nguyen, Bidong D.; Barker, Jeffrey R.; Valdivia, Raphael H.; Bastidas, Robert J.

    2015-01-01

    SUMMARY Gene inactivation by transposon insertion or allelic exchange is a powerful approach to probe gene function. Unfortunately, many microbes, including Chlamydia, are not amenable to routine molecular genetic manipulations. Here we describe an arrayed library of chemically-induced mutants of the genetically-intransigent pathogen Chlamydia trachomatis, in which all mutations have been identified by whole genome sequencing, providing a platform for reverse genetic applications. An analysis of possible loss-of-function mutations in the collection uncovered plasticity in the central metabolic properties of this obligate intracellular pathogen. We also describe the use of the library in a forward genetic screen that identified InaC as a bacterial factor that binds host ARF and 14-3-3 proteins to modulate F-actin assembly and Golgi redistribution around the pathogenic vacuole. This work provides a robust platform for reverse and forward genetic approaches in Chlamydia and should serve as a valuable resource to the community. PMID:25920978

  16. A microfluidic platform for size-dependent generation of droplet interface bilayer networks on rails

    PubMed Central

    Carreras, P.; Elani, Y.; Law, R. V.; Brooks, N. J.; Seddon, J. M.; Ces, O.

    2015-01-01

    Droplet interface bilayer (DIB) networks are emerging as a cornerstone technology for the bottom up construction of cell-like and tissue-like structures and bio-devices. They are an exciting and versatile model-membrane platform, seeing increasing use in the disciplines of synthetic biology, chemical biology, and membrane biophysics. DIBs are formed when lipid-coated water-in-oil droplets are brought together—oil is excluded from the interface, resulting in a bilayer. Perhaps the greatest feature of the DIB platform is the ability to generate bilayer networks by connecting multiple droplets together, which can in turn be used in applications ranging from tissue mimics, multicellular models, and bio-devices. For such applications, the construction and release of DIB networks of defined size and composition on-demand is crucial. We have developed a droplet-based microfluidic method for the generation of different sized DIB networks (300–1500 pl droplets) on-chip. We do this by employing a droplet-on-rails strategy where droplets are guided down designated paths of a chip with the aid of microfabricated grooves or “rails,” and droplets of set sizes are selectively directed to specific rails using auxiliary flows. In this way we can uniquely produce parallel bilayer networks of defined sizes. By trapping several droplets in a rail, extended DIB networks containing up to 20 sequential bilayers could be constructed. The trapped DIB arrays can be composed of different lipid types and can be released on-demand and regenerated within seconds. We show that chemical signals can be propagated across the bio-network by transplanting enzymatic reaction cascades for inter-droplet communication. PMID:26759638

  17. A microfluidic platform for size-dependent generation of droplet interface bilayer networks on rails.

    PubMed

    Carreras, P; Elani, Y; Law, R V; Brooks, N J; Seddon, J M; Ces, O

    2015-11-01

    Droplet interface bilayer (DIB) networks are emerging as a cornerstone technology for the bottom up construction of cell-like and tissue-like structures and bio-devices. They are an exciting and versatile model-membrane platform, seeing increasing use in the disciplines of synthetic biology, chemical biology, and membrane biophysics. DIBs are formed when lipid-coated water-in-oil droplets are brought together-oil is excluded from the interface, resulting in a bilayer. Perhaps the greatest feature of the DIB platform is the ability to generate bilayer networks by connecting multiple droplets together, which can in turn be used in applications ranging from tissue mimics, multicellular models, and bio-devices. For such applications, the construction and release of DIB networks of defined size and composition on-demand is crucial. We have developed a droplet-based microfluidic method for the generation of different sized DIB networks (300-1500 pl droplets) on-chip. We do this by employing a droplet-on-rails strategy where droplets are guided down designated paths of a chip with the aid of microfabricated grooves or "rails," and droplets of set sizes are selectively directed to specific rails using auxiliary flows. In this way we can uniquely produce parallel bilayer networks of defined sizes. By trapping several droplets in a rail, extended DIB networks containing up to 20 sequential bilayers could be constructed. The trapped DIB arrays can be composed of different lipid types and can be released on-demand and regenerated within seconds. We show that chemical signals can be propagated across the bio-network by transplanting enzymatic reaction cascades for inter-droplet communication. PMID:26759638

  18. Generation and Characterization of an IgG4 Monomeric Fc Platform

    PubMed Central

    Shan, Lu; Colazet, Magali; Rosenthal, Kim L.; Yu, Xiang-Qing; Bee, Jared S.; Ferguson, Andrew; Damschroder, Melissa M.; Wu, Herren; Dall’Acqua, William F.; Tsui, Ping

    2016-01-01

    The immunoglobulin Fc region is a homodimer consisted of two sets of CH2 and CH3 domains and has been exploited to generate two-arm protein fusions with high expression yields, simplified purification processes and extended serum half-life. However, attempts to generate one-arm fusion proteins with monomeric Fc, with one set of CH2 and CH3 domains, are often plagued with challenges such as weakened binding to FcRn or partial monomer formation. Here, we demonstrate the generation of a stable IgG4 Fc monomer with a unique combination of mutations at the CH3-CH3 interface using rational design combined with in vitro evolution methodologies. In addition to size-exclusion chromatography and analytical ultracentrifugation, we used multi-angle light scattering (MALS) to show that the engineered Fc monomer exhibits excellent monodispersity. Furthermore, crystal structure analysis (PDB ID: 5HVW) reveals monomeric properties supported by disrupted interactions at the CH3-CH3 interface. Monomeric Fc fusions with Fab or scFv achieved FcRn binding and serum half-life comparable to wildtype IgG. These results demonstrate that this monomeric IgG4 Fc is a promising therapeutic platform to extend the serum half-life of proteins in a monovalent format. PMID:27479095

  19. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression

    PubMed Central

    Popham, Holly J. R.; Gould, Fred; Adang, Michael J.; Jurat-Fuentes, Juan Luis

    2015-01-01

    Heliothine pests such as the tobacco budworm, Heliothis virescens (F.), pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus) and silkworm (Bombyx mori). Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt) Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new biological

  20. Generation of a Transcriptome in a Model Lepidopteran Pest, Heliothis virescens, Using Multiple Sequencing Strategies for Profiling Midgut Gene Expression.

    PubMed

    Perera, Omaththage P; Shelby, Kent S; Popham, Holly J R; Gould, Fred; Adang, Michael J; Jurat-Fuentes, Juan Luis

    2015-01-01

    Heliothine pests such as the tobacco budworm, Heliothis virescens (F.), pose a significant threat to production of a variety of crops and ornamental plants and are models for developmental and physiological studies. The efforts to develop new control measures for H. virescens, as well as its use as a relevant biological model, are hampered by a lack of molecular resources. The present work demonstrates the utility of next-generation sequencing technologies for rapid molecular resource generation from this species for which lacks a sequenced genome. In order to amass a de novo transcriptome for this moth, transcript sequences generated from Illumina, Roche 454, and Sanger sequencing platforms were merged into a single de novo transcriptome assembly. This pooling strategy allowed a thorough sampling of transcripts produced under diverse environmental conditions, developmental stages, tissues, and infections with entomopathogens used for biological control, to provide the most complete transcriptome to date for this species. Over 138 million reads from the three platforms were assembled into the final set of 63,648 contigs. Of these, 29,978 had significant BLAST scores indicating orthologous relationships to transcripts of other insect species, with the top-hit species being the monarch butterfly (Danaus plexippus) and silkworm (Bombyx mori). Among identified H. virescens orthologs were immune effectors, signal transduction pathways, olfactory receptors, hormone biosynthetic pathways, peptide hormones and their receptors, digestive enzymes, and insecticide resistance enzymes. As an example, we demonstrate the utility of this transcriptomic resource to study gene expression profiling of larval midguts and detect transcripts of putative Bacillus thuringiensis (Bt) Cry toxin receptors. The substantial molecular resources described in this study will facilitate development of H. virescens as a relevant biological model for functional genomics and for new biological

  1. Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

    PubMed Central

    2010-01-01

    Background Complete chloroplast genome sequences provide a valuable source of molecular markers for studies in molecular ecology and evolution of plants. To obtain complete genome sequences, recent studies have made use of the polymerase chain reaction to amplify overlapping fragments from conserved gene loci. However, this approach is time consuming and can be more difficult to implement where gene organisation differs among plants. An alternative approach is to first isolate chloroplasts and then use the capacity of high-throughput sequencing to obtain complete genome sequences. We report our findings from studies of the latter approach, which used a simple chloroplast isolation procedure, multiply-primed rolling circle amplification of chloroplast DNA, Illumina Genome Analyzer II sequencing, and de novo assembly of paired-end sequence reads. Results A modified rapid chloroplast isolation protocol was used to obtain plant DNA that was enriched for chloroplast DNA, but nevertheless contained nuclear and mitochondrial DNA. Multiply-primed rolling circle amplification of this mixed template produced sufficient quantities of chloroplast DNA, even when the amount of starting material was small, and improved the template quality for Illumina Genome Analyzer II (hereafter Illumina GAII) sequencing. We demonstrate, using independent samples of karaka (Corynocarpus laevigatus), that there is high fidelity in the sequence obtained from this template. Although less than 20% of our sequenced reads could be mapped to chloroplast genome, it was relatively easy to assemble complete chloroplast genome sequences from the mixture of nuclear, mitochondrial and chloroplast reads. Conclusions We report successful whole genome sequencing of chloroplast DNA from karaka, obtained efficiently and with high fidelity. PMID:20920211

  2. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  3. Complete Genome Sequence of Southern tomato virus Identified in China Using Next-Generation Sequencing

    PubMed Central

    Padmanabhan, Chellappan; Zheng, Yi; Li, Rugang; Sun, Shu-E; Zhang, Deyong; Liu, Yong; Fei, Zhangjun

    2015-01-01

    The complete genome sequence of Southern tomato virus (STV), a double-stranded RNA virus that affects tomato in China, was determined using small RNA deep sequencing. This Chinese isolate shares 99% sequence identity to other isolates from Mexico, France, Spain, and the United States. This is the first report of STV infecting tomatoes in Asia. PMID:26494671

  4. Identification and Characterization of Microsatellite Loci in Maqui (Aristotelia chilensis [Molina] Stunz) Using Next-Generation Sequencing (NGS).

    PubMed

    Bastías, Adriana; Correa, Francisco; Rojas, Pamela; Almada, Rubén; Muñoz, Carlos; Sagredo, Boris

    2016-01-01

    Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree native to South America with edible fruit characterized by very high antioxidant capacity and anthocyanin content. To preserve maqui as a genetic resource it is essential to study its genetic diversity. However, the complete genome is unknown and only a few gene sequences are available in databases. Simple sequence repeats (SSR) markers, which are neutral, co-dominant, reproducible and highly variable, are desirable to support genetic studies in maqui populations. By means of identification and characterization of microsatellite loci from a maqui genotype, using 454 sequencing technology, we develop a set of SSR for this species. Obtaining a total of 165,043 shotgun genome sequences, with an average read length of 387 bases, we covered 64 Mb of the maqui genome. Reads were assembled into 4,832 contigs, while 98,546 reads remained as singletons, generating a total of 103,378 consensus genomic sequences. A total of 24,494 SSR maqui markers were identified. Of them, 15,950 SSR maqui markers were classified as perfects. The most common SSR motifs were dinucleotide (31%), followed by tetranucleotide (26%) and trinucleotide motifs (24%). The motif AG/CT (28.4%) was the most abundant, while the motif AC (89 bp) was the largest. Eleven polymorphic SSRs were selected and used to analyze a population of 40 maqui genotypes. Polymorphism information content (PIC) ranged from 0.117 to 0.82, with an average of 0.58. Non-significant groups were observed in the maqui population, showing a panmictic genetic structure. In addition, we also predicted 11150 putative genes and 3 microRNAs (miRNAs) in maqui sequences. This results, including partial sequences of genes, some miRNAs and SSR markers from high throughput next generation sequencing (NGS) of maqui genomic DNA, constitute the first platform to undertake genetic and molecular studies of this important species. PMID:27459734

  5. Identification and Characterization of Microsatellite Loci in Maqui (Aristotelia chilensis [Molina] Stunz) Using Next-Generation Sequencing (NGS)

    PubMed Central

    Bastías, Adriana; Correa, Francisco; Rojas, Pamela; Almada, Rubén; Muñoz, Carlos; Sagredo, Boris

    2016-01-01

    Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree native to South America with edible fruit characterized by very high antioxidant capacity and anthocyanin content. To preserve maqui as a genetic resource it is essential to study its genetic diversity. However, the complete genome is unknown and only a few gene sequences are available in databases. Simple sequence repeats (SSR) markers, which are neutral, co-dominant, reproducible and highly variable, are desirable to support genetic studies in maqui populations. By means of identification and characterization of microsatellite loci from a maqui genotype, using 454 sequencing technology, we develop a set of SSR for this species. Obtaining a total of 165,043 shotgun genome sequences, with an average read length of 387 bases, we covered 64 Mb of the maqui genome. Reads were assembled into 4,832 contigs, while 98,546 reads remained as singletons, generating a total of 103,378 consensus genomic sequences. A total of 24,494 SSR maqui markers were identified. Of them, 15,950 SSR maqui markers were classified as perfects. The most common SSR motifs were dinucleotide (31%), followed by tetranucleotide (26%) and trinucleotide motifs (24%). The motif AG/CT (28.4%) was the most abundant, while the motif AC (89 bp) was the largest. Eleven polymorphic SSRs were selected and used to analyze a population of 40 maqui genotypes. Polymorphism information content (PIC) ranged from 0.117 to 0.82, with an average of 0.58. Non-significant groups were observed in the maqui population, showing a panmictic genetic structure. In addition, we also predicted 11150 putative genes and 3 microRNAs (miRNAs) in maqui sequences. This results, including partial sequences of genes, some miRNAs and SSR markers from high throughput next generation sequencing (NGS) of maqui genomic DNA, constitute the first platform to undertake genetic and molecular studies of this important species. PMID:27459734

  6. Identification and Characterization of Microsatellite Loci in Maqui (Aristotelia chilensis [Molina] Stunz) Using Next-Generation Sequencing (NGS).

    PubMed

    Bastías, Adriana; Correa, Francisco; Rojas, Pamela; Almada, Rubén; Muñoz, Carlos; Sagredo, Boris

    2016-01-01

    Maqui (Aristotelia chilensis [Molina] Stunz) is a small dioecious tree native to South America with edible fruit characterized by very high antioxidant capacity and anthocyanin content. To preserve maqui as a genetic resource it is essential to study its genetic diversity. However, the complete genome is unknown and only a few gene sequences are available in databases. Simple sequence repeats (SSR) markers, which are neutral, co-dominant, reproducible and highly variable, are desirable to support genetic studies in maqui populations. By means of identification and characterization of microsatellite loci from a maqui genotype, using 454 sequencing technology, we develop a set of SSR for this species. Obtaining a total of 165,043 shotgun genome sequences, with an average read length of 387 bases, we covered 64 Mb of the maqui genome. Reads were assembled into 4,832 contigs, while 98,546 reads remained as singletons, generating a total of 103,378 consensus genomic sequences. A total of 24,494 SSR maqui markers were identified. Of them, 15,950 SSR maqui markers were classified as perfects. The most common SSR motifs were dinucleotide (31%), followed by tetranucleotide (26%) and trinucleotide motifs (24%). The motif AG/CT (28.4%) was the most abundant, while the motif AC (89 bp) was the largest. Eleven polymorphic SSRs were selected and used to analyze a population of 40 maqui genotypes. Polymorphism information content (PIC) ranged from 0.117 to 0.82, with an average of 0.58. Non-significant groups were observed in the maqui population, showing a panmictic genetic structure. In addition, we also predicted 11150 putative genes and 3 microRNAs (miRNAs) in maqui sequences. This results, including partial sequences of genes, some miRNAs and SSR markers from high throughput next generation sequencing (NGS) of maqui genomic DNA, constitute the first platform to undertake genetic and molecular studies of this important species.

  7. Capturing chloroplast variation for molecular ecology studies: a simple next generation sequencing approach applied to a rainforest tree

    PubMed Central

    2013-01-01

    Background With high quantity and quality data production and low cost, next generation sequencing has the potential to provide new opportunities for plant phylogeographic studies on single and multiple species. Here we present an approach for in silicio chloroplast DNA assembly and single nucleotide polymorphism detection from short-read shotgun sequencing. The approach is simple and effective and can be implemented using standard bioinformatic tools. Results The chloroplast genome of Toona ciliata (Meliaceae), 159,514 base pairs long, was assembled from shotgun sequencing on the Illumina platform using de novo assembly of contigs. To evaluate its practicality, value and quality, we compared the short read assembly with an assembly completed using 454 data obtained after chloroplast DNA isolation. Sanger sequence verifications indicated that the Illumina dataset outperformed the longer read 454 data. Pooling of several individuals during preparation of the shotgun library enabled detection of informative chloroplast SNP markers. Following validation, we used the identified SNPs for a preliminary phylogeographic study of T. ciliata in Australia and to confirm low diversity across the distribution. Conclusions Our approach provides a simple method for construction of whole chloroplast genomes from shotgun sequencing of whole genomic DNA using short-read data and no available closely related reference genome (e.g. from the same species or genus). The high coverage of Illumina sequence data also renders this method appropriate for multiplexing and SNP discovery and therefore a useful approach for landscape level studies of evolutionary ecology. PMID:23497206

  8. Spread-spectrum communications using sequences generated by phase filters

    NASA Astrophysics Data System (ADS)

    Bouvet, M.

    The principal characteristics of spread-spectrum communications is to extend the signal spectrum in order to combat jammers and other interferences. The 'noise-like' emitted signal must have a power spectral density as flat as possible. It is shown that the impulse response of an ARMA phase filter can be considered an infinite sequence with this good spectrum property. Such sequences are studied as alternatives for spread-spectrum communications signal design. Characteristics of these signals, such as their autocorrelation, spectrum, and intercorrelation are investigated. Some comparisons with other pseudorandom sequences are given.

  9. Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing

    PubMed Central

    2013-01-01

    Background Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Results Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. Conclusions The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with

  10. Robust global microRNA expression profiling using next-generation sequencing technologies.

    PubMed

    Tam, Shirley; de Borja, Richard; Tsao, Ming-Sound; McPherson, John D

    2014-03-01

    miRNAs are a class of regulatory molecules involved in a wide range of cellular functions, including growth, development and apoptosis. Given their widespread roles in biological processes, understanding their patterns of expression in normal and diseased states will provide insights into the consequences of aberrant expression. As such, global miRNA expression profiling of human malignancies is gaining popularity in both basic and clinically driven research. However, to date, the majority of such analyses have used microarrays and quantitative real-time PCR. With the introduction of digital count technologies, such as next-generation sequencing (NGS) and the NanoString nCounter System, we have at our disposal many more options. To make effective use of these different platforms, the strengths and pitfalls of several miRNA profiling technologies were assessed, including a microarray platform, NGS technologies and the NanoString nCounter System. Overall, NGS had the greatest detection sensitivity, largest dynamic range of detection and highest accuracy in differential expression analysis when compared with gold-standard quantitative real-time PCR. Its technical reproducibility was high, with intrasample correlations of at least 0.95 in all cases. Furthermore, miRNA analysis of formalin-fixed, paraffin-embedded (FFPE) tissue was also evaluated. Expression profiles between paired frozen and FFPE samples were similar, with Spearman's ρ>0.93. These results show the superior sensitivity, accuracy and robustness of NGS for the comprehensive profiling of miRNAs in both frozen and FFPE tissues.

  11. Recurrent Network Models of Sequence Generation and Memory.

    PubMed

    Rajan, Kanaka; Harvey, Christopher D; Tank, David W

    2016-04-01

    Sequential activation of neurons is a common feature of network activity during a variety of behaviors, including working memory and decision making. Previous network models for sequences and memory emphasized specialized architectures in which a principled mechanism is pre-wired into their connectivity. Here we demonstrate that, starting from random connectivity and modifying a small fraction of connections, a largely disordered recurrent network can produce sequences and implement working memory efficiently. We use this process, called Partial In-Network Training (PINning), to model and match cellular resolution imaging data from the posterior parietal cortex during a virtual memory-guided two-alternative forced-choice task. Analysis of the connectivity reveals that sequences propagate by the cooperation between recurrent synaptic interactions and external inputs, rather than through feedforward or asymmetric connections. Together our results suggest that neural sequences may emerge through learning from largely unstructured network architectures.

  12. Beating heart on a chip: a novel microfluidic platform to generate functional 3D cardiac microtissues.

    PubMed

    Marsano, Anna; Conficconi, Chiara; Lemme, Marta; Occhetta, Paola; Gaudiello, Emanuele; Votta, Emiliano; Cerino, Giulia; Redaelli, Alberto; Rasponi, Marco

    2016-02-01

    In the past few years, microfluidic-based technology has developed microscale models recapitulating key physical and biological cues typical of the native myocardium. However, the application of controlled physiological uniaxial cyclic strains on a defined three-dimension cellular environment is not yet possible. Two-dimension mechanical stimulation was particularly investigated, neglecting the complex three-dimensional cell-cell and cell-matrix interactions. For this purpose, we developed a heart-on-a-chip platform, which recapitulates the physiologic mechanical environment experienced by cells in the native myocardium. The device includes an array of hanging posts to confine cell-laden gels, and a pneumatic actuation system to induce homogeneous uniaxial cyclic strains to the 3D cell constructs during culture. The device was used to generate mature and highly functional micro-engineered cardiac tissues (μECTs), from both neonatal rat and human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CM), strongly suggesting the robustness of our engineered cardiac micro-niche. Our results demonstrated that the cyclic strain was effectively highly uniaxial and uniformly transferred to cells in culture. As compared to control, stimulated μECTs showed superior cardiac differentiation, as well as electrical and mechanical coupling, owing to a remarkable increase in junction complexes. Mechanical stimulation also promoted early spontaneous synchronous beating and better contractile capability in response to electric pacing. Pacing analyses of hiPSC-CM constructs upon controlled administration of isoprenaline showed further promising applications of our platform in drug discovery, delivery and toxicology fields. The proposed heart-on-a-chip device represents a relevant step forward in the field, providing a standard functional three-dimensional cardiac model to possibly predict signs of hypertrophic changes in cardiac phenotype by mechanical and biochemical co-stimulation.

  13. Beating heart on a chip: a novel microfluidic platform to generate functional 3D cardiac microtissues.

    PubMed

    Marsano, Anna; Conficconi, Chiara; Lemme, Marta; Occhetta, Paola; Gaudiello, Emanuele; Votta, Emiliano; Cerino, Giulia; Redaelli, Alberto; Rasponi, Marco

    2016-02-01

    In the past few years, microfluidic-based technology has developed microscale models recapitulating key physical and biological cues typical of the native myocardium. However, the application of controlled physiological uniaxial cyclic strains on a defined three-dimension cellular environment is not yet possible. Two-dimension mechanical stimulation was particularly investigated, neglecting the complex three-dimensional cell-cell and cell-matrix interactions. For this purpose, we developed a heart-on-a-chip platform, which recapitulates the physiologic mechanical environment experienced by cells in the native myocardium. The device includes an array of hanging posts to confine cell-laden gels, and a pneumatic actuation system to induce homogeneous uniaxial cyclic strains to the 3D cell constructs during culture. The device was used to generate mature and highly functional micro-engineered cardiac tissues (μECTs), from both neonatal rat and human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CM), strongly suggesting the robustness of our engineered cardiac micro-niche. Our results demonstrated that the cyclic strain was effectively highly uniaxial and uniformly transferred to cells in culture. As compared to control, stimulated μECTs showed superior cardiac differentiation, as well as electrical and mechanical coupling, owing to a remarkable increase in junction complexes. Mechanical stimulation also promoted early spontaneous synchronous beating and better contractile capability in response to electric pacing. Pacing analyses of hiPSC-CM constructs upon controlled administration of isoprenaline showed further promising applications of our platform in drug discovery, delivery and toxicology fields. The proposed heart-on-a-chip device represents a relevant step forward in the field, providing a standard functional three-dimensional cardiac model to possibly predict signs of hypertrophic changes in cardiac phenotype by mechanical and biochemical co

  14. A Framework for the Generation and Dissemination of Drop Size Distribution (DSD) Characteristics Using Multiple Platforms

    NASA Technical Reports Server (NTRS)

    Wolf, David B.; Tokay, Ali; Petersen, Walt; Williams, Christopher; Gatlin, Patrick; Wingo, Mathew

    2010-01-01

    Proper characterization of the precipitation drop size distribution (DSD) is integral to providing realistic and accurate space- and ground-based precipitation retrievals. Current technology allows for the development of DSD products from a variety of platforms, including disdrometers, vertical profilers and dual-polarization radars. Up to now, however, the dissemination or availability of such products has been limited to individual sites and/or field campaigns, in a variety of formats, often using inconsistent algorithms for computing the integral DSD parameters, such as the median- and mass-weighted drop diameter, total number concentration, liquid water content, rain rate, etc. We propose to develop a framework for the generation and dissemination of DSD characteristic products using a unified structure, capable of handling the myriad collection of disdrometers, profilers, and dual-polarization radar data currently available and to be collected during several upcoming GPM Ground Validation field campaigns. This DSD super-structure paradigm is an adaptation of the radar super-structure developed for NASA s Radar Software Library (RSL) and RSL_in_IDL. The goal is to provide the DSD products in a well-documented format, most likely NetCDF, along with tools to ingest and analyze the products. In so doing, we can develop a robust archive of DSD products from multiple sites and platforms, which should greatly benefit the development and validation of precipitation retrieval algorithms for GPM and other precipitation missions. An outline of this proposed framework will be provided as well as a discussion of the algorithms used to calculate the DSD parameters.

  15. Sequence stratigraphy and systems tract development of the Latemar platform, Middle Triassic of the dolomites: Outcrop calibration keyed by cycle stacking patterns

    SciTech Connect

    Goldhammer, R.K.; Dunn, P.A. ); Harris, M.T. ); Hardie, L.A. )

    1991-03-01

    The Middle Triassic Latemar platform provides a seismic-scale outcrop example of an intact carbonate shelf-to-basin transition, ideal for integrating sequence stratigraphy with facies and cyclic stratigraphy. This subcircular, high-relief buildup records two third-order accommodation sequences within the platform interior: the lower Ladinian sequence and the upper Ladinian sequence. Sequence L1 developed atop a widespread, low-relief Middle Anisian carbonate bank (60 m thick). Underlying subtidal bank cycles thin upward into the basal, subaerial sequence boundary (type 1) reflecting decreasing third-order accommodation; above it, platform-interior facies of sequence L1 retrograde. This results in superimposition of Ladinian basinal and foreslope facies atop the underlying, horizontal, shallow-water bank along its periphery. The transgressive (TST) and highstand systems tract (HST) of sequence L1 (as well as L2) are marked by long-term, systematic vertical facies changes and variation in stacking patterns of aggradational high-frequency, 20 kyr cycles within the platform interior. The maximum flooding surface (MFS) is a marine hardground surface displaying evidence of very slow sedimentation and is the platform expression of the condensed section. A type 2 SB caps sequence L1, marked by an interval of vertically superimposed thin subaerial tepees; beneath this, high-frequency cycles are thinning-upward, and above they are thickening-upward. Only the transgressive systems tract of sequence L2 is preserved at the Latemar owing to late Ladinian-Early Carnian volcanism and tectonism which terminated carbonate platform deposition.

  16. FadE: whole genome methylation analysis for multiple sequencing platforms.

    PubMed

    Souaiaia, Tade; Zhang, Zheng; Chen, Ting

    2013-01-01

    DNA methylation plays a central role in genomic regulation and disease. Sodium bisulfite treatment (SBT) causes unmethylated cytosines to be sequenced as thymine, which allows methylation levels to reflected in the number of 'C'-'C' alignments covering reference cytosines. Di-base color reads produced by lifetech's SOLiD sequencer provide unreliable results when translated to bases because single sequencing errors effect the downstream sequence. We describe FadE, an algorithm to accurately determine genome-wide methylation rates directly in color or nucleotide space. FadE uses SBT unmethylated and untreated data to determine background error rates and incorporate them into a model which uses Newton-Raphson optimization to estimate the methylation rate and provide a credible interval describing its distribution at every reference cytosine. We sequenced two slides of human fibroblast cell-line bisulfite-converted fragment library with the SOLiD sequencer to investigate genome-wide methylation levels. FadE reported widespread differences in methylation levels across CpG islands and a large number of differentially methylated regions adjacent to genes which compares favorably to the results of an investigation on the same cell-line using nucleotide-space reads at higher coverage levels, suggesting that FadE is an accurate method to estimate genome-wide methylation with color or nucleotide reads. http://code.google.com/p/fade/.

  17. Enrichment of target sequences for next-generation sequencing applications in research and diagnostics.

    PubMed

    Altmüller, Janine; Budde, Birgit S; Nürnberg, Peter

    2014-02-01

    Abstract Targeted re-sequencing such as gene panel sequencing (GPS) has become very popular in medical genetics, both for research projects and in diagnostic settings. The technical principles of the different enrichment methods have been reviewed several times before; however, new enrichment products are constantly entering the market, and researchers are often puzzled about the requirement to take decisions about long-term commitments, both for the enrichment product and the sequencing technology. This review summarizes important considerations for the experimental design and provides helpful recommendations in choosing the best sequencing strategy for various research projects and diagnostic applications.

  18. Next generation sequencers: methods and applications in food-borne pathogens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencers are able to produce millions of short sequence reads in a high-throughput, low-cost way. The emergence of these technologies has not only facilitated genome sequencing but also started to change the landscape of life sciences. This chapter will survey their methods and app...

  19. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  20. Analysis of Pre-Analytic Factors Affecting the Success of Clinical Next-Generation Sequencing of Solid Organ Malignancies

    PubMed Central

    Chen, Hui; Luthra, Rajyalakshmi; Goswami, Rashmi S.; Singh, Rajesh R.; Roy-Chowdhuri, Sinchita

    2015-01-01

    Application of next-generation sequencing (NGS) technology to routine clinical practice has enabled characterization of personalized cancer genomes to identify patients likely to have a response to targeted therapy. The proper selection of tumor sample for downstream NGS based mutational analysis is critical to generate accurate results and to guide therapeutic intervention. However, multiple pre-analytic factors come into play in determining the success of NGS testing. In this review, we discuss pre-analytic requirements for AmpliSeq PCR-based sequencing using Ion Torrent Personal Genome Machine (PGM) (Life Technologies), a NGS sequencing platform that is often used by clinical laboratories for sequencing solid tumors because of its low input DNA requirement from formalin fixed and paraffin embedded tissue. The success of NGS mutational analysis is affected not only by the input DNA quantity but also by several other factors, including the specimen type, the DNA quality, and the tumor cellularity. Here, we review tissue requirements for solid tumor NGS based mutational analysis, including procedure types, tissue types, tumor volume and fraction, decalcification, and treatment effects. PMID:26343728

  1. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS)

    PubMed Central

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-dos-Santos, André M.; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-dos-Santos, Ândrea

    2016-01-01

    Abstract Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  2. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS).

    PubMed

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-Dos-Santos, André M; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-Dos-Santos, Ândrea

    2016-05-13

    Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  3. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS).

    PubMed

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-Dos-Santos, André M; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-Dos-Santos, Ândrea

    2016-05-13

    Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform.

  4. Big data challenges in bone research: genome-wide association studies and next-generation sequencing

    PubMed Central

    Alonso, Nerea; Lucas, Gavin; Hysi, Pirro

    2015-01-01

    Genome-wide association studies (GWAS) have been developed as a practical method to identify genetic loci associated with disease by scanning multiple markers across the genome. Significant advances in the genetics of complex diseases have been made owing to advances in genotyping technologies, the progress of projects such as HapMap and 1000G and the emergence of genetics as a collaborative discipline. Because of its great potential to be used in parallel by multiple collaborators, it is important to adhere to strict protocols assuring data quality and analyses. Quality control analyses must be applied to each sample and each single-nucleotide polymorphism (SNP). The software package PLINK is capable of performing the whole range of necessary quality control tests. Genotype imputation has also been developed to substantially increase the power of GWAS methodology. Imputation permits the investigation of associations at genetic markers that are not directly genotyped. Results of individual GWAS reports can be combined through meta-analysis. Finally, next-generation sequencing (NGS) has gained popularity in recent years through its capacity to analyse a much greater number of markers across the genome. Although NGS platforms are capable of examining a higher number of SNPs compared with GWA studies, the results obtained by NGS require careful interpretation, as their biological correlation is incompletely understood. In this article, we will discuss the basic features of such protocols. PMID:25709812

  5. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data.

    PubMed

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-01-01

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses. PMID:26166306

  6. Mutation based treatment recommendations from next generation sequencing data: a comparison of web tools

    PubMed Central

    Patel, Jaymin M.; Knopf, Joshua; Reiner, Eric; Bossuyt, Veerle; Epstein, Lianne; DiGiovanna, Michael; Chung, Gina; Silber, Andrea; Sanft, Tara; Hofstatter, Erin; Mougalian, Sarah; Abu-Khalaf, Maysa; Platt, James; Shi, Weiwei; Gershkovich, Peter; Hatzis, Christos; Pusztai, Lajos

    2016-01-01

    Interpretation of complex cancer genome data, generated by tumor target profiling platforms, is key for the success of personalized cancer therapy. How to draw therapeutic conclusions from tumor profiling results is not standardized and may vary among commercial and academically-affiliated recommendation tools. We performed targeted sequencing of 315 genes from 75 metastatic breast cancer biopsies using the FoundationOne assay. Results were run through 4 different web tools including the Drug-Gene Interaction Database (DGidb), My Cancer Genome (MCG), Personalized Cancer Therapy (PCT), and cBioPortal, for drug and clinical trial recommendations. These recommendations were compared amongst each other and to those provided by FoundationOne. The identification of a gene as targetable varied across the different recommendation sources. Only 33% of cases had 4 or more sources recommend the same drug for at least one of the usually several altered genes found in tumor biopsies. These results indicate further development and standardization of broadly applicable software tools that assist in our therapeutic interpretation of genomic data is needed. Existing algorithms for data acquisition, integration and interpretation will likely need to incorporate artificial intelligence tools to improve both content and real-time status. PMID:26980737

  7. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data.

    PubMed

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-07-13

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses.

  8. Vy-PER: eliminating false positive detection of virus integration events in next generation sequencing data

    PubMed Central

    Forster, Michael; Szymczak, Silke; Ellinghaus, David; Hemmrich, Georg; Rühlemann, Malte; Kraemer, Lars; Mucha, Sören; Wienbrandt, Lars; Stanulla, Martin; Franke, Andre

    2015-01-01

    Several pathogenic viruses such as hepatitis B and human immunodeficiency viruses may integrate into the host genome. These virus/host integrations are detectable using paired-end next generation sequencing. However, the low number of expected true virus integrations may be difficult to distinguish from the noise of many false positive candidates. Here, we propose a novel filtering approach that increases specificity without compromising sensitivity for virus/host chimera detection. Our detection pipeline termed Vy-PER (Virus integration detection bY Paired End Reads) outperforms existing similar tools in speed and accuracy. We analysed whole genome data from childhood acute lymphoblastic leukemia (ALL), which is characterised by genomic rearrangements and usually associated with radiation exposure. This analysis was motivated by the recently reported virus integrations at genomic rearrangement sites and association with chromosomal instability in liver cancer. However, as expected, our analysis of 20 tumour and matched germline genomes from ALL patients finds no significant evidence for integrations by known viruses. Nevertheless, our method eliminates 12,800 false positives per genome (80× coverage) and only our method detects singleton human-phiX174-chimeras caused by optical errors of the Illumina HiSeq platform. This high accuracy is useful for detecting low virus integration levels as well as non-integrated viruses. PMID:26166306

  9. PileLine: a toolbox to handle genome position information in next-generation sequencing studies

    PubMed Central

    2011-01-01

    Background Genomic position (GP) files currently used in next-generation sequencing (NGS) studies are always difficult to manipulate due to their huge size and the lack of appropriate tools to properly manage them. The structure of these flat files is based on representing one line per position that has been covered by at least one aligned read, imposing significant restrictions from a computational performance perspective. Results PileLine implements a flexible command-line toolkit providing specific support to the management, filtering, comparison and annotation of GP files produced by NGS experiments. PileLine tools are coded in Java and run on both UNIX (Linux, Mac OS) and Windows platforms. The set of tools comprising PileLine are designed to be memory efficient by performing fast seek on-disk operations over sorted GP files. Conclusions Our novel toolbox has been extensively tested taking into consideration performance issues. It is publicly available at http://sourceforge.net/projects/pilelinetools under the GNU LGPL license. Full documentation including common use cases and guided analysis workflows is available at http://sing.ei.uvigo.es/pileline. PMID:21261974

  10. HLA genotyping in the clinical laboratory: comparison of next-generation sequencing methods.

    PubMed

    Profaizer, T; Lázár-Molnár, E; Close, D W; Delgado, J C; Kumánovics, A

    2016-07-01

    Implementation of human leukocyte antigen (HLA) genotyping by next-generation sequencing (NGS) in the clinical lab brings new challenges to the laboratories performing this testing. With the advent of commercially available HLA-NGS typing kits, labs must make numerous decisions concerning capital equipment and address labor considerations. Therefore, careful and unbiased evaluation of available methods is imperative. In this report, we compared our in-house developed HLA NGS typing with two commercially available kits from Illumina and Omixon using 10 International Histocompatibility Working Group (IHWG) and 36 clinical samples. Although all three methods employ long range polymerase chain reaction (PCR) and have been developed on the Illumina MiSeq platform, the methodologies for library preparation show significant variations. There was 100% typing concordance between all three methods at the first field when a HLA type could be assigned. Overall, HLA typing by NGS using in-house or commercially available methods is now feasible in clinical laboratories. However, technical variables such as hands-on time and indexing strategies are sufficiently different among these approaches to impact the workflow of the clinical laboratory. PMID:27524804

  11. Coinfection of Fusobacterium nucleatum and Actinomyces israelii in mastoiditis diagnosed by next-generation DNA sequencing.

    PubMed

    Salipante, Stephen J; Hoogestraat, Daniel R; Abbott, April N; SenGupta, Dhruba J; Cummings, Lisa A; Butler-Wu, Susan M; Stephens, Karen; Cookson, Brad T; Hoffman, Noah G

    2014-05-01

    Some bacterial infections involve potentially complex mixtures of species that can now be distinguished using next-generation DNA sequencing. We present a case of mastoiditis where Gram stain, culture, and molecular diagnosis were nondiagnostic or discrepant. Next-generation sequencing implicated coinfection of Fusobacterium nucleatum and Actinomyces israelii, resolving these diagnostic discrepancies.

  12. Autonomously generating operations sequences for a Mars Rover using AI-based planning

    NASA Technical Reports Server (NTRS)

    Sherwood, Rob; Mishkin, Andrew; Estlin, Tara; Chien, Steve; Backes, Paul; Cooper, Brian; Maxwell, Scott; Rabideau, Gregg

    2001-01-01

    This paper discusses a proof-of-concept prototype for ground-based automatic generation of validated rover command sequences from highlevel science and engineering activities. This prototype is based on ASPEN, the Automated Scheduling and Planning Environment. This Artificial Intelligence (AI) based planning and scheduling system will automatically generate a command sequence that will execute within resource constraints and satisfy flight rules.

  13. Enhanced microbial diversity in the saliva microbiome induced by short-term probiotic intake revealed by 16S rRNA sequencing on the IonTorrent PGM platform.

    PubMed

    Dassi, Erik; Ballarini, Annalisa; Covello, Giuseppina; Quattrone, Alessandro; Jousson, Olivier; De Sanctis, Veronica; Bertorelli, Roberto; Denti, Michela Alessandra; Segata, Nicola

    2014-11-20

    Microbial communities populating several human body habitats are important determinants of human health. Cultivation-free community-wide approaches like bacterial 16S rRNA sequencing recently revolutionized the study of such human-associated microbial diversity, and the continuously decreasing cost/throughput ratio of current sequencing platforms is further enhancing the availability and effectiveness of microbiome research. The IonTorrent PGM platform is among the latest available commercial high-throughput sequencing tools, but it is just starting to be used for 16S rRNA surveys with only episodic assessments of its performance. We present here the first IonTorrent profiling of the human saliva microbiome collected from 12 healthy individuals. In this cohort, a subset of the volunteers was asked to assume a probiotic product, in order to investigate its impact on the composition and the structure of the saliva microbiome. Analysis of the generated dataset suggests the suitability of the IonTorrent platform for 16S rRNA surveys, even though some platform-specific choices are required to optimize the consistency of the obtained bacterial profiles. Interestingly, we found a marked and statistically significant increase of the overall bacterial diversity in the saliva of individuals who received the probiotic product compared to the control group, suggesting a short-term effect of probiotic product administration on oral microbiome composition.

  14. The development of next-generation sequencing assays for the mitochondrial genome and 108 nuclear genes associated with mitochondrial disorders.

    PubMed

    Dames, Shale; Chou, Lan-Szu; Xiao, Ye; Wayman, Tyler; Stocks, Jennifer; Singleton, Marc; Eilbeck, Karen; Mao, Rong

    2013-07-01

    Sanger sequencing of multigenic disorders can be technically challenging, time consuming, and prohibitively expensive. High-throughput next-generation sequencing (NGS) can provide a cost-effective method for sequencing targeted genes associated with multigenic disorders. We have developed a NGS clinical targeted gene assay for the mitochondrial genome and for 108 selected nuclear genes associated with mitochondrial disorders. Mitochondrial disorders have a reported incidence of 1 in 5000 live births, encompass a broad range of phenotypes, and are attributed to mutations in the mitochondrial and nuclear genomes. Approximately 20% of mitochondrial disorders result from mutations in mtDNA, with the remaining 80% found in nuclear genes that affect mtDNA levels or mitochondrion protein assembly. In our NGS approach, the 16,569-bp mtDNA is enriched by long-range PCR and the 108 nuclear genes (which represent 1301 amplicons and 680 kb) are enriched by RainDance emulsion PCR. Sequencing is performed on Illumina HiSeq 2000 or MiSeq platforms, and bioinformatics analysis is performed using commercial and in-house developed bioinformatics pipelines. A total of 16 validation and 13 clinical samples were examined. All previously reported variants associated with mitochondrial disorders were found in validation samples, and 5 of the 13 clinical samples were found to have mutations associated with mitochondrial disorders in either the mitochondrial genome or the 108 nuclear genes. All variants were confirmed by Sanger sequencing.

  15. Concatenated shift registers generating maximally spaced phase shifts of PN-sequences

    NASA Technical Reports Server (NTRS)

    Hurd, W. J.; Welch, L. R.

    1977-01-01

    A large class of linearly concatenated shift registers is shown to generate approximately maximally spaced phase shifts of pn-sequences, for use in pseudorandom number generation. A constructive method is presented for finding members of this class, for almost all degrees for which primitive trinomials exist. The sequences which result are not normally characterized by trinomial recursions, which is desirable since trinomial sequences can have some undesirable randomness properties.

  16. [Next-generation sequencing technologies and the application in microbiology--a review].

    PubMed

    Qin, Nan; Li, Dongfang; Yang, Ruifu

    2011-04-01

    Since its invention in 1970s, nucleic acid sequencing technology has contributed tremendously to the genomics advances. The next-generation sequencing technologies, represented by HiSeq 2000 from Illumina, SOLiD from Applied Biosystems and 454 from Roche, re-energized the application of genomics. In this review, we first introduced the next-generation sequencing technologies, then, described their potential applications in the field of microbiology.

  17. NexGen Production – Sequencing and Analysis

    SciTech Connect

    Muzny, Donna

    2010-06-02

    Donna Muzny of the Baylor College of Medicine Human Genome Sequencing Center discusses next generation sequencing platforms and evaluating pipeline performance on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM

  18. Structural variation detection using next-generation sequencing data: A comparative technical review.

    PubMed

    Guan, Peiyong; Sung, Wing-Kin

    2016-06-01

    Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. PMID:26845461

  19. Effort required to finish shotgun-generated genome sequences differs significantly among vertebrates

    PubMed Central

    2010-01-01

    Background The approaches for shotgun-based sequencing of vertebrate genomes are now well-established, and have resulted in the generation of numerous draft whole-genome sequence assemblies. In contrast, the process of refining those assemblies to improve contiguity and increase accuracy (known as 'sequence finishing') remains tedious, labor-intensive, and expensive. As a result, the vast majority of vertebrate genome sequences generated to date remain at a draft stage. Results To date, our genome sequencing efforts have focused on comparative studies of targeted genomic regions, requiring sequence finishing of large blocks of orthologous sequence (average size 0.5-2 Mb) from various subsets of 75 vertebrates. This experience has provided a unique opportunity to compare the relative effort required to finish shotgun-generated genome sequence assemblies from different species, which we report here. Importantly, we found that the sequence assemblies generated for the same orthologous regions from various vertebrates show substantial variation with respect to misassemblies and, in particular, the frequency and characteristics of sequence gaps. As a consequence, the work required to finish different species' sequences varied greatly. Application of the same standardized methods for finishing provided a novel opportunity to "assay" characteristics of genome sequences among many vertebrate species. It is important to note that many of the problems we have encountered during sequence finishing reflect unique architectural features of a particular vertebrate's genome, which in some cases may have important functional and/or evolutionary implications. Finally, based on our analyses, we have been able to improve our procedures to overcome some of these problems and to increase the overall efficiency of the sequence-finishing process, although significant challenges still remain. Conclusion Our findings have important implications for the eventual finishing of the draft whole

  20. A next-generation marker genotyping platform (AmpSeq) in heterozygous crops: a case study for marker-assisted selection in grapevine.

    PubMed

    Yang, Shanshan; Fresnedo-Ramírez, Jonathan; Wang, Minghui; Cote, Linda; Schweitzer, Peter; Barba, Paola; Takacs, Elizabeth M; Clark, Matthew; Luby, James; Manns, David C; Sacks, Gavin; Mansfield, Anna Katharine; Londo, Jason; Fennell, Anne; Gadoury, David; Reisch, Bruce; Cadle-Davidson, Lance; Sun, Qi

    2016-01-01

    Marker-assisted selection (MAS) is often employed in crop breeding programs to accelerate and enhance cultivar development, via selection during the juvenile phase and parental selection prior to crossing. Next-generation sequencing and its derivative technologies have been used for genome-wide molecular marker discovery. To bridge the gap between marker development and MAS implementation, this study developed a novel practical strategy with a semi-automated pipeline that incorporates trait-associated single nucleotide polymorphism marker discovery, low-cost genotyping through amplicon sequencing (AmpSeq) and decision making. The results document the development of a MAS package derived from genotyping-by-sequencing using three traits (flower sex, disease resistance and acylated anthocyanins) in grapevine breeding. The vast majority of sequence reads (⩾99%) were from the targeted regions. Across 380 individuals and up to 31 amplicons sequenced in each lane of MiSeq data, most amplicons (83 to 87%) had <10% missing data, and read depth had a median of 220-244×. Several strengths of the AmpSeq platform that make this approach of broad interest in diverse crop species include accuracy, flexibility, speed, high-throughput, low-cost and easily automated analysis. PMID:27257505

  1. A next-generation marker genotyping platform (AmpSeq) in heterozygous crops: a case study for marker-assisted selection in grapevine.

    PubMed

    Yang, Shanshan; Fresnedo-Ramírez, Jonathan; Wang, Minghui; Cote, Linda; Schweitzer, Peter; Barba, Paola; Takacs, Elizabeth M; Clark, Matthew; Luby, James; Manns, David C; Sacks, Gavin; Mansfield, Anna Katharine; Londo, Jason; Fennell, Anne; Gadoury, David; Reisch, Bruce; Cadle-Davidson, Lance; Sun, Qi

    2016-01-01

    Marker-assisted selection (MAS) is often employed in crop breeding programs to accelerate and enhance cultivar development, via selection during the juvenile phase and parental selection prior to crossing. Next-generation sequencing and its derivative technologies have been used for genome-wide molecular marker discovery. To bridge the gap between marker development and MAS implementation, this study developed a novel practical strategy with a semi-automated pipeline that incorporates trait-associated single nucleotide polymorphism marker discovery, low-cost genotyping through amplicon sequencing (AmpSeq) and decision making. The results document the development of a MAS package derived from genotyping-by-sequencing using three traits (flower sex, disease resistance and acylated anthocyanins) in grapevine breeding. The vast majority of sequence reads (⩾99%) were from the targeted regions. Across 380 individuals and up to 31 amplicons sequenced in each lane of MiSeq data, most amplicons (83 to 87%) had <10% missing data, and read depth had a median of 220-244×. Several strengths of the AmpSeq platform that make this approach of broad interest in diverse crop species include accuracy, flexibility, speed, high-throughput, low-cost and easily automated analysis.

  2. A next-generation marker genotyping platform (AmpSeq) in heterozygous crops: a case study for marker-assisted selection in grapevine

    PubMed Central

    Yang, Shanshan; Fresnedo-Ramírez, Jonathan; Wang, Minghui; Cote, Linda; Schweitzer, Peter; Barba, Paola; Takacs, Elizabeth M; Clark, Matthew; Luby, James; Manns, David C; Sacks, Gavin; Mansfield, Anna Katharine; Londo, Jason; Fennell, Anne; Gadoury, David; Reisch, Bruce; Cadle-Davidson, Lance; Sun, Qi

    2016-01-01

    Marker-assisted selection (MAS) is often employed in crop breeding programs to accelerate and enhance cultivar development, via selection during the juvenile phase and parental selection prior to crossing. Next-generation sequencing and its derivative technologies have been used for genome-wide molecular marker discovery. To bridge the gap between marker development and MAS implementation, this study developed a novel practical strategy with a semi-automated pipeline that incorporates trait-associated single nucleotide polymorphism marker discovery, low-cost genotyping through amplicon sequencing (AmpSeq) and decision making. The results document the development of a MAS package derived from genotyping-by-sequencing using three traits (flower sex, disease resistance and acylated anthocyanins) in grapevine breeding. The vast majority of sequence reads (⩾99%) were from the targeted regions. Across 380 individuals and up to 31 amplicons sequenced in each lane of MiSeq data, most amplicons (83 to 87%) had <10% missing data, and read depth had a median of 220–244×. Several strengths of the AmpSeq platform that make this approach of broad interest in diverse crop species include accuracy, flexibility, speed, high-throughput, low-cost and easily automated analysis. PMID:27257505

  3. A Systems Approach towards an Intelligent and Self-Controlling Platform for Integrated Continuous Reaction Sequences**

    PubMed Central

    Ingham, Richard J; Battilocchio, Claudio; Fitzpatrick, Daniel E; Sliwinski, Eric; Hawkins, Joel M; Ley, Steven V

    2015-01-01

    Performing reactions in flow can offer major advantages over batch methods. However, laboratory flow chemistry processes are currently often limited to single steps or short sequences due to the complexity involved with operating a multi-step process. Using new modular components for downstream processing, coupled with control technologies, more advanced multi-step flow sequences can be realized. These tools are applied to the synthesis of 2-aminoadamantane-2-carboxylic acid. A system comprising three chemistry steps and three workup steps was developed, having sufficient autonomy and self-regulation to be managed by a single operator. PMID:25377747

  4. Genome sequence of Geobacillus thermoglucosidasius DSM2542, a platform hosts for biotechnological applications with industrial potential.

    PubMed

    Chen, Jingyu; Zhang, Zhengzhi; Zhang, Caili; Yu, Bo

    2015-12-20

    Thermophilic Geobacillus thermoglucosidasius could ferment a wide range of substrates with low nutrient requirements for growth. Here, the first released the complete genome sequence of G. thermoglucosidasius DSM2542 may facilitate the design of rational strategies for further strain improvements and provide information for exploring industrially interesting enzymes with thermotolerant properties.

  5. Nematode.net update 2011: addition of data sets and tools featuring next-generation sequencing data.

    PubMed

    Martin, John; Abubucker, Sahar; Heizer, Esley; Taylor, Christina M; Mitreva, Makedonka

    2012-01-01

    Nematode.net (http://nematode.net) has been a publicly available resource for studying nematodes for over a decade. In the past 3 years, we reorganized Nematode.net to provide more user-friendly navigation through the site, a necessity due to the explosion of data from next-generation sequencing platforms. Organism-centric portals containing dynamically generated data are available for over 56 different nematode species. Next-generation data has been added to the various data-mining portals hosted, including NemaBLAST and NemaBrowse. The NemaPath metabolic pathway viewer builds associations using KOs, rather than ECs to provide more accurate and fine-grained descriptions of proteins. Two new features for data analysis and comparative genomics have been added to the site. NemaSNP enables the user to perform population genetics studies in various nematode populations using next-generation sequencing data. HelmCoP (Helminth Control and Prevention) as an independent component of Nematode.net provides an integrated resource for storage, annotation and comparative genomics of helminth genomes to aid in learning more about nematode genomes, as well as drug, pesticide, vaccine and drug target discovery. With this update, Nematode.net will continue to realize its original goal to disseminate diverse bioinformatic data sets and provide analysis tools to the broad scientific community in a useful and user-friendly manner.

  6. Quantifying Next Generation Sequencing Sample Pre-Processing Bias in HIV-1 Complete Genome Sequencing.

    PubMed

    Vrancken, Bram; Trovão, Nídia Sequeira; Baele, Guy; van Wijngaerden, Eric; Vandamme, Anne-Mieke; van Laethem, Kristel; Lemey, Philippe

    2016-01-01

    Genetic analyses play a central role in infectious disease research. Massively parallelized "mechanical cloning" and sequencing technologies were quickly adopted by HIV researchers in order to broaden the understanding of the clinical importance of minor drug-resistant variants. These efforts have, however, remained largely limited to small genomic regions. The growing need to monitor multiple genome regions for drug resistance testing, as well as the obvious benefit for studying evolutionary and epidemic processes makes complete genome sequencing an important goal in viral research. In addition, a major drawback for NGS applications to RNA viruses is the need for large quantities of input DNA. Here, we use a generic overlapping amplicon-based near full-genome amplification protocol to compare low-input enzymatic fragmentation (Nextera™) with conventional mechanical shearing for Roche 454 sequencing. We find that the fragmentation method has only a modest impact on the characterization of the population composition and that for reliable results, the variation introduced at all steps of the procedure--from nucleic acid extraction to sequencing--should be taken into account, a finding that is also relevant for NGS technologies that are now more commonly used. Furthermore, by applying our protocol to deep sequence a number of pre-therapy plasma and PBMC samples, we illustrate the potential benefits of a near complete genome sequencing approach in routine genotyping. PMID:26751471

  7. Assessing kinetic and epitopic diversity across orthogonal monoclonal antibody generation platforms.

    PubMed

    Abdiche, Yasmina Noubia; Harriman, Rian; Deng, Xiaodi; Yeung, Yik Andy; Miles, Adam; Morishige, Winse; Boustany, Leila; Zhu, Lei; Izquierdo, Shelley Mettler; Harriman, William

    2016-01-01

    The ability of monoclonal antibodies (mAbs) to target specific antigens with high precision has led to an increasing demand to generate them for therapeutic use in many disease areas. Historically, the discovery of therapeutic mAbs has relied upon the immunization of mammals and various in vitro display technologies. While the routine immunization of rodents yields clones that are stable in serum and have been selected against vast arrays of endogenous, non-target self-antigens, it is often difficult to obtain species cross-reactive mAbs owing to the generally high sequence similarity shared across human antigens and their mammalian orthologs. In vitro display technologies bypass this limitation, but lack an in vivo screening mechanism, and thus may potentially generate mAbs with undesirable binding specificity and stability issues. Chicken immunization is emerging as an attractive mAb discovery method because it combines the benefits of both in vivo and in vitro display methods. Since chickens are phylogenetically separated from mammals, their proteins share less sequence homology with those of humans, so human proteins are often immunogenic and can readily elicit rodent cross-reactive clones, which are necessary for in vivo proof of mechanism studies. Here, we compare the binding characteristics of mAbs isolated from chicken immunization, mouse immunization, and phage display of human antibody libraries. Our results show that chicken-derived mAbs not only recapitulate the kinetic diversity of mAbs sourced from other methods, but appear to offer an expanded repertoire of epitopes. Further, chicken-derived mAbs can bind their native serum antigen with very high affinity, highlighting their therapeutic potential. PMID:26652308

  8. Assessing kinetic and epitopic diversity across orthogonal monoclonal antibody generation platforms

    PubMed Central

    Abdiche, Yasmina Noubia; Harriman, Rian; Deng, Xiaodi; Yeung, Yik Andy; Miles, Adam; Morishige, Winse; Boustany, Leila; Zhu, Lei; Izquierdo, Shelley Mettler; Harriman, William

    2016-01-01

    ABSTRACT The ability of monoclonal antibodies (mAbs) to target specific antigens with high precision has led to an increasing demand to generate them for therapeutic use in many disease areas. Historically, the discovery of therapeutic mAbs has relied upon the immunization of mammals and various in vitro display technologies. While the routine immunization of rodents yields clones that are stable in serum and have been selected against vast arrays of endogenous, non-target self-antigens, it is often difficult to obtain species cross-reactive mAbs owing to the generally high sequence similarity shared across human antigens and their mammalian orthologs. In vitro display technologies bypass this limitation, but lack an in vivo screening mechanism, and thus may potentially generate mAbs with undesirable binding specificity and stability issues. Chicken immunization is emerging as an attractive mAb discovery method because it combines the benefits of both in vivo and in vitro display methods. Since chickens are phylogenetically separated from mammals, their proteins share less sequence homology with those of humans, so human proteins are often immunogenic and can readily elicit rodent cross-reactive clones, which are necessary for in vivo proof of mechanism studies. Here, we compare the binding characteristics of mAbs isolated from chicken immunization, mouse immunization, and phage display of human antibody libraries. Our results show that chicken-derived mAbs not only recapitulate the kinetic diversity of mAbs sourced from other methods, but appear to offer an expanded repertoire of epitopes. Further, chicken-derived mAbs can bind their native serum antigen with very high affinity, highlighting their therapeutic potential. PMID:26652308

  9. Long-range PCR in next-generation sequencing: comparison of six enzymes and evaluation on the MiSeq sequencer.

    PubMed

    Jia, Haiying; Guo, Yunfei; Zhao, Weiwei; Wang, Kai

    2014-01-01

    Long-range PCR remains a flexible, fast, efficient and cost-effective choice for sequencing candidate genomic regions in a small number of samples, especially when combined with next-generation sequencing (NGS) platforms. Several long-range DNA polymerases are advertised as being able to amplify up to 15 kb or longer genomic DNA. However, their real-world performance characteristics and their suitability for NGS remain unclear. We evaluated six long-range DNA polymerases (Invitrogen SequalPrep, Invitrogen AccuPrime, TaKaRa PrimeSTAR GXL, TaKaRa LA Taq Hot Start, KAPA Long Range HotStart and QIAGEN LongRange PCR Polymerase) to amplify three amplicons, with sizes of 12.9 kb, 9.7 kb, and 5.8 kb, respectively. Subsequently, we used the PrimeSTAR enzyme to amplify entire BRCA1 (83.2 kb) and BRCA2 (84.2 kb) genes from nine subjects and sequenced them on an Illumina MiSeq sequencer. We found that the TaKaRa PrimeSTAR GXL DNA polymerase can amplify almost all amplicons with different sizes and Tm values under identical PCR conditions. Other enzymes require alteration of PCR conditions to obtain optimal performance. From the MiSeq run, we identified multiple intronic and exonic single-nucleotide variations (SNVs), including one mutation (c.5946delT in BRCA2) in a positive control. Our study provided useful results for sequencing research focused on large genomic regions. PMID:25034901

  10. Next generation sequencing technologies and the changing landscape of phage genomics

    PubMed Central

    Klumpp, Jochen; Fouts, Derrick E.; Sozhamannan, Shanmuga

    2012-01-01

    The dawn of next generation sequencing technologies has opened up exciting possibilities for whole genome sequencing of a plethora of organisms. The 2nd and 3rd generation sequencing technologies, based on cloning-free, massively parallel sequencing, have enabled the generation of a deluge of genomic sequences of both prokaryotic and eukaryotic origin in the last seven years. However, whole genome sequencing of bacterial viruses has not kept pace with this revolution, despite the fact that their genomes are orders of magnitude smaller in size compared with bacteria and other organisms. Sequencing phage genomes poses several challenges; (1) obtaining pure phage genomic material, (2) PCR amplification biases and (3) complex nature of their genetic material due to features such as methylated bases and repeats that are inherently difficult to sequence and assemble. Here we describe conclusions drawn from our efforts in sequencing hundreds of bacteriophage genomes from a variety of Gram-positive and Gram-negative bacteria using Sanger, 454, Illumina and PacBio technologies. Based on our experience we propose several general considerations regarding sample quality, the choice of technology and a “blended approach” for generating reliable whole genome sequences of phages. PMID:23275870

  11. SeqReporter: automating next-generation sequencing result interpretation and reporting workflow in a clinical laboratory.

    PubMed

    Roy, Somak; Durso, Mary Beth; Wald, Abigail; Nikiforov, Yuri E; Nikiforova, Marina N

    2014-01-01

    A wide repertoire of bioinformatics applications exist for next-generation sequencing data analysis; however, certain requirements of the clinical molecular laboratory limit their use: i) comprehensive report generation, ii) compatibility with existing laboratory information systems and computer operating system, iii) knowledgebase development, iv) quality management, and v) data security. SeqReporter is a web-based application developed using ASP.NET framework version 4.0. The client-side was designed using HTML5, CSS3, and Javascript. The server-side processing (VB.NET) relied on interaction with a customized SQL server 2008 R2 database. Overall, 104 cases (1062 variant calls) were analyzed by SeqReporter. Each variant call was classified into one of five report levels: i) known clinical significance, ii) uncertain clinical significance, iii) pending pathologists' review, iv) synonymous and deep intronic, and v) platform and panel-specific sequence errors. SeqReporter correctly annotated and classified 99.9% (859 of 860) of sequence variants, including 68.7% synonymous single-nucleotide variants, 28.3% nonsynonymous single-nucleotide variants, 1.7% insertions, and 1.3% deletions. One variant of potential clinical significance was re-classified after pathologist review. Laboratory information system-compatible clinical reports were generated automatically. SeqReporter also facilitated quality management activities. SeqReporter is an example of a customized and well-designed informatics solution to optimize and automate the downstream analysis of clinical next-generation sequencing data. We propose it as a model that may envisage the development of a comprehensive clinical informatics solution.

  12. SeqReporter: automating next-generation sequencing result interpretation and reporting workflow in a clinical laboratory.

    PubMed

    Roy, Somak; Durso, Mary Beth; Wald, Abigail; Nikiforov, Yuri E; Nikiforova, Marina N

    2014-01-01

    A wide repertoire of bioinformatics applications exist for next-generation sequencing data analysis; however, certain requirements of the clinical molecular laboratory limit their use: i) comprehensive report generation, ii) compatibility with existing laboratory information systems and computer operating system, iii) knowledgebase development, iv) quality management, and v) data security. SeqReporter is a web-based application developed using ASP.NET framework version 4.0. The client-side was designed using HTML5, CSS3, and Javascript. The server-side processing (VB.NET) relied on interaction with a customized SQL server 2008 R2 database. Overall, 104 cases (1062 variant calls) were analyzed by SeqReporter. Each variant call was classified into one of five report levels: i) known clinical significance, ii) uncertain clinical significance, iii) pending pathologists' review, iv) synonymous and deep intronic, and v) platform and panel-specific sequence errors. SeqReporter correctly annotated and classified 99.9% (859 of 860) of sequence variants, including 68.7% synonymous single-nucleotide variants, 28.3% nonsynonymous single-nucleotide variants, 1.7% insertions, and 1.3% deletions. One variant of potential clinical significance was re-classified after pathologist review. Laboratory information system-compatible clinical reports were generated automatically. SeqReporter also facilitated quality management activities. SeqReporter is an example of a customized and well-designed informatics solution to optimize and automate the downstream analysis of clinical next-generation sequencing data. We propose it as a model that may envisage the development of a comprehensive clinical informatics solution. PMID:24220144

  13. Next-generation sequencing for the diagnosis of cardiac arrhythmia syndromes.

    PubMed

    Lubitz, Steven A; Ellinor, Patrick T

    2015-05-01

    Inherited arrhythmia syndromes are collectively associated with substantial morbidity, yet our understanding of the genetic architecture of these conditions remains limited. Recent technological advances in DNA sequencing have led to the commercialization of genetic testing now widely available in clinical practice. In particular, next-generation sequencing allows the large-scale and rapid assessment of entire genomes. Although next-generation sequencing represents a major technological advance, it has introduced numerous challenges with respect to the interpretation of genetic variation and has opened a veritable floodgate of biological data of unknown clinical significance to practitioners. In this review, we discuss current genetic testing indications for inherited arrhythmia syndromes, broadly outline characteristics of next-generation sequencing techniques, and highlight challenges associated with such testing. We further summarize future directions that will be necessary to address to enable the widespread adoption of next-generation sequencing in the routine management of patients with inherited arrhythmia syndromes. PMID:25625719

  14. Diagnostic validation of a familial hypercholesterolaemia cohort provides a model for using targeted next generation DNA sequencing in the clinical setting.

    PubMed

    Hinchcliffe, Marcus; Le, Huong; Fimmel, Anthony; Molloy, Laura; Freeman, Lucinda; Sullivan, David; Trent, Ronald J

    2014-01-01

    Our aim was to assess the sensitivity and specificity of a next generation DNA sequencing (NGS) platform using a capture based DNA library preparation method. Data and experience gained from this diagnostic validation can be used to progress the applications of NGS in the wider molecular diagnostic setting. A technical cross-validation comparing the current molecular diagnostic gold standard methods of Sanger DNA sequencing and multiplex ligation-dependant probe amplification (MLPA) versus a customised capture based targeted re-sequencing method on a SOLiD 5500 sequencing platform was carried out using a cohort of 96 familial hypercholesterolaemia (FH) samples. We compared a total of 595 DNA variations (488 common single nucleotide polymorphisms, 73 missense mutations, 9 nonsense mutations, 3 splice site point mutations, 13 small indels, 2 multi-exonic duplications and 7 multi-exonic deletions) found previously in the 96 FH samples. DNA variation detection sensitivity and specificity were both 100% for the SOLiD 5500 NGS platform compared with Sanger sequencing and MLPA only when both LifeScope and Integrative Genomics Viewer softwares were utilised. The methods described here offer a high-quality strategy for the detection of a wide range of DNA mutations in diseases with a moderate number of well described causative genes. However, there are important issues related to the bioinformatic algorithms employed to detect small indels.

  15. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  16. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  17. Quantifying Next Generation Sequencing Sample Pre-Processing Bias in HIV-1 Complete Genome Sequencing

    PubMed Central

    Vrancken, Bram; Trovão, Nídia Sequeira; Baele, Guy; van Wijngaerden, Eric; Vandamme, Anne-Mieke; van Laethem, Kristel; Lemey, Philippe

    2016-01-01

    Genetic analyses play a central role in infectious disease research. Massively parallelized “mechanical cloning” and sequencing technologies were quickly adopted by HIV researchers in order to broaden the understanding of the clinical importance of minor drug-resistant variants. These efforts have, however, remained largely limited to small genomic regions. The growing need to monitor multiple genome regions for drug resistance testing, as well as the obvious benefit for studying evolutionary and epidemic processes makes complete genome sequencing an important goal in viral research. In addition, a major drawback for NGS applications to RNA viruses is the need for large quantities of input DNA. Here, we use a generic overlapping amplicon-based near full-genome amplification protocol to compare low-input enzymatic fragmentation (Nextera™) with conventional mechanical shearing for Roche 454 sequencing. We find that the fragmentation method has only a modest impact on the characterization of the population composition and that for reliable results, the variation introduced at all steps of the procedure—from nucleic acid extraction to sequencing—should be taken into account, a finding that is also relevant for NGS technologies that are now more commonly used. Furthermore, by applying our protocol to deep sequence a number of pre-therapy plasma and PBMC samples, we illustrate the potential benefits of a near complete genome sequencing approach in routine genotyping. PMID:26751471

  18. Next-generation sequencing-based method shows increased mutation detection sensitivity in an Indian retinoblastoma cohort

    PubMed Central

    Singh, Jaya; Mishra, Avshesh; Pandian, Arunachalam Jayamuruga; Mallipatna, Ashwin C.; Khetan, Vikas; Sripriya, S.; Kapoor, Suman; Agarwal, Smita; Sankaran, Satish; Katragadda, Shanmukh; Veeramachaneni, Vamsi; Hariharan, Ramesh; Subramanian, Kalyanasundaram

    2016-01-01

    Purpose Retinoblastoma (Rb) is the most common primary intraocular cancer of childhood and one of the major causes of blindness in children. India has the highest number of patients with Rb in the world. Mutations in the RB1 gene are the primary cause of Rb, and heterogeneous mutations are distributed throughout the entire length of the gene. Therefore, genetic testing requires screening of the entire gene, which by conventional sequencing is time consuming and expensive. Methods In this study, we screened the RB1 gene in the DNA isolated from blood or saliva samples of 50 unrelated patients with Rb using the TruSight Cancer panel. Next-generation sequencing (NGS) was done on the Illumina MiSeq platform. Genetic variations were identified using the Strand NGS software and interpreted using the StrandOmics platform. Results We were able to detect germline pathogenic mutations in 66% (33/50) of the cases, 12 of which were novel. We were able to detect all types of mutations, including missense, nonsense, splice site, indel, and structural variants. When we considered bilateral Rb cases only, the mutation detection rate increased to 100% (22/22). In unilateral Rb cases, the mutation detection rate was 30% (6/20). Conclusions Our study suggests that NGS-based approaches increase the sensitivity of mutation detection in the RB1 gene, making it fast and cost-effective compared to the conventional tests performed in a reflex-testing mode. PMID:27582626

  19. Next-generation sequencing (NGS) for assessment of microbial water quality: current progress, challenges, and future opportunities.

    PubMed

    Tan, BoonFei; Ng, Charmaine; Nshimyimana, Jean Pierre; Loh, Lay Leng; Gin, Karina Y-H; Thompson, Janelle R

    2015-01-01

    Water quality is an emergent property of a complex system comprised of interacting microbial populations and introduced microbial and chemical contaminants. Studies leveraging next-generation sequencing (NGS) technologies are providing new insights into the ecology of microbially mediated processes that influence fresh water quality such as algal blooms, contaminant biodegradation, and pathogen dissemination. In addition, sequencing methods targeting small subunit (SSU) rRNA hypervariable regions have allowed identification of signature microbial species that serve as bioindicators for sewage contamination in these environments. Beyond amplicon sequencing, metagenomic and metatranscriptomic analyses of microbial communities in fresh water environments reveal the genetic capabilities and interplay of waterborne microorganisms, shedding light on the mechanisms for production and biodegradation of toxins and other contaminants. This review discusses the challenges and benefits of applying NGS-based methods to water quality research and assessment. We will consider the suitability and biases inherent in the application of NGS as a screening tool for assessment of biological risks and discuss the potential and limitations for direct quantitative interpretation of NGS data. Secondly, we will examine case studies from recent literature where NGS based methods have been applied to topics in water quality assessment, including development of bioindicators for sewage pollution and microbial source tracking, characterizing the distribution of toxin and antibiotic resistance genes in water samples, and investigating mechanisms of biodegradation of harmful pollutants that threaten water quality. Finally, we provide a short review of emerging NGS platforms and their potential applications to the next generation of water quality assessment tools.

  20. Next-Generation Sequencing Approaches in Cancer: Where Have They Brought Us and Where Will They Take Us?

    PubMed Central

    LeBlanc, Veronique G.; Marra, Marco A.

    2015-01-01

    Next-generation sequencing (NGS) technologies and data have revolutionized cancer research and are increasingly being deployed to guide clinicians in treatment decision-making. NGS technologies have allowed us to take an “omics” approach to cancer in order to reveal genomic, transcriptomic, and epigenomic landscapes of individual malignancies. Integrative multi-platform analyses are increasingly used in large-scale projects that aim to fully characterize individual tumours as well as general cancer types and subtypes. In this review, we examine how NGS technologies in particular have contributed to “omics” approaches in cancer research, allowing for large-scale integrative analyses that consider hundreds of tumour samples. These types of studies have provided us with an unprecedented wealth of information, providing the background knowledge needed to make small-scale (including “N of 1”) studies informative and relevant. We also take a look at emerging opportunities provided by NGS and state-of-the-art third-generation sequencing technologies, particularly in the context of translational research. Cancer research and care are currently poised to experience significant progress catalyzed by accessible sequencing technologies that will benefit both clinical- and research-based efforts. PMID:26404381

  1. An efficient quantitation method of next-generation sequencing libraries by using MiSeq sequencer.

    PubMed

    Katsuoka, Fumiki; Yokozawa, Junji; Tsuda, Kaoru; Ito, Shin; Pan, Xiaoqing; Nagasaki, Masao; Yasuda, Jun; Yamamoto, Masayuki

    2014-12-01

    Library quantitation is a critical step to obtain high data output in Illumina HiSeq sequencers. Here, we introduce a library quantitation method that uses the Illumina MiSeq sequencer designated as quantitative MiSeq (qMiSeq). In this procedure, 96 dual-index libraries, including control samples, are denatured, pooled in equal volume, and sequenced by MiSeq. We found that relative concentration of each library can be determined based on the observed index ratio and can be used to determine HiSeq run condition for each library. Thus, qMiSeq provides an efficient way to quantitate a large number of libraries at a time.

  2. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment. PMID:27131380

  3. The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis.

    PubMed

    Alva, Vikram; Nam, Seung-Zin; Söding, Johannes; Lupas, Andrei N

    2016-07-01

    The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.

  4. Using NS5B Sequencing for Hepatitis C Virus Genotyping Reveals Discordances with Commercial Platforms.

    PubMed

    Chueca, Natalia; Rivadulla, Isidro; Lovatti, Rubén; Reina, Gabriel; Blanco, Ana; Fernandez-Caballero, Jose Angel; Cardeñoso, Laura; Rodriguez-Granjer, Javier; Fernandez-Alonso, Miriam; Aguilera, Antonio; Alvarez, Marta; Galán, Juan Carlos; García, Federico

    2016-01-01

    We aimed to evaluate the correct assignment of HCV genotypes by three commercial methods-Trugene HCV genotyping kit (Siemens), VERSANT HCV Genotype 2.0 assay (Siemens), and Real-Time HCV genotype II (Abbott)-compared to NS5B sequencing. We studied 327 clinical samples that carried representative HCV genotypes of the most frequent geno/subtypes in Spain. After commercial genotyping, the sequencing of a 367 bp fragment in the NS5B gene was used to assign genotypes. Major discrepancies were defined, e.g. differences in the assigned genotype by one of the three methods and NS5B sequencing, including misclassification of subtypes 1a and 1b. Minor discrepancies were considered when differences at subtype levels, other than 1a and 1b, were observed. The overall discordance with the reference method was 34% for Trugene and 15% for VERSANT HCV2.0. The Abbott assay correctly identified all 1a and 1b subtypes, but did not subtype all the 2, 3, 4 and 5 (34%) genotypes. Major discordances were found in 16% of cases for Trugene HCV, and the majority were 1b- to 1a-related discordances; major discordances were found for VERSANT HCV 2.0 in 6% of cases, which were all but one 1b to 1a cases. These results indicated that the Trugene assay especially, and to a lesser extent, Versant HCV 2.0, can fail to differentiate HCV subtypes 1a and 1b, and lead to critical errors in clinical practice for correctly using directly acting antiviral agents. PMID:27097040

  5. Using NS5B Sequencing for Hepatitis C Virus Genotyping Reveals Discordances with Commercial Platforms

    PubMed Central

    Chueca, Natalia; Rivadulla, Isidro; Lovatti, Rubén; Reina, Gabriel; Blanco, Ana; Fernandez-Caballero, Jose Angel; Cardeñoso, Laura; Rodriguez-Granjer, Javier; Fernandez-Alonso, Miriam; Aguilera, Antonio; Alvarez, Marta

    2016-01-01

    We aimed to evaluate the correct assignment of HCV genotypes by three commercial methods—Trugene HCV genotyping kit (Siemens), VERSANT HCV Genotype 2.0 assay (Siemens), and Real-Time HCV genotype II (Abbott)—compared to NS5B sequencing. We studied 327 clinical samples that carried representative HCV genotypes of the most frequent geno/subtypes in Spain. After commercial genotyping, the sequencing of a 367 bp fragment in the NS5B gene was used to assign genotypes. Major discrepancies were defined, e.g. differences in the assigned genotype by one of the three methods and NS5B sequencing, including misclassification of subtypes 1a and 1b. Minor discrepancies were considered when differences at subtype levels, other than 1a and 1b, were observed. The overall discordance with the reference method was 34% for Trugene and 15% for VERSANT HCV2.0. The Abbott assay correctly identified all 1a and 1b subtypes, but did not subtype all the 2, 3, 4 and 5 (34%) genotypes. Major discordances were found in 16% of cases for Trugene HCV, and the majority were 1b- to 1a-related discordances; major discordances were found for VERSANT HCV 2.0 in 6% of cases, which were all but one 1b to 1a cases. These results indicated that the Trugene assay especially, and to a lesser extent, Versant HCV 2.0, can fail to differentiate HCV subtypes 1a and 1b, and lead to critical errors in clinical practice for correctly using directly acting antiviral agents. PMID:27097040

  6. Synchronized excitability in a network enables generation of internal neuronal sequences

    PubMed Central

    Wang, Yingxue; Roth, Zachary; Pastakova, Eva

    2016-01-01

    Hippocampal place field sequences are supported by sensory cues and network internal mechanisms. In contrast, sharp-wave (SPW) sequences, theta sequences, and episode field sequences are internally generated. The relationship of these sequences to memory is unclear. SPW sequences have been shown to support learning and have been assumed to also support episodic memory. Conversely, we demonstrate these SPW sequences were present in trained rats even after episodic memory was impaired and after other internal sequences – episode field and theta sequences – were eliminated. SPW sequences did not support memory despite continuing to ‘replay’ all task-related sequences – place- field and episode field sequences. Sequence replay occurred selectively during synchronous increases of population excitability -- SPWs. Similarly, theta sequences depended on the presence of repeated synchronized waves of excitability – theta oscillations. Thus, we suggest that either intermittent or rhythmic synchronized changes of excitability trigger sequential firing of neurons, which in turn supports learning and/or memory. DOI: http://dx.doi.org/10.7554/eLife.20697.001 PMID:27677848

  7. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

    PubMed

    Stein, Lincoln D; Bao, Zhirong; Blasiar, Darin; Blumenthal, Thomas; Brent, Michael R; Chen, Nansheng; Chinwalla, Asif; Clarke, Laura; Clee, Chris; Coghlan, Avril; Coulson, Alan; D'Eustachio, Peter; Fitch, David H A; Fulton, Lucinda A; Fulton, Robert E; Griffiths-Jones, Sam; Harris, Todd W; Hillier, LaDeana W; Kamath, Ravi; Kuwabara, Patricia E; Mardis, Elaine R; Marra, Marco A; Miner, Tracie L; Minx, Patrick; Mullikin, James C; Plumb, Robert W; Rogers, Jane; Schein, Jacqueline E; Sohrmann, Marc; Spieth, John; Stajich, Jason E; Wei, C; Willey, David; Wilson, Richard K; Durbin, Richard; Waterston, Robert H

    2003-11-01

    The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found

  8. Architecting Prodiguer: the next generation French climate modelling data management platform

    NASA Astrophysics Data System (ADS)

    Morgan, Mark; Denvil, Sebastien; Bhardwaj, Ashish

    2010-05-01

    The Pierre Simon Laplace Institute (IPSL), like many other climate modeling groups, is involved in the international development of a comprehensive Earth System Model (ESM) to study the interactions between chemical, physical, and biological processes. This work entails the coupling of different components (land, ocean, atmosphere, chemistry...etc) and requires an execution environment platform that can tackle the entire range of interdependent model configurations. Furthermore, the ever-increasing number of simulations, executed against model configurations within scientific computing centres, is generating a huge volume of data and meta-data that must be made available to the international community of researchers, modelers, students and general users. IPSL is in the process of implementing a French national project called Prodiguer whose objective is to ensure that the data and meta-data can be delivered to the French & international communities in a timely and appropriate fashion, hence acheiving the strategic goals outlined above. Prodiguer aims to leverage, extend and build upon the work of international projects such as Earth System Grid, METAFOR and IS-ENES. Thus Prodiguer is to be seen as one actor amongst many attempting the difficult task of information integration within a complex enterprise space. We will present the technical architecture being put in place to achieve the goals of Prodiguer. Such an architecture necessarily encompasses many aspects of Service / Resource Orientated Architural practice. From security to messaging patterns, from message queues to failover strategies, we will illustrate how pragmatism is inevitably the main driver behind such an architecture. We will also illustrate that as the number of actors increases so does workflow complexity, and as a consequence simplicity becomes an important guiding factor in itself.

  9. Identification and characterization of Highlands J virus from a Mississippi sandhill crane using unbiased next-generation sequencing

    USGS Publications Warehouse

    Ip, Hon S.; Wiley, Michael R.; Long, Renee; Gustavo, Palacios; Shearn-Bochsler, Valerie; Whitehouse, Chris A.

    2014-01-01

    Advances in massively parallel DNA sequencing platforms, commonly termed next-generation sequencing (NGS) technologies, have greatly reduced time, labor, and cost associated with DNA sequencing. Thus, NGS has become a routine tool for new viral pathogen discovery and will likely become the standard for routine laboratory diagnostics of infectious diseases in the near future. This study demonstrated the application of NGS for the rapid identification and characterization of a virus isolated from the brain of an endangered Mississippi sandhill crane. This bird was part of a population restoration effort and was found in an emaciated state several days after Hurricane Isaac passed over the refuge in Mississippi in 2012. Post-mortem examination had identified trichostrongyliasis as the possible cause of death, but because a virus with morphology consistent with a togavirus was isolated from the brain of the bird, an arboviral etiology was strongly suspected. Because individual molecular assays for several known arboviruses were negative, unbiased NGS by Illumina MiSeq was used to definitively identify and characterize the causative viral agent. Whole genome sequencing and phylogenetic analysis revealed the viral isolate to be the Highlands J virus, a known avian pathogen. This study demonstrates the use of unbiased NGS for the rapid detection and characterization of an unidentified viral pathogen and the application of this technology to wildlife disease diagnostics and conservation medicine.

  10. Targeted, high-depth, next-generation sequencing of cancer genes in formalin-fixed, paraffin-embedded and fine-needle aspiration tumor specimens.

    PubMed

    Hadd, Andrew G; Houghton, Jeff; Choudhary, Ashish; Sah, Sachin; Chen, Liangjing; Marko, Adam C; Sanford, Tiffany; Buddavarapu, Kalyan; Krosting, Julie; Garmire, Lana; Wylie, Dennis; Shinde, Rupali; Beaudenon, Sylvie; Alexander, Erik K; Mambo, Elizabeth; Adai, Alex T; Latham, Gary J

    2013-03-01

    Implementation of highly sophisticated technologies, such as next-generation sequencing (NGS), into routine clinical practice requires compatibility with common tumor biopsy types, such as formalin-fixed, paraffin-embedded (FFPE) and fine-needle aspiration specimens, and validation metrics for platforms, controls, and data analysis pipelines. In this study, a two-step PCR enrichment workflow was used to assess 540 known cancer-relevant variants in 16 oncogenes for high-depth sequencing in tumor samples on either mature (Illumina GAIIx) or emerging (Ion Torrent PGM) NGS platforms. The results revealed that the background noise of variant detection was elevated approximately twofold in FFPE compared with cell line DNA. Bioinformatic algorithms were optimized to accommodate this background. Variant calls from 38 residual clinical colorectal cancer FFPE specimens and 10 thyroid fine-needle aspiration specimens were compared across multiple cancer genes, resulting in an accuracy of 96.1% (95% CI, 96.1% to 99.3%) compared with Sanger sequencing, and 99.6% (95% CI, 97.9% to 99.9%) compared with an alternative method with an analytical sensitivity of 1% mutation detection. A total of 45 of 48 samples were concordant between NGS platforms across all matched regions, with the three discordant calls each represented at <10% of reads. Consequently, NGS of targeted oncogenes in real-life tumor specimens using distinct platforms addresses unmet needs for unbiased and highly sensitive mutation detection and can accelerate both basic and clinical cancer research.

  11. Functional characterization of a monoclonal antibody epitope using a lambda phage display-deep sequencing platform

    PubMed Central

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; Venza, Mario; Venza, Isabella; Borgogni, Erica; Castellino, Flora; Midiri, Angelina; Galbo, Roberta; Romeo, Letizia; Biondo, Carmelo; Masignani, Vega; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2016-01-01

    We have recently described a method, named PROFILER, for the identification of antigenic regions preferentially targeted by polyclonal antibody responses after vaccination. To test the ability of the technique to provide insights into the functional properties of monoclonal antibody (mAb) epitopes, we used here a well-characterized epitope of meningococcal factor H binding protein (fHbp), which is recognized by mAb 12C1. An fHbp library, engineered on a lambda phage vector enabling surface expression of polypeptides of widely different length, was subjected to massive parallel sequencing of the phage inserts after affinity selection with the 12C1 mAb. We detected dozens of unique antibody-selected sequences, the most enriched of which (designated as FrC) could largely recapitulate the ability of fHbp to bind mAb 12C1. Computational analysis of the cumulative enrichment of single amino acids in the antibody-selected fragments identified two overrepresented stretches of residues (H248-K254 and S140-G154), whose presence was subsequently found to be required for binding of FrC to mAb 12C1. Collectively, these results suggest that the PROFILER technology can rapidly and reliably identify, in the context of complex conformational epitopes, discrete “hot spots” with a crucial role in antigen-antibody interactions, thereby providing useful clues for the functional characterization of the epitope. PMID:27530334

  12. RANK-ORDER-SELECTIVE NEURONS FORM A TEMPORAL BASIS SET FOR THE GENERATION OF MOTOR SEQUENCES

    PubMed Central

    Salinas, Emilio

    2009-01-01

    Many behaviors are composed of a series of elementary motor actions that must occur in a specific order, but the neuronal mechanisms by which such motor sequences are generated are poorly understood. In particular, if a sequence consists of a few motor actions, a primate can learn to replicate it from memory after practicing it for just a few trials. How do the motor and premotor areas of the brain assemble motor sequences so fast? The network model presented here reveals part of the solution to this problem. The model is based on experiments showing that, during the performance of motor sequences, some cortical neurons are always activated at specific times, regardless of which motor action is being executed. In the model, a population of such rank-order-selective (ROS) cells drives a layer of downstream motor neurons so that these generate specific movements at different times in different sequences. A key ingredient of the model is that the amplitude of the ROS responses must be modulated by sequence identity. Because of this modulation, which is consistent with experimental reports, the network is able not only to produce multiple sequences accurately but also to learn a new sequence with minimal changes in connectivity. The ROS neurons modulated by sequence identity thus serve as a basis set for constructing arbitrary sequences of motor responses downstream. The underlying mechanism is analogous to the mechanism described in parietal areas for generating coordinate transformations in the spatial domain. PMID:19357265

  13. Designing next-generation platforms for evaluating scientific output: what scientists can learn from the social web

    PubMed Central

    Yarkoni, Tal

    2012-01-01

    Traditional pre-publication peer review of scientific output is a slow, inefficient, and unreliable process. Efforts to replace or supplement traditional evaluation models with open evaluation platforms that leverage advances in information technology are slowly gaining traction, but remain in the early stages of design and implementation. Here I discuss a number of considerations relevant to the development of such platforms. I focus particular attention on three core elements that next-generation evaluation platforms should strive to emphasize, including (1) open and transparent access to accumulated evaluation data, (2) personalized and highly customizable performance metrics, and (3) appropriate short-term incentivization of the userbase. Because all of these elements have already been successfully implemented on a large scale in hundreds of existing social web applications, I argue that development of new scientific evaluation platforms should proceed largely by adapting existing techniques rather than engineering entirely new evaluation mechanisms. Successful implementation of open evaluation platforms has the potential to substantially advance both the pace and the quality of scientific publication and evaluation, and the scientific community has a vested interest in shifting toward such models as soon as possible. PMID:23060783

  14. Generation of Triple-Transgenic Forsythia Cell Cultures as a Platform for the Efficient, Stable, and Sustainable Production of Lignans

    PubMed Central

    Murata, Jun; Matsumoto, Erika; Morimoto, Kinuyo; Koyama, Tomotsugu; Satake, Honoo

    2015-01-01

    Sesamin is a furofuran lignan biosynthesized from the precursor lignan pinoresinol specifically in sesame seeds. This lignan is shown to exhibit anti-hypertensive activity, protect the liver from damages by ethanol and lipid oxidation, and reduce lung tumor growth. Despite rapidly elevating demand, plant sources of lignans are frequently limited because of the high cost of locating and collecting plants. Indeed, the acquisition of sesamin exclusively depends on the conventional extraction of particular Sesamum seeds. In this study, we have created the efficient, stable and sustainable sesamin production system using triple-transgenic Forsythia koreana cell suspension cultures, U18i-CPi-Fk. These transgenic cell cultures were generated by stably introducing an RNAi sequence against the pinoresinol-glucosylating enzyme, UGT71A18, into existing CPi-Fk cells, which had been created by introducing Sesamum indicum sesamin synthase (CYP81Q1) and an RNA interference (RNAi) sequence against pinoresinol/lariciresinol reductase (PLR) into F. koreanna cells. Compared to its transgenic prototype, U18i-CPi-Fk displayed 5-fold higher production of pinoresinol aglycone and 1.4-fold higher production of sesamin, respectively, while the wildtype cannot produce sesamin due to a lack of any intrinsic sesamin synthase. Moreover, red LED irradiation of U18i-CPi-Fk specifically resulted in 3.0-fold greater production in both pinoresinol aglycone and sesamin than production of these lignans under the dark condition, whereas pinoresinol production was decreased in the wildtype under red LED. Moreover, we developed a procedure for sodium alginate-based long-term storage of U18i-CPi-Fk in liquid nitrogen. Production of sesamin in U18i-CPi-Fk re-thawed after six-month cryopreservation was equivalent to that of non-cryopreserved U18i-CPi-Fk. These data warrant on-demand production of sesamin anytime and anywhere. Collectively, the present study provides evidence that U18i-CP-Fk is an

  15. Generation of Triple-Transgenic Forsythia Cell Cultures as a Platform for the Efficient, Stable, and Sustainable Production of Lignans.

    PubMed

    Murata, Jun; Matsumoto, Erika; Morimoto, Kinuyo; Koyama, Tomotsugu; Satake, Honoo

    2015-01-01

    Sesamin is a furofuran lignan biosynthesized from the precursor lignan pinoresinol specifically in sesame seeds. This lignan is shown to exhibit anti-hypertensive activity, protect the liver from damages by ethanol and lipid oxidation, and reduce lung tumor growth. Despite rapidly elevating demand, plant sources of lignans are frequently limited because of the high cost of locating and collecting plants. Indeed, the acquisition of sesamin exclusively depends on the conventional extraction of particular Sesamum seeds. In this study, we have created the efficient, stable and sustainable sesamin production system using triple-transgenic Forsythia koreana cell suspension cultures, U18i-CPi-Fk. These transgenic cell cultures were generated by stably introducing an RNAi sequence against the pinoresinol-glucosylating enzyme, UGT71A18, into existing CPi-Fk cells, which had been created by introducing Sesamum indicum sesamin synthase (CYP81Q1) and an RNA interference (RNAi) sequence against pinoresinol/lariciresinol reductase (PLR) into F. koreanna cells. Compared to its transgenic prototype, U18i-CPi-Fk displayed 5-fold higher production of pinoresinol aglycone and 1.4-fold higher production of sesamin, respectively, while the wildtype cannot produce sesamin due to a lack of any intrinsic sesamin synthase. Moreover, red LED irradiation of U18i-CPi-Fk specifically resulted in 3.0-fold greater production in both pinoresinol aglycone and sesamin than production of these lignans under the dark condition, whereas pinoresinol production was decreased in the wildtype under red LED. Moreover, we developed a procedure for sodium alginate-based long-term storage of U18i-CPi-Fk in liquid nitrogen. Production of sesamin in U18i-CPi-Fk re-thawed after six-month cryopreservation was equivalent to that of non-cryopreserved U18i-CPi-Fk. These data warrant on-demand production of sesamin anytime and anywhere. Collectively, the present study provides evidence that U18i-CP-Fk is an

  16. Generation of Triple-Transgenic Forsythia Cell Cultures as a Platform for the Efficient, Stable, and Sustainable Production of Lignans.

    PubMed

    Murata, Jun; Matsumoto, Erika; Morimoto, Kinuyo; Koyama, Tomotsugu; Satake, Honoo

    2015-01-01

    Sesamin is a furofuran lignan biosynthesized from the precursor lignan pinoresinol specifically in sesame seeds. This lignan is shown to exhibit anti-hypertensive activity, protect the liver from damages by ethanol and lipid oxidation, and reduce lung tumor growth. Despite rapidly elevating demand, plant sources of lignans are frequently limited because of the high cost of locating and collecting plants. Indeed, the acquisition of sesamin exclusively depends on the conventional extraction of particular Sesamum seeds. In this study, we have created the efficient, stable and sustainable sesamin production system using triple-transgenic Forsythia koreana cell suspension cultures, U18i-CPi-Fk. These transgenic cell cultures were generated by stably introducing an RNAi sequence against the pinoresinol-glucosylating enzyme, UGT71A18, into existing CPi-Fk cells, which had been created by introducing Sesamum indicum sesamin synthase (CYP81Q1) and an RNA interference (RNAi) sequence against pinoresinol/lariciresinol reductase (PLR) into F. koreanna cells. Compared to its transgenic prototype, U18i-CPi-Fk displayed 5-fold higher production of pinoresinol aglycone and 1.4-fold higher production of sesamin, respectively, while the wildtype cannot produce sesamin due to a lack of any intrinsic sesamin synthase. Moreover, red LED irradiation of U18i-CPi-Fk specifically resulted in 3.0-fold greater production in both pinoresinol aglycone and sesamin than production of these lignans under the dark condition, whereas pinoresinol production was decreased in the wildtype under red LED. Moreover, we developed a procedure for sodium alginate-based long-term storage of U18i-CPi-Fk in liquid nitrogen. Production of sesamin in U18i-CPi-Fk re-thawed after six-month cryopreservation was equivalent to that of non-cryopreserved U18i-CPi-Fk. These data warrant on-demand production of sesamin anytime and anywhere. Collectively, the present study provides evidence that U18i-CP-Fk is an

  17. Short Communication: Investigating a Chain of HIV Transmission Events Due to Homosexual Exposure and Blood Transfusion Based on a Next Generation Sequencing Method.

    PubMed

    Zhao, Qi; Zhang, Chen; Jiang, Yan; Wen, Yujie; Pan, Pinliang; Li, Yang; Zhang, Guiyun; Zhang, Lei; Qiu, Maofeng

    2015-12-01

    This study investigates a chain of HIV transmission events due to homosexual exposure and blood transfusion in China. The MiSeq platform, a next generation sequencing (NGS) system, was used to obtain genetic details of the HIV-1 env region (336 base pairs). Evolutionary analysis combined with epidemiologic evidence suggests a transmission chain from patient T3 to T2 through homosexual exposure and subsequently to T1 through blood transfusion. More importantly, a phylogenetic study suggested a likely genetic bottleneck for HIV in homosexual transmission from T3 to T2, while T1 inherited the majority of variants from T2. The result from the MiSeq platform is consistent with findings from the epidemiologic survey. The MiSeq platform is a powerful tool for tracing HIV transmissions and intrapersonal evolution.

  18. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms

    PubMed Central

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-01-01

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes. PMID:24296905

  19. De novo assembly of the complete organelle genome sequences of azuki bean (Vigna angularis) using next-generation sequencers.

    PubMed

    Naito, Ken; Kaga, Akito; Tomooka, Norihiko; Kawase, Makoto

    2013-06-01

    Since chloroplasts and mitochondria are maternally inherited and have unique features in evolution, DNA sequences of those organelle genomes have been broadly used in phylogenetic studies. Thanks to recent progress in next-generation sequencer (NGS) technology, whole-genome sequencing can be easily performed. Here, using NGS data generated by Roche GS Titanium and Illumina Hiseq 2000, we performed a hybrid assembly of organelle genome sequences of Vigna angularis (azuki bean). Both the mitochondrial genome (mtDNA) and the chloroplast genome (cpDNA) of V. angularis have very similar size and gene content to those of V. radiata (mungbean). However, in structure, mtDNA sequences have undergone many recombination events after divergence from the common ancestor of V. angularis and V. radiata, whereas cpDNAs are almost identical between the two. The stability of cpDNAs and the variability of mtDNAs was further confirmed by comparative analysis of Vigna organelles with model plants Lotus japonicus and Arabidopsis thaliana.

  20. Next-Generation Sequencing to Guide Clinical Trials

    PubMed Central

    Siu, Lillian L.; Conley, Barbara A.; Boerner, Scott; LoRusso, Patricia M.

    2015-01-01

    Rapidly accruing knowledge of the mutational landscape of malignant neoplasms, the increasing facility of massively parallel genomic sequencing, and the availability of drugs targeting many “driver” molecular abnormalities have spurred the oncologic community to consider how to use these new tools to improve cancer treatment. In order to assure that assignment of patients to a particular targeted treatment is likely to be beneficial to the patient, it will be necessary to conduct appropriate clinical research. It is clear that clinical (histology, stage) eligibility criteria are not sufficient for most clinical trials using agents that target mutations that are present in only a minority of patients. Recently, several clinical trial designs have been suggested to test the benefit of targeted treatment in molecular and/or clinical subgroups of patients. However, challenges remain in the implementation of such trials, including choice of assay, levels of evidence regarding gene variants, tumor heterogeneity, identifying resistance mechanisms, the necessity of screening large numbers of patients, infrastructure needs, and collaboration of investigators and industry. This article reviews current trial designs and discusses some of the considerations, advantages and drawbacks of designing clinical trials that depend on particular molecular variants as eligibility criteria. PMID:26473189

  1. Storm-generated bedforms and relict dissolution pits and channels on the Yucatan carbonate platform

    NASA Astrophysics Data System (ADS)

    Gulick, S. P.; Goff, J. A.; Stewart, H. A.; Perez-Cruz, L. L.; Davis, M. B.; Duncan, D.; Saustrup, S.; Sanford, J. C.; Fucugauchi, J. U.

    2013-12-01

    survey area. Therefore, none of these dissolution pits appear to be underlain by a cenote or sink hole. The NW sector of the survey area exhibits a more complex morphology than the alternating ribbon/bare rock morphology elsewhere, including linear scarps (up to ~1 m relief), deeper pitting (up to ~1 m relief), and sinuous, dendritic channeling (up to ~2 m relief). The geologic origin of these features will require further investigation. Sand drifts are present in this region, but are thinner and cover less area. These observations show the dominant modern sediment formation and transport processes on this starved platform are from large storms and hurricanes that place large regions of the platform at wave base. Remaining observed features were generated during times of lower sea level.

  2. De novo Sequence Assembly and Characterization of Lycoris aurea Transcriptome Using GS FLX Titanium Platform of 454 Pyrosequencing

    PubMed Central

    Wang, Ren; Xu, Sheng; Jiang, Yumei; Jiang, Jingwei; Li, Xiaodan; Liang, Lijian; He, Jia; Peng, Feng; Xia, Bing

    2013-01-01

    Background Lycoris aurea, also called Golden Magic Lily, is an ornamentally and medicinally important species of the Amaryllidaceae family. To date, the sequencing of its whole genome is unavailable as a non-model organism. Transcriptomic information is also scarce for this species. In this study, we performed de novo transcriptome sequencing to produce the first comprehensive expressed sequence tag (EST) dataset for L. aurea using high-throughput sequencing technology. Methodology and Principal Findings Total RNA was isolated from leaves with sodium nitroprusside (SNP), salicylic acid (SA), or methyl jasmonate (MeJA) treatment, stems, and flowers at the bud, blooming, and wilting stages. Equal quantities of RNA from each tissue and stage were pooled to construct a cDNA library. Using 454 pyrosequencing technology, a total of 937,990 high quality reads (308.63 Mb) with an average read length of 329 bp were generated. Clustering and assembly of these reads produced a non-redundant set of 141,111 unique sequences, comprising 24,604 contigs and 116,507 singletons. All of the unique sequences were involved in the biological process, cellular component and molecular function categories by GO analysis. Potential genes and their functions were predicted by KEGG pathway mapping and COG analysis. Based on our sequence analysis and published literatures, many putative genes involved in Amaryllidaceae alkaloids synthesis, including PAL, TYDC OMT, NMT, P450, and other potentially important candidate genes, were identified for the first time in this Lycoris. Furthermore, 6,386 SSRs and 18,107 high-confidence SNPs were identified in this EST dataset. Conclusions The transcriptome provides an invaluable new data for a functional genomics resource and future biological research in L. aurea. The molecular markers identified in this study will provide a material basis for future genetic linkage and quantitative trait loci analyses, and will provide useful information for functional

  3. Co-detection and sequencing of genes and transcripts from the same single cells facilitated by a microfluidics platform

    PubMed Central

    Han, Lin; Zi, Xiaoyuan; Garmire, Lana X.; Wu, Yu; Weissman, Sherman M.; Pan, Xinghua; Fan, Rong

    2014-01-01

    Despite the recent advance of single-cell gene expression analyses, co-measurement of both genomic and transcriptional signatures at the single-cell level has not been realized. However such analysis is necessary in order to accurately delineate how genetic information is transcribed, expressed, and regulated to give rise to an enormously diverse range of cell phenotypes. Here we report on a microfluidics-facilitated approach that allows for controlled separation of cytoplasmic and nuclear contents of a single cell followed by on-chip amplification of genomic DNA and cytoplasmic mRNA. When coupled with off-chip polymerase chain reaction, gel electrophoresis and Sanger sequencing, a panel of genes and transcripts from the same single cell can be co-detected and sequenced. This platform is potentially an enabling tool to permit multiple genomic measurements performed on the same single cells and opens new opportunities to tackle a range of fundamental biology questions including non-genetic cell-to-cell variability, epigenetic regulation, and stem cell fate control. It also helps address clinical challenges such as diagnosing intra-tumor heterogeneity and dissecting complex cellular immune responses. PMID:25255798

  4. Co-detection and sequencing of genes and transcripts from the same single cells facilitated by a microfluidics platform

    NASA Astrophysics Data System (ADS)

    Han, Lin; Zi, Xiaoyuan; Garmire, Lana X.; Wu, Yu; Weissman, Sherman M.; Pan, Xinghua; Fan, Rong

    2014-09-01

    Despite the recent advance of single-cell gene expression analyses, co-measurement of both genomic and transcriptional signatures at the single-cell level has not been realized. However such analysis is necessary in order to accurately delineate how genetic information is transcribed, expressed, and regulated to give rise to an enormously diverse range of cell phenotypes. Here we report on a microfluidics-facilitated approach that allows for controlled separation of cytoplasmic and nuclear contents of a single cell followed by on-chip amplification of genomic DNA and cytoplasmic mRNA. When coupled with off-chip polymerase chain reaction, gel electrophoresis and Sanger sequencing, a panel of genes and transcripts from the same single cell can be co-detected and sequenced. This platform is potentially an enabling tool to permit multiple genomic measurements performed on the same single cells and opens new opportunities to tackle a range of fundamental biology questions including non-genetic cell-to-cell variability, epigenetic regulation, and stem cell fate control. It also helps address clinical challenges such as diagnosing intra-tumor heterogeneity and dissecting complex cellular immune responses.

  5. Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing

    PubMed Central

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640

  6. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    PubMed

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  7. Next-generation sequencing for targeted discovery of rare mutations in rice

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Advances in DNA sequencing (i.e., next-generation sequencing, NGS) have greatly increased the power and efficiency of detecting rare mutations in large mutant populations. Targeting Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach for identifying gene mutations resulting fro...

  8. From FASTQ to Function: In Silico Methods for Processing Next-Generation Sequencing Data.

    PubMed

    Preston, Mark D; Stabler, Richard A

    2016-01-01

    This chapter presents a method to process C. difficile whole-genome next-generation sequencing data straight from the sequencer. Quality control processing and de novo assembly of these data enable downstream analyses such as gene annotation and in silico multi-locus strain-type identification. PMID:27507331

  9. From FASTQ to Function: In Silico Methods for Processing Next-Generation Sequencing Data.

    PubMed

    Preston, Mark D; Stabler, Richard A

    2016-01-01

    This chapter presents a method to process C. difficile whole-genome next-generation sequencing data straight from the sequencer. Quality control processing and de novo assembly of these data enable downstream analyses such as gene annotation and in silico multi-locus strain-type identification.

  10. RNA editing generates cellular subsets with diverse sequence within populations

    PubMed Central

    Harjanto, Dewi; Papamarkou, Theodore; Oates, Chris J.; Rayon-Estrada, Violeta; Papavasiliou, F. Nina; Papavasiliou, Anastasia

    2016-01-01

    RNA editing is a mutational mechanism that specifically alters the nucleotide content in transcribed RNA. However, editing rates vary widely, and could result from equivalent editing amongst individual cells, or represent an average of variable editing within a population. Here we present a hierarchical Bayesian model that quantifies the variance of editing rates at specific sites using RNA-seq data from both single cells, and a cognate bulk sample to distinguish between these two possibilities. The model predicts high variance for specific edited sites in murine macrophages and dendritic cells, findings that we validated experimentally by using targeted amplification of specific editable transcripts from single cells. The model also predicts changes in variance in editing rates for specific sites in dendritic cells during the course of LPS stimulation. Our data demonstrate substantial variance in editing signatures amongst single cells, supporting the notion that RNA editing generates diversity within cellular populations. PMID:27418407

  11. Next-generation Sequencing of 16S Ribosomal RNA Gene Amplicons

    PubMed Central

    Sanschagrin, Sylvie; Yergeau, Etienne

    2014-01-01

    One of the major questions in microbial ecology is “who is there?” This question can be answered using various tools, but one of the long-lasting gold standards is to sequence 16S ribosomal RNA (rRNA) gene amplicons generated by domain-level PCR reactions amplifying from genomic DNA. Traditionally, this was performed by cloning and Sanger (capillary electrophoresis) sequencing of PCR amplicons. The advent of next-generation sequencing has tremendously simplified and increased the sequencing depth for 16S rRNA gene sequencing. The introduction of benchtop sequencers now allows small labs to perform their 16S rRNA sequencing in-house in a matter of days. Here, an approach for 16S rRNA gene amplicon sequencing using a benchtop next-generation sequencer is detailed. The environmental DNA is first amplified by PCR using primers that contain sequencing adapters and barcodes. They are then coupled to spherical particles via emulsion PCR. The particles are loaded on a disposable chip and the chip is inserted in the sequencing machine after which the sequencing is performed. The sequences are retrieved in fastq format, filtered and the barcodes are used to establish the sample membership of the reads. The filtered and binned reads are then further analyzed using publically available tools. An example analysis where the reads were classified with a taxonomy-finding algorithm within the software package Mothur is given. The method outlined here is simple, inexpensive and straightforward and should help smaller labs to take advantage from the ongoing genomic revolution. PMID:25226019

  12. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples.

    PubMed

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-03-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline. PMID:26910355

  13. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples

    PubMed Central

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-01-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline. PMID:26910355

  14. Assessment of Epstein-Barr virus nucleic acids in gastric but not in breast cancer by next-generation sequencing of pooled Mexican samples.

    PubMed

    Fuentes-Pananá, Ezequiel M; Larios-Serrato, Violeta; Méndez-Tenorio, Alfonso; Morales-Sánchez, Abigail; Arias, Carlos F; Torres, Javier

    2016-03-01

    Gastric (GC) and breast (BrC) cancer are two of the most common and deadly tumours. Different lines of evidence suggest a possible causative role of viral infections for both GC and BrC. Wide genome sequencing (WGS) technologies allow searching for viral agents in tissues of patients with cancer. These technologies have already contributed to establish virus-cancer associations as well as to discovery new tumour viruses. The objective of this study was to document possible associations of viral infection with GC and BrC in Mexican patients. In order to gain idea about cost effective conditions of experimental sequencing, we first carried out an in silico simulation of WGS. The next-generation-platform IlluminaGallx was then used to sequence GC and BrC tumour samples. While we did not find viral sequences in tissues from BrC patients, multiple reads matching Epstein-Barr virus (EBV) sequences were found in GC tissues. An end-point polymerase chain reaction confirmed an enrichment of EBV sequences in one of the GC samples sequenced, validating the next-generation sequencing-bioinformatics pipeline.

  15. Microbial profiling of South African acid mine water samples using next generation sequencing platform.

    PubMed

    Kamika, I; Azizi, S; Tekere, M

    2016-07-01

    This study monitored changes in bacterial and fungal structure in a mine water in a monthly basis over 4 months. Over the 4-month study period, mine water samples contained more bacteria (91.06 %) compared to fungi (8.94 %). For bacteria, mine water samples were dominated by Proteobacteria (39.14 to 65.06 %) followed by Firmicutes (26.34 to 28.9 %) in summer, and Cyanobacteria (27.05 %) in winter. In the collected samples, 18 % of bacteria could not be assigned to a phylum and remained unclassified suggesting hitherto vast untapped microbial diversity especially during winter. The fungal domain was the sole eukaryotic microorganism found in the mine water samples with unclassified fungi (68.2 to 91 %) as the predominant group, followed by Basidiomycota (6.9 to 27.8 %). The time of collection, which was linked to the weather, had higher impact on bacterial community than fungal community. The bacterial operational taxonomic units (OTUs) ranged from 865 to 4052 over the 4-month sampling period, while fungal OTUs varied from 73 to 249. The diversity indices suggested that the bacterial community inhabiting the mine water samples were more diverse than the fungal community. The canonical correspondence analysis (CCA) results highlighted that the bacterial community variance had the strongest relationship with water temperature, conductivity, pH, and dissolved oxygen (DO) content, as compared to fungi and water characteristics, had the greatest contribution to both bacterial and fungal community variance. The results provided the relationships between microbial community and environmental variables in the studied mining sites. PMID:26980100

  16. Genetic sequence relationships of Winnipegosis platform carbonates, southern Elk Point basin, North Dakota

    SciTech Connect

    Shanley, K.W.; Cross, T.A.

    1988-01-01

    Examination of cores and well log data from the Winnipegosis Formation (Givetian) within a study area of approximately 11,500 mi/sup 2/ (30,000 km/sup 2/) in northern North Dakota allows recognition of seven time-stratigraphic progradational units within the Winnipegosis Formation. Together with the underlying Ashern Formation, these units are arranged in landward-stepping, vertical stacking, and seaward-stepping geometric patterns, which reflect changes in relative sea level. Abrupt juxtaposition of shallow over deeper water lithologies, evidence for subaerial exposure, and onlap geometries further suggest that these progradational units form two larger, Vail-type sequences separated by regionally persistent unconformities or their correlative conformities. Sea level rise during the early Eifelian caused southeastward onlap of the Ashern Formation onto Middle Silurian carbonates of the Interlake Formation. Maximum flooding, expressed by deepest marine facies and a hardground surface, suggests the existence of a condensed section at the top of the Ashern Formation. This was developed during the maximum rate of sea level rise. A decrease in the rate of sea level rise resulted in aggradation of lower Winnipegosis units on a gently dipping ramp. These are represented by nodular and burrowed open marine limestones with scattered stromatoporoid patch reefs and grainstone shoals. During the subsequent sea level fall, represented by Temple units, a shelf margin with pronounced depositional topography and adjacent starved basin were developed. Temple strata include coral-brachiopod-stromatoporoid reefs and productive fore-reef talus deposits along the shelf margin rim.

  17. Thermal Test of an Improved Platform for Silicon Nanowire-Based Thermoelectric Micro-generators

    NASA Astrophysics Data System (ADS)

    Calaza, C.; Fonseca, L.; Salleras, M.; Donmez, I.; Tarancón, A.; Morata, A.; Santos, J. D.; Gadea, G.

    2016-03-01

    This work reports on an improved design intended to enhance the thermal isolation between the hot and cold parts of a silicon-based thermoelectric microgenerator. Micromachining techniques and silicon on insulator substrates are used to obtain a suspended silicon platform surrounded by a bulk silicon rim, in which arrays of bottom-up silicon nanowires are integrated later on to join both parts with a thermoelectric active material. In previous designs the platform was linked to the rim by means of bulk silicon bridges, used as mechanical support and holder for the electrical connections. Such supports severely reduce platform thermal isolation and penalise the functional area due to the need of longer supports. A new technological route is planned to obtain low thermal conductance supports, making use of a particular geometrical design and a wet bulk micromachining process to selectively remove silicon shaping a thin dielectric membrane. Thermal conductance measurements have been performed to analyse the influence of the different design parameters of the suspended platform (support type, bridge/membrane length, separation between platform and silicon rim,) on overall thermal isolation. A thermal conductance reduction from 1.82 mW/K to 1.03 mW/K, has been obtained on tested devices by changing the support type, even though its length has been halved.

  18. De novo genome assembly of the economically important weed horseweed using integrated data from multiple sequencing platforms.

    PubMed

    Peng, Yanhui; Lai, Zhao; Lane, Thomas; Nageswara-Rao, Madhugiri; Okada, Miki; Jasieniuk, Marie; O'Geen, Henriette; Kim, Ryan W; Sammons, R Douglas; Rieseberg, Loren H; Stewart, C Neal

    2014-11-01

    Horseweed (Conyza canadensis), a member of the Compositae (Asteraceae) family, was the first broadleaf weed to evolve resistance to glyphosate. Horseweed, one of the most problematic weeds in the world, is a true diploid (2n = 2x = 18), with the smallest genome of any known agricultural weed (335 Mb). Thus, it is an appropriate candidate to help us understand the genetic and genomic bases of weediness. We undertook a draft de novo genome assembly of horseweed by combining data from multiple sequencing platforms (454 GS-FLX, Illumina HiSeq 2000, and PacBio RS) using various libraries with different insertion sizes (approximately 350 bp, 600 bp, 3 kb, and 10 kb) of a Tennessee-accessed, glyphosate-resistant horseweed biotype. From 116.3 Gb (approximately 350× coverage) of data, the genome was assembled into 13,966 scaffolds with 50% of the assembly = 33,561 bp. The assembly covered 92.3% of the genome, including the complete chloroplast genome (approximately 153 kb) and a nearly complete mitochondrial genome (approximately 450 kb in 120 scaffolds). The nuclear genome is composed of 44,592 protein-coding genes. Genome resequencing of seven additional horseweed biotypes was performed. These sequence data were assembled and used to analyze genome variation. Simple sequence repeat and single-nucleotide polymorphisms were surveyed. Genomic patterns were detected that associated with glyphosate-resistant or -susceptible biotypes. The draft genome will be useful to better understand weediness and the evolution of herbicide resistance and to devise new management strategies. The genome will also be useful as another reference genome in the Compositae. To our knowledge, this article represents the first published draft genome of an agricultural weed.

  19. Efficient generation of transgenic cattle using the DNA transposon and their analysis by next-generation sequencing.

    PubMed

    Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo

    2016-01-01

    Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781

  20. Efficient generation of transgenic cattle using the DNA transposon and their analysis by next-generation sequencing

    PubMed Central

    Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo

    2016-01-01

    Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781

  1. Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  2. Evaluation of 16S Rrna amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...

  3. Comparison of normalization methods for construction of large, multiplex amplicon pools for next-generation sequencing.

    PubMed

    Harris, J Kirk; Sahl, Jason W; Castoe, Todd A; Wagner, Brandie D; Pollock, David D; Spear, John R

    2010-06-01

    Constructing mixtures of tagged or bar-coded DNAs for sequencing is an important requirement for the efficient use of next-generation sequencers in applications where limited sequence data are required per sample. There are many applications in which next-generation sequencing can be used effectively to sequence large mixed samples; an example is the characterization of microbial communities where sequences per samples are adequate to address research questions. Thus, it is possible to examine hundreds to thousands of samples per run on massively parallel next-generation sequencers. However, the cost savings for efficient utilization of sequence capacity is realized only if the production and management costs associated with construction of multiplex pools are also scalable. One critical step in multiplex pool construction is the normalization process, whereby equimolar amounts of each amplicon are mixed. Here we compare three approaches (spectroscopy, size-restricted spectroscopy, and quantitative binding) for normalization of large, multiplex amplicon pools for performance and efficiency. We found that the quantitative binding approach was superior and represents an efficient scalable process for construction of very large, multiplex pools with hundreds and perhaps thousands of individual amplicons included. We demonstrate the increased sequence diversity identified with higher throughput. Massively parallel sequencing can dramatically accelerate microbial ecology studies by allowing appropriate replication of sequence acquisition to account for temporal and spatial variations. Further, population studies to examine genetic variation, which require even lower levels of sequencing, should be possible where thousands of individual bar-coded amplicons are examined in parallel. PMID:20418443

  4. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    PubMed Central

    Gullapalli, Rama R.; Desai, Ketaki V.; Santana-Santos, Lucas; Kant, Jeffrey A.; Becich, Michael J.

    2012-01-01

    The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation