sequencing technology combined: Topics by Science.gov

Sample records for sequencing technology combined

A rapid and cost-effective method for sequencing pooled cDNA clones by using a combination of transposon insertion and Gateway technology.

PubMed

Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide

2011-09-01

Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
[Current applications of high-throughput DNA sequencing technology in antibody drug research].

PubMed

Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

2012-03-01

Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
SNMR pulse sequence phase cycling

DOEpatents

Walsh, David O; Grunewald, Elliot D

2013-11-12

Technologies applicable to SNMR pulse sequence phase cycling are disclosed, including SNMR acquisition apparatus and methods, SNMR processing apparatus and methods, and combinations thereof. SNMR acquisition may include transmitting two or more SNMR pulse sequences and applying a phase shift to a pulse in at least one of the pulse sequences, according to any of a variety cycling techniques. SNMR processing may include combining SNMR from a plurality of pulse sequences comprising pulses of different phases, so that desired signals are preserved and indesired signals are canceled.
A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

PubMed Central

Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier

2008-01-01

Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152
Swine transcriptome characterization by combined Iso-Seq and RNA-seq for annotating the emerging long read-based reference genome

USDA-ARS?s Scientific Manuscript database

PacBio long-read sequencing technology is increasingly popular in genome sequence assembly and transcriptome cataloguing. Recently, a new-generation pig reference genome was assembled based on long reads from this technology. To finely annotate this genome assembly, transcriptomes of nine tissues fr...
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

PubMed

Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

2012-01-01

DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Single molecule targeted sequencing for cancer gene mutation detection.

PubMed

Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui

2016-05-19

With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis.
Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

PubMed

Moll, Karen M; Zhou, Peng; Ramaraj, Thiruvarangan; Fajardo, Diego; Devitt, Nicholas P; Sadowsky, Michael J; Stupar, Robert M; Tiffin, Peter; Miller, Jason R; Young, Nevin D; Silverstein, Kevin A T; Mudge, Joann

2017-08-04

Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.
Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.

PubMed

Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

2016-01-01

Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.
A computational genomics pipeline for prokaryotic sequencing projects.

PubMed

Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

2010-08-01

New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.
Biosensors for DNA sequence detection

NASA Technical Reports Server (NTRS)

Vercoutere, Wenonah; Akeson, Mark

2002-01-01

DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.
The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

PubMed

Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C

2012-01-01

The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).
Enabling next-gen sequencing and analysis at the USDA-ARS U.S. Meat Animal Research Center with MiniLIMS

USDA-ARS?s Scientific Manuscript database

There is a growing need to combine DNA sequencing technologies to address complex problems in genome biology. These genomic studies routinely generate voluminous image, sequence, and mapping files that should be associated with quality control information (gels, spectra, etc.), and other important ...
Structured oligonucleotides for target indexing to allow single-vessel PCR amplification and solid support microarray hybridization.

PubMed

Girard, Laurie D; Boissinot, Karel; Peytavi, Régis; Boissinot, Maurice; Bergeron, Michel G

2015-02-07

The combination of molecular diagnostic technologies is increasingly used to overcome limitations on sensitivity, specificity or multiplexing capabilities, and provide efficient lab-on-chip devices. Two such techniques, PCR amplification and microarray hybridization are used serially to take advantage of the high sensitivity and specificity of the former combined with high multiplexing capacities of the latter. These methods are usually performed in different buffers and reaction chambers. However, these elaborate methods have high complexity and cost related to reagent requirements, liquid storage and the number of reaction chambers to integrate into automated devices. Furthermore, microarray hybridizations have a sequence dependent efficiency not always predictable. In this work, we have developed the concept of a structured oligonucleotide probe which is activated by cleavage from polymerase exonuclease activity. This technology is called SCISSOHR for Structured Cleavage Induced Single-Stranded Oligonucleotide Hybridization Reaction. The SCISSOHR probes enable indexing the target sequence to a tag sequence. The SCISSOHR technology also allows the combination of nucleic acid amplification and microarray hybridization in a single vessel in presence of the PCR buffer only. The SCISSOHR technology uses an amplification probe that is irreversibly modified in presence of the target, releasing a single-stranded DNA tag for microarray hybridization. Each tag is composed of a 3-nucleotide sequence-dependent segment and a unique "target sequence-independent" 14-nucleotide segment allowing for optimal hybridization with minimal cross-hybridization. We evaluated the performance of five (5) PCR buffers to support microarray hybridization, compared to a conventional hybridization buffer. Finally, as a proof of concept, we developed a multiplexed assay for the amplification, detection, and identification of three (3) DNA targets. This new technology will facilitate the design of lab-on-chip microfluidic devices, while also reducing consumable costs. At term, it will allow the cost-effective automation of highly multiplexed assays for detection and identification of genetic targets.
A computational genomics pipeline for prokaryotic sequencing projects

PubMed Central

Kislyuk, Andrey O.; Katz, Lee S.; Agrawal, Sonia; Hagen, Matthew S.; Conley, Andrew B.; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C.; Sammons, Scott A.; Govil, Dhwani; Mair, Raydel D.; Tatti, Kathleen M.; Tondella, Maria L.; Harcourt, Brian H.; Mayer, Leonard W.; Jordan, I. King

2010-01-01

Motivation: New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. Results: We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. Availability and implementation: The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems. Contact: king.jordan@biology.gatech.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20519285
VARiD: a variation detection framework for color-space and letter-space platforms.

PubMed

Dalca, Adrian V; Rumble, Stephen M; Levy, Samuel; Brudno, Michael

2010-06-15

High-throughput sequencing (HTS) technologies are transforming the study of genomic variation. The various HTS technologies have different sequencing biases and error rates, and while most HTS technologies sequence the residues of the genome directly, generating base calls for each position, the Applied Biosystem's SOLiD platform generates dibase-coded (color space) sequences. While combining data from the various platforms should increase the accuracy of variation detection, to date there are only a few tools that can identify variants from color space data, and none that can analyze color space and regular (letter space) data together. We present VARiD--a probabilistic method for variation detection from both letter- and color-space reads simultaneously. VARiD is based on a hidden Markov model and uses the forward-backward algorithm to accurately identify heterozygous, homozygous and tri-allelic SNPs, as well as micro-indels. Our analysis shows that VARiD performs better than the AB SOLiD toolset at detecting variants from color-space data alone, and improves the calls dramatically when letter- and color-space reads are combined. The toolset is freely available at http://compbio.cs.utoronto.ca/varid.
Epigenetic mechanisms of nutrient-induced modulation of gene expression and cellular functions

USDA-ARS?s Scientific Manuscript database

Utilizing next-generation sequencing technology in combination with chromatin immunoprecipitation (ChIP) technology, our study provides systematic and novel insights into the relationships between nutrition and epigenetics. One paradigmatic example of nutrient-epigenetic-phenotype relationship is th...
AMPLISAS: a web server for multilocus genotyping using next-generation amplicon sequencing data.

PubMed

Sebastian, Alvaro; Herdegen, Magdalena; Migalska, Magdalena; Radwan, Jacek

2016-03-01

Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. © 2015 John Wiley & Sons Ltd.
Advances in technologies and study design

USDA-ARS?s Scientific Manuscript database

Completion of the initial draft sequence of the human genome was the proving ground for and has ushered in significant advancements in technology of increasing sophistication and ever increasing amounts of data. Often, this combination has a multiplicative effect of stimulating research groups to co...
A Hybrid Approach for the Automated Finishing of Bacterial Genomes

PubMed Central

Robins, William P.; Chin, Chen-Shan; Webster, Dale; Paxinos, Ellen; Hsu, David; Ashby, Meredith; Wang, Susana; Peluso, Paul; Sebra, Robert; Sorenson, Jon; Bullard, James; Yen, Jackie; Valdovino, Marie; Mollova, Emilia; Luong, Khai; Lin, Steven; LaMay, Brianna; Joshi, Amruta; Rowe, Lori; Frace, Michael; Tarr, Cheryl L.; Turnsek, Maryann; Davis, Brigid M; Kasarskis, Andrew; Mekalanos, John J.; Waldor, Matthew K.; Schadt, Eric E.

2013-01-01

Dramatic improvements in DNA sequencing technology have revolutionized our ability to characterize most genomic diversity. However, accurate resolution of large structural events has remained challenging due to the comparatively shorter read lengths of second-generation technologies. Emerging third-generation sequencing technologies, which yield markedly increased read length on rapid time scales and for low cost, have the potential to address assembly limitations. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at > 99.9% accuracy. Complex regions with clinically significant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 reference we obtain 14 and 8 scaffolds greater than 1kb, respectively, correcting several errors in the underlying source data. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly. PMID:22750883

Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes.

PubMed

Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E

2008-01-15

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.
Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes

PubMed Central

Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.

2008-01-01

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
SMARTIV: combined sequence and structure de-novo motif discovery for in-vivo RNA binding data.

PubMed

Polishchuk, Maya; Paz, Inbal; Yakhini, Zohar; Mandel-Gutfreund, Yael

2018-05-25

Gene expression regulation is highly dependent on binding of RNA-binding proteins (RBPs) to their RNA targets. Growing evidence supports the notion that both RNA primary sequence and its local secondary structure play a role in specific Protein-RNA recognition and binding. Despite the great advance in high-throughput experimental methods for identifying sequence targets of RBPs, predicting the specific sequence and structure binding preferences of RBPs remains a major challenge. We present a novel webserver, SMARTIV, designed for discovering and visualizing combined RNA sequence and structure motifs from high-throughput RNA-binding data, generated from in-vivo experiments. The uniqueness of SMARTIV is that it predicts motifs from enriched k-mers that combine information from ranked RNA sequences and their predicted secondary structure, obtained using various folding methods. Consequently, SMARTIV generates Position Weight Matrices (PWMs) in a combined sequence and structure alphabet with assigned P-values. SMARTIV concisely represents the sequence and structure motif content as a single graphical logo, which is informative and easy for visual perception. SMARTIV was examined extensively on a variety of high-throughput binding experiments for RBPs from different families, generated from different technologies, showing consistent and accurate results. Finally, SMARTIV is a user-friendly webserver, highly efficient in run-time and freely accessible via http://smartiv.technion.ac.il/.
Using Digital Technology to See Angles from Different Angles. Part 1: Corners

ERIC Educational Resources Information Center

Host, Erin; Baynham, Emily; McMaster, Heather

2014-01-01

In Part 1 of their article, Erin Host, Emily Baynham and Heather McMaster use a combination of digital technology and concrete materials to explore the concept of "corners". They provide a practical, easy to follow sequence of activities that builds on students' understandings. [For "Using Digital Technology to See Angles from…
Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

PubMed

Caruccio, Nicholas

2011-01-01

DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.
Comparison of next generation sequencing technologies for transcriptome characterization

PubMed Central

2009-01-01

Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms. PMID:19646272
Structured oligonucleotides for target indexing to allow single-vessel PCR amplification and solid support microarray hybridization

PubMed Central

Girard, Laurie D.; Boissinot, Karel; Peytavi, Régis; Boissinot, Maurice; Bergeron, Michel G.

2014-01-01

The combination of molecular diagnostic technologies is increasingly used to overcome limitations on sensitivity, specificity or multiplexing capabilities, and provide efficient lab-on-chip devices. Two such techniques, PCR amplification and microarray hybridization are used serially to take advantage of the high sensitivity and specificity of the former combined with high multiplexing capacities of the latter. These methods are usually performed in different buffers and reaction chambers. However, these elaborate methods have a high complexity cost related to reagent requirements, liquid storage and the number of reaction chambers to integrate into automated devices. Furthermore, microarray hybridizations have a sequence dependent efficiency not always predictable. In this work, we have developed the concept of a structured oligonucleotide probe which is activated by cleavage from polymerase exonuclease activity. This technology is called SCISSOHR for Structured Cleavage Induced Single-Stranded Oligonucleotide Hybridization Reaction. The SCISSOHR probes enable indexing the target sequence to a tag sequence. The SCISSOHR technology also allows the combination of nucleic acid amplification and microarray hybridization in a single vessel in presence of the PCR buffer only. The SCISSOHR technology uses an amplification probe that is irreversibly modified in presence of the target, releasing a single-stranded DNA tag for microarray hybridization. Each tag is composed of a 3-nucleotidesequence-dependent segment and a unique “target sequence-independent” 14-nucleotide segment allowing for optimal hybridization with minimal cross-hybridization. We evaluated the performance of five (5) PCR buffers to support microarray hybridization, compared to a conventional hybridization buffer. Finally, as a proof of concept, we developed a multiplexed assay for the amplification, detection, and identification of three (3) DNA targets. This new technology will facilitate the design of lab-on-chip microfluidic devices, while also reducing consumable costs. At term, it will allow the cost-effective automation of highly multiplexed assays for detection and identification of genetic targets. PMID:25489607
The tomato genome sequence provides insight into fleshy fruit evolution

USDA-ARS?s Scientific Manuscript database

The genome of the inbred tomato cultivar ‘Heinz 1706’ was sequenced and assembled using a combination of Sanger and “next generation” technologies. The predicted genome size is ~900 Mb, consistent with prior estimates, of which 760 Mb were assembled in 91 scaffolds aligned to the 12 tomato chromosom...
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer.

PubMed

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph; Aury, Jean-Marc

2017-02-01

Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. © The Author 2017. Published by Oxford University Press.
de novo assembly and population genomic survey of natural yeast isolates with the Oxford Nanopore MinION sequencer

PubMed Central

Istace, Benjamin; Friedrich, Anne; d'Agata, Léo; Faye, Sébastien; Payen, Emilie; Beluche, Odette; Caradec, Claudia; Davidas, Sabrina; Cruaud, Corinne; Liti, Gianni; Lemainque, Arnaud; Engelen, Stefan; Wincker, Patrick; Schacherer, Joseph

2017-01-01

Abstract Background: Oxford Nanopore Technologies Ltd (Oxford, UK) have recently commercialized MinION, a small single-molecule nanopore sequencer, that offers the possibility of sequencing long DNA fragments from small genomes in a matter of seconds. The Oxford Nanopore technology is truly disruptive; it has the potential to revolutionize genomic applications due to its portability, low cost, and ease of use compared with existing long reads sequencing technologies. The MinION sequencer enables the rapid sequencing of small eukaryotic genomes, such as the yeast genome. Combined with existing assembler algorithms, near complete genome assemblies can be generated and comprehensive population genomic analyses can be performed. Results: Here, we resequenced the genome of the Saccharomyces cerevisiae S288C strain to evaluate the performance of nanopore-only assemblers. Then we de novo sequenced and assembled the genomes of 21 isolates representative of the S. cerevisiae genetic diversity using the MinION platform. The contiguity of our assemblies was 14 times higher than the Illumina-only assemblies and we obtained one or two long contigs for 65 % of the chromosomes. This high contiguity allowed us to accurately detect large structural variations across the 21 studied genomes. Conclusion: Because of the high completeness of the nanopore assemblies, we were able to produce a complete cartography of transposable elements insertions and inspect structural variants that are generally missed using a short-read sequencing strategy. Our analyses show that the Oxford Nanopore technology is already usable for de novo sequencing and assembly; however, non-random errors in homopolymers require polishing the consensus using an alternate sequencing technology. PMID:28369459
Fluorescent signatures for variable DNA sequences

PubMed Central

Rice, John E.; Reis, Arthur H.; Rice, Lisa M.; Carver-Brown, Rachel K.; Wangh, Lawrence J.

2012-01-01

Life abounds with genetic variations writ in sequences that are often only a few hundred nucleotides long. Rapid detection of these variations for identification of genetic diseases, pathogens and organisms has become the mainstay of molecular science and medicine. This report describes a new, highly informative closed-tube polymerase chain reaction (PCR) strategy for analysis of both known and unknown sequence variations. It combines efficient quantitative amplification of single-stranded DNA targets through LATE-PCR with sets of Lights-On/Lights-Off probes that hybridize to their target sequences over a broad temperature range. Contiguous pairs of Lights-On/Lights-Off probes of the same fluorescent color are used to scan hundreds of nucleotides for the presence of mutations. Sets of probes in different colors can be combined in the same tube to analyze even longer single-stranded targets. Each set of hybridized Lights-On/Lights-Off probes generates a composite fluorescent contour, which is mathematically converted to a sequence-specific fluorescent signature. The versatility and broad utility of this new technology is illustrated in this report by characterization of variant sequences in three different DNA targets: the rpoB gene of Mycobacterium tuberculosis, a sequence in the mitochondrial cytochrome C oxidase subunit 1 gene of nematodes and the V3 hypervariable region of the bacterial 16 s ribosomal RNA gene. We anticipate widespread use of these technologies for diagnostics, species identification and basic research. PMID:22879378
Genomic approaches for the elucidation of genes and gene networks underlying cardiovascular traits.

PubMed

Adriaens, M E; Bezzina, C R

2018-06-22

Genome-wide association studies have shed light on the association between natural genetic variation and cardiovascular traits. However, linking a cardiovascular trait associated locus to a candidate gene or set of candidate genes for prioritization for follow-up mechanistic studies is all but straightforward. Genomic technologies based on next-generation sequencing technology nowadays offer multiple opportunities to dissect gene regulatory networks underlying genetic cardiovascular trait associations, thereby aiding in the identification of candidate genes at unprecedented scale. RNA sequencing in particular becomes a powerful tool when combined with genotyping to identify loci that modulate transcript abundance, known as expression quantitative trait loci (eQTL), or loci modulating transcript splicing known as splicing quantitative trait loci (sQTL). Additionally, the allele-specific resolution of RNA-sequencing technology enables estimation of allelic imbalance, a state where the two alleles of a gene are expressed at a ratio differing from the expected 1:1 ratio. When multiple high-throughput approaches are combined with deep phenotyping in a single study, a comprehensive elucidation of the relationship between genotype and phenotype comes into view, an approach known as systems genetics. In this review, we cover key applications of systems genetics in the broad cardiovascular field.
International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

PubMed Central

Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

2015-01-01

This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
Studies of a biochemical factory: tomato trichome deep expressed sequence tag sequencing and proteomics.

PubMed

Schilmiller, Anthony L; Miner, Dennis P; Larson, Matthew; McDowell, Eric; Gang, David R; Wilkerson, Curtis; Last, Robert L

2010-07-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces beta-caryophyllene and alpha-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells.
Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

PubMed Central

Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

2010-01-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087
Accurate and exact CNV identification from targeted high-throughput sequence data.

PubMed

Nord, Alex S; Lee, Ming; King, Mary-Claire; Walsh, Tom

2011-04-12

Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data. Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate. Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.
Dissecting enzyme function with microfluidic-based deep mutational scanning.

PubMed

Romero, Philip A; Tran, Tuan M; Abate, Adam R

2015-06-09

Natural enzymes are incredibly proficient catalysts, but engineering them to have new or improved functions is challenging due to the complexity of how an enzyme's sequence relates to its biochemical properties. Here, we present an ultrahigh-throughput method for mapping enzyme sequence-function relationships that combines droplet microfluidic screening with next-generation DNA sequencing. We apply our method to map the activity of millions of glycosidase sequence variants. Microfluidic-based deep mutational scanning provides a comprehensive and unbiased view of the enzyme function landscape. The mapping displays expected patterns of mutational tolerance and a strong correspondence to sequence variation within the enzyme family, but also reveals previously unreported sites that are crucial for glycosidase function. We modified the screening protocol to include a high-temperature incubation step, and the resulting thermotolerance landscape allowed the discovery of mutations that enhance enzyme thermostability. Droplet microfluidics provides a general platform for enzyme screening that, when combined with DNA-sequencing technologies, enables high-throughput mapping of enzyme sequence space.
Assembly and diploid architecture of an individual human genome via single-molecule technologies

PubMed Central

Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali

2015-01-01

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality. PMID:26121404
Assembly and diploid architecture of an individual human genome via single-molecule technologies.

PubMed

Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali

2015-08-01

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.
Finding Sequences for over 270 Orphan Enzymes

PubMed Central

Shearer, Alexander G.; Altman, Tomer; Rhee, Christine D.

2014-01-01

Despite advances in sequencing technology, there are still significant numbers of well-characterized enzymatic activities for which there are no known associated sequences. These ‘orphan enzymes’ represent glaring holes in our biological understanding, and it is a top priority to reunite them with their coding sequences. Here we report a methodology for resolving orphan enzymes through a combination of database search and literature review. Using this method we were able to reconnect over 270 orphan enzymes with their corresponding sequence. This success points toward how we can systematically eliminate the remaining orphan enzymes and prevent the introduction of future orphan enzymes. PMID:24826896

OPTIMA: sensitive and accurate whole-genome alignment of error-prone genomic maps by combinatorial indexing and technology-agnostic statistical analysis.

PubMed

Verzotto, Davide; M Teo, Audrey S; Hillmer, Axel M; Nagarajan, Niranjan

2016-01-01

Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.
Draft Genome Sequence, and a Sequence-Defined Genetic Linkage Map of the Legume Crop Species Lupinus angustifolius L

PubMed Central

Zheng, Zequn; Zhang, Qisen; Zhou, Gaofeng; Sweetingham, Mark W.; Howieson, John G.; Li, Chengdao

2013-01-01

Lupin (Lupinus angustifolius L.) is the most recently domesticated crop in major agricultural cultivation. Its seeds are high in protein and dietary fibre, but low in oil and starch. Medical and dietetic studies have shown that consuming lupin-enriched food has significant health benefits. We report the draft assembly from a whole genome shotgun sequencing dataset for this legume species with 26.9x coverage of the genome, which is predicted to contain 57,807 genes. Analysis of the annotated genes with metabolic pathways provided a partial understanding of some key features of lupin, such as the amino acid profile of storage proteins in seeds. Furthermore, we applied the NGS-based RAD-sequencing technology to obtain 8,244 sequence-defined markers for anchoring the genomic sequences. A total of 4,214 scaffolds from the genome sequence assembly were aligned into the genetic map. The combination of the draft assembly and a sequence-defined genetic map made it possible to locate and study functional genes of agronomic interest. The identification of co-segregating SNP markers, scaffold sequences and gene annotation facilitated the identification of a candidate R gene associated with resistance to the major lupin disease anthracnose. We demonstrated that the combination of medium-depth genome sequencing and a high-density genetic linkage map by application of NGS technology is a cost-effective approach to generating genome sequence data and a large number of molecular markers to study the genomics, genetics and functional genes of lupin, and to apply them to molecular plant breeding. This strategy does not require prior genome knowledge, which potentiates its application to a wide range of non-model species. PMID:23734219
Draft genome sequence, and a sequence-defined genetic linkage map of the legume crop species Lupinus angustifolius L.

PubMed

Yang, Huaan; Tao, Ye; Zheng, Zequn; Zhang, Qisen; Zhou, Gaofeng; Sweetingham, Mark W; Howieson, John G; Li, Chengdao

2013-01-01

Lupin (Lupinus angustifolius L.) is the most recently domesticated crop in major agricultural cultivation. Its seeds are high in protein and dietary fibre, but low in oil and starch. Medical and dietetic studies have shown that consuming lupin-enriched food has significant health benefits. We report the draft assembly from a whole genome shotgun sequencing dataset for this legume species with 26.9x coverage of the genome, which is predicted to contain 57,807 genes. Analysis of the annotated genes with metabolic pathways provided a partial understanding of some key features of lupin, such as the amino acid profile of storage proteins in seeds. Furthermore, we applied the NGS-based RAD-sequencing technology to obtain 8,244 sequence-defined markers for anchoring the genomic sequences. A total of 4,214 scaffolds from the genome sequence assembly were aligned into the genetic map. The combination of the draft assembly and a sequence-defined genetic map made it possible to locate and study functional genes of agronomic interest. The identification of co-segregating SNP markers, scaffold sequences and gene annotation facilitated the identification of a candidate R gene associated with resistance to the major lupin disease anthracnose. We demonstrated that the combination of medium-depth genome sequencing and a high-density genetic linkage map by application of NGS technology is a cost-effective approach to generating genome sequence data and a large number of molecular markers to study the genomics, genetics and functional genes of lupin, and to apply them to molecular plant breeding. This strategy does not require prior genome knowledge, which potentiates its application to a wide range of non-model species.
Genome assembly from synthetic long read clouds

PubMed Central

Kuleshov, Volodymyr; Snyder, Michael P.; Batzoglou, Serafim

2016-01-01

Motivation: Despite rapid progress in sequencing technology, assembling de novo the genomes of new species as well as reconstructing complex metagenomes remains major technological challenges. New synthetic long read (SLR) technologies promise significant advances towards these goals; however, their applicability is limited by high sequencing requirements and the inability of current assembly paradigms to cope with combinations of short and long reads. Results: Here, we introduce Architect, a new de novo scaffolder aimed at SLR technologies. Unlike previous assembly strategies, Architect does not require a costly subassembly step; instead it assembles genomes directly from the SLR’s underlying short reads, which we refer to as read clouds. This enables a 4- to 20-fold reduction in sequencing requirements and a 5-fold increase in assembly contiguity on both genomic and metagenomic datasets relative to state-of-the-art assembly strategies aimed directly at fully subassembled long reads. Availability and Implementation: Our source code is freely available at https://github.com/kuleshov/architect. Contact: kuleshov@stanford.edu PMID:27307620
Microbial sequencing methods for monitoring of anaerobic treatment of antibiotics to optimize performance and prevent system failure.

PubMed

Aydin, Sevcan

2016-06-01

As a result of developments in molecular technologies and the use of sequencing technologies, the analyses of the anaerobic microbial community in biological treatment process has become increasingly prevalent. This review examines the ways in which microbial sequencing methods can be applied to achieve an extensive understanding of the phylogenetic and functional characteristics of microbial assemblages in anaerobic reactor if the substrate is contaminated by antibiotics which is one of the most important toxic compounds. It will discuss some of the advantages and disadvantages associated with microbial sequencing techniques that are more commonly employed and will assess how a combination of the existing methods may be applied to develop a more comprehensive understanding of microbial communities and improve the validity and depth of the results for the enhancement of the stability of anaerobic reactors.
Rapid detection of IHNV by molecular padlock recognition and surface-associated isothermal amplification

NASA Astrophysics Data System (ADS)

McCarthy, Erik L.; Egeler, Teressa J.; Bickerstaff, Lee E.; Pereira da Cunha, Mauricio; Millard, Paul J.

2005-11-01

RNA sequences derived from infectious hematopoeitic necrosis virus (IHNV) could be detected using a combination of surface-associated molecular padlock DNA probes (MPP) and rolling circle amplification (RCA) in microcapillary tubes. DNA oligonucleotides with base sequences identical to RNA obtained from IHNV were recognized by MPP. Circularized MPP were then captured on the inner surface of glass microcapillary tubes by immobilized DNA oligonucleotide primers. Extension of the immobilized primers by isothermal RCA gave rise to DNA concatamers, which were in turn bound by the fluorescent reporter SYBR Green II nucleic acid stain, and measured by microfluorimetry. Surface-associated molecular padlock technology, combined with isothermal RCA, exhibited high selectivity and sensitivity without thermal cycling. This technology is applicable to direct RNA and DNA detection, permitting detection of a variety of viral or bacterial pathogens.
Diagnostics based on nucleic acid sequence variant profiling: PCR, hybridization, and NGS approaches.

PubMed

Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu

2016-10-01

Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Chronodes: Interactive Multifocus Exploration of Event Sequences

PubMed Central

POLACK, PETER J.; CHEN, SHANG-TSE; KAHNG, MINSUK; DE BARBARO, KAYA; BASOLE, RAHUL; SHARMIN, MOUSHUMI; CHAU, DUEN HORNG

2018-01-01

The advent of mobile health (mHealth) technologies challenges the capabilities of current visualizations, interactive tools, and algorithms. We present Chronodes, an interactive system that unifies data mining and human-centric visualization techniques to support explorative analysis of longitudinal mHealth data. Chronodes extracts and visualizes frequent event sequences that reveal chronological patterns across multiple participant timelines of mHealth data. It then combines novel interaction and visualization techniques to enable multifocus event sequence analysis, which allows health researchers to interactively define, explore, and compare groups of participant behaviors using event sequence combinations. Through summarizing insights gained from a pilot study with 20 behavioral and biomedical health experts, we discuss Chronodes’s efficacy and potential impact in the mHealth domain. Ultimately, we outline important open challenges in mHealth, and offer recommendations and design guidelines for future research. PMID:29515937
Short-read, high-throughput sequencing technology for STR genotyping

PubMed Central

Bornman, Daniel M.; Hester, Mark E.; Schuetter, Jared M.; Kasoji, Manjula D.; Minard-Smith, Angela; Barden, Curt A.; Nelson, Scott C.; Godbold, Gene D.; Baker, Christine H.; Yang, Boyu; Walther, Jacquelyn E.; Tornes, Ivan E.; Yan, Pearlly S.; Rodriguez, Benjamin; Bundschuh, Ralf; Dickens, Michael L.; Young, Brian A.; Faith, Seth A.

2013-01-01

DNA-based methods for human identification principally rely upon genotyping of short tandem repeat (STR) loci. Electrophoretic-based techniques for variable-length classification of STRs are universally utilized, but are limited in that they have relatively low throughput and do not yield nucleotide sequence information. High-throughput sequencing technology may provide a more powerful instrument for human identification, but is not currently validated for forensic casework. Here, we present a systematic method to perform high-throughput genotyping analysis of the Combined DNA Index System (CODIS) STR loci using short-read (150 bp) massively parallel sequencing technology. Open source reference alignment tools were optimized to evaluate PCR-amplified STR loci using a custom designed STR genome reference. Evaluation of this approach demonstrated that the 13 CODIS STR loci and amelogenin (AMEL) locus could be accurately called from individual and mixture samples. Sensitivity analysis showed that as few as 18,500 reads, aligned to an in silico referenced genome, were required to genotype an individual (>99% confidence) for the CODIS loci. The power of this technology was further demonstrated by identification of variant alleles containing single nucleotide polymorphisms (SNPs) and the development of quantitative measurements (reads) for resolving mixed samples. PMID:25621315
Effectiveness of the standard and an alternative set of Streptococcus pneumoniae multi locus sequence typing primers.

PubMed

Adamiak, Paul; Vanderkooi, Otto G; Kellner, James D; Schryvers, Anthony B; Bettinger, Julie A; Alcantara, Joenel

2014-06-03

Multi-locus sequence typing (MLST) is a portable, broadly applicable method for classifying bacterial isolates at an intra-species level. This methodology provides clinical and scientific investigators with a standardized means of monitoring evolution within bacterial populations. MLST uses the DNA sequences from a set of genes such that each unique combination of sequences defines an isolate's sequence type. In order to reliably determine the sequence of a typing gene, matching sequence reads for both strands of the gene must be obtained. This study assesses the ability of both the standard, and an alternative set of, Streptococcus pneumoniae MLST primers to completely sequence, in both directions, the required typing alleles. The results demonstrated that for five (aroE, recP, spi, xpt, ddl) of the seven S. pneumoniae typing alleles, the standard primers were unable to obtain the complete forward and reverse sequences. This is due to the standard primers annealing too closely to the target regions, and current sequencing technology failing to sequence the bases that are too close to the primer. The alternative primer set described here, which includes a combination of primers proposed by the CDC and several designed as part of this study, addresses this limitation by annealing to highly conserved segments further from the target region. This primer set was subsequently employed to sequence type 105 S. pneumoniae isolates collected by the Canadian Immunization Monitoring Program ACTive (IMPACT) over a period of 18 years. The inability of several of the standard S. pneumoniae MLST primers to fully sequence the required region was consistently observed and is the result of a shift in sequencing technology occurring after the original primers were designed. The results presented here introduce clear documentation describing this phenomenon into the literature, and provide additional guidance, through the introduction of a widely validated set of alternative primers, to research groups seeking to undertake S. pneumoniae MLST based studies.
Target enrichment and high-throughput sequencing of 80 ribosomal protein genes to identify mutations associated with Diamond-Blackfan anaemia.

PubMed

Gerrard, Gareth; Valgañón, Mikel; Foong, Hui En; Kasperaviciute, Dalia; Iskander, Deena; Game, Laurence; Müller, Michael; Aitman, Timothy J; Roberts, Irene; de la Fuente, Josu; Foroni, Letizia; Karadimitris, Anastasios

2013-08-01

Diamond-Blackfan anaemia (DBA) is caused by inactivating mutations in ribosomal protein (RP) genes, with mutations in 13 of the 80 RP genes accounting for 50-60% of cases. The remaining 40-50% cases may harbour mutations in one of the remaining RP genes, but the very low frequencies render conventional genetic screening as challenging. We, therefore, applied custom enrichment technology combined with high-throughput sequencing to screen all 80 RP genes. Using this approach, we identified and validated inactivating mutations in 15/17 (88%) DBA patients. Target enrichment combined with high-throughput sequencing is a robust and improved methodology for the genetic diagnosis of DBA. © 2013 John Wiley & Sons Ltd.
An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

PubMed

Tørresen, Ole K; Star, Bastiaan; Jentoft, Sissel; Reinar, William B; Grove, Harald; Miller, Jason R; Walenz, Brian P; Knight, James; Ekholm, Jenny M; Peluso, Paul; Edvardsen, Rolf B; Tooming-Klunderud, Ave; Skage, Morten; Lien, Sigbjørn; Jakobsen, Kjetill S; Nederbragt, Alexander J

2017-01-18

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies. By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual. The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

PubMed

Senol Cali, Damla; Kim, Jeremie S; Ghose, Saugata; Alkan, Can; Mutlu, Onur

2018-04-02

Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious and effective choices for each step of the genome assembly pipeline using nanopore sequence data. Also, with the help of bottlenecks we have found, developers can improve the current tools or build new ones that are both accurate and fast, to overcome the high error rates of the nanopore sequencing technology.
T85C polymorphisms of the dihydropyrimidine dehydrogenase gene detected in gastric cancer tissues by high-resolution melting curve analysis.

PubMed

Fang, Weijia; Xu, Nong; Jin, Dazhi; Chen, Yu; Chen, Xiaogang; Zheng, Yi; Shen, Hong; Yuan, Ying; Zheng, Shusen

2012-01-01

Dihydropyrimidine dehydrogenase is a key enzyme acting on the metabolic pathway of medications for gastric cancer. High-resolution melting curve technology, which was developed recently, can distinguish the wild-type dihydropyrimidine dehydrogenase gene from multiple polymorphisms by fluorescent quantitative polymerase chain reaction products in a direct and effective manner. T85C polymorphisms of dihydropyrimidine dehydrogenase in the peripheral blood of 112 Chinese gastric cancer patients were detected by real-time polymerase chain reaction combined with high-resolution melting curve technology. Primer design, along with the reaction system and conditions, was optimized based on the GenBank sequence. Seventy nine cases of wild-type (TT, [70.5%]), 29 cases of heterozygous (TC, [25.9%]), and 4 cases of homozygous mutant (CC, [3.6%]) were observed. The result was completely consistent with the results of the sequencing. Real-time polymerase chain reaction combined with high-resolution melting curve technology is a rapid, simple, reliable, direct-viewing, and convenient method for the detection and screening of polymorphisms.
High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology

PubMed Central

Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M

2007-01-01

Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442
Image Encryption Algorithm Based on Hyperchaotic Maps and Nucleotide Sequences Database

PubMed Central

2017-01-01

Image encryption technology is one of the main means to ensure the safety of image information. Using the characteristics of chaos, such as randomness, regularity, ergodicity, and initial value sensitiveness, combined with the unique space conformation of DNA molecules and their unique information storage and processing ability, an efficient method for image encryption based on the chaos theory and a DNA sequence database is proposed. In this paper, digital image encryption employs a process of transforming the image pixel gray value by using chaotic sequence scrambling image pixel location and establishing superchaotic mapping, which maps quaternary sequences and DNA sequences, and by combining with the logic of the transformation between DNA sequences. The bases are replaced under the displaced rules by using DNA coding in a certain number of iterations that are based on the enhanced quaternary hyperchaotic sequence; the sequence is generated by Chen chaos. The cipher feedback mode and chaos iteration are employed in the encryption process to enhance the confusion and diffusion properties of the algorithm. Theoretical analysis and experimental results show that the proposed scheme not only demonstrates excellent encryption but also effectively resists chosen-plaintext attack, statistical attack, and differential attack. PMID:28392799
Agility in adversity: Vaccines on Demand.

PubMed

De Groot, Anne S; Moise, Leonard; Olive, David; Einck, Leo; Martin, William

2016-09-01

Is the US ready for a biological attack using Ebola virus or Anthrax? Will vaccine developers be able to produce a Zika virus vaccine, before the epidemic spreads around the world? A recent report by The Blue Ribbon Study Panel on Biodefense argues that the US is not ready for these challenges, however, technologies and capabilities that could address these deficiencies are within reach. Vaccine technologies have advanced and readiness has improved in recent years, due to advances in sequencing technology and computational power making the 'vaccines on demand' concept a reality. Building a robust strategy to design effective biodefense vaccines from genome sequences harvested by real-time biosurveillance will benefit from technologies that are being brought to bear on the cancer cure 'moonshot'. When combined with flexible vaccine production platforms, vaccines on demand will relegate expensive and, in some cases, insufficiently effective vaccine stockpiles to the dust heap of history.
Coverage Bias and Sensitivity of Variant Calling for Four Whole-genome Sequencing Technologies

PubMed Central

Lasitschka, Bärbel; Jones, David; Northcott, Paul; Hutter, Barbara; Jäger, Natalie; Kool, Marcel; Taylor, Michael; Lichter, Peter; Pfister, Stefan; Wolf, Stephan; Brors, Benedikt; Eils, Roland

2013-01-01

The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies’ platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other platforms. This helps to identify the proper sequencing platform for whole genome studies with different application scopes. PMID:23776689
Label-free screening of single biomolecules through resistive pulse sensing technology for precision medicine applications

NASA Astrophysics Data System (ADS)

Harrer, S.; Kim, S. C.; Schieber, C.; Kannam, S.; Gunn, N.; Moore, S.; Scott, D.; Bathgate, R.; Skafidas, S.; Wagner, J. M.

2015-05-01

Employing integrated nano- and microfluidic circuits for detecting and characterizing biological compounds through resistive pulse sensing technology is a vibrant area of research at the interface of biotechnology and nanotechnology. Resistive pulse sensing platforms can be customized to study virtually any particle of choice which can be threaded through a fluidic channel and enable label-free single-particle interrogation with the primary read-out signal being an electric current fingerprint. The ability to perform label-free molecular screening with single-molecule and even single binding site resolution makes resistive pulse sensing technology a powerful tool for analyzing the smallest units of biological systems and how they interact with each other on a molecular level. This task is at the core of experimental systems biology and in particular ‘omics research which in combination with next-generation DNA-sequencing and next-generation drug discovery and design forms the foundation of a novel disruptive medical paradigm commonly referred to as personalized medicine or precision medicine. DNA-sequencing has approached the 1000-Dollar-Genome milestone allowing for decoding a complete human genome with unmatched speed and at low cost. Increased sequencing efficiency yields massive amounts of genomic data. Analyzing this data in combination with medical and biometric health data eventually enables understanding the pathways from individual genes to physiological functions. Access to this information triggers fundamental questions for doctors and patients alike: what are the chances of an outbreak for a specific disease? Can individual risks be managed and if so how? Which drugs are available and how should they be applied? Could a new drug be tailored to an individual’s genetic predisposition fast and in an affordable way? In order to provide answers and real-life value to patients, the rapid evolvement of novel computing approaches for analyzing big data in systems genomics has to be accompanied by an equally strong effort to develop next-generation DNA-sequencing and next-generation drug screening and design platforms. In that context lab-on-a-chip devices utilizing nanopore- and nanochannel based resistive pulse-sensing technology for DNA-sequencing and protein screening applications occupy a key role. This paper describes the status quo of resistive pulse sensing technology for these two application areas with a special focus on current technology trends and challenges ahead.
Label-free screening of single biomolecules through resistive pulse sensing technology for precision medicine applications.

PubMed

Harrer, S; Kim, S C; Schieber, C; Kannam, S; Gunn, N; Moore, S; Scott, D; Bathgate, R; Skafidas, S; Wagner, J M

2015-05-08

Employing integrated nano- and microfluidic circuits for detecting and characterizing biological compounds through resistive pulse sensing technology is a vibrant area of research at the interface of biotechnology and nanotechnology. Resistive pulse sensing platforms can be customized to study virtually any particle of choice which can be threaded through a fluidic channel and enable label-free single-particle interrogation with the primary read-out signal being an electric current fingerprint. The ability to perform label-free molecular screening with single-molecule and even single binding site resolution makes resistive pulse sensing technology a powerful tool for analyzing the smallest units of biological systems and how they interact with each other on a molecular level. This task is at the core of experimental systems biology and in particular 'omics research which in combination with next-generation DNA-sequencing and next-generation drug discovery and design forms the foundation of a novel disruptive medical paradigm commonly referred to as personalized medicine or precision medicine. DNA-sequencing has approached the 1000-Dollar-Genome milestone allowing for decoding a complete human genome with unmatched speed and at low cost. Increased sequencing efficiency yields massive amounts of genomic data. Analyzing this data in combination with medical and biometric health data eventually enables understanding the pathways from individual genes to physiological functions. Access to this information triggers fundamental questions for doctors and patients alike: what are the chances of an outbreak for a specific disease? Can individual risks be managed and if so how? Which drugs are available and how should they be applied? Could a new drug be tailored to an individual's genetic predisposition fast and in an affordable way? In order to provide answers and real-life value to patients, the rapid evolvement of novel computing approaches for analyzing big data in systems genomics has to be accompanied by an equally strong effort to develop next-generation DNA-sequencing and next-generation drug screening and design platforms. In that context lab-on-a-chip devices utilizing nanopore- and nanochannel based resistive pulse-sensing technology for DNA-sequencing and protein screening applications occupy a key role. This paper describes the status quo of resistive pulse sensing technology for these two application areas with a special focus on current technology trends and challenges ahead.

Adaptive and perceptual learning technologies in medical education and training.

PubMed

Kellman, Philip J

2013-10-01

Recent advances in the learning sciences offer remarkable potential to improve medical education and maximize the benefits of emerging medical technologies. This article describes 2 major innovation areas in the learning sciences that apply to simulation and other aspects of medical learning: Perceptual learning (PL) and adaptive learning technologies. PL technology offers, for the first time, systematic, computer-based methods for teaching pattern recognition, structural intuition, transfer, and fluency. Synergistic with PL are new adaptive learning technologies that optimize learning for each individual, embed objective assessment, and implement mastery criteria. The author describes the Adaptive Response-Time-based Sequencing (ARTS) system, which uses each learner's accuracy and speed in interactive learning to guide spacing, sequencing, and mastery. In recent efforts, these new technologies have been applied in medical learning contexts, including adaptive learning modules for initial medical diagnosis and perceptual/adaptive learning modules (PALMs) in dermatology, histology, and radiology. Results of all these efforts indicate the remarkable potential of perceptual and adaptive learning technologies, individually and in combination, to improve learning in a variety of medical domains. Reprint & Copyright © 2013 Association of Military Surgeons of the U.S.
Genotypic tropism testing by massively parallel sequencing: qualitative and quantitative analysis.

PubMed

Däumer, Martin; Kaiser, Rolf; Klein, Rolf; Lengauer, Thomas; Thiele, Bernhard; Thielen, Alexander

2011-05-13

Inferring viral tropism from genotype is a fast and inexpensive alternative to phenotypic testing. While being highly predictive when performed on clonal samples, sensitivity of predicting CXCR4-using (X4) variants drops substantially in clinical isolates. This is mainly attributed to minor variants not detected by standard bulk-sequencing. Massively parallel sequencing (MPS) detects single clones thereby being much more sensitive. Using this technology we wanted to improve genotypic prediction of coreceptor usage. Plasma samples from 55 antiretroviral-treated patients tested for coreceptor usage with the Monogram Trofile Assay were sequenced with standard population-based approaches. Fourteen of these samples were selected for further analysis with MPS. Tropism was predicted from each sequence with geno2pheno[coreceptor]. Prediction based on bulk-sequencing yielded 59.1% sensitivity and 90.9% specificity compared to the trofile assay. With MPS, 7600 reads were generated on average per isolate. Minorities of sequences with high confidence in CXCR4-usage were found in all samples, irrespective of phenotype. When using the default false-positive-rate of geno2pheno[coreceptor] (10%), and defining a minority cutoff of 5%, the results were concordant in all but one isolate. The combination of MPS and coreceptor usage prediction results in a fast and accurate alternative to phenotypic assays. The detection of X4-viruses in all isolates suggests that coreceptor usage as well as fitness of minorities is important for therapy outcome. The high sensitivity of this technology in combination with a quantitative description of the viral population may allow implementing meaningful cutoffs for predicting response to CCR5-antagonists in the presence of X4-minorities.
A bioinformatics approach for identifying transgene insertion sites using whole genome sequencing data.

PubMed

Park, Doori; Park, Su-Hyun; Ban, Yong Wook; Kim, Youn Shic; Park, Kyoung-Cheul; Kim, Nam-Soo; Kim, Ju-Kon; Choi, Ik-Young

2017-08-15

Genetically modified crops (GM crops) have been developed to improve the agricultural traits of modern crop cultivars. Safety assessments of GM crops are of paramount importance in research at developmental stages and before releasing transgenic plants into the marketplace. Sequencing technology is developing rapidly, with higher output and labor efficiencies, and will eventually replace existing methods for the molecular characterization of genetically modified organisms. To detect the transgenic insertion locations in the three GM rice gnomes, Illumina sequencing reads are mapped and classified to the rice genome and plasmid sequence. The both mapped reads are classified to characterize the junction site between plant and transgene sequence by sequence alignment. Herein, we present a next generation sequencing (NGS)-based molecular characterization method, using transgenic rice plants SNU-Bt9-5, SNU-Bt9-30, and SNU-Bt9-109. Specifically, using bioinformatics tools, we detected the precise insertion locations and copy numbers of transfer DNA, genetic rearrangements, and the absence of backbone sequences, which were equivalent to results obtained from Southern blot analyses. NGS methods have been suggested as an effective means of characterizing and detecting transgenic insertion locations in genomes. Our results demonstrate the use of a combination of NGS technology and bioinformatics approaches that offers cost- and time-effective methods for assessing the safety of transgenic plants.
Genome-wide ChIP-seq mapping and analysis of butyrate-induced H3K9 and H3K27 acetylation and epigenomic landscape alteration in bovine cells

USDA-ARS?s Scientific Manuscript database

Utilizing next-generation sequencing technology, combined with ChIP (Chromatin Immunoprecipitation) technology, we analyzed histone modification (acetylation) induced by butyrate and the large-scale mapping of the epigenomic landscape of normal histone H3 and acetylated histone H3K9 and H3K27. To d...
Next-generation sequencing coupled with a cell-free display technology for high-throughput production of reliable interactome data

PubMed Central

Fujimori, Shigeo; Hirai, Naoya; Ohashi, Hiroyuki; Masuoka, Kazuyo; Nishikimi, Akihiko; Fukui, Yoshinori; Washio, Takanori; Oshikubo, Tomohiro; Yamashita, Tatsuhiro; Miyamoto-Sato, Etsuko

2012-01-01

Next-generation sequencing (NGS) has been applied to various kinds of omics studies, resulting in many biological and medical discoveries. However, high-throughput protein-protein interactome datasets derived from detection by sequencing are scarce, because protein-protein interaction analysis requires many cell manipulations to examine the interactions. The low reliability of the high-throughput data is also a problem. Here, we describe a cell-free display technology combined with NGS that can improve both the coverage and reliability of interactome datasets. The completely cell-free method gives a high-throughput and a large detection space, testing the interactions without using clones. The quantitative information provided by NGS reduces the number of false positives. The method is suitable for the in vitro detection of proteins that interact not only with the bait protein, but also with DNA, RNA and chemical compounds. Thus, it could become a universal approach for exploring the large space of protein sequences and interactome networks. PMID:23056904
Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.

PubMed

Bashir, Ali; Bansal, Vikas; Bafna, Vineet

2010-06-18

Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.
SSPACE-LongRead: scaffolding bacterial draft genomes using long read sequence information

PubMed Central

2014-01-01

Background The recent introduction of the Pacific Biosciences RS single molecule sequencing technology has opened new doors to scaffolding genome assemblies in a cost-effective manner. The long read sequence information is promised to enhance the quality of incomplete and inaccurate draft assemblies constructed from Next Generation Sequencing (NGS) data. Results Here we propose a novel hybrid assembly methodology that aims to scaffold pre-assembled contigs in an iterative manner using PacBio RS long read information as a backbone. On a test set comprising six bacterial draft genomes, assembled using either a single Illumina MiSeq or Roche 454 library, we show that even a 50× coverage of uncorrected PacBio RS long reads is sufficient to drastically reduce the number of contigs. Comparisons to the AHA scaffolder indicate our strategy is better capable of producing (nearly) complete bacterial genomes. Conclusions The current work describes our SSPACE-LongRead software which is designed to upgrade incomplete draft genomes using single molecule sequences. We conclude that the recent advances of the PacBio sequencing technology and chemistry, in combination with the limited computational resources required to run our program, allow to scaffold genomes in a fast and reliable manner. PMID:24950923
Identification of an EMS-induced causal mutation in a gene required for boron-mediated root development by low-coverage genome re-sequencing in Arabidopsis

PubMed Central

Tabata, Ryo; Kamiya, Takehiro; Shigenobu, Shuji; Yamaguchi, Katsushi; Yamada, Masashi; Hasebe, Mitsuyasu; Fujiwara, Toru; Sawa, Shinichiro

2013-01-01

Next-generation sequencing (NGS) technologies enable the rapid production of an enormous quantity of sequence data. These powerful new technologies allow the identification of mutations by whole-genome sequencing. However, most reported NGS-based mapping methods, which are based on bulked segregant analysis, are costly and laborious. To address these limitations, we designed a versatile NGS-based mapping method that consists of a combination of low- to medium-coverage multiplex SOLiD (Sequencing by Oligonucleotide Ligation and Detection) and classical genetic rough mapping. Using only low to medium coverage reduces the SOLiD sequencing costs and, since just 10 to 20 mutant F2 plants are required for rough mapping, the operation is simple enough to handle in a laboratory with limited space and funding. As a proof of principle, we successfully applied this method to identify the CTR1, which is involved in boron-mediated root development, from among a population of high boron requiring Arabidopsis thaliana mutants. Our work demonstrates that this NGS-based mapping method is a moderately priced and versatile method that can readily be applied to other model organisms. PMID:23104114
Modeling Complex Phenomena Using Multiscale Time Sequences

DTIC Science & Technology

2009-08-24

measures based on Hurst and Holder exponents , auto-regressive methods and Fourier and wavelet decomposition methods. The applications for this technology...relate to each other. This can be done by combining a set statistical fractal measures based on Hurst and Holder exponents , auto-regressive...different scales and how these scales relate to each other. This can be done by combining a set statistical fractal measures based on Hurst and
A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

PubMed Central

Cao, Yu; Fanning, Séamus; Proos, Sinéad; Jordan, Kieran; Srikumar, Shabarinath

2017-01-01

The development of next generation sequencing (NGS) techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents) or a food manufacturing facility econiche (e.g., floor drain). To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods. PMID:29033905
Evaluation and validation of de novo and hybrid assembly techniques to derive high quality genome sequences

DOE PAGES

Utturkar, Sagar M.; Klingeman, Dawn Marie; Land, Miriam L.; ...

2014-06-14

Our motivation with this work was to assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. Our results show Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as anmore » additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. As to availability and implementation–all assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.« less
Combining multiple ChIP-seq peak detection systems using combinatorial fusion.

PubMed

Schweikert, Christina; Brown, Stuart; Tang, Zuojian; Smith, Phillip R; Hsu, D Frank

2012-01-01

Due to the recent rapid development in ChIP-seq technologies, which uses high-throughput next-generation DNA sequencing to identify the targets of Chromatin Immunoprecipitation, there is an increasing amount of sequencing data being generated that provides us with greater opportunity to analyze genome-wide protein-DNA interactions. In particular, we are interested in evaluating and enhancing computational and statistical techniques for locating protein binding sites. Many peak detection systems have been developed; in this study, we utilize the following six: CisGenome, MACS, PeakSeq, QuEST, SISSRs, and TRLocator. We define two methods to merge and rescore the regions of two peak detection systems and analyze the performance based on average precision and coverage of transcription start sites. The results indicate that ChIP-seq peak detection can be improved by fusion using score or rank combination. Our method of combination and fusion analysis would provide a means for generic assessment of available technologies and systems and assist researchers in choosing an appropriate system (or fusion method) for analyzing ChIP-seq data. This analysis offers an alternate approach for increasing true positive rates, while decreasing false positive rates and hence improving the ChIP-seq peak identification process.
Transcriptome characterization and polymorphism detection between subspecies of big sagebrush (Artemisia tridentata)

PubMed Central

2011-01-01

Background Big sagebrush (Artemisia tridentata) is one of the most widely distributed and ecologically important shrub species in western North America. This species serves as a critical habitat and food resource for many animals and invertebrates. Habitat loss due to a combination of disturbances followed by establishment of invasive plant species is a serious threat to big sagebrush ecosystem sustainability. Lack of genomic data has limited our understanding of the evolutionary history and ecological adaptation in this species. Here, we report on the sequencing of expressed sequence tags (ESTs) and detection of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers in subspecies of big sagebrush. Results cDNA of A. tridentata sspp. tridentata and vaseyana were normalized and sequenced using the 454 GS FLX Titanium pyrosequencing technology. Assembly of the reads resulted in 20,357 contig consensus sequences in ssp. tridentata and 20,250 contigs in ssp. vaseyana. A BLASTx search against the non-redundant (NR) protein database using 29,541 consensus sequences obtained from a combined assembly resulted in 21,436 sequences with significant blast alignments (≤ 1e-15). A total of 20,952 SNPs and 119 polymorphic SSRs were detected between the two subspecies. SNPs were validated through various methods including sequence capture. Validation of SNPs in different individuals uncovered a high level of nucleotide variation in EST sequences. EST sequences of a third, tetraploid subspecies (ssp. wyomingensis) obtained by Illumina sequencing were mapped to the consensus sequences of the combined 454 EST assembly. Approximately one-third of the SNPs between sspp. tridentata and vaseyana identified in the combined assembly were also polymorphic within the two geographically distant ssp. wyomingensis samples. Conclusion We have produced a large EST dataset for Artemisia tridentata, which contains a large sample of the big sagebrush leaf transcriptome. SNP mapping among the three subspecies suggest the origin of ssp. wyomingensis via mixed ancestry. A large number of SNP and SSR markers provide the foundation for future research to address questions in big sagebrush evolution, ecological genetics, and conservation using genomic approaches. PMID:21767398
Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

NASA Astrophysics Data System (ADS)

Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M.; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A. C. T.; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M.; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

2016-09-01

Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.
Transcriptome characterization for genome annotation and functional genomics in Theobroma cacao

USDA-ARS?s Scientific Manuscript database

Evidence from leaf transcriptome sequencing using two technology platforms, in combination with protein homology and trained ab initio predictions, previously enabled us to build 35,000 gene models in T. cacao (www.cacaogenomedb.org). Here we review the contribution of each data type to cacao gene a...
Characterization of microRNAs from goat (Capra hircus) by Solexa deep-sequencing technology.

PubMed

Ling, Y H; Ding, J P; Zhang, X D; Wang, L J; Zhang, Y H; Li, Y S; Zhang, Z J; Zhang, X R

2013-06-13

MicroRNAs (miRNAs) are an important class of small noncoding RNAs that are highly conserved in plants and animals. Many miRNAs are known to mediate a myriad of cell processes, including proliferation and differentiation, via the regulation of some transcription and signaling factors, which are closely related to muscle development and disease. In this study, small RNA cDNA libraries of Boer goats were constructed. In addition, we obtained the goat muscle miRNAs by using Solexa deep-sequencing technology and analyzed these miRNA characteristics by combining it with the bioinformatics technology. Based on Solexa sequencing and bioinformatics analysis, 562 species-conserved and 5 goat genome-specific miRNAs were identified, 322 of which exceeded 100 in the expression levels. The results of real-time quantitative polymerase chain reaction from 8 randomly selected miRNAs showed that the 8 miRNAs were expressed in goat muscle, and the expression patterns were consistent with the Solexa sequencing results. The identification and characterization of miRNAs in goat muscle provide important information on the role of miRNA regulation in muscle growth and development. These data will help to facilitate studies on the regulatory roles played by miRNAs during goat growth and development.
Comparative Transcriptomes and EVO-DEVO Studies Depending on Next Generation Sequencing.

PubMed

Liu, Tiancheng; Yu, Lin; Liu, Lei; Li, Hong; Li, Yixue

2015-01-01

High throughput technology has prompted the progressive omics studies, including genomics and transcriptomics. We have reviewed the improvement of comparative omic studies, which are attributed to the high throughput measurement of next generation sequencing technology. Comparative genomics have been successfully applied to evolution analysis while comparative transcriptomics are adopted in comparison of expression profile from two subjects by differential expression or differential coexpression, which enables their application in evolutionary developmental biology (EVO-DEVO) studies. EVO-DEVO studies focus on the evolutionary pressure affecting the morphogenesis of development and previous works have been conducted to illustrate the most conserved stages during embryonic development. Old measurements of these studies are based on the morphological similarity from macro view and new technology enables the micro detection of similarity in molecular mechanism. Evolutionary model of embryo development, which includes the "funnel-like" model and the "hourglass" model, has been evaluated by combination of these new comparative transcriptomic methods with prior comparative genomic information. Although the technology has promoted the EVO-DEVO studies into a new era, technological and material limitation still exist and further investigations require more subtle study design and procedure.
Identification of forensic samples by using an infrared-based automatic DNA sequencer.

PubMed

Ricci, Ugo; Sani, Ilaria; Klintschar, Michael; Cerri, Nicoletta; De Ferrari, Francesco; Giovannucci Uzielli, Maria Luisa

2003-06-01

We have recently introduced a new protocol for analyzing all core loci of the Federal Bureau of Investigation's (FBI) Combined DNA Index System (CODIS) with an infrared (IR) automatic DNA sequencer (LI-COR 4200). The amplicons were labeled with forward oligonucleotide primers, covalently linked to a new infrared fluorescent molecule (IRDye 800). The alleles were displayed as familiar autoradiogram-like images with real-time detection. This protocol was employed for paternity testing, population studies, and identification of degraded forensic samples. We extensively analyzed some simulated forensic samples and mixed stains (blood, semen, saliva, bones, and fixed archival embedded tissues), comparing the results with donor samples. Sensitivity studies were also performed for the four multiplex systems. Our results show the efficiency, reliability, and accuracy of the IR system for the analysis of forensic samples. We also compared the efficiency of the multiplex protocol with ultraviolet (UV) technology. Paternity tests, undegraded DNA samples, and real forensic samples were analyzed with this approach based on IR technology and with UV-based automatic sequencers in combination with commercially-available kits. The comparability of the results with the widespread UV methods suggests that it is possible to exchange data between laboratories using the same core group of markers but different primer sets and detection methods.
An automatic and efficient pipeline for disease gene identification through utilizing family-based sequencing data.

PubMed

Song, Dandan; Li, Ning; Liao, Lejian

2015-01-01

Due to the generation of enormous amounts of data at both lower costs as well as in shorter times, whole-exome sequencing technologies provide dramatic opportunities for identifying disease genes implicated in Mendelian disorders. Since upwards of thousands genomic variants can be sequenced in each exome, it is challenging to filter pathogenic variants in protein coding regions and reduce the number of missing true variants. Therefore, an automatic and efficient pipeline for finding disease variants in Mendelian disorders is designed by exploiting a combination of variants filtering steps to analyze the family-based exome sequencing approach. Recent studies on the Freeman-Sheldon disease are revisited and show that the proposed method outperforms other existing candidate gene identification methods.
Human Y chromosome copy number variation in the next generation sequencing era and beyond.

PubMed

Massaia, Andrea; Xue, Yali

2017-05-01

The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.

From days to hours: reporting clinically actionable variants from whole genome sequencing.

PubMed

Middha, Sumit; Baheti, Saurabh; Hart, Steven N; Kocher, Jean-Pierre A

2014-01-01

As the cost of whole genome sequencing (WGS) decreases, clinical laboratories will be looking at broadly adopting this technology to screen for variants of clinical significance. To fully leverage this technology in a clinical setting, results need to be reported quickly, as the turnaround rate could potentially impact patient care. The latest sequencers can sequence a whole human genome in about 24 hours. However, depending on the computing infrastructure available, the processing of data can take several days, with the majority of computing time devoted to aligning reads to genomics regions that are to date not clinically interpretable. In an attempt to accelerate the reporting of clinically actionable variants, we have investigated the utility of a multi-step alignment algorithm focused on aligning reads and calling variants in genomic regions of clinical relevance prior to processing the remaining reads on the whole genome. This iterative workflow significantly accelerates the reporting of clinically actionable variants with no loss of accuracy when compared to genotypes obtained with the OMNI SNP platform or to variants detected with a standard workflow that combines Novoalign and GATK.
XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets.

PubMed

Yu, Yao; Hu, Hao; Bohlender, Ryan J; Hu, Fulan; Chen, Jiun-Sheng; Holt, Carson; Fowler, Jerry; Guthery, Stephen L; Scheet, Paul; Hildebrandt, Michelle A T; Yandell, Mark; Huff, Chad D

2018-04-06

High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease-gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits.
Gene Prioritization of Resistant Rice Gene against Xanthomas oryzae pv. oryzae by Using Text Mining Technologies

PubMed Central

Xia, Jingbo; Zhang, Xing; Yuan, Daojun; Chen, Lingling; Webster, Jonathan; Fang, Alex Chengyu

2013-01-01

To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization. PMID:24371834
Gene prioritization of resistant rice gene against Xanthomas oryzae pv. oryzae by using text mining technologies.

PubMed

Xia, Jingbo; Zhang, Xing; Yuan, Daojun; Chen, Lingling; Webster, Jonathan; Fang, Alex Chengyu

2013-01-01

To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization.
Genomic paradigms for food-borne enteric pathogen analysis at the USFDA: case studies highlighting method utility, integration and resolution.

PubMed

Elkins, C A; Kotewicz, M L; Jackson, S A; Lacher, D W; Abu-Ali, G S; Patel, I R

2013-01-01

Modern risk control and food safety practices involving food-borne bacterial pathogens are benefiting from new genomic technologies for rapid, yet highly specific, strain characterisations. Within the United States Food and Drug Administration (USFDA) Center for Food Safety and Applied Nutrition (CFSAN), optical genome mapping and DNA microarray genotyping have been used for several years to quickly assess genomic architecture and gene content, respectively, for outbreak strain subtyping and to enhance retrospective trace-back analyses. The application and relative utility of each method varies with outbreak scenario and the suspect pathogen, with comparative analytical power enhanced by database scale and depth. Integration of these two technologies allows high-resolution scrutiny of the genomic landscapes of enteric food-borne pathogens with notable examples including Shiga toxin-producing Escherichia coli (STEC) and Salmonella enterica serovars from a variety of food commodities. Moreover, the recent application of whole genome sequencing technologies to food-borne pathogen outbreaks and surveillance has enhanced resolution to the single nucleotide scale. This new wealth of sequence data will support more refined next-generation custom microarray designs, targeted re-sequencing and "genomic signature recognition" approaches involving a combination of genes and single nucleotide polymorphism detection to distil strain-specific fingerprinting to a minimised scale. This paper examines the utility of microarrays and optical mapping in analysing outbreaks, reviews best practices and the limits of these technologies for pathogen differentiation, and it considers future integration with whole genome sequencing efforts.
Diversity arrays technology: a generic genome profiling technology on open platforms.

PubMed

Kilian, Andrzej; Wenzl, Peter; Huttner, Eric; Carling, Jason; Xia, Ling; Blois, Hélène; Caig, Vanessa; Heller-Uszynska, Katarzyna; Jaccoud, Damian; Hopper, Colleen; Aschenbrenner-Kilian, Malgorzata; Evers, Margaret; Peng, Kaiman; Cayla, Cyril; Hok, Puthick; Uszynski, Grzegorz

2012-01-01

In the last 20 years, we have observed an exponential growth of the DNA sequence data and simular increase in the volume of DNA polymorphism data generated by numerous molecular marker technologies. Most of the investment, and therefore progress, concentrated on human genome and genomes of selected model species. Diversity Arrays Technology (DArT), developed over a decade ago, was among the first "democratizing" genotyping technologies, as its performance was primarily driven by the level of DNA sequence variation in the species rather than by the level of financial investment. DArT also proved more robust to genome size and ploidy-level differences among approximately 60 organisms for which DArT was developed to date compared to other high-throughput genotyping technologies. The success of DArT in a number of organisms, including a wide range of "orphan crops," can be attributed to the simplicity of underlying concepts: DArT combines genome complexity reduction methods enriching for genic regions with a highly parallel assay readout on a number of "open-access" microarray platforms. The quantitative nature of the assay enabled a number of applications in which allelic frequencies can be estimated from DArT arrays. A typical DArT assay tests for polymorphism tens of thousands of genomic loci with the final number of markers reported (hundreds to thousands) reflecting the level of DNA sequence variation in the tested loci. Detailed DArT methods, protocols, and a range of their application examples as well as DArT's evolution path are presented.
Software for pre-processing Illumina next-generation sequencing short read sequences

PubMed Central

2014-01-01

Background When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets. Methods We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7. Results Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference-based assembly as measured by assembly contiguity and correctness. Conclusions Trimming of short read sequences can improve the quality of de novo and reference-based assembly and assembler performance. The parallel processing capability of ngsShoRT reduces trimming time and improves the memory efficiency when dealing with large datasets. We recommend combining sequencing artifacts removal, and quality score based read filtering and base trimming as the most consistent method for improving sequence quality and downstream assemblies. ngsShoRT source code, user guide and tutorial are available at http://research.bioinformatics.udel.edu/genomics/ngsShoRT/. ngsShoRT can be incorporated as a pre-processing step in genome and transcriptome assembly projects. PMID:24955109
Metagenome assembly through clustering of next-generation sequencing data using protein sequences.

PubMed

Sim, Mikang; Kim, Jaebum

2015-02-01

The study of environmental microbial communities, called metagenomics, has gained a lot of attention because of the recent advances in next-generation sequencing (NGS) technologies. Microbes play a critical role in changing their environments, and the mode of their effect can be solved by investigating metagenomes. However, the difficulty of metagenomes, such as the combination of multiple microbes and different species abundance, makes metagenome assembly tasks more challenging. In this paper, we developed a new metagenome assembly method by utilizing protein sequences, in addition to the NGS read sequences. Our method (i) builds read clusters by using mapping information against available protein sequences, and (ii) creates contig sequences by finding consensus sequences through probabilistic choices from the read clusters. By using simulated NGS read sequences from real microbial genome sequences, we evaluated our method in comparison with four existing assembly programs. We found that our method could generate relatively long and accurate metagenome assemblies, indicating that the idea of using protein sequences, as a guide for the assembly, is promising. Copyright © 2015 Elsevier B.V. All rights reserved.
An efficient approach to BAC based assembly of complex genomes.

PubMed

Visendi, Paul; Berkman, Paul J; Hayashi, Satomi; Golicz, Agnieszka A; Bayer, Philipp E; Ruperao, Pradeep; Hurgobin, Bhavna; Montenegro, Juan; Chan, Chon-Kit Kenneth; Staňková, Helena; Batley, Jacqueline; Šimková, Hana; Doležel, Jaroslav; Edwards, David

2016-01-01

There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate 'gold' reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes.
Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies.

PubMed

Standish, Kristopher A; Carland, Tristan M; Lockwood, Glenn K; Pfeiffer, Wayne; Tatineni, Mahidhar; Huang, C Chris; Lamberth, Sarah; Cherkas, Yauheniya; Brodmerkel, Carrie; Jaeger, Ed; Smith, Lance; Rajagopal, Gunaretnam; Curran, Mark E; Schork, Nicholas J

2015-09-22

Next-generation sequencing (NGS) technologies have become much more efficient, allowing whole human genomes to be sequenced faster and cheaper than ever before. However, processing the raw sequence reads associated with NGS technologies requires care and sophistication in order to draw compelling inferences about phenotypic consequences of variation in human genomes. It has been shown that different approaches to variant calling from NGS data can lead to different conclusions. Ensuring appropriate accuracy and quality in variant calling can come at a computational cost. We describe our experience implementing and evaluating a group-based approach to calling variants on large numbers of whole human genomes. We explore the influence of many factors that may impact the accuracy and efficiency of group-based variant calling, including group size, the biogeographical backgrounds of the individuals who have been sequenced, and the computing environment used. We make efficient use of the Gordon supercomputer cluster at the San Diego Supercomputer Center by incorporating job-packing and parallelization considerations into our workflow while calling variants on 437 whole human genomes generated as part of large association study. We ultimately find that our workflow resulted in high-quality variant calls in a computationally efficient manner. We argue that studies like ours should motivate further investigations combining hardware-oriented advances in computing systems with algorithmic developments to tackle emerging 'big data' problems in biomedical research brought on by the expansion of NGS technologies.
Constructing Digital Stories

ERIC Educational Resources Information Center

Kajder, Sara; Bull, Glen; Albaugh, Susan

2005-01-01

A digital story consists of a series of still images combined with a narrated soundtrack to tell a story. This document contains a sequence of seven steps for digital storytelling based on a two-year project in Curry School's Center for Technology and Teacher Education at the University of Virginia. The strategies outlined offer a starting point…
Etude des sequences de type consonne constrictive plus voyelle en francais, a l'aide de la radiocinematographie et de l'oscillographie (A Study of the Constrictive Consonant Plus Vowel Sequences in French, Using X-Ray Filming and Oscillography). Publication B-148.

ERIC Educational Resources Information Center

Rochette, Claude; Simard, Claude

A study of the phonetic combination of a constrictive consonant (specifically, [f], [v], and [r]) and a vowel in French using x-ray and oscillograph technology focused on the speed and process of articulation between the consonant and the vowel. The study considered aperture size, nasality, labiality, and accent. Articulation of a total of 407…
Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies.

PubMed

Rickert, Keith W; Grinberg, Luba; Woods, Robert M; Wilson, Susan; Bowen, Michael A; Baca, Manuel

2016-01-01

The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3-5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material.
Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies

PubMed Central

Rickert, Keith W.; Grinberg, Luba; Woods, Robert M.; Wilson, Susan; Bowen, Michael A.; Baca, Manuel

2016-01-01

ABSTRACT The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3–5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material. PMID:26852694
The Widening Gulf between Genomics Data Generation and Consumption: A Practical Guide to Big Data Transfer Technology.

PubMed

Feltus, Frank A; Breen, Joseph R; Deng, Juan; Izard, Ryan S; Konger, Christopher A; Ligon, Walter B; Preuss, Don; Wang, Kuang-Ching

2015-01-01

In the last decade, high-throughput DNA sequencing has become a disruptive technology and pushed the life sciences into a distributed ecosystem of sequence data producers and consumers. Given the power of genomics and declining sequencing costs, biology is an emerging "Big Data" discipline that will soon enter the exabyte data range when all subdisciplines are combined. These datasets must be transferred across commercial and research networks in creative ways since sending data without thought can have serious consequences on data processing time frames. Thus, it is imperative that biologists, bioinformaticians, and information technology engineers recalibrate data processing paradigms to fit this emerging reality. This review attempts to provide a snapshot of Big Data transfer across networks, which is often overlooked by many biologists. Specifically, we discuss four key areas: 1) data transfer networks, protocols, and applications; 2) data transfer security including encryption, access, firewalls, and the Science DMZ; 3) data flow control with software-defined networking; and 4) data storage, staging, archiving and access. A primary intention of this article is to orient the biologist in key aspects of the data transfer process in order to frame their genomics-oriented needs to enterprise IT professionals.
Sequencing thousands of single-cell genomes with combinatorial indexing.

PubMed

Vitak, Sarah A; Torkenczy, Kristof A; Rosenkrantz, Jimi L; Fields, Andrew J; Christiansen, Lena; Wong, Melissa H; Carbone, Lucia; Steemers, Frank J; Adey, Andrew

2017-03-01

Single-cell genome sequencing has proven valuable for the detection of somatic variation, particularly in the context of tumor evolution. Current technologies suffer from high library construction costs, which restrict the number of cells that can be assessed and thus impose limitations on the ability to measure heterogeneity within a tissue. Here, we present single-cell combinatorial indexed sequencing (SCI-seq) as a means of simultaneously generating thousands of low-pass single-cell libraries for detection of somatic copy-number variants. We constructed libraries for 16,698 single cells from a combination of cultured cell lines, primate frontal cortex tissue and two human adenocarcinomas, and obtained a detailed assessment of subclonal variation within a pancreatic tumor.
SNP discovery by high-throughput sequencing in soybean

PubMed Central

2010-01-01

Background With the advance of new massively parallel genotyping technologies, quantitative trait loci (QTL) fine mapping and map-based cloning become more achievable in identifying genes for important and complex traits. Development of high-density genetic markers in the QTL regions of specific mapping populations is essential for fine-mapping and map-based cloning of economically important genes. Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation existing between any diverse genotypes that are usually used for QTL mapping studies. The massively parallel sequencing technologies (Roche GS/454, Illumina GA/Solexa, and ABI/SOLiD), have been widely applied to identify genome-wide sequence variations. However, it is still remains unclear whether sequence data at a low sequencing depth are enough to detect the variations existing in any QTL regions of interest in a crop genome, and how to prepare sequencing samples for a complex genome such as soybean. Therefore, with the aims of identifying SNP markers in a cost effective way for fine-mapping several QTL regions, and testing the validation rate of the putative SNPs predicted with Solexa short sequence reads at a low sequencing depth, we evaluated a pooled DNA fragment reduced representation library and SNP detection methods applied to short read sequences generated by Solexa high-throughput sequencing technology. Results A total of 39,022 putative SNPs were identified by the Illumina/Solexa sequencing system using a reduced representation DNA library of two parental lines of a mapping population. The validation rates of these putative SNPs predicted with low and high stringency were 72% and 85%, respectively. One hundred sixty four SNP markers resulted from the validation of putative SNPs and have been selectively chosen to target a known QTL, thereby increasing the marker density of the targeted region to one marker per 42 K bp. Conclusions We have demonstrated how to quickly identify large numbers of SNPs for fine mapping of QTL regions by applying massively parallel sequencing combined with genome complexity reduction techniques. This SNP discovery approach is more efficient for targeting multiple QTL regions in a same genetic population, which can be applied to other crops. PMID:20701770
Is the extraction by Whatman FTA filter matrix technology and sequencing of large ribosomal subunit D1-D2 region sufficient for identification of clinical fungi?

PubMed

Kiraz, Nuri; Oz, Yasemin; Aslan, Huseyin; Erturan, Zayre; Ener, Beyza; Akdagli, Sevtap Arikan; Muslumanoglu, Hamza; Cetinkaya, Zafer

2015-10-01

Although conventional identification of pathogenic fungi is based on the combination of tests evaluating their morphological and biochemical characteristics, they can fail to identify the less common species or the differentiation of closely related species. In addition these tests are time consuming, labour-intensive and require experienced personnel. We evaluated the feasibility and sufficiency of DNA extraction by Whatman FTA filter matrix technology and DNA sequencing of D1-D2 region of the large ribosomal subunit gene for identification of clinical isolates of 21 yeast and 160 moulds in our clinical mycology laboratory. While the yeast isolates were identified at species level with 100% homology, 102 (63.75%) clinically important mould isolates were identified at species level, 56 (35%) isolates at genus level against fungal sequences existing in DNA databases and two (1.25%) isolates could not be identified. Consequently, Whatman FTA filter matrix technology was a useful method for extraction of fungal DNA; extremely rapid, practical and successful. Sequence analysis strategy of D1-D2 region of the large ribosomal subunit gene was found considerably sufficient in identification to genus level for the most clinical fungi. However, the identification to species level and especially discrimination of closely related species may require additional analysis. © 2015 Blackwell Verlag GmbH.
Label-free voltammetric detection of MicroRNAs at multi-channel screen printed array of electrodes comparison to graphite sensors.

PubMed

Erdem, Arzum; Congur, Gulsah

2014-01-01

The multi-channel screen-printed array of electrodes (MUX-SPE16) was used in our study for the first time for electrochemical monitoring of nucleic acid hybridization related to different miRNA sequences (miRNA-16, miRNA-15a and miRNA-660, i.e, the biomarkers for Alzheimer disease). The MUX-SPE16 was also used for the first time herein for the label-free electrochemical detection of nucleic acid hybridization combined magnetic beads (MB) assay in comparison to the disposable pencil graphite electrode (PGE). Under the principle of the magnetic beads assay, the biotinylated inosine substituted DNA probe was firstly immobilized onto streptavidin coated MB, and then, the hybridization process between probe and its complementary miRNA sequence was performed at MB surface. The voltammetric transduction was performed using differential pulse voltammetry (DPV) technique in combination with the single-use graphite sensor technologies; PGE and MUX-SPE16 for miRNA detection by measuring the guanine oxidation signal without using any external indicator. The features of single-use sensor technologies, PGE and MUX-SPE16, were discussed concerning to their reproducibility, detection limit, and selectivity compared to the results in the earlier studies presenting the electrochemical miRNA detection related to different miRNA sequences. © 2013 Elsevier B.V. All rights reserved.
Development of a Rapid Identification Method for a Variety of Antibody Candidates Using High-throughput Sequencing.

PubMed

Ito, Yuji

2017-01-01

As an alternative to hybridoma technology, the antibody phage library system can also be used for antibody selection. This method enables the isolation of antigen-specific binders through an in vitro selection process known as biopanning. While it has several advantages, such as an avoidance of animal immunization, the phage cloning and screening steps of biopanning are time-consuming and problematic. Here, we introduce a novel biopanning method combined with high-throughput sequencing (HTS) using a next-generation sequencer (NGS) to save time and effort in antibody selection, and to increase the diversity of acquired antibody sequences. Biopannings against a target antigen were performed using a human single chain Fv (scFv) antibody phage library. VH genes in pooled phages at each round of biopanning were analyzed by HTS on a NGS. The obtained data were trimmed, merged, and translated into amino acid sequences. The frequencies (%) of the respective VH sequences at each biopanning step were calculated, and the amplification factor (change of frequency through biopanning) was obtained to estimate the potential for antigen binding. A phylogenetic tree was drawn using the top 50 VH sequences with high amplification factors. Representative VH sequences forming the cluster were then picked up and used to reconstruct scFv genes harboring these VHs. Their derived scFv-Fc fusion proteins showed clear antigen binding activity. These results indicate that a combination of biopanning and HTS enables the rapid and comprehensive identification of specific binders from antibody phage libraries.

Genotype-specific signal generation based on digestion of 3-way DNA junctions: application to KRAS variation detection.

PubMed

Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike

2006-10-01

Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

PubMed

Gupta, P D

2016-10-01

In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
Biorecognition by DNA oligonucleotides after Exposure to Photoresists and Resist Removers

PubMed Central

Dean, Stacey L.; Morrow, Thomas J.; Patrick, Sue; Li, Mingwei; Clawson, Gary; Mayer, Theresa S.; Keating, Christine D.

2013-01-01

Combining biological molecules with integrated circuit technology is of considerable interest for next generation sensors and biomedical devices. Current lithographic microfabrication methods, however, were developed for compatibility with silicon technology rather than bioorganic molecules and consequently it cannot be assumed that biomolecules will remain attached and intact during on-chip processing. Here, we evaluate the effects of three common photoresists (Microposit S1800 series, PMGI SF6, and Megaposit SPR 3012) and two photoresist removers (acetone and 1165 remover) on the ability of surface-immobilized DNA oligonucleotides to selectively recognize their reverse-complementary sequence. Two common DNA immobilization methods were compared: adsorption of 5′-thiolated sequences directly to gold nanowires and covalent attachment of 5′-thiolated sequences to surface amines on silica coated nanowires. We found that acetone had deleterious effects on selective hybridization as compared to 1165 remover, presumably due to incomplete resist removal. Use of the PMGI photoresist, which involves a high temperature bake step, was detrimental to the later performance of nanowire-bound DNA in hybridization assays, especially for DNA attached via thiol adsorption. The other three photoresists did not substantially degrade DNA binding capacity or selectivity for complementary DNA sequences. To determine if the lithographic steps caused more subtle damage, we also tested oligonucleotides containing a single base mismatch. Finally, a two-step photolithographic process was developed and used in combination with dielectrophoretic nanowire assembly to produce an array of doubly-contacted, electrically isolated individual nanowire components on a chip. Post-fabrication fluorescence imaging indicated that nanowire-bound DNA was present and able to selectively bind complementary strands. PMID:23952639
Effect of hot acid hydrolysis and hot chlorine dioxide stage on bleaching effluent biodegradability.

PubMed

Gomes, C M; Colodette, J L; Delantonio, N R N; Mounteer, A H; Silva, C M

2007-01-01

The hot acid hydrolysis followed by chlorine dioxide (A/D*) and hot chlorine dioxide (D*) technologies have proven very useful for bleaching of eucalyptus kraft pulp. Although the characteristics and biodegradability of effluents from conventional chlorine dioxide bleaching are well known, such information is not yet available for effluents derived from hot acid hydrolysis and hot chorine dioxide bleaching. This study discusses the characteristics and biodegradability of such effluents. Combined whole effluents from the complete sequences DEpD, D*EpD, A/D*EpD and ADEpD, and from the pre-bleaching sequences DEp, D*Ep, A/D*Ep and ADEp were characterized by quantifying their colour, AOX and organic load (BOD, COD, TOC). These effluents were also evaluated for their treatability by simulation of an activated sludge system. It was concluded that treatment in the laboratory sequencing batch reactor was efficient for removal of COD, BOD and TOC of all effluents. However, colour increased after biological treatment, with the greatest increase found for the effluent produced using the AD technology. Biological treatment was less efficient at removing AOX of effluents from the sequences with D*, A/D* and AD as the first stages, when compared to the reference D stage; there was evidence of the lower treatability of these organochlorine compounds from these sequences.
Computational approaches to define a human milk metaglycome

PubMed Central

Agravat, Sanjay B.; Song, Xuezheng; Rojsajjakul, Teerapat; Cummings, Richard D.; Smith, David F.

2016-01-01

Motivation: The goal of deciphering the human glycome has been hindered by the lack of high-throughput sequencing methods for glycans. Although mass spectrometry (MS) is a key technology in glycan sequencing, MS alone provides limited information about the identification of monosaccharide constituents, their anomericity and their linkages. These features of individual, purified glycans can be partly identified using well-defined glycan-binding proteins, such as lectins and antibodies that recognize specific determinants within glycan structures. Results: We present a novel computational approach to automate the sequencing of glycans using metadata-assisted glycan sequencing, which combines MS analyses with glycan structural information from glycan microarray technology. Success in this approach was aided by the generation of a ‘virtual glycome’ to represent all potential glycan structures that might exist within a metaglycomes based on a set of biosynthetic assumptions using known structural information. We exploited this approach to deduce the structures of soluble glycans within the human milk glycome by matching predicted structures based on experimental data against the virtual glycome. This represents the first meta-glycome to be defined using this method and we provide a publically available web-based application to aid in sequencing milk glycans. Availability and implementation: http://glycomeseq.emory.edu Contact: sagravat@bidmc.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26803164
Molecular Biology at the Cutting Edge: A Review on CRISPR/CAS9 Gene Editing for Undergraduates

ERIC Educational Resources Information Center

Thurtle-Schmidt, Deborah M.; Lo, Te-Wen

2018-01-01

Disrupting a gene to determine its effect on an organism's phenotype is an indispensable tool in molecular biology. Such techniques are critical for understanding how a gene product contributes to the development and cellular identity of organisms. The explosion of genomic sequencing technologies combined with recent advances in genome-editing…
[Application of single nucleotide polymorphism-microarray and target gene sequencing in the study of genetic etiology of children with unexplained intellectual disability or developmental delay].

PubMed

Gao, Z J; Jiang, Q; Cheng, D Z; Yan, X X; Chen, Q; Xu, K M

2016-10-02

Objective: To evaluate the application of single nucleotide polymorphism (SNP)-microarray and target gene sequencing technology in the clinical molecular genetic diagnosis of unexplained intellectual disability(ID) or developmental delay (DD). Method: Patients with ID or DD were recruited in the Department of Neurology, Affiliated Children's Hospital of Capital Institute of Pediatrics between September 2015 and February 2016. The intellectual assessment of the patients was performed using 0-6-year-old pediatric examination table of neuropsychological development or Wechsler intelligence scale (>6 years). Patients with a DQ less than 49 or IQ less than 51 were included in this study. The patients were scanned by SNP-array for detection of genomic copy number variations (CNV), and the revealed genomic imbalance was confirmed by quantitative real time-PCR. Candidate gene mutation screening was carried out by target gene sequencing technology.Causal mutations or likely pathogenic variants were verified by polymerase chain reaction and direct sequencing. Result: There were 15 children with ID or DD enrolled, 9 males and 6 females. The age of these patients was 7 months-16 years and 9 months. SNP-array revealed that two of the 15 patients had genomic CNV. Both CNV were de novo micro deletions, one involved 11q24.1q25 and the other micro deletion located on 21q22.2q22.3. Both micro deletions were proved to have a clinical significance due to their association with ID, brain DD, unusual faces etc. by querying Decipher database. Thirteen patients with negative findings in SNP-array were consequently examined with target gene sequencing technology, genotype-phenotype correlation analysis and genetic analysis. Five patients were diagnosed with monogenic disorder, two were diagnosed with suspected genetic disorder and six were still negative. Conclusion: Sequential use of SNP-array and target gene sequencing technology can significantly increase the molecular genetic etiologic diagnosis rate of the patients with unexplained ID or DD. Combined use of these technologies can serve as a useful examinational method in assisting differential diagnosis of children with unexplained ID or DD.
Local alignment of two-base encoded DNA sequence

PubMed Central

Homer, Nils; Merriman, Barry; Nelson, Stanley F

2009-01-01

Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732
"First generation" automated DNA sequencing technology.

PubMed

Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

2011-10-01

Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

PubMed

Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

2016-02-27

In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Trade Spaces in Crewed Spacecraft Atmosphere Revitalization System Development

NASA Technical Reports Server (NTRS)

Perry, Jay L.; Bagdigian, Robert M.; Carrasquillo, Robyn L.

2010-01-01

Developing the technological response to realizing an efficient atmosphere revitalization system for future crewed spacecraft and space habitats requires identifying and describing functional trade spaces. Mission concepts and requirements dictate the necessary functions; however, the combination and sequence of those functions possess significant flexibility. Us-ing a closed loop environmental control and life support (ECLS) system architecture as a starting basis, a functional unit operations approach is developed to identify trade spaces. Generalized technological responses to each trade space are discussed. Key performance parameters that apply to functional areas are described.
Construction of a robust microarray from a non-model species (largemouth bass) using pyrosequencing technology

PubMed Central

Garcia-Reyero, Natàlia; Griffitt, Robert J.; Liu, Li; Kroll, Kevin J.; Farmerie, William G.; Barber, David S.; Denslow, Nancy D.

2009-01-01

A novel custom microarray for largemouth bass (Micropterus salmoides) was designed with sequences obtained from a normalized cDNA library using the 454 Life Sciences GS-20 pyrosequencer. This approach yielded in excess of 58 million bases of high-quality sequence. The sequence information was combined with 2,616 reads obtained by traditional suppressive subtractive hybridizations to derive a total of 31,391 unique sequences. Annotation and coding sequences were predicted for these transcripts where possible. 16,350 annotated transcripts were selected as target sequences for the design of the custom largemouth bass oligonucleotide microarray. The microarray was validated by examining the transcriptomic response in male largemouth bass exposed to 17β-œstradiol. Transcriptomic responses were assessed in liver and gonad, and indicated gene expression profiles typical of exposure to œstradiol. The results demonstrate the potential to rapidly create the tools necessary to assess large scale transcriptional responses in non-model species, paving the way for expanded impact of toxicogenomics in ecotoxicology. PMID:19936325
High quality de novo sequencing and assembly of the Saccharomyces arboricolus genome

PubMed Central

2013-01-01

Background Comparative genomics is a formidable tool to identify functional elements throughout a genome. In the past ten years, studies in the budding yeast Saccharomyces cerevisiae and a set of closely related species have been instrumental in showing the benefit of analyzing patterns of sequence conservation. Increasing the number of closely related genome sequences makes the comparative genomics approach more powerful and accurate. Results Here, we report the genome sequence and analysis of Saccharomyces arboricolus, a yeast species recently isolated in China, that is closely related to S. cerevisiae. We obtained high quality de novo sequence and assemblies using a combination of next generation sequencing technologies, established the phylogenetic position of this species and considered its phenotypic profile under multiple environmental conditions in the light of its gene content and phylogeny. Conclusions We suggest that the genome of S. arboricolus will be useful in future comparative genomics analysis of the Saccharomyces sensu stricto yeasts. PMID:23368932
Implementing Genome-Driven Oncology

PubMed Central

Hyman, David M.; Taylor, Barry S.; Baselga, José

2017-01-01

Early successes in identifying and targeting individual oncogenic drivers, together with the increasing feasibility of sequencing tumor genomes, have brought forth the promise of genome-driven oncology care. As we expand the breadth and depth of genomic analyses, the biological and clinical complexity of its implementation will be unparalleled. Challenges include target credentialing and validation, implementing drug combinations, clinical trial designs, targeting tumor heterogeneity, and deploying technologies beyond DNA sequencing, among others. We review how contemporary approaches are tackling these challenges and will ultimately serve as an engine for biological discovery and increase our insight into cancer and its treatment. PMID:28187282
The mitogenome of Onchocerca volvulus from the Brazilian Amazonia focus.

PubMed

Crainey, James L; Silva, Túllio R R da; Encinas, Fernando; Marín, Michel A; Vicente, Ana Carolina P; Luz, Sérgio L B

2016-01-01

We report here the first complete mitochondria genome of Onchocerca volvulus from a focus outside of Africa. An O. volvulus mitogenome from the Brazilian Amazonia focus was obtained using a combination of high-throughput and Sanger sequencing technologies. Comparisons made between this mitochondrial genome and publicly available mitochondrial sequences identified 46 variant nucleotide positions and suggested that our Brazilian mitogenome is more closely related to Cameroon-origin mitochondria than West African-origin mitochondria. As well as providing insights into the origins of Latin American onchocerciasis, the Brazilian Amazonia focus mitogenome may also have value as an epidemiological resource.
[Using exon combined target region capture sequencing chip to detect the disease-causing genes of retinitis pigmentosa].

PubMed

Rong, Weining; Chen, Xuejuan; Li, Huiping; Liu, Yani; Sheng, Xunlun

2014-06-01

To detect the disease-causing genes of 10 retinitis pigmentosa pedigrees by using exon combined target region capture sequencing chip. Pedigree investigation study. From October 2010 to December 2013, 10 RP pedigrees were recruited for this study in Ningxia Eye Hospital. All the patients and family members received complete ophthalmic examinations. DNA was abstracted from patients, family members and controls. Using exon combined target region capture sequencing chip to screen the candidate disease-causing mutations. Polymerase chain reaction (PCR) and direct sequencing were used to confirm the disease-causing mutations. Seventy patients and 23 normal family members were recruited from 10 pedigrees. Among 10 RP pedigrees, 1 was autosomal dominant pedigrees and 9 were autosomal recessive pedigrees. 7 mutations related to 5 genes of 5 pedigrees were detected. A frameshift mutation on BBS7 gene was detected in No.2 pedigree, the patients of this pedigree combined with central obesity, polydactyly and mental handicap. No.2 pedigree was diagnosed as Bardet-Biedl syndrome finally. A missense mutation was detected in No.7 and No.10 pedigrees respectively. Because the patients suffered deafness meanwhile, the final diagnosis was Usher syndrome. A missense mutation on C3 gene related to age-related macular degeneration was also detected in No. 7 pedigrees. A nonsense mutation and a missense mutation on CRB1 gene were detected in No. 1 pedigree and a splicesite mutation on PROM1 gene was detected in No. 5 pedigree. Retinitis pigmentosa is a kind of genetic eye disease with diversity clinical phenotypes. Rapid and effective genetic diagnosis technology combined with clinical characteristics analysis is helpful to improve the level of clinical diagnosis of RP.
A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome

DOE PAGES

Chapman, Jarrod A.; Mascher, Martin; Buluc, Aydin; ...

2015-01-31

We report that polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible tomore » construct a mapping population.« less
A whole-genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chapman, Jarrod A.; Mascher, Martin; Buluc, Aydin

We report that polyploid species have long been thought to be recalcitrant to whole-genome assembly. By combining high-throughput sequencing, recent developments in parallel computing, and genetic mapping, we derive, de novo, a sequence assembly representing 9.1 Gbp of the highly repetitive 16 Gbp genome of hexaploid wheat, Triticum aestivum, and assign 7.1 Gb of this assembly to chromosomal locations. The genome representation and accuracy of our assembly is comparable or even exceeds that of a chromosome-by-chromosome shotgun assembly. Our assembly and mapping strategy uses only short read sequencing technology and is applicable to any species where it is possible tomore » construct a mapping population.« less
Assembly of highly repetitive genomes using short reads: the genome of discrete typing unit III Trypanosoma cruzi strain 231.

PubMed

Baptista, Rodrigo P; Reis-Cunha, Joao Luis; DeBarry, Jeremy D; Chiari, Egler; Kissinger, Jessica C; Bartholomeu, Daniella C; Macedo, Andrea M

2018-02-14

Next-generation sequencing (NGS) methods are low-cost high-throughput technologies that produce thousands to millions of sequence reads. Despite the high number of raw sequence reads, their short length, relative to Sanger, PacBio or Nanopore reads, complicates the assembly of genomic repeats. Many genome tools are available, but the assembly of highly repetitive genome sequences using only NGS short reads remains challenging. Genome assembly of organisms responsible for important neglected diseases such as Trypanosoma cruzi, the aetiological agent of Chagas disease, is known to be challenging because of their repetitive nature. Only three of six recognized discrete typing units (DTUs) of the parasite have their draft genomes published and therefore genome evolution analyses in the taxon are limited. In this study, we developed a computational workflow to assemble highly repetitive genomes via a combination of de novo and reference-based assembly strategies to better overcome the intrinsic limitations of each, based on Illumina reads. The highly repetitive genome of the human-infecting parasite T. cruzi 231 strain was used as a test subject. The combined-assembly approach shown in this study benefits from the reference-based assembly ability to resolve highly repetitive sequences and from the de novo capacity to assemble genome-specific regions, improving the quality of the assembly. The acceptable confidence obtained by analyzing our results showed that our combined approach is an attractive option to assemble highly repetitive genomes with NGS short reads. Phylogenomic analysis including the 231 strain, the first representative of DTU III whose genome was sequenced, was also performed and provides new insights into T. cruzi genome evolution.
Improved Annotation of 3′ Untranslated Regions and Complex Loci by Combination of Strand-Specific Direct RNA Sequencing, RNA-Seq and ESTs

PubMed Central

Song, Junfang; Duc, Céline; Storey, Kate G.; McLean, W. H. Irwin; Brown, Sara J.; Simpson, Gordon G.; Barton, Geoffrey J.

2014-01-01

The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3′ untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3′ polyadenylation sites to within +/− 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3′ UTR re-annotation (including extension of one 3′ UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data. PMID:24722185

The Widening Gulf between Genomics Data Generation and Consumption: A Practical Guide to Big Data Transfer Technology

PubMed Central

Feltus, Frank A.; Breen, Joseph R.; Deng, Juan; Izard, Ryan S.; Konger, Christopher A.; Ligon, Walter B.; Preuss, Don; Wang, Kuang-Ching

2015-01-01

In the last decade, high-throughput DNA sequencing has become a disruptive technology and pushed the life sciences into a distributed ecosystem of sequence data producers and consumers. Given the power of genomics and declining sequencing costs, biology is an emerging “Big Data” discipline that will soon enter the exabyte data range when all subdisciplines are combined. These datasets must be transferred across commercial and research networks in creative ways since sending data without thought can have serious consequences on data processing time frames. Thus, it is imperative that biologists, bioinformaticians, and information technology engineers recalibrate data processing paradigms to fit this emerging reality. This review attempts to provide a snapshot of Big Data transfer across networks, which is often overlooked by many biologists. Specifically, we discuss four key areas: 1) data transfer networks, protocols, and applications; 2) data transfer security including encryption, access, firewalls, and the Science DMZ; 3) data flow control with software-defined networking; and 4) data storage, staging, archiving and access. A primary intention of this article is to orient the biologist in key aspects of the data transfer process in order to frame their genomics-oriented needs to enterprise IT professionals. PMID:26568680
Efficient sequence-specific isolation of DNA fragments and chromatin by in vitro enChIP technology using recombinant CRISPR ribonucleoproteins.

PubMed

Fujita, Toshitsugu; Yuno, Miyuki; Fujii, Hodaka

2016-04-01

The clustered regularly interspaced short palindromic repeats (CRISPR) system is widely used for various biological applications, including genome editing. We developed engineered DNA-binding molecule-mediated chromatin immunoprecipitation (enChIP) using CRISPR to isolate target genomic regions from cells for their biochemical characterization. In this study, we developed 'in vitro enChIP' using recombinant CRISPR ribonucleoproteins (RNPs) to isolate target genomic regions. in vitro enChIP has the great advantage over conventional enChIP of not requiring expression of CRISPR complexes in cells. We first showed that in vitro enChIP using recombinant CRISPR RNPs can be used to isolate target DNA from mixtures of purified DNA in a sequence-specific manner. In addition, we showed that this technology can be used to efficiently isolate target genomic regions, while retaining their intracellular molecular interactions, with negligible contamination from irrelevant genomic regions. Thus, in vitro enChIP technology is of potential use for sequence-specific isolation of DNA, as well as for identification of molecules interacting with genomic regions of interest in vivo in combination with downstream analysis. © 2016 The Authors. Genes to Cells published by Molecular Biology Society of Japan and John Wiley & Sons Australia, Ltd.
Use of low-coverage, large-insert, short-read data for rapid and accurate generation of enhanced-quality draft Pseudomonas genome sequences.

PubMed

O'Brien, Heath E; Gong, Yunchen; Fung, Pauline; Wang, Pauline W; Guttman, David S

2011-01-01

Next-generation genomic technology has both greatly accelerated the pace of genome research as well as increased our reliance on draft genome sequences. While groups such as the Genomics Standards Consortium have made strong efforts to promote genome standards there is a still a general lack of uniformity among published draft genomes, leading to challenges for downstream comparative analyses. This lack of uniformity is a particular problem when using standard draft genomes that frequently have large numbers of low-quality sequencing tracts. Here we present a proposal for an "enhanced-quality draft" genome that identifies at least 95% of the coding sequences, thereby effectively providing a full accounting of the genic component of the genome. Enhanced-quality draft genomes are easily attainable through a combination of small- and large-insert next-generation, paired-end sequencing. We illustrate the generation of an enhanced-quality draft genome by re-sequencing the plant pathogenic bacterium Pseudomonas syringae pv. phaseolicola 1448A (Pph 1448A), which has a published, closed genome sequence of 5.93 Mbp. We use a combination of Illumina paired-end and mate-pair sequencing, and surprisingly find that de novo assemblies with 100x paired-end coverage and mate-pair sequencing with as low as low as 2-5x coverage are substantially better than assemblies based on higher coverage. The rapid and low-cost generation of large numbers of enhanced-quality draft genome sequences will be of particular value for microbial diagnostics and biosecurity, which rely on precise discrimination of potentially dangerous clones from closely related benign strains.
Aircraft Loss-of-Control: Analysis and Requirements for Future Safety-Critical Systems and Their Validation

NASA Technical Reports Server (NTRS)

Belcastro, Christine M.

2011-01-01

Loss of control remains one of the largest contributors to fatal aircraft accidents worldwide. Aircraft loss-of-control accidents are complex, resulting from numerous causal and contributing factors acting alone or more often in combination. Hence, there is no single intervention strategy to prevent these accidents. This paper summarizes recent analysis results in identifying worst-case combinations of loss-of-control accident precursors and their time sequences, a holistic approach to preventing loss-of-control accidents in the future, and key requirements for validating the associated technologies.
SNiPlay: a web-based tool for detection, management and analysis of SNPs. Application to grapevine diversity projects.

PubMed

Dereeper, Alexis; Nicolas, Stéphane; Le Cunff, Loïc; Bacilieri, Roberto; Doligez, Agnès; Peros, Jean-Pierre; Ruiz, Manuel; This, Patrice

2011-05-05

High-throughput re-sequencing, new genotyping technologies and the availability of reference genomes allow the extensive characterization of Single Nucleotide Polymorphisms (SNPs) and insertion/deletion events (indels) in many plant species. The rapidly increasing amount of re-sequencing and genotyping data generated by large-scale genetic diversity projects requires the development of integrated bioinformatics tools able to efficiently manage, analyze, and combine these genetic data with genome structure and external data. In this context, we developed SNiPlay, a flexible, user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis. It integrates:1) a pipeline, freely accessible through the internet, combining existing softwares with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data. From standard sequence alignments, genotyping data or Sanger sequencing traces given as input, SNiPlay detects SNPs and indels events and outputs submission files for the design of Illumina's SNP chips. Subsequently, it sends sequences and genotyping data into a series of modules in charge of various processes: physical mapping to a reference genome, annotation (genomic position, intron/exon location, synonymous/non-synonymous substitutions), SNP frequency determination in user-defined groups, haplotype reconstruction and network, linkage disequilibrium evaluation, and diversity analysis (Pi, Watterson's Theta, Tajima's D).Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.2) a database storing polymorphisms, genotyping data and grapevine sequences released by public and private projects. It allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency), to compare SNP patterns between populations, and to export genotyping data or sequences in various formats. Our experiments on grapevine genetic projects showed that SNiPlay allows geneticists to rapidly obtain advanced results in several key research areas of plant genetic diversity. Both the management and treatment of large amounts of SNP data are rendered considerably easier for end-users through automation and integration. Current developments are taking into account new advances in high-throughput technologies.SNiPlay is available at: http://sniplay.cirad.fr/.
What are Whole Exome Sequencing and Whole Genome Sequencing?

MedlinePlus

... the future. For more information about DNA sequencing technologies and their use: Genetics Home Reference discusses whether ... University in St. Louis describes the different sequencing technologies and what the new technologies have meant for ...
Computer-Aided Detection of Prostate Cancer with MRI: Technology and Applications

PubMed Central

Liu, Lizhi; Tian, Zhiqiang; Zhang, Zhenfeng; Fei, Baowei

2016-01-01

One in six men will develop prostate cancer in his life time. Early detection and accurate diagnosis of the disease can improve cancer survival and reduce treatment costs. Recently, imaging of prostate cancer has greatly advanced since the introduction of multi-parametric magnetic resonance imaging (mp-MRI). Mp-MRI consists of T2-weighted sequences combined with functional sequences including dynamic contrast-enhanced MRI, diffusion-weighted MRI, and MR spectroscopy imaging. Due to the big data and variations in imaging sequences, detection can be affected by multiple factors such as observer variability and visibility and complexity of the lesions. In order to improve quantitative assessment of the disease, various computer-aided detection systems have been designed to help radiologists in their clinical practice. This review paper presents an overview of literatures on computer-aided detection of prostate cancer with mp-MRI, which include the technology and its applications. The aim of the survey is threefold: an introduction for those new to the field, an overview for those working in the field, and a reference for those searching for literature on a specific application. PMID:27133005
OTG-snpcaller: an optimized pipeline based on TMAP and GATK for SNP calling from ion torrent data.

PubMed

Zhu, Pengyuan; He, Lingyu; Li, Yaqiao; Huang, Wenpan; Xi, Feng; Lin, Lin; Zhi, Qihuan; Zhang, Wenwei; Tang, Y Tom; Geng, Chunyu; Lu, Zhiyuan; Xu, Xun

2014-01-01

Because the new Proton platform from Life Technologies produced markedly different data from those of the Illumina platform, the conventional Illumina data analysis pipeline could not be used directly. We developed an optimized SNP calling method using TMAP and GATK (OTG-snpcaller). This method combined our own optimized processes, Remove Duplicates According to AS Tag (RDAST) and Alignment Optimize Structure (AOS), together with TMAP and GATK, to call SNPs from Proton data. We sequenced four sets of exomes captured by Agilent SureSelect and NimbleGen SeqCap EZ Kit, using Life Technology's Ion Proton sequencer. Then we applied OTG-snpcaller and compared our results with the results from Torrent Variants Caller. The results indicated that OTG-snpcaller can reduce both false positive and false negative rates. Moreover, we compared our results with Illumina results generated by GATK best practices, and we found that the results of these two platforms were comparable. The good performance in variant calling using GATK best practices can be primarily attributed to the high quality of the Illumina sequences.
HLA genotyping by next-generation sequencing of complementary DNA.

PubMed

Segawa, Hidenobu; Kukita, Yoji; Kato, Kikuya

2017-11-28

Genotyping of the human leucocyte antigen (HLA) is indispensable for various medical treatments. However, unambiguous genotyping is technically challenging due to high polymorphism of the corresponding genomic region. Next-generation sequencing is changing the landscape of genotyping. In addition to high throughput of data, its additional advantage is that DNA templates are derived from single molecules, which is a strong merit for the phasing problem. Although most currently developed technologies use genomic DNA, use of cDNA could enable genotyping with reduced costs in data production and analysis. We thus developed an HLA genotyping system based on next-generation sequencing of cDNA. Each HLA gene was divided into 3 or 4 target regions subjected to PCR amplification and subsequent sequencing with Ion Torrent PGM. The sequence data were then subjected to an automated analysis. The principle of the analysis was to construct candidate sequences generated from all possible combinations of variable bases and arrange them in decreasing order of the number of reads. Upon collecting candidate sequences from all target regions, 2 haplotypes were usually assigned. Cases not assigned 2 haplotypes were forwarded to 4 additional processes: selection of candidate sequences applying more stringent criteria, removal of artificial haplotypes, selection of candidate sequences with a relaxed threshold for sequence matching, and countermeasure for incomplete sequences in the HLA database. The genotyping system was evaluated using 30 samples; the overall accuracy was 97.0% at the field 3 level and 98.3% at the G group level. With one sample, genotyping of DPB1 was not completed due to short read size. We then developed a method for complete sequencing of individual molecules of the DPB1 gene, using the molecular barcode technology. The performance of the automatic genotyping system was comparable to that of systems developed in previous studies. Thus, next-generation sequencing of cDNA is a viable option for HLA genotyping.
Rapid sequencing of the bamboo mitochondrial genome using Illumina technology and parallel episodic evolution of organelle genomes in grasses.

PubMed

Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

2012-01-01

Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects.
Rapid Sequencing of the Bamboo Mitochondrial Genome Using Illumina Technology and Parallel Episodic Evolution of Organelle Genomes in Grasses

PubMed Central

Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

2012-01-01

Background Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. Methodology/Principal Findings We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Conclusions/Significance Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast genomes in grasses is consistent with lineage effects. PMID:22272330
Sequencing Technologies Panel at SFAF

DOE Office of Scientific and Technical Information (OSTI.GOV)

Turner, Steve; Fiske, Haley; Knight, Jim

2010-06-02

From left to right: Steve Turner of Pacific Biosciences, Haley Fiske of Illumina, Jim Knight of Roche, Michael Rhodes of Life Technologies and Peter Vander Horn of Life Technologies' Single Molecule Sequencing group discuss new sequencing technologies and applications on June 2, 2010 at the "Sequencing, Finishing, Analysis in the Future" meeting in Santa Fe, NM
Absolute quantification of microbial taxon abundances.

PubMed

Props, Ruben; Kerckhof, Frederiek-Maarten; Rubbens, Peter; De Vrieze, Jo; Hernandez Sanabria, Emma; Waegeman, Willem; Monsieurs, Pieter; Hammes, Frederik; Boon, Nico

2017-02-01

High-throughput amplicon sequencing has become a well-established approach for microbial community profiling. Correlating shifts in the relative abundances of bacterial taxa with environmental gradients is the goal of many microbiome surveys. As the abundances generated by this technology are semi-quantitative by definition, the observed dynamics may not accurately reflect those of the actual taxon densities. We combined the sequencing approach (16S rRNA gene) with robust single-cell enumeration technologies (flow cytometry) to quantify the absolute taxon abundances. A detailed longitudinal analysis of the absolute abundances resulted in distinct abundance profiles that were less ambiguous and expressed in units that can be directly compared across studies. We further provide evidence that the enrichment of taxa (increase in relative abundance) does not necessarily relate to the outgrowth of taxa (increase in absolute abundance). Our results highlight that both relative and absolute abundances should be considered for a comprehensive biological interpretation of microbiome surveys.
Evaluation of 16S Rrna amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

USDA-ARS?s Scientific Manuscript database

Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...
Evaluation of 16S rRNA amplicon sequencing using two next-generation sequencing technologies for phylogenetic analysis of the rumen bacterial community in steers

USDA-ARS?s Scientific Manuscript database

Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...
Food Safety in the Age of Next Generation Sequencing, Bioinformatics, and Open Data Access.

PubMed

Taboada, Eduardo N; Graham, Morag R; Carriço, João A; Van Domselaar, Gary

2017-01-01

Public health labs and food regulatory agencies globally are embracing whole genome sequencing (WGS) as a revolutionary new method that is positioned to replace numerous existing diagnostic and microbial typing technologies with a single new target: the microbial draft genome. The ability to cheaply generate large amounts of microbial genome sequence data, combined with emerging policies of food regulatory and public health institutions making their microbial sequences increasingly available and public, has served to open up the field to the general scientific community. This open data access policy shift has resulted in a proliferation of data being deposited into sequence repositories and of novel bioinformatics software designed to analyze these vast datasets. There also has been a more recent drive for improved data sharing to achieve more effective global surveillance, public health and food safety. Such developments have heightened the need for enhanced analytical systems in order to process and interpret this new type of data in a timely fashion. In this review we outline the emergence of genomics, bioinformatics and open data in the context of food safety. We also survey major efforts to translate genomics and bioinformatics technologies out of the research lab and into routine use in modern food safety labs. We conclude by discussing the challenges and opportunities that remain, including those expected to play a major role in the future of food safety science.
Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

PubMed Central

Vembar, Shruthi Sridhar; Seetin, Matthew; Lambert, Christine; Nattestad, Maria; Schatz, Michael C.; Baybayan, Primo; Scherf, Artur; Smith, Melissa Laird

2016-01-01

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. PMID:27345719
Next generation sequencing and its applications in forensic genetics.

PubMed

Børsting, Claus; Morling, Niels

2015-09-01

It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Top-down analysis of protein samples by de novo sequencing techniques

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vyatkina, Kira; Wu, Si; Dekker, Lennard J. M.

MOTIVATION: Recent technological advances have made high-resolution mass spectrometers affordable to many laboratories, thus boosting rapid development of top-down mass spectrometry, and implying a need in efficient methods for analyzing this kind of data. RESULTS: We describe a method for analysis of protein samples from top-down tandem mass spectrometry data, which capitalizes on de novo sequencing of fragments of the proteins present in the sample. Our algorithm takes as input a set of de novo amino acid strings derived from the given mass spectra using the recently proposed Twister approach, and combines them into aggregated strings endowed with offsets. Themore » former typically constitute accurate sequence fragments of sufficiently well-represented proteins from the sample being analyzed, while the latter indicate their location in the protein sequence, and also bear information on post-translational modifications and fragmentation patterns.« less
Rapid, simple and direct detection of Meloidogyne hapla from infected root galls using loop-mediated isothermal amplification combined with FTA technology

PubMed Central

Peng, Huan; Long, Haibo; Huang, Wenkun; Liu, Jing; Cui, Jiangkuan; Kong, Lingan; Hu, Xianqi; Gu, Jianfeng; Peng, Deliang

2017-01-01

The northern root-knot nematode (Meloidogyne hapla) is a damaging nematode that has caused serious economic losses worldwide. In the present study, a sensitive, simple and rapid method was developed for detection of M. hapla in infested plant roots by combining a Flinders Technology Associates (FTA) card with loop-mediated isothermal amplification (LAMP). The specific primers of LAMP were designed based on the distinction of internal transcribed spacer (ITS) sequences between M. hapla and other Meloidogyne spp. The LAMP assay can detect nematode genomic DNA at concentrations low to 1/200 000, which is 100 times more sensitive than conventional PCR. The LAMP was able to highly specifically distinguish M. hapla from other closely related nematode species. Furthermore, the advantages of the FTA-LAMP assay to detect M. hapla were demonstrated by assaying infected root galls that were artificially inoculated. In addition, M. hapla was successfully detected from six of forty-two field samples using FTA-LAMP technology. This study was the first to provide a simple diagnostic assay for M. hapla using the LAMP assay combined with FTA technology. In conclusion, the new FTA-LAMP assay has the potential for diagnosing infestation in the field and managing the pathogen M. hapla. PMID:28368036

Rapid, simple and direct detection of Meloidogyne hapla from infected root galls using loop-mediated isothermal amplification combined with FTA technology.

PubMed

Peng, Huan; Long, Haibo; Huang, Wenkun; Liu, Jing; Cui, Jiangkuan; Kong, Lingan; Hu, Xianqi; Gu, Jianfeng; Peng, Deliang

2017-04-03

The northern root-knot nematode (Meloidogyne hapla) is a damaging nematode that has caused serious economic losses worldwide. In the present study, a sensitive, simple and rapid method was developed for detection of M. hapla in infested plant roots by combining a Flinders Technology Associates (FTA) card with loop-mediated isothermal amplification (LAMP). The specific primers of LAMP were designed based on the distinction of internal transcribed spacer (ITS) sequences between M. hapla and other Meloidogyne spp. The LAMP assay can detect nematode genomic DNA at concentrations low to 1/200 000, which is 100 times more sensitive than conventional PCR. The LAMP was able to highly specifically distinguish M. hapla from other closely related nematode species. Furthermore, the advantages of the FTA-LAMP assay to detect M. hapla were demonstrated by assaying infected root galls that were artificially inoculated. In addition, M. hapla was successfully detected from six of forty-two field samples using FTA-LAMP technology. This study was the first to provide a simple diagnostic assay for M. hapla using the LAMP assay combined with FTA technology. In conclusion, the new FTA-LAMP assay has the potential for diagnosing infestation in the field and managing the pathogen M. hapla.
Genome Engineering and Agriculture: Opportunities and Challenges.

PubMed

Baltes, Nicholas J; Gil-Humanes, Javier; Voytas, Daniel F

2017-01-01

In recent years, plant biotechnology has witnessed unprecedented technological change. Advances in high-throughput sequencing technologies have provided insight into the location and structure of functional elements within plant DNA. At the same time, improvements in genome engineering tools have enabled unprecedented control over genetic material. These technologies, combined with a growing understanding of plant systems biology, will irrevocably alter the way we create new crop varieties. As the first wave of genome-edited products emerge, we are just getting a glimpse of the immense opportunities the technology provides. We are also seeing its challenges and limitations. It is clear that genome editing will play an increased role in crop improvement and will help us to achieve food security in the coming decades; however, certain challenges and limitations must be overcome to realize the technology's full potential. © 2017 Elsevier Inc. All rights reserved.
Identification of the Bovine Arachnomelia Mutation by Massively Parallel Sequencing Implicates Sulfite Oxidase (SUOX) in Bone Development

PubMed Central

Drögemüller, Cord; Tetens, Jens; Sigurdsson, Snaevar; Gentile, Arcangelo; Testoni, Stefania; Lindblad-Toh, Kerstin; Leeb, Tosso

2010-01-01

Arachnomelia is a monogenic recessive defect of skeletal development in cattle. The causative mutation was previously mapped to a ∼7 Mb interval on chromosome 5. Here we show that array-based sequence capture and massively parallel sequencing technology, combined with the typical family structure in livestock populations, facilitates the identification of the causative mutation. We re-sequenced the entire critical interval in a healthy partially inbred cow carrying one copy of the critical chromosome segment in its ancestral state and one copy of the same segment with the arachnomelia mutation, and we detected a single heterozygous position. The genetic makeup of several partially inbred cattle provides extremely strong support for the causality of this mutation. The mutation represents a single base insertion leading to a premature stop codon in the coding sequence of the SUOX gene and is perfectly associated with the arachnomelia phenotype. Our findings suggest an important role for sulfite oxidase in bone development. PMID:20865119
In vitro manipulation of gene expression in larval Schistosoma: a model for postgenomic approaches in Trematoda

PubMed Central

YOSHINO, TIMOTHY P.; DINGUIRARD, NATHALIE; DE MORAES MOURÃO, MARINA

2013-01-01

SUMMARY With rapid developments in DNA and protein sequencing technologies, combined with powerful bioinformatics tools, a continued acceleration of gene identification in parasitic helminths is predicted, potentially leading to discovery of new drug and vaccine targets, enhanced diagnostics and insights into the complex biology underlying host-parasite interactions. For the schistosome blood flukes, with the recent completion of genome sequencing and comprehensive transcriptomic datasets, there has accumulated massive amounts of gene sequence data, for which, in the vast majority of cases, little is known about actual functions within the intact organism. In this review we attempt to bring together traditional in vitro cultivation approaches and recent emergent technologies of molecular genomics, transcriptomics and genetic manipulation to illustrate the considerable progress made in our understanding of trematode gene expression and function during development of the intramolluscan larval stages. Using several prominent trematode families (Schistosomatidae, Fasciolidae, Echinostomatidae), we have focused on the current status of in vitro larval isolation/cultivation as a source of valuable raw material supporting gene discovery efforts in model digeneans that include whole genome sequencing, transcript and protein expression profiling during larval development, and progress made in the in vitro manipulation of genes and their expression in larval trematodes using transgenic and RNA interference (RNAi) approaches. PMID:19961646
The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection

PubMed Central

Jiang, Yue; Turinsky, Andrei L.; Brudno, Michael

2015-01-01

With the development of High-Throughput Sequencing (HTS) thousands of human genomes have now been sequenced. Whenever different studies analyze the same genome they usually agree on the amount of single-nucleotide polymorphisms, but differ dramatically on the number of insertion and deletion variants (indels). Furthermore, there is evidence that indels are often severely under-reported. In this manuscript we derive the total number of indel variants in a human genome by combining data from different sequencing technologies, while assessing the indel detection accuracy. Our estimate of approximately 1 million indels in a Yoruban genome is much higher than the results reported in several recent HTS studies. We identify two key sources of difficulties in indel detection: the insufficient coverage, read length or alignment quality; and the presence of repeats, including short interspersed elements and homopolymers/dimers. We quantify the effect of these factors on indel detection. The quality of sequencing data plays a major role in improving indel detection by HTS methods. However, many indels exist in long homopolymers and repeats, where their detection is severely impeded. The true number of indel events is likely even higher than our current estimates, and new techniques and technologies will be required to detect them. PMID:26130710
Identification of sequence-structure RNA binding motifs for SELEX-derived aptamers.

PubMed

Hoinka, Jan; Zotenko, Elena; Friedman, Adam; Sauna, Zuben E; Przytycka, Teresa M

2012-06-15

Systematic Evolution of Ligands by EXponential Enrichment (SELEX) represents a state-of-the-art technology to isolate single-stranded (ribo)nucleic acid fragments, named aptamers, which bind to a molecule (or molecules) of interest via specific structural regions induced by their sequence-dependent fold. This powerful method has applications in designing protein inhibitors, molecular detection systems, therapeutic drugs and antibody replacement among others. However, full understanding and consequently optimal utilization of the process has lagged behind its wide application due to the lack of dedicated computational approaches. At the same time, the combination of SELEX with novel sequencing technologies is beginning to provide the data that will allow the examination of a variety of properties of the selection process. To close this gap we developed, Aptamotif, a computational method for the identification of sequence-structure motifs in SELEX-derived aptamers. To increase the chances of identifying functional motifs, Aptamotif uses an ensemble-based approach. We validated the method using two published aptamer datasets containing experimentally determined motifs of increasing complexity. We were able to recreate the author's findings to a high degree, thus proving the capability of our approach to identify binding motifs in SELEX data. Additionally, using our new experimental dataset, we illustrate the application of Aptamotif to elucidate several properties of the selection process.
Next-generation sequencing: the future of molecular genetics in poultry production and food safety.

PubMed

Diaz-Sanchez, S; Hanning, I; Pendleton, Sean; D'Souza, Doris

2013-02-01

The era of molecular biology and automation of the Sanger chain-terminator sequencing method has led to discovery and advances in diagnostics and biotechnology. The Sanger methodology dominated research for over 2 decades, leading to significant accomplishments and technological improvements in DNA sequencing. Next-generation high-throughput sequencing (HT-NGS) technologies were developed subsequently to overcome the limitations of this first generation technology that include higher speed, less labor, and lowered cost. Various platforms developed include sequencing-by-synthesis 454 Life Sciences, Illumina (Solexa) sequencing, SOLiD sequencing (among others), and the Ion Torrent semiconductor sequencing technologies that use different detection principles. As technology advances, progress made toward third generation sequencing technologies are being reported, which include Nanopore Sequencing and real-time monitoring of PCR activity through fluorescent resonant energy transfer. The advantages of these technologies include scalability, simplicity, with increasing DNA polymerase performance and yields, being less error prone, and even more economically feasible with the eventual goal of obtaining real-time results. These technologies can be directly applied to improve poultry production and enhance food safety. For example, sequence-based (determination of the gut microbial community, genes for metabolic pathways, or presence of plasmids) and function-based (screening for function such as antibiotic resistance, or vitamin production) metagenomic analysis can be carried out. Gut microbialflora/communities of poultry can be sequenced to determine the changes that affect health and disease along with efficacy of methods to control pathogenic growth. Thus, the purpose of this review is to provide an overview of the principles of these current technologies and their potential application to improve poultry production and food safety as well as public health.
The Complete Chloroplast Genome Sequences of Five Epimedium Species: Lights into Phylogenetic and Taxonomic Analyses

PubMed Central

Zhang, Yanjun; Du, Liuwen; Liu, Ao; Chen, Jianjun; Wu, Li; Hu, Weiming; Zhang, Wei; Kim, Kyunghee; Lee, Sang-Choon; Yang, Tae-Jin; Wang, Ying

2016-01-01

Epimedium L. is a phylogenetically and economically important genus in the family Berberidaceae. We here sequenced the complete chloroplast (cp) genomes of four Epimedium species using Illumina sequencing technology via a combination of de novo and reference-guided assembly, which was also the first comprehensive cp genome analysis on Epimedium combining the cp genome sequence of E. koreanum previously reported. The five Epimedium cp genomes exhibited typical quadripartite and circular structure that was rather conserved in genomic structure and the synteny of gene order. However, these cp genomes presented obvious variations at the boundaries of the four regions because of the expansion and contraction of the inverted repeat (IR) region and the single-copy (SC) boundary regions. The trnQ-UUG duplication occurred in the five Epimedium cp genomes, which was not found in the other basal eudicotyledons. The rapidly evolving cp genome regions were detected among the five cp genomes, as well as the difference of simple sequence repeats (SSR) and repeat sequence were identified. Phylogenetic relationships among the five Epimedium species based on their cp genomes showed accordance with the updated system of the genus on the whole, but reminded that the evolutionary relationships and the divisions of the genus need further investigation applying more evidences. The availability of these cp genomes provided valuable genetic information for accurately identifying species, taxonomy and phylogenetic resolution and evolution of Epimedium, and assist in exploration and utilization of Epimedium plants. PMID:27014326
The application of the high throughput sequencing technology in the transposable elements.

PubMed

Liu, Zhen; Xu, Jian-hong

2015-09-01

High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.
Single-cell sequencing technologies: current and future.

PubMed

Liang, Jialong; Cai, Wanshi; Sun, Zhongsheng

2014-10-20

Intensively developed in the last few years, single-cell sequencing technologies now present numerous advantages over traditional sequencing methods for solving the problems of biological heterogeneity and low quantities of available biological materials. The application of single-cell sequencing technologies has profoundly changed our understanding of a series of biological phenomena, including gene transcription, embryo development, and carcinogenesis. However, before single-cell sequencing technologies can be used extensively, researchers face the serious challenge of overcoming inherent issues of high amplification bias, low accuracy and reproducibility. Here, we simply summarize the techniques used for single-cell isolation, and review the current technologies used in single-cell genomic, transcriptomic, and epigenomic sequencing. We discuss the merits, defects, and scope of application of single-cell sequencing technologies and then speculate on the direction of future developments. Copyright © 2014 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
Aligner optimization increases accuracy and decreases compute times in multi-species sequence data.

PubMed

Robinson, Kelly M; Hawkins, Aziah S; Santana-Cruz, Ivette; Adkins, Ricky S; Shetty, Amol C; Nagaraj, Sushma; Sadzewicz, Lisa; Tallon, Luke J; Rasko, David A; Fraser, Claire M; Mahurkar, Anup; Silva, Joana C; Dunning Hotopp, Julie C

2017-09-01

As sequencing technologies have evolved, the tools to analyze these sequences have made similar advances. However, for multi-species samples, we observed important and adverse differences in alignment specificity and computation time for bwa- mem (Burrows-Wheeler aligner-maximum exact matches) relative to bwa-aln. Therefore, we sought to optimize bwa-mem for alignment of data from multi-species samples in order to reduce alignment time and increase the specificity of alignments. In the multi-species cases examined, there was one majority member (i.e. Plasmodium falciparum or Brugia malayi ) and one minority member (i.e. human or the Wolbachia endosymbiont w Bm) of the sequence data. Increasing bwa-mem seed length from the default value reduced the number of read pairs from the majority sequence member that incorrectly aligned to the reference genome of the minority sequence member. Combining both source genomes into a single reference genome increased the specificity of mapping, while also reducing the central processing unit (CPU) time. In Plasmodium , at a seed length of 18 nt, 24.1 % of reads mapped to the human genome using 1.7±0.1 CPU hours, while 83.6 % of reads mapped to the Plasmodium genome using 0.2±0.0 CPU hours (total: 107.7 % reads mapping; in 1.9±0.1 CPU hours). In contrast, 97.1 % of the reads mapped to a combined Plasmodium- human reference in only 0.7±0.0 CPU hours. Overall, the results suggest that combining all references into a single reference database and using a 23 nt seed length reduces the computational time, while maximizing specificity. Similar results were found for simulated sequence reads from a mock metagenomic data set. We found similar improvements to computation time in a publicly available human-only data set.
A Public Health Model for the Molecular Surveillance of HIV Transmission in San Diego, California

PubMed Central

May, Susanne; Tweeten, Samantha; Drumright, Lydia; Pacold, Mary E.; Kosakovsky Pond, Sergei L.; Pesano, Rick L.; Lie, Yolanda S.; Richman, Douglas D.; Frost, Simon D.W.; Woelk, Christopher H.; Little, Susan J.

2009-01-01

Background Current public health efforts often use molecular technologies to identify and contain communicable disease networks, but not for HIV. Here, we investigate how molecular epidemiology can be used to identify highly-related HIV networks within a population and how voluntary contact tracing of sexual partners can be used to selectively target these networks. Methods We evaluated the use of HIV-1 pol sequences obtained from participants of a community-recruited cohort (n=268) and a primary infection research cohort (n=369) to define highly related transmission clusters and the use of contact tracing to link other individuals (n=36) within these clusters. The presence of transmitted drug resistance was interpreted from the pol sequences (Calibrated Population Resistance v3.0). Results Phylogenetic clustering was conservatively defined when the genetic distance between any two pol sequences was <1%, which identified 34 distinct transmission clusters within the combined community-recruited and primary infection research cohorts containing 160 individuals. Although sequences from the epidemiologically-linked partners represented approximately 5% of the total sequences, they clustered with 60% of the sequences that clustered from the combined cohorts (O.R. 21.7; p=<0.01). Major resistance to at least one class of antiretroviral medication was found in 19% of clustering sequences. Conclusions Phylogenetic methods can be used to identify individuals who are within highly related transmission groups, and contact tracing of epidemiologically-linked partners of recently infected individuals can be used to link into previously-defined transmission groups. These methods could be used to implement selectively targeted prevention interventions. PMID:19098493
Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

PubMed Central

Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

2013-01-01

For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960
Combinatorial pooling enables selective sequencing of the barley gene space.

PubMed

Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J

2013-04-01

For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.
Electromagnetic navigational bronchoscopy and robotic-assisted thoracic surgery.

PubMed

Christie, Sara

2014-06-01

With the use of electromagnetic navigational bronchoscopy and robotics, lung lesions can be diagnosed and resected during one surgical procedure. Global positioning system technology allows surgeons to identify and mark a thoracic tumor, and then robotics technology allows them to perform minimally invasive resection and cancer staging procedures. Nurses on the perioperative robotics team must consider the logistics of providing safe and competent care when performing combined procedures during one surgical encounter. Instrumentation, OR organization and room setup, and patient positioning are important factors to consider to complete the procedure systematically and efficiently. This revolutionary concept of combining navigational bronchoscopy with robotics requires a team of dedicated nurses to facilitate the sequence of events essential for providing optimal patient outcomes in highly advanced surgical procedures. Copyright © 2014 AORN, Inc. Published by Elsevier Inc. All rights reserved.
Single Cell Multi-Omics Technology: Methodology and Application.

PubMed

Hu, Youjin; An, Qin; Sheu, Katherine; Trejo, Brandon; Fan, Shuxin; Guo, Ying

2018-01-01

In the era of precision medicine, multi-omics approaches enable the integration of data from diverse omics platforms, providing multi-faceted insight into the interrelation of these omics layers on disease processes. Single cell sequencing technology can dissect the genotypic and phenotypic heterogeneity of bulk tissue and promises to deepen our understanding of the underlying mechanisms governing both health and disease. Through modification and combination of single cell assays available for transcriptome, genome, epigenome, and proteome profiling, single cell multi-omics approaches have been developed to simultaneously and comprehensively study not only the unique genotypic and phenotypic characteristics of single cells, but also the combined regulatory mechanisms evident only at single cell resolution. In this review, we summarize the state-of-the-art single cell multi-omics methods and discuss their applications, challenges, and future directions.
Single Cell Multi-Omics Technology: Methodology and Application

PubMed Central

Hu, Youjin; An, Qin; Sheu, Katherine; Trejo, Brandon; Fan, Shuxin; Guo, Ying

2018-01-01

In the era of precision medicine, multi-omics approaches enable the integration of data from diverse omics platforms, providing multi-faceted insight into the interrelation of these omics layers on disease processes. Single cell sequencing technology can dissect the genotypic and phenotypic heterogeneity of bulk tissue and promises to deepen our understanding of the underlying mechanisms governing both health and disease. Through modification and combination of single cell assays available for transcriptome, genome, epigenome, and proteome profiling, single cell multi-omics approaches have been developed to simultaneously and comprehensively study not only the unique genotypic and phenotypic characteristics of single cells, but also the combined regulatory mechanisms evident only at single cell resolution. In this review, we summarize the state-of-the-art single cell multi-omics methods and discuss their applications, challenges, and future directions. PMID:29732369
[OMICS AND BIG DATA, MAJOR ADVANCES TOWARDS PERSONALIZED MEDICINE OF THE FUTURE?].

PubMed

Scheen, A J

2015-01-01

The increasing interest for personalized medicine evolves together with two major technological advances. First, the new-generation, rapid and less expensive, DNA sequencing method, combined with remarkable progresses in molecular biology leading to the post-genomic era (transcriptomics, proteomics, metabolomics). Second, the refinement of computing tools (IT), which allows the immediate analysis of a huge amount of data (especially, those resulting from the omics approaches) and, thus, creates a new universe for medical research, that of analyzed by computerized modelling. This article for scientific communication and popularization briefly describes the main advances in these two fields of interest. These technological progresses are combined with those occurring in communication, which makes possible the development of artificial intelligence. These major advances will most probably represent the grounds of the future personalized medicine.

Research progress of plant population genomics based on high-throughput sequencing.

PubMed

Wang, Yun-sheng

2016-08-01

Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.

Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

USDA-ARS?s Scientific Manuscript database

Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

Recent Applications of DNA Sequencing Technologies in Food, Nutrition and Agriculture

USDA-ARS?s Scientific Manuscript database

Next-generation DNA sequencing technologies are able to produce millions of short sequence reads in a high-throughput, cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. This review surveys their rec...
[Sequencing technology in gene diagnosis and its application].

PubMed

Yibin, Guo

2014-11-01

The study of gene mutation is one of the hot topics in the field of life science nowadays, and the related detection methods and diagnostic technology have been developed rapidly. Sequencing technology plays an indispensable role in the definite diagnosis and classification of genetic diseases. In this review, we summarize the research progress in sequencing technology, evaluate the advantages and disadvantages of 1(st) ~3(rd) generation of sequencing technology, and describe its application in gene diagnosis. Also we made forecasts and prospects on its development trend.
Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping

PubMed Central

2011-01-01

Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336
Applications of nanotechnology, next generation sequencing and microarrays in biomedical research.

PubMed

Elingaramil, Sauli; Li, Xiaolong; He, Nongyue

2013-07-01

Next-generation sequencing technologies, microarrays and advances in bio nanotechnology have had an enormous impact on research within a short time frame. This impact appears certain to increase further as many biomedical institutions are now acquiring these prevailing new technologies. Beyond conventional sampling of genome content, wide-ranging applications are rapidly evolving for next-generation sequencing, microarrays and nanotechnology. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted re sequencing and discovery of transcription factor binding sites, noncoding RNA expression profiling and molecular diagnostics. This paper thus discusses current applications of nanotechnology, next-generation sequencing technologies and microarrays in biomedical research and highlights the transforming potential these technologies offer.
Combining 3D structure of real video and synthetic objects

NASA Astrophysics Data System (ADS)

Kim, Man-Bae; Song, Mun-Sup; Kim, Do-Kyoon

1998-04-01

This paper presents a new approach of combining real video and synthetic objects. The purpose of this work is to use the proposed technology in the fields of advanced animation, virtual reality, games, and so forth. Computer graphics has been used in the fields previously mentioned. Recently, some applications have added real video to graphic scenes for the purpose of augmenting the realism that the computer graphics lacks in. This approach called augmented or mixed reality can produce more realistic environment that the entire use of computer graphics. Our approach differs from the virtual reality and augmented reality in the manner that computer- generated graphic objects are combined to 3D structure extracted from monocular image sequences. The extraction of the 3D structure requires the estimation of 3D depth followed by the construction of a height map. Graphic objects are then combined to the height map. The realization of our proposed approach is carried out in the following steps: (1) We derive 3D structure from test image sequences. The extraction of the 3D structure requires the estimation of depth and the construction of a height map. Due to the contents of the test sequence, the height map represents the 3D structure. (2) The height map is modeled by Delaunay triangulation or Bezier surface and each planar surface is texture-mapped. (3) Finally, graphic objects are combined to the height map. Because 3D structure of the height map is already known, Step (3) is easily manipulated. Following this procedure, we produced an animation video demonstrating the combination of the 3D structure and graphic models. Users can navigate the realistic 3D world whose associated image is rendered on the display monitor.
Treatment of mature landfill leachate by internal micro-electrolysis integrated with coagulation: a comparative study on a novel sequencing batch reactor based on zero valent iron.

PubMed

Ying, Diwen; Peng, Juan; Xu, Xinyan; Li, Kan; Wang, Yalin; Jia, Jinping

2012-08-30

A comparative study of treating mature landfill leachate with various treatment processes was conducted to investigate whether the method of combined processes of internal micro-electrolysis (IME) without aeration and IME with full aeration in one reactor was an efficient treatment for mature landfill leachate. A specifically designed novel sequencing batch internal micro-electrolysis reactor (SIME) with the latest automation technology was employed in the experiment. Experimental data showed that combined processes obtained a high COD removal efficiency of 73.7 ± 1.3%, which was 15.2% and 24.8% higher than that of the IME with and without aeration, respectively. The SIME reactor also exhibited a COD removal efficiency of 86.1 ± 3.8% to mature landfill leachate in the continuous operation, which is much higher (p<0.05) than that of conventional treatments of electrolysis (22.8-47.0%), coagulation-sedimentation (18.5-22.2%), and the Fenton process (19.9-40.2%), respectively. The innovative concept behind this excellent performance is a combination effect of reductive and oxidative processes of the IME, and the integration electro-coagulation. Optimal operating parameters, including the initial pH, Fe/C mass ratio, air flow rate, and addition of H(2)O(2), were optimized. All results show that the SIME reactor is a promising and efficient technology in treating mature landfill leachate. Copyright © 2012 Elsevier B.V. All rights reserved.
Critical slowing down associated with regime shifts in the US housing market

NASA Astrophysics Data System (ADS)

Tan, James Peng Lung; Cheong, Siew Siew Ann

2014-02-01

Complex systems are described by a large number of variables with strong and nonlinear interactions. Such systems frequently undergo regime shifts. Combining insights from bifurcation theory in nonlinear dynamics and the theory of critical transitions in statistical physics, we know that critical slowing down and critical fluctuations occur close to such regime shifts. In this paper, we show how universal precursors expected from such critical transitions can be used to forecast regime shifts in the US housing market. In the housing permit, volume of homes sold and percentage of homes sold for gain data, we detected strong early warning signals associated with a sequence of coupled regime shifts, starting from a Subprime Mortgage Loans transition in 2003-2004 and ending with the Subprime Crisis in 2007-2008. Weaker signals of critical slowing down were also detected in the US housing market data during the 1997-1998 Asian Financial Crisis and the 2000-2001 Technology Bubble Crisis. Backed by various macroeconomic data, we propose a scenario whereby hot money flowing back into the US during the Asian Financial Crisis fueled the Technology Bubble. When the Technology Bubble collapsed in 2000-2001, the hot money then flowed into the US housing market, triggering the Subprime Mortgage Loans transition in 2003-2004 and an ensuing sequence of transitions. We showed how this sequence of couple transitions unfolded in space and in time over the whole of US.
Nanofluidic Device with Embedded Nanopore

NASA Astrophysics Data System (ADS)

Zhang, Yuning; Reisner, Walter

2014-03-01

Nanofluidic based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with nanpore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a nanopore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can detect - using fluorescent microscopy - successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. We also show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore until a certain voltage bias is added.
A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance.

PubMed

Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

2016-01-01

Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available.
Use of the Minion nanopore sequencer for rapid sequencing of avian influenza virus isolates

USDA-ARS?s Scientific Manuscript database

A relatively new sequencing technology, the MinION nanopore sequencer, provides a platform that is smaller, faster, and cheaper than existing Next Generation Sequence (NGS) technologies. The MinION sequences of individual strands of DNA and can produce millions of sequencing reads. The cost of the s...
What Advances Are Being Made in DNA Sequencing?

MedlinePlus

... to identify genetic variations; both methods rely on new technologies that allow rapid sequencing of large amounts of ... describes the different sequencing technologies and what the new technologies have meant for the study of the genetic ...
Integrative workflows for metagenomic analysis

PubMed Central

Ladoukakis, Efthymios; Kolisis, Fragiskos N.; Chatziioannou, Aristotelis A.

2014-01-01

The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. PMID:25478562
Customisation of the exome data analysis pipeline using a combinatorial approach.

PubMed

Pattnaik, Swetansu; Vaidyanathan, Srividya; Pooja, Durgad G; Deepak, Sa; Panda, Binay

2012-01-01

The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis. Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination determined by a simple framework of pre-existing metrics to create significant datasets.
The future scalability of pH-based genome sequencers: A theoretical perspective

NASA Astrophysics Data System (ADS)

Go, Jonghyun; Alam, Muhammad A.

2013-10-01

Sequencing of human genome is an essential prerequisite for personalized medicine and early prognosis of various genetic diseases. The state-of-art, high-throughput genome sequencing technologies provide improved sequencing; however, their reliance on relatively expensive optical detection schemes has prevented wide-spread adoption of the technology in routine care. In contrast, the recently announced pH-based electronic genome sequencers achieve fast sequencing at low cost because of the compatibility with the current microelectronics technology. While the progress in technology development has been rapid, the physics of the sequencing chips and the potential for future scaling (and therefore, cost reduction) remain unexplored. In this article, we develop a theoretical framework and a scaling theory to explain the principle of operation of the pH-based sequencing chips and use the framework to explore various perceived scaling limits of the technology related to signal to noise ratio, well-to-well crosstalk, and sequencing accuracy. We also address several limitations inherent to the key steps of pH-based genome sequencers, which are widely shared by many other sequencing platforms in the market but remained unexplained properly so far.
A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

PubMed

Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

2012-06-15

The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.
A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome

PubMed Central

Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

2012-01-01

The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development. PMID:22508961
Detection of the CLOCK/BMAL1 heterodimer using a nucleic acid probe with cycling probe technology.

PubMed

Nakagawa, Kazuhiro; Yamamoto, Takuro; Yasuda, Akio

2010-09-15

An isothermal signal amplification technique for specific DNA sequences, known as cycling probe technology (CPT), has enabled rapid acquisition of genomic information. Here we report an analogous technique for the detection of an activated transcription factor, a transcription element-binding assay with fluorescent amplification by apurinic/apyrimidinic (AP) site lysis cycle (TEFAL). This simple amplification assay can detect activated transcription factors by using a unique nucleic acid probe containing a consensus binding sequence and an AP site, which enables the CPT reaction with AP endonuclease. In this article, we demonstrate that this method detects the functional CLOCK/BMAL1 heterodimer via the TEFAL probe containing the E-box consensus sequence to which the CLOCK/BMAL1 heterodimer binds. Using TEFAL combined with immunoassays, we measured oscillations in the amount of CLOCK/BMAL1 heterodimer in serum-stimulated HeLa cells. Furthermore, we succeeded in measuring the circadian accumulation of the functional CLOCK/BMAL1 heterodimer in human buccal mucosa cells. TEFAL contributes greatly to the study of transcription factor activation in mammalian tissues and cell extracts and is a powerful tool for less invasive investigation of human circadian rhythms. 2010 Elsevier Inc. All rights reserved.
Computer-aided Detection of Prostate Cancer with MRI: Technology and Applications.

PubMed

Liu, Lizhi; Tian, Zhiqiang; Zhang, Zhenfeng; Fei, Baowei

2016-08-01

One in six men will develop prostate cancer in his lifetime. Early detection and accurate diagnosis of the disease can improve cancer survival and reduce treatment costs. Recently, imaging of prostate cancer has greatly advanced since the introduction of multiparametric magnetic resonance imaging (mp-MRI). Mp-MRI consists of T2-weighted sequences combined with functional sequences including dynamic contrast-enhanced MRI, diffusion-weighted MRI, and magnetic resonance spectroscopy imaging. Because of the big data and variations in imaging sequences, detection can be affected by multiple factors such as observer variability and visibility and complexity of the lesions. To improve quantitative assessment of the disease, various computer-aided detection systems have been designed to help radiologists in their clinical practice. This review paper presents an overview of literatures on computer-aided detection of prostate cancer with mp-MRI, which include the technology and its applications. The aim of the survey is threefold: an introduction for those new to the field, an overview for those working in the field, and a reference for those searching for literature on a specific application. Copyright © 2016 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Two-color, 30 second microwave-accelerated Metal-Enhanced Fluorescence DNA assays: a new Rapid Catch and Signal (RCS) technology.

PubMed

Dragan, Anatoliy I; Golberg, Karina; Elbaz, Amit; Marks, Robert; Zhang, Yongxia; Geddes, Chris D

2011-03-07

For analyses of DNA fragment sequences in solution we introduce a 2-color DNA assay, utilizing a combination of the Metal-Enhanced Fluorescence (MEF) effect and microwave-accelerated DNA hybridization. The assay is based on a new "Catch and Signal" technology, i.e. the simultaneous specific recognition of two target DNA sequences in one well by complementary anchor-ssDNAs, attached to silver island films (SiFs). It is shown that fluorescent labels (Alexa 488 and Alexa 594), covalently attached to ssDNA fragments, play the role of biosensor recognition probes, demonstrating strong response upon DNA hybridization, locating fluorophores in close proximity to silver NPs, which is ideal for MEF. Subsequently the emission dramatically increases, while the excited state lifetime decreases. It is also shown that 30s microwave irradiation of wells, containing DNA molecules, considerably (~1000-fold) speeds up the highly selective hybridization of DNA fragments at ambient temperature. The 2-color "Catch and Signal" DNA assay platform can radically expedite quantitative analysis of genome DNA sequences, creating a simple and fast bio-medical platform for nucleic acid analysis. Copyright © 2010 Elsevier B.V. All rights reserved.
Early Pleistocene occurrence of Acheulian technology in North China

NASA Astrophysics Data System (ADS)

Li, Xingwen; Ao, Hong; Dekkers, Mark J.; Roberts, Andrew P.; Zhang, Peng; Lin, Shan; Huang, Weiwen; Hou, Yamei; Zhang, Weihua; An, Zhisheng

2017-01-01

Acheulian tools with their associated level of cognizance heralded a major threshold in the evolution of hominin technology, culture and behavior. Thus, unraveling occurrence ages of Acheulian technology across different regions worldwide constitutes a key aspect of understanding the archeology of early human evolution. Here we present a magneto-cyclochronology for the Acheulian assemblage from Sanmenxia Basin, Loess Plateau, North China. Our results place a sequence of stable normal and reversed paleomagnetic polarities within a regional lithostratigraphic context. The Acheulian assemblage is dated to be older than the Matuyama-Brunhes boundary at 0.78 Ma, and is found in strata that are probably equivalent to a weak paleosol subunit within loess layer L9 in the Chinese loess-paleosol sequence, which corresponds to marine isotope stage (MIS) 23, a relatively subdued interglacial period with age range of ∼0.89-0.92 Ma. This age of ∼0.9 Ma implies that Acheulian stone tools were unambiguously present in North China during the Early Pleistocene. It distinctly enlarges the geographic distribution of Acheulian technology and brings its occurrence in North China back into the Early Pleistocene, which is contemporaneous with its first emergence in Europe. Combined with other archeological records, the larger area over which Acheulian technology existed in East Asia during the terminal Early Pleistocene has important implications for understanding early human occupation of North China.

[Target gene sequence capture and next generation sequencing technology to diagnose four children with Alagille syndrome].

PubMed

Gao, M L; Zhong, X M; Ma, X; Ning, H J; Zhu, D; Zou, J Z

2016-06-02

To make genetic diagnosis of Alagille syndrome (ALGS) patients using target gene sequence capture and next generation sequencing technology. Target gene sequence capture and next generation sequencing were used to detect ALGS gene of 4 patients. They were hospitalized at the Affiliated Hospital, Capital Institute of Pediatrics between January 2014 and December 2015, referred to clinical diagnosis of ALGS typical and atypical respectively in 2 cases. Blood samples were collected from patients and their parents and genomic DNA was extracted from lymphocytes. Target gene sequence capture and next generation sequencing was detected. Sanger sequencing was used to confirm the results of the patients and their parents. Cholestasis, heart defects, inverted triangular face and butterfly vertebrae were presented as main clinical features in 4 male patients. The first hospital visiting ages ranged from 3 months and 14 days to 3 years and 1 month. The age of onset ranged from 3 days to 42 days (median 23 days). According to the clinical diagnostic criteria of ALGS, patient 1 and patient 2 were considered as typical ALGS. The other 2 patients were considered as atypical ALGS. Four Jagged 1(JAG1) pathogenic mutations were detected. Three different missense mutations were detected in patient 1 to patient 3 with ALGS(c.839C>T(p.W280X), c. 703G>A(p.R235X), c. 1720C>T(p.V574M)). The JAG1 mutation of patient 3 was first reported. Patient 4 had one novel insertion mutation (c.1779_1780insA(p.Ile594AsnfsTer23)). Parental analysis verified that the JAG1 missense mutation of 3 patients were de novo. The results of sanger sequencing was consistent with the results of the next generation sequencing. Target gene sequence capture combined with next generation sequencing can detect two pathogenic genes in ALGS and test genes of other related diseases in infantile cholestatic diseases simultaneously and presents a high throughput, high efficiency and low cost. It may provide molecular diagnosis and treatment for clinicians with good clinical application prospects.
Improving livestock for agriculture - technological progress from random transgenesis to precision genome editing heralds a new era.

PubMed

Laible, Götz; Wei, Jingwei; Wagner, Stefan

2015-01-01

Humans have a long history in shaping the genetic makeup of livestock to optimize production and meet growing human demands for food and other animal products. Until recently, this has only been possible through traditional breeding and selection, which is a painstakingly slow process of accumulating incremental gains over a long period. The development of transgenic livestock technology offers a more direct approach with the possibility for making genetic improvements with greater impact and within a single generation. However, initially the technology was hampered by technical difficulties and limitations, which have now largely been overcome by progressive improvements over the past 30 years. Particularly, the advent of genome editing in combination with homologous recombination has added a new level of efficiency and precision that holds much promise for the genetic improvement of livestock using the increasing knowledge of the phenotypic impact of genetic sequence variants. So far not a single line of transgenic livestock has gained approval for commercialization. The step change to genome-edited livestock with precise sequence changes may accelerate the path to market, provided applications of this new technology for agriculture can deliver, in addition to economic incentives for producers, also compelling benefits for animals, consumers, and the environment. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
An innovative diagnostic technology for the codon mutation C580Y in kelch13 of Plasmodium falciparum with MinION nanopore sequencer.

PubMed

Imai, Kazuo; Tarumoto, Norihito; Runtuwene, Lucky Ronald; Sakai, Jun; Hayashida, Kyoko; Eshita, Yuki; Maeda, Ryuichiro; Tuda, Josef; Ohno, Hideaki; Murakami, Takashi; Maesaki, Shigefumi; Suzuki, Yutaka; Yamagishi, Junya; Maeda, Takuya

2018-05-29

The recent spread of artemisinin (ART)-resistant Plasmodium falciparum represents an emerging global threat to public health. In Southeast Asia, the C580Y mutation of kelch13 (k13) is the dominant mutation of ART-resistant P. falciparum. Therefore, a simple method for the detection of C580Y mutation is urgently needed to enable widespread routine surveillance in the field. The aim of this study is to develop a new diagnostic procedure for the C580Y mutation using loop-mediated isothermal amplification (LAMP) combined with the MinION nanopore sequencer. A LAMP assay for the k13 gene of P. falciparum to detect the C580Y mutation was successfully developed. The detection limit of this procedure was 10 copies of the reference plasmid harboring the k13 gene within 60 min. Thereafter, amplicon sequencing of the LAMP products using the MinION nanopore sequencer was performed to clarify the nucleotide sequences of the gene. The C580Y mutation was identified based on the sequence data collected from MinION reads 30 min after the start of sequencing. Further, clinical evaluation of the LAMP assay in 34 human blood samples collected from patients with P. falciparum malaria in Indonesia revealed a positive detection rate of 100%. All LAMP amplicons of up to 12 specimens were simultaneously sequenced using MinION. The results of sequencing were consistent with those of the conventional PCR and Sanger sequencing protocol. All procedures from DNA extraction to variant calling were completed within 3 h. The C580Y mutation was not found among these 34 P. falciparum isolates in Indonesia. An innovative method combining LAMP and MinION will enable simple, rapid, and high-sensitivity detection of the C580Y mutation of P. falciparum, even in resource-limited situations in developing countries.
Across the Gap: Geochronological and Sedimentological Analyses from the Late Pleistocene-Holocene Sequence of Goda Buticha, Southeastern Ethiopia

PubMed Central

Asrat, Asfawossen; Bahain, Jean-Jacques; Chapon, Cécile; Douville, Eric; Fragnol, Carole; Hernandez, Marion; Hovers, Erella; Leplongeon, Alice; Martin, Loïc; Pleurdeau, David; Pearson, Osbjorn; Puaud, Simon; Assefa, Zelalem

2017-01-01

Goda Buticha is a cave site near Dire Dawa in southeastern Ethiopia that contains an archaeological sequence sampling the late Pleistocene and Holocene of the region. The sedimentary sequence displays complex cultural, chronological and sedimentological histories that seem incongruent with one another. A first set of radiocarbon ages suggested a long sedimentological gap from the end of Marine Isotopic Stage (MIS) 3 to the mid-Holocene. Macroscopic observations suggest that the main sedimentological change does not coincide with the chronostratigraphic hiatus. The cultural sequence shows technological continuity with a late persistence of artifacts that are usually attributed to the Middle Stone Age into the younger parts of the stratigraphic sequence, yet become increasingly associated with lithic artifacts typically related to the Later Stone Age. While not a unique case, this combination of features is unusual in the Horn of Africa. In order to evaluate the possible implications of these observations, sedimentological analyses combined with optically stimulated luminescence (OSL) were conducted. The OSL data now extend the radiocarbon chronology up to 63 ± 7 ka; they also confirm the existence of the chronological gap between 24.8 ± 2.6 ka and 7.5 ± 0.3 ka. The sedimentological analyses suggest that the origin and mode of deposition were largely similar throughout the whole sequence, although the anthropic and faunal activities increased in the younger levels. Regional climatic records are used to support the sedimentological observations and interpretations. We discuss the implications of the sedimentological and dating analyses for understanding cultural processes in the region. PMID:28125597
Across the Gap: Geochronological and Sedimentological Analyses from the Late Pleistocene-Holocene Sequence of Goda Buticha, Southeastern Ethiopia.

PubMed

Tribolo, Chantal; Asrat, Asfawossen; Bahain, Jean-Jacques; Chapon, Cécile; Douville, Eric; Fragnol, Carole; Hernandez, Marion; Hovers, Erella; Leplongeon, Alice; Martin, Loïc; Pleurdeau, David; Pearson, Osbjorn; Puaud, Simon; Assefa, Zelalem

2017-01-01

Goda Buticha is a cave site near Dire Dawa in southeastern Ethiopia that contains an archaeological sequence sampling the late Pleistocene and Holocene of the region. The sedimentary sequence displays complex cultural, chronological and sedimentological histories that seem incongruent with one another. A first set of radiocarbon ages suggested a long sedimentological gap from the end of Marine Isotopic Stage (MIS) 3 to the mid-Holocene. Macroscopic observations suggest that the main sedimentological change does not coincide with the chronostratigraphic hiatus. The cultural sequence shows technological continuity with a late persistence of artifacts that are usually attributed to the Middle Stone Age into the younger parts of the stratigraphic sequence, yet become increasingly associated with lithic artifacts typically related to the Later Stone Age. While not a unique case, this combination of features is unusual in the Horn of Africa. In order to evaluate the possible implications of these observations, sedimentological analyses combined with optically stimulated luminescence (OSL) were conducted. The OSL data now extend the radiocarbon chronology up to 63 ± 7 ka; they also confirm the existence of the chronological gap between 24.8 ± 2.6 ka and 7.5 ± 0.3 ka. The sedimentological analyses suggest that the origin and mode of deposition were largely similar throughout the whole sequence, although the anthropic and faunal activities increased in the younger levels. Regional climatic records are used to support the sedimentological observations and interpretations. We discuss the implications of the sedimentological and dating analyses for understanding cultural processes in the region.
Combining Next Generation Sequencing with Bulked Segregant Analysis to Fine Map a Stem Moisture Locus in Sorghum (Sorghum bicolor L. Moench).

PubMed

Han, Yucui; Lv, Peng; Hou, Shenglin; Li, Suying; Ji, Guisu; Ma, Xue; Du, Ruiheng; Liu, Guoqing

2015-01-01

Sorghum is one of the most promising bioenergy crops. Stem juice yield, together with stem sugar concentration, determines sugar yield in sweet sorghum. Bulked segregant analysis (BSA) is a gene mapping technique for identifying genomic regions containing genetic loci affecting a trait of interest that when combined with deep sequencing could effectively accelerate the gene mapping process. In this study, a dry stem sorghum landrace was characterized and the stem water controlling locus, qSW6, was fine mapped using QTL analysis and the combined BSA and deep sequencing technologies. Results showed that: (i) In sorghum variety Jiliang 2, stem water content was around 80% before flowering stage. It dropped to 75% during grain filling with little difference between different internodes. In landrace G21, stem water content keeps dropping after the flag leaf stage. The drop from 71% at flowering time progressed to 60% at grain filling time. Large differences exist between different internodes with the lowest (51%) at the 7th and 8th internodes at dough stage. (ii) A quantitative trait locus (QTL) controlling stem water content mapped on chromosome 6 between SSR markers Ch6-2 and gpsb069 explained about 34.7-56.9% of the phenotypic variation for the 5th to 10th internodes, respectively. (iii) BSA and deep sequencing analysis narrowed the associated region to 339 kb containing 38 putative genes. The results could help reveal molecular mechanisms underlying juice yield of sorghum and thus to improve total sugar yield.
The salt-responsive transcriptome of chickpea roots and nodules via deepSuperSAGE

PubMed Central

2011-01-01

Background The combination of high-throughput transcript profiling and next-generation sequencing technologies is a prerequisite for genome-wide comprehensive transcriptome analysis. Our recent innovation of deepSuperSAGE is based on an advanced SuperSAGE protocol and its combination with massively parallel pyrosequencing on Roche's 454 sequencing platform. As a demonstration of the power of this combination, we have chosen the salt stress transcriptomes of roots and nodules of the third most important legume crop chickpea (Cicer arietinum L.). While our report is more technology-oriented, it nevertheless addresses a major world-wide problem for crops generally: high salinity. Together with low temperatures and water stress, high salinity is responsible for crop losses of millions of tons of various legume (and other) crops. Continuously deteriorating environmental conditions will combine with salinity stress to further compromise crop yields. As a good example for such stress-exposed crop plants, we started to characterize salt stress responses of chickpeas on the transcriptome level. Results We used deepSuperSAGE to detect early global transcriptome changes in salt-stressed chickpea. The salt stress responses of 86,919 transcripts representing 17,918 unique 26 bp deepSuperSAGE tags (UniTags) from roots of the salt-tolerant variety INRAT-93 two hours after treatment with 25 mM NaCl were characterized. Additionally, the expression of 57,281 transcripts representing 13,115 UniTags was monitored in nodules of the same plants. From a total of 144,200 analyzed 26 bp tags in roots and nodules together, 21,401 unique transcripts were identified. Of these, only 363 and 106 specific transcripts, respectively, were commonly up- or down-regulated (>3.0-fold) under salt stress in both organs, witnessing a differential organ-specific response to stress. Profiting from recent pioneer works on massive cDNA sequencing in chickpea, more than 9,400 UniTags were able to be linked to UniProt entries. Additionally, gene ontology (GO) categories over-representation analysis enabled to filter out enriched biological processes among the differentially expressed UniTags. Subsequently, the gathered information was further cross-checked with stress-related pathways. From several filtered pathways, here we focus exemplarily on transcripts associated with the generation and scavenging of reactive oxygen species (ROS), as well as on transcripts involved in Na+ homeostasis. Although both processes are already very well characterized in other plants, the information generated in the present work is of high value. Information on expression profiles and sequence similarity for several hundreds of transcripts of potential interest is now available. Conclusions This report demonstrates, that the combination of the high-throughput transcriptome profiling technology SuperSAGE with one of the next-generation sequencing platforms allows deep insights into the first molecular reactions of a plant exposed to salinity. Cross validation with recent reports enriched the information about the salt stress dynamics of more than 9,000 chickpea ESTs, and enlarged their pool of alternative transcripts isoforms. As an example for the high resolution of the employed technology that we coin deepSuperSAGE, we demonstrate that ROS-scavenging and -generating pathways undergo strong global transcriptome changes in chickpea roots and nodules already 2 hours after onset of moderate salt stress (25 mM NaCl). Additionally, a set of more than 15 candidate transcripts are proposed to be potential components of the salt overly sensitive (SOS) pathway in chickpea. Newly identified transcript isoforms are potential targets for breeding novel cultivars with high salinity tolerance. We demonstrate that these targets can be integrated into breeding schemes by micro-arrays and RT-PCR assays downstream of the generation of 26 bp tags by SuperSAGE. PMID:21320317
The salt-responsive transcriptome of chickpea roots and nodules via deepSuperSAGE.

PubMed

Molina, Carlos; Zaman-Allah, Mainassara; Khan, Faheema; Fatnassi, Nadia; Horres, Ralf; Rotter, Björn; Steinhauer, Diana; Amenc, Laurie; Drevon, Jean-Jacques; Winter, Peter; Kahl, Günter

2011-02-14

The combination of high-throughput transcript profiling and next-generation sequencing technologies is a prerequisite for genome-wide comprehensive transcriptome analysis. Our recent innovation of deepSuperSAGE is based on an advanced SuperSAGE protocol and its combination with massively parallel pyrosequencing on Roche's 454 sequencing platform. As a demonstration of the power of this combination, we have chosen the salt stress transcriptomes of roots and nodules of the third most important legume crop chickpea (Cicer arietinum L.). While our report is more technology-oriented, it nevertheless addresses a major world-wide problem for crops generally: high salinity. Together with low temperatures and water stress, high salinity is responsible for crop losses of millions of tons of various legume (and other) crops. Continuously deteriorating environmental conditions will combine with salinity stress to further compromise crop yields. As a good example for such stress-exposed crop plants, we started to characterize salt stress responses of chickpeas on the transcriptome level. We used deepSuperSAGE to detect early global transcriptome changes in salt-stressed chickpea. The salt stress responses of 86,919 transcripts representing 17,918 unique 26 bp deepSuperSAGE tags (UniTags) from roots of the salt-tolerant variety INRAT-93 two hours after treatment with 25 mM NaCl were characterized. Additionally, the expression of 57,281 transcripts representing 13,115 UniTags was monitored in nodules of the same plants. From a total of 144,200 analyzed 26 bp tags in roots and nodules together, 21,401 unique transcripts were identified. Of these, only 363 and 106 specific transcripts, respectively, were commonly up- or down-regulated (>3.0-fold) under salt stress in both organs, witnessing a differential organ-specific response to stress.Profiting from recent pioneer works on massive cDNA sequencing in chickpea, more than 9,400 UniTags were able to be linked to UniProt entries. Additionally, gene ontology (GO) categories over-representation analysis enabled to filter out enriched biological processes among the differentially expressed UniTags. Subsequently, the gathered information was further cross-checked with stress-related pathways. From several filtered pathways, here we focus exemplarily on transcripts associated with the generation and scavenging of reactive oxygen species (ROS), as well as on transcripts involved in Na+ homeostasis. Although both processes are already very well characterized in other plants, the information generated in the present work is of high value. Information on expression profiles and sequence similarity for several hundreds of transcripts of potential interest is now available. This report demonstrates, that the combination of the high-throughput transcriptome profiling technology SuperSAGE with one of the next-generation sequencing platforms allows deep insights into the first molecular reactions of a plant exposed to salinity. Cross validation with recent reports enriched the information about the salt stress dynamics of more than 9,000 chickpea ESTs, and enlarged their pool of alternative transcripts isoforms. As an example for the high resolution of the employed technology that we coin deepSuperSAGE, we demonstrate that ROS-scavenging and -generating pathways undergo strong global transcriptome changes in chickpea roots and nodules already 2 hours after onset of moderate salt stress (25 mM NaCl). Additionally, a set of more than 15 candidate transcripts are proposed to be potential components of the salt overly sensitive (SOS) pathway in chickpea. Newly identified transcript isoforms are potential targets for breeding novel cultivars with high salinity tolerance. We demonstrate that these targets can be integrated into breeding schemes by micro-arrays and RT-PCR assays downstream of the generation of 26 bp tags by SuperSAGE.
Combining one-step Sanger sequencing with phasing probe hybridization for HLA class I typing yields rapid, G-group resolution predicting 99% of unique full length protein sequences.

PubMed

Tu, Bin; Masaberg, Carly; Hou, Lihua; Behm, Daniel; Brescia, Peter; Cha, Nuri; Kariyawasam, Kanthi; Lee, Jar How; Nong, Thoa; Sells, John; Tausch, Paul; Yang, Ruyan; Ng, Jennifer; Hurley, Carolyn Katovich

2017-02-01

Sanger-based DNA sequencing of exons 2+3 of HLA class I alleles from a heterozygote frequently results in two or more alternative genotypes. This study was undertaken to reduce the time and effort required to produce a single high resolution HLA genotype. Samples were typed in parallel by Sanger sequencing and oligonucleotide probe hybridization. This workflow, together with optimization of analysis software, was tested and refined during the typing of over 42,000 volunteers for an unrelated hematopoietic progenitor cell donor registry. Next generation DNA sequencing (NGS) was applied to over 1000 of these samples to identify the alleles present within the G group designations. Single genotypes at G level resolution were obtained for over 95% of the loci without additional assays. The vast majority of alleles identified (>99%) were the primary allele giving the G groups their name. Only 0.7% of the alleles identified encoded protein variants that were not detected by a focus on the antigen recognition domain (ARD)-encoding exons. Our combined method routinely provides biologically relevant typing resolution at the level of the ARD. It can be applied to both single samples or to large volume typing supporting either bone marrow or solid organ transplantation using technologies currently available in many HLA laboratories. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Targeted gene panel sequencing in children with very early onset inflammatory bowel disease--evaluation and prospective analysis.

PubMed

Kammermeier, Jochen; Drury, Suzanne; James, Chela T; Dziubak, Robert; Ocaka, Louise; Elawad, Mamoun; Beales, Philip; Lench, Nicholas; Uhlig, Holm H; Bacchelli, Chiara; Shah, Neil

2014-11-01

Multiple monogenetic conditions with partially overlapping phenotypes can present with inflammatory bowel disease (IBD)-like intestinal inflammation. With novel genotype-specific therapies emerging, establishing a molecular diagnosis is becoming increasingly important. We have introduced targeted next-generation sequencing (NGS) technology as a prospective screening tool in children with very early onset IBD (VEOIBD). We evaluated the coverage of 40 VEOIBD genes in two separate cohorts undergoing targeted gene panel sequencing (TGPS) (n=25) and whole exome sequencing (WES) (n=20). TGPS revealed causative mutations in four genes (IL10RA, EPCAM, TTC37 and SKIV2L) discovered unexpected phenotypes and directly influenced clinical decision making by supporting as well as avoiding haematopoietic stem cell transplantation. TGPS resulted in significantly higher median coverage when compared with WES, fewer coverage deficiencies and improved variant detection across established VEOIBD genes. Excluding or confirming known VEOIBD genotypes should be considered early in the disease course in all cases of therapy-refractory VEOIBD, as it can have a direct impact on patient management. To combine both described NGS technologies would compensate for the limitations of WES for disease-specific application while offering the opportunity for novel gene discovery in the research setting. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
RNAi screen for rapid therapeutic target identification in leukemia patients

PubMed Central

Tyner, Jeffrey W.; Deininger, Michael W.; Loriaux, Marc M.; Chang, Bill H.; Gotlib, Jason R.; Willis, Stephanie G.; Erickson, Heidi; Kovacsovics, Tibor; O'Hare, Thomas; Heinrich, Michael C.; Druker, Brian J.

2009-01-01

Targeted therapy has vastly improved outcomes in certain types of cancer. Extension of this paradigm across a broad spectrum of malignancies will require an efficient method to determine the molecular vulnerabilities of cancerous cells. Improvements in sequencing technology will soon enable high-throughput sequencing of entire genomes of cancer patients; however, determining the relevance of identified sequence variants will require complementary functional analyses. Here, we report an RNAi-assisted protein target identification (RAPID) technology that individually assesses targeting of each member of the tyrosine kinase gene family. We demonstrate that RAPID screening of primary leukemia cells from 30 patients identifies targets that are critical to survival of the malignant cells from 10 of these individuals. We identify known, activating mutations in JAK2 and K-RAS, as well as patient-specific sensitivity to down-regulation of FLT1, CSF1R, PDGFR, ROR1, EPHA4/5, JAK1/3, LMTK3, LYN, FYN, PTK2B, and N-RAS. We also describe a previously undescribed, somatic, activating mutation in the thrombopoietin receptor that is sensitive to down-stream pharmacologic inhibition. Hence, the RAPID technique can quickly identify molecular vulnerabilities in malignant cells. Combination of this technique with whole-genome sequencing will represent an ideal tool for oncogenic target identification such that specific therapies can be matched with individual patients. PMID:19433805
Cellular Reprogramming, Genome Editing, and Alternative CRISPR Cas9 Technologies for Precise Gene Therapy of Duchenne Muscular Dystrophy

PubMed Central

Xu, Huaigeng

2017-01-01

In the past decade, the development of two innovative technologies, namely, induced pluripotent stem cells (iPSCs) and the CRISPR Cas9 system, has enabled researchers to model diseases derived from patient cells and precisely edit DNA sequences of interest, respectively. In particular, Duchenne muscular dystrophy (DMD) has been an exemplary monogenic disease model for combining these technologies to demonstrate that genome editing can correct genetic mutations in DMD patient-derived iPSCs. DMD is an X-linked genetic disorder caused by mutations that disrupt the open reading frame of the dystrophin gene, which plays a critical role in stabilizing muscle cells during contraction and relaxation. The CRISPR Cas9 system has been shown to be capable of targeting the dystrophin gene and rescuing its expression in in vitro patient-derived iPSCs and in vivo DMD mouse models. In this review, we highlight recent advances made using the CRISPR Cas9 system to correct genetic mutations and discuss how emerging CRISPR technologies and iPSCs in a combined platform can play a role in bringing a therapy for DMD closer to the clinic. PMID:28607562
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.

PubMed

Ozsolak, Fatih

2016-01-01

With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
Small indels induced by CRISPR/Cas9 in the 5' region of microRNA lead to its depletion and Drosha processing retardance.

PubMed

Jiang, Qian; Meng, Xing; Meng, Lingwei; Chang, Nannan; Xiong, Jingwei; Cao, Huiqing; Liang, Zicai

2014-01-01

MicroRNA knockout by genome editing technologies is promising. In order to extend the application of the technology and to investigate the function of a specific miRNA, we used CRISPR/Cas9 to deplete human miR-93 from a cluster by targeting its 5' region in HeLa cells. Various small indels were induced in the targeted region containing the Drosha processing site and seed sequences. Interestingly, we found that even a single nucleotide deletion led to complete knockout of the target miRNA with high specificity. Functional knockout was confirmed by phenotype analysis. Furthermore, de novo microRNAs were not found by RNA-seq. Nevertheless, expression of the pri-microRNAs was increased. When combined with structural analysis, the data indicated that biogenesis was impaired. Altogether, we showed that small indels in the 5' region of a microRNA result in sequence depletion as well as Drosha processing retard.
New technology and resources for cryptococcal research

PubMed Central

Zhang, Nannan; Park, Yoon-Dong; Williamson, Peter R.

2014-01-01

Rapid advances in molecular biology and genome sequencing have enabled the generation of new technology and resources for cryptococcal research. RNAi-mediated specific gene knock down has become routine and more efficient by utilizing modified shRNA plasmids and convergent promoter RNAi constructs. This system was recently applied in a high-throughput screen to identify genes involved in host-pathogen interactions. Gene deletion efficiencies have also been improved by increasing rates of homologous recombination through a number of approaches, including a combination of double-joint PCR with split-marker transformation, the use of dominant selectable markers and the introduction of Cre-Loxp systems into Cryptococcus. Moreover, visualization of cryptococcal proteins has become more facile using fusions with codon-optimized fluorescent tags, such as green or red fluorescent proteins or, mCherry. Using recent genome-wide analytical tools, new transcriptional factors and regulatory proteins have been identified in novel virulence-related signaling pathways by employing microarray analysis, RNA-sequencing and proteomic analysis. PMID:25460849
Advanced Virus Detection Technologies Interest Group (AVDTIG): Efforts on High Throughput Sequencing (HTS) for Virus Detection.

PubMed

Khan, Arifa S; Vacante, Dominick A; Cassart, Jean-Pol; Ng, Siemon H S; Lambert, Christophe; Charlebois, Robert L; King, Kathryn E

Several nucleic-acid based technologies have recently emerged with capabilities for broad virus detection. One of these, high throughput sequencing, has the potential for novel virus detection because this method does not depend upon prior viral sequence knowledge. However, the use of high throughput sequencing for testing biologicals poses greater challenges as compared to other newly introduced tests due to its technical complexities and big data bioinformatics. Thus, the Advanced Virus Detection Technologies Users Group was formed as a joint effort by regulatory and industry scientists to facilitate discussions and provide a forum for sharing data and experiences using advanced new virus detection technologies, with a focus on high throughput sequencing technologies. The group was initiated as a task force that was coordinated by the Parenteral Drug Association and subsequently became the Advanced Virus Detection Technologies Interest Group to continue efforts for using new technologies for detection of adventitious viruses with broader participation, including international government agencies, academia, and technology service providers. © PDA, Inc. 2016.
De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries.

PubMed

Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

2015-09-21

Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.
The advantages of SMRT sequencing.

PubMed

Roberts, Richard J; Carneiro, Mauricio O; Schatz, Michael C

2013-07-03

Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.
Data Interoperability of Whole Exome Sequencing (WES) Based Mutational Burden Estimates from Different Laboratories

PubMed Central

Qiu, Ping; Pang, Ling; Arreaza, Gladys; Maguire, Maureen; Chang, Ken C. N.; Marton, Matthew J.; Levitan, Diane

2016-01-01

Immune checkpoint inhibitors, which unleash a patient’s own T cells to kill tumors, are revolutionizing cancer treatment. Several independent studies suggest that higher non-synonymous mutational burden assessed by whole exome sequencing (WES) in tumors is associated with improved objective response, durable clinical benefit, and progression-free survival in immune checkpoint inhibitors treatment. Next-generation sequencing (NGS) is a promising technology being used in the clinic to direct patient treatment. Cancer genome WES poses a unique challenge due to tumor heterogeneity and sequencing artifacts introduced by formalin-fixed, paraffin-embedded (FFPE) tissue. In order to evaluate the data interoperability of WES data from different sources to survey tumor mutational landscape, we compared WES data of several tumor/normal matched samples from five commercial vendors. A large data discrepancy was observed from vendors’ self-reported data. Independent data analysis from vendors’ raw NGS data shows that whole exome sequencing data from qualified vendors can be combined and analyzed uniformly to derive comparable quantitative estimates of tumor mutational burden. PMID:27136543
Accurate read-based metagenome characterization using a hierarchical suite of unique signatures

PubMed Central

Freitas, Tracey Allen K.; Li, Po-E; Scholz, Matthew B.; Chain, Patrick S. G.

2015-01-01

A major challenge in the field of shotgun metagenomics is the accurate identification of organisms present within a microbial community, based on classification of short sequence reads. Though existing microbial community profiling methods have attempted to rapidly classify the millions of reads output from modern sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, errors and biases in sequencing technologies, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here, we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling method with significantly and consistently smaller FDR than any other available method. Our algorithm circumvents false positives using a series of non-redundant signature databases and examines Genomic Origins Through Taxonomic CHAllenge (GOTTCHA). GOTTCHA was tested and validated on 20 synthetic and mock datasets ranging in community composition and complexity, was applied successfully to data generated from spiked environmental and clinical samples, and robustly demonstrates superior performance compared with other available tools. PMID:25765641

Fungal Genomics Program

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor

The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scalemore » genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.« less
FACS single cell index sorting is highly reliable and determines immune phenotypes of clonally expanded T cells.

PubMed

Penter, Livius; Dietze, Kerstin; Bullinger, Lars; Westermann, Jörg; Rahn, Hans-Peter; Hansmann, Leo

2018-03-14

FACS index sorting allows the isolation of single cells with retrospective identification of each single cell's high-dimensional immune phenotype. We experimentally determine the error rate of index sorting and combine the technology with T cell receptor sequencing to identify clonal T cell expansion in aplastic anemia bone marrow as an example. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Next-Generation Sequencing Primer—How Does It Work and What Can It Do?

PubMed Central

Alekseyev, Yuriy O.; Fazeli, Roghayeh; Yang, Shi; Basran, Raveen; Miller, Nancy S.

2018-01-01

Next-generation sequencing refers to a high-throughput technology that determines the nucleic acid sequences and identifies variants in a sample. The technology has been introduced into clinical laboratory testing and produces test results for precision medicine. Since next-generation sequencing is relatively new, graduate students, medical students, pathology residents, and other physicians may benefit from a primer to provide a foundation about basic next-generation sequencing methods and applications, as well as specific examples where it has had diagnostic and prognostic utility. Next-generation sequencing technology grew out of advances in multiple fields to produce a sophisticated laboratory test with tremendous potential. Next-generation sequencing may be used in the clinical setting to look for specific genetic alterations in patients with cancer, diagnose inherited conditions such as cystic fibrosis, and detect and profile microbial organisms. This primer will review DNA sequencing technology, the commercialization of next-generation sequencing, and clinical uses of next-generation sequencing. Specific applications where next-generation sequencing has demonstrated utility in oncology are provided. PMID:29761157
B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC.

PubMed

Cui, Yingbo; Liao, Xiangke; Zhu, Xiaoqian; Wang, Bingqiang; Peng, Shaoliang

2016-03-01

Sequence alignment is the central process for sequence analysis, where mapping raw sequencing data to reference genome. The large amount of data generated by NGS is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. Intel recently announced the MIC coprocessor, which can provide massive computing power. The Tianhe-2 is the world's fastest supercomputer now equipped with three MIC coprocessors each compute node. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a MIC-oriented three-level parallelization strategy to speed up BWA, a widely used sequence alignment tool, and developed our ultrafast parallel sequence aligner: B-MIC. B-MIC contains three levels of parallelization: firstly, parallelization of data IO and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by MIC coprocessor technology; thirdly, inter-node parallelization implemented by MPI. In this paper, we demonstrate that B-MIC outperforms BWA by a combination of those techniques using Inspur NF5280M server and the Tianhe-2 supercomputer. To the best of our knowledge, B-MIC is the first sequence alignment tool to run on Intel MIC and it can achieve more than fivefold speedup over the original BWA while maintaining the alignment precision.
A computational proposal for designing structured RNA pools for in vitro selection of RNAs.

PubMed

Kim, Namhee; Gan, Hin Hark; Schlick, Tamar

2007-04-01

Although in vitro selection technology is a versatile experimental tool for discovering novel synthetic RNA molecules, finding complex RNA molecules is difficult because most RNAs identified from random sequence pools are simple motifs, consistent with recent computational analysis of such sequence pools. Thus, enriching in vitro selection pools with complex structures could increase the probability of discovering novel RNAs. Here we develop an approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a "mixing matrix" approach combined with a graph theory analysis. We define five classes of mixing matrices motivated by covariance mutations in RNA; these constructs define nucleotide transition rates and are applied to chosen starting sequences to yield specific nonrandom pools. We examine the coverage of sequence space as a function of the mixing matrix and starting sequence via clustering analysis. We show that, in contrast to random sequences, which are associated only with a local region of sequence space, our designed pools, including a structured pool for GTP aptamers, can target specific motifs. It follows that experimental synthesis of designed pools can benefit from using optimized starting sequences, mixing matrices, and pool fractions associated with each of our constructed pools as a guide. Automation of our approach could provide practical tools for pool design applications for in vitro selection of RNAs and related problems.
Combining Cell Type-Restricted Adenoviral Targeting with Immunostaining and Flow Cytometry to Identify Cells-of-Origin of Lung Cancer.

PubMed

Best, Sarah A; Kersbergen, Ariena; Asselin-Labat, Marie-Liesse; Sutherland, Kate D

2018-01-01

Lung cancers display considerable intertumoral heterogeneity, leading to the classification of distinct tumor subtypes. Our understanding of the genetic aberrations that underlie tumor subtypes has been greatly enhanced by recent genomic sequencing studies and state-of-the-art gene targeting technologies, highlighting evidence that distinct lung cancer subtypes may be derived from different "cells-of-origin". Here, we describe the intra-tracheal delivery of cell type-restricted Ad5-Cre viruses into the lungs of adult mice, combined with immunohistochemical and flow cytometry strategies for the detection of lung cancer-initiating cells in vivo.
[Progress of gene editing technologies and prospect in traditional Chinese medicine].

PubMed

Ma, Yan-Yan; Li, Jing-Zhe; Gao, Er-Ning; Qian, Dan; Zhong, Ju-Ying; Liu, Chang-Zhen

2017-01-01

Gene editing is a kind of technologies that makes precise modification to the genome. It can be used to knock out/in and replace the specific DNA fragment, and make accurate gene editing on the genome level. The essence of the technique is the DNA sequence change with use of non homologous end link repair and homologous recombination repair, combined with specific DNA target recognition and endonuclease.This technology has wide range of development prospects and high application value in terms of scientific research, agriculture, medical treatment and other fields. In the field of gene therapy, gene editing technology has achieved cross-time success in cancers such as leukemia, genetic disorders such as hemophilia, thalassemia, multiple muscle nutritional disorders and retrovirus associated infectious diseases such as AIDS and other diseases. The preparation work for new experimental methods and animal models combined with gene editing technology is under rapid development and improvement. Laboratories around the world have also applied gene editing technique in prevention of malaria, organ transplantation, biological pharmaceuticals, agricultural breeding improvement, resurrection of extinct species, and other research areas. This paper summarizes the application and development status of gene editing technique in the above fields, and also preliminarily explores the potential application prospect of the technology in the field of traditional Chinese medicine, and discusses the present controversy and thoughts. Copyright© by the Chinese Pharmaceutical Association.
Tilted pillar array fabrication by the combination of proton beam writing and soft lithography for microfluidic cell capture Part 2: Image sequence analysis based evaluation and biological application.

PubMed

Járvás, Gábor; Varga, Tamás; Szigeti, Márton; Hajba, László; Fürjes, Péter; Rajta, István; Guttman, András

2018-02-01

As a continuation of our previously published work, this paper presents a detailed evaluation of a microfabricated cell capture device utilizing a doubly tilted micropillar array. The device was fabricated using a novel hybrid technology based on the combination of proton beam writing and conventional lithography techniques. Tilted pillars offer unique flow characteristics and support enhanced fluidic interaction for improved immunoaffinity based cell capture. The performance of the microdevice was evaluated by an image sequence analysis based in-house developed single-cell tracking system. Individual cell tracking allowed in-depth analysis of the cell-chip surface interaction mechanism from hydrodynamic point of view. Simulation results were validated by using the hybrid device and the optimized surface functionalization procedure. Finally, the cell capture capability of this new generation microdevice was demonstrated by efficiently arresting cells from a HT29 cell-line suspension. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Methods, Tools and Current Perspectives in Proteogenomics *

PubMed Central

Ruggles, Kelly V.; Krug, Karsten; Wang, Xiaojing; Clauser, Karl R.; Wang, Jing; Payne, Samuel H.; Fenyö, David; Zhang, Bing; Mani, D. R.

2017-01-01

With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications. PMID:28456751
An integrated approach to fast and informative morphological vouchering of nematodes for applications in molecular barcoding

PubMed Central

De Ley, Paul; De Ley, Irma Tandingan; Morris, Krystalynne; Abebe, Eyualem; Mundo-Ocampo, Manuel; Yoder, Melissa; Heras, Joseph; Waumann, Dora; Rocha-Olivares, Axayácatl; Jay Burr, A.H; Baldwin, James G; Thomas, W. Kelley

2005-01-01

Molecular surveys of meiofaunal diversity face some interesting methodological challenges when it comes to interstitial nematodes from soils and sediments. Morphology-based surveys are greatly limited in processing speed, while barcoding approaches for nematodes are hampered by difficulties of matching sequence data with traditional taxonomy. Intermediate technology is needed to bridge the gap between both approaches. An example of such technology is video capture and editing microscopy, which consists of the recording of taxonomically informative multifocal series of microscopy images as digital video clips. The integration of multifocal imaging with sequence analysis of the D2D3 region of large subunit (LSU) rDNA is illustrated here in the context of a combined morphological and barcode sequencing survey of marine nematodes from Baja California and California. The resulting video clips and sequence data are made available online in the database NemATOL (http://nematol.unh.edu/). Analyses of 37 barcoded nematodes suggest that these represent at least 32 species, none of which matches available D2D3 sequences in public databases. The recorded multifocal vouchers allowed us to identify most specimens to genus, and will be used to match specimens with subsequent species identifications and descriptions of preserved specimens. Like molecular barcodes, multifocal voucher archives are part of a wider effort at structuring and changing the process of biodiversity discovery. We argue that data-rich surveys and phylogenetic tools for analysis of barcode sequences are an essential component of the exploration of phyla with a high fraction of undiscovered species. Our methods are also directly applicable to other meiofauna such as for example gastrotrichs and tardigrades. PMID:16214752
Sequencing technologies - the next generation.

PubMed

Metzker, Michael L

2010-01-01

Demand has never been greater for revolutionary technologies that deliver fast, inexpensive and accurate genome information. This challenge has catalysed the development of next-generation sequencing (NGS) technologies. The inexpensive production of large volumes of sequence data is the primary advantage over conventional methods. Here, I present a technical review of template preparation, sequencing and imaging, genome alignment and assembly approaches, and recent advances in current and near-term commercially available NGS instruments. I also outline the broad range of applications for NGS technologies, in addition to providing guidelines for platform selection to address biological questions of interest.
Genome Structural Diversity among 31 Bordetella pertussis Isolates from Two Recent U.S. Whooping Cough Statewide Epidemics.

PubMed

Bowden, Katherine E; Weigand, Michael R; Peng, Yanhui; Cassiday, Pamela K; Sammons, Scott; Knipe, Kristen; Rowe, Lori A; Loparev, Vladimir; Sheth, Mili; Weening, Keeley; Tondella, M Lucia; Williams, Margaret M

2016-01-01

During 2010 and 2012, California and Vermont, respectively, experienced statewide epidemics of pertussis with differences seen in the demographic affected, case clinical presentation, and molecular epidemiology of the circulating strains. To overcome limitations of the current molecular typing methods for pertussis, we utilized whole-genome sequencing to gain a broader understanding of how current circulating strains are causing large epidemics. Through the use of combined next-generation sequencing technologies, this study compared de novo, single-contig genome assemblies from 31 out of 33 Bordetella pertussis isolates collected during two separate pertussis statewide epidemics and 2 resequenced vaccine strains. Final genome architecture assemblies were verified with whole-genome optical mapping. Sixteen distinct genome rearrangement profiles were observed in epidemic isolate genomes, all of which were distinct from the genome structures of the two resequenced vaccine strains. These rearrangements appear to be mediated by repetitive sequence elements, such as high-copy-number mobile genetic elements and rRNA operons. Additionally, novel and previously identified single nucleotide polymorphisms were detected in 10 virulence-related genes in the epidemic isolates. Whole-genome variation analysis identified state-specific variants, and coding regions bearing nonsynonymous mutations were classified into functional annotated orthologous groups. Comprehensive studies on whole genomes are needed to understand the resurgence of pertussis and develop novel tools to better characterize the molecular epidemiology of evolving B. pertussis populations. IMPORTANCE Pertussis, or whooping cough, is the most poorly controlled vaccine-preventable bacterial disease in the United States, which has experienced a resurgence for more than a decade. Once viewed as a monomorphic pathogen, B. pertussis strains circulating during epidemics exhibit diversity visible on a genome structural level, previously undetectable by traditional sequence analysis using short-read technologies. For the first time, we combine short- and long-read sequencing platforms with restriction optical mapping for single-contig, de novo assembly of 31 isolates to investigate two geographically and temporally independent U.S. pertussis epidemics. These complete genomes reshape our understanding of B. pertussis evolution and strengthen molecular epidemiology toward one day understanding the resurgence of pertussis.
Regional comparisons of on-site solar potential in the residential and industrial sectors

NASA Astrophysics Data System (ADS)

Gatzke, A. E.; Skewes-Cox, A. O.

1980-10-01

Regional and subregional differences in the potential development of decentralized solar technologies are studied. Two sectors of the economy were selected for intensive analysis: the residential and industrial sectors. The sequence of analysis follows the same general steps: (1) selection of appropriate prototypes within each land use sector disaggregated by census region; (2) characterization of the end-use energy demand of each prototype in order to match an appropriate decentralized solar technology to the energy demand; (3) assessment of the energy conservation potential within each prototype limited by land use patterns, technology efficiency, and variation in solar insolation; and (4) evaluation of the regional and subregional differences in the land use implications of decentralized energy supply technologies that result from the combination of energy demand, energy supply potential, and the subsequent addition of increasingly more restrictive policies to increase the percent contribution of on-site solar energy.
Base-resolution detection of N 4-methylcytosine in genomic DNA using 4mC-Tet-assisted-bisulfite-sequencing

DOE PAGES

Yu, Miao; Ji, Lexiang; Neumann, Drexel A.; ...

2015-07-15

Restriction-modification (R-M) systems pose a major barrier to DNA transformation and genetic engineering of bacterial species. Systematic identification of DNA methylation in R-M systems, including N 6-methyladenine (6mA), 5-methylcytosine (5mC) and N 4-methylcytosine (4mC), will enable strategies to make these species genetically tractable. Although single-molecule, real time (SMRT) sequencing technology is capable of detecting 4mC directly for any bacterial species regardless of whether an assembled genome exists or not, it is not as scalable to profiling hundreds to thousands of samples compared with the commonly used next-generation sequencing technologies. Here, we present 4mC-Tet-assisted bisulfite-sequencing (4mC-TAB-seq), a next-generation sequencing method thatmore » rapidly and cost efficiently reveals the genome-wide locations of 4mC for bacterial species with an available assembled reference genome. In 4mC-TAB-seq, both cytosines and 5mCs are read out as thymines, whereas only 4mCs are read out as cytosines, revealing their specific positions throughout the genome. We applied 4mC-TAB-seq to study the methylation of a member of the hyperthermophilc genus, Caldicellulosiruptor, in which 4mC-related restriction is a major barrier to DNA transformation from other species. Lastly, in combination with MethylC-seq, both 4mC- and 5mC-containing motifs are identified which can assist in rapid and efficient genetic engineering of these bacteria in the future.« less
Microfluidic single-cell whole-transcriptome sequencing.

PubMed

Streets, Aaron M; Zhang, Xiannian; Cao, Chen; Pang, Yuhong; Wu, Xinglong; Xiong, Liang; Yang, Lu; Fu, Yusi; Zhao, Liang; Tang, Fuchou; Huang, Yanyi

2014-05-13

Single-cell whole-transcriptome analysis is a powerful tool for quantifying gene expression heterogeneity in populations of cells. Many techniques have, thus, been recently developed to perform transcriptome sequencing (RNA-Seq) on individual cells. To probe subtle biological variation between samples with limiting amounts of RNA, more precise and sensitive methods are still required. We adapted a previously developed strategy for single-cell RNA-Seq that has shown promise for superior sensitivity and implemented the chemistry in a microfluidic platform for single-cell whole-transcriptome analysis. In this approach, single cells are captured and lysed in a microfluidic device, where mRNAs with poly(A) tails are reverse-transcribed into cDNA. Double-stranded cDNA is then collected and sequenced using a next generation sequencing platform. We prepared 94 libraries consisting of single mouse embryonic cells and technical replicates of extracted RNA and thoroughly characterized the performance of this technology. Microfluidic implementation increased mRNA detection sensitivity as well as improved measurement precision compared with tube-based protocols. With 0.2 M reads per cell, we were able to reconstruct a majority of the bulk transcriptome with 10 single cells. We also quantified variation between and within different types of mouse embryonic cells and found that enhanced measurement precision, detection sensitivity, and experimental throughput aided the distinction between biological variability and technical noise. With this work, we validated the advantages of an early approach to single-cell RNA-Seq and showed that the benefits of combining microfluidic technology with high-throughput sequencing will be valuable for large-scale efforts in single-cell transcriptome analysis.
Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.

PubMed

Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing

2015-08-05

To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing.
CloVR: a virtual machine for automated and portable sequence analysis from the desktop using cloud computing.

PubMed

Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian

2011-08-30

Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.
Deciphering the distance to antibiotic resistance for the pneumococcus using genome sequencing data

PubMed Central

Mobegi, Fredrick M.; Cremers, Amelieke J. H.; de Jonge, Marien I.; Bentley, Stephen D.; van Hijum, Sacha A. F. T.; Zomer, Aldert

2017-01-01

Advances in genome sequencing technologies and genome-wide association studies (GWAS) have provided unprecedented insights into the molecular basis of microbial phenotypes and enabled the identification of the underlying genetic variants in real populations. However, utilization of genome sequencing in clinical phenotyping of bacteria is challenging due to the lack of reliable and accurate approaches. Here, we report a method for predicting microbial resistance patterns using genome sequencing data. We analyzed whole genome sequences of 1,680 Streptococcus pneumoniae isolates from four independent populations using GWAS and identified probable hotspots of genetic variation which correlate with phenotypes of resistance to essential classes of antibiotics. With the premise that accumulation of putative resistance-conferring SNPs, potentially in combination with specific resistance genes, precedes full resistance, we retrogressively surveyed the hotspot loci and quantified the number of SNPs and/or genes, which if accumulated would confer full resistance to an otherwise susceptible strain. We name this approach the ‘distance to resistance’. It can be used to identify the creep towards complete antibiotics resistance in bacteria using genome sequencing. This approach serves as a basis for the development of future sequencing-based methods for predicting resistance profiles of bacterial strains in hospital microbiology and public health settings. PMID:28205635
DOE Office of Scientific and Technical Information (OSTI.GOV)

Hraber, Peter; Korber, Bette; Wagh, Kshitij

Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less
Simulation-Based Evaluation of Learning Sequences for Instructional Technologies

ERIC Educational Resources Information Center

McEneaney, John E.

2016-01-01

Instructional technologies critically depend on systematic design, and learning hierarchies are a commonly advocated tool for designing instructional sequences. But hierarchies routinely allow numerous sequences and choosing an optimal sequence remains an unsolved problem. This study explores a simulation-based approach to modeling learning…

Advances in DNA sequencing technologies for high resolution HLA typing.

PubMed

Cereb, Nezih; Kim, Hwa Ran; Ryu, Jaejun; Yang, Soo Young

2015-12-01

This communication describes our experience in large-scale G group-level high resolution HLA typing using three different DNA sequencing platforms - ABI 3730 xl, Illumina MiSeq and PacBio RS II. Recent advances in DNA sequencing technologies, so-called next generation sequencing (NGS), have brought breakthroughs in deciphering the genetic information in all living species at a large scale and at an affordable level. The NGS DNA indexing system allows sequencing multiple genes for large number of individuals in a single run. Our laboratory has adopted and used these technologies for HLA molecular testing services. We found that each sequencing technology has its own strengths and weaknesses, and their sequencing performances complement each other. HLA genes are highly complex and genotyping them is quite challenging. Using these three sequencing platforms, we were able to meet all requirements for G group-level high resolution and high volume HLA typing. Copyright © 2015 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.
Genomics and Genetics in the Biology of Adaptation to Exercise

PubMed Central

Bouchard, Claude; Rankinen, Tuomo; Timmons, James A.

2014-01-01

This chapter is devoted to the role of genetic variation and gene-exercise interactions in the biology of adaptation to exercise. There is evidence from genetic epidemiology research that DNA sequence differences contribute to human variation in physical activity level, cardiorespiratory fitness in the untrained state, cardiovascular and metabolic response to acute exercise, and responsiveness to regular exercise. Methodological and technological advances have made it possible to undertake the molecular dissection of the genetic component of complex, multifactorial traits, such as those of interest to exercise biology, in terms of tissue expression profile, genes, and allelic variants. The evidence from animal models and human studies is considered. Data on candidate genes, genome-wide linkage results, genome-wide association findings, expression arrays, and combinations of these approaches are reviewed. Combining transcriptomic and genomic technologies has been shown to be more powerful as evidenced by the development of a recent molecular predictor of the ability to increase VO2max with exercise training. For exercise as a behavior and physiological fitness as a state to be major players in public health policies will require that that the role of human individuality and the influence of DNA sequence differences be understood. Likewise, progress in the use of exercise in therapeutic medicine will depend to a large extent on our ability to identify the favorable responders for given physiological properties to a given exercise regimen. PMID:23733655
Illuminating the Black Box of Genome Sequence Assembly: A Free Online Tool to Introduce Students to Bioinformatics

ERIC Educational Resources Information Center

Taylor, D. Leland; Campbell, A. Malcolm; Heyer, Laurie J.

2013-01-01

Next-generation sequencing technologies have greatly reduced the cost of sequencing genomes. With the current sequencing technology, a genome is broken into fragments and sequenced, producing millions of "reads." A computer algorithm pieces these reads together in the genome assembly process. PHAST is a set of online modules…
dBBQs: dataBase of Bacterial Quality scores.

PubMed

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-12-28

It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
Multiple-Frame Detection of Subpixel Targets in Thermal Image Sequences

NASA Technical Reports Server (NTRS)

Thompson, David R.; Kremens, Robert

2013-01-01

The new technology in this approach combines the subpixel detection information from multiple frames of a sequence to achieve a more sensitive detection result, using only the information found in the images themselves. It is taken as a constraint that the method is automated, robust, and computationally feasible for field networks with constrained computation and data rates. This precludes simply downloading a video stream for pixel-wise co-registration on the ground. It is also important that this method not require precise knowledge of sensor position or direction, because such information is often not available. It is also assumed that the scene in question is approximately planar, which is appropriate for a high-altitude airborne or orbital view.
Fuzzy logic based on-line fault detection and classification in transmission line.

PubMed

Adhikari, Shuma; Sinha, Nidul; Dorendrajit, Thingam

2016-01-01

This study presents fuzzy logic based online fault detection and classification of transmission line using Programmable Automation and Control technology based National Instrument Compact Reconfigurable i/o (CRIO) devices. The LabVIEW software combined with CRIO can perform real time data acquisition of transmission line. When fault occurs in the system current waveforms are distorted due to transients and their pattern changes according to the type of fault in the system. The three phase alternating current, zero sequence and positive sequence current data generated by LabVIEW through CRIO-9067 are processed directly for relaying. The result shows that proposed technique is capable of right tripping action and classification of type of fault at high speed therefore can be employed in practical application.
Single-Molecule Denaturation Mapping of Genomic DNA in Nanofluidic Channels

NASA Astrophysics Data System (ADS)

Reisner, Walter; Larsen, Niels; Kristensen, Anders; Tegenfeldt, Jonas O.; Flyvbjerg, Henrik

2009-03-01

We have developed a new DNA barcoding technique based on the partial denaturation of extended fluorescently labeled DNA molecules. We partially melt DNA extended in nanofluidic channels via a combination of local heating and added chemical denaturants. The melted molecules, imaged via a standard fluorescence videomicroscopy setup, exhibit a nonuniform fluorescence profile corresponding to a series of local dips and peaks in the intensity trace along the stretched molecule. We show that this barcode is consistent with the presence of locally melted regions and can be explained by calculations of sequence-dependent melting probability. We believe this melting mapping technology is the first optically based single molecule technique sensitive to genome wide sequence variation that does not require an additional enzymatic labeling or restriction scheme.
Local Alignment Tool Based on Hadoop Framework and GPU Architecture

PubMed Central

Hung, Che-Lun; Hua, Guan-Jie

2014-01-01

With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analyze such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented on GPU architectures, for biologists to compare protein sequences. To deal with the big biology data, it is hard to rely on single GPU. Therefore, we implement a distributed BLASTP by combining Hadoop and multi-GPUs. The experimental results present that the proposed method can improve the performance of BLASTP on single GPU, and also it can achieve high availability and fault tolerance. PMID:24955362
Local alignment tool based on Hadoop framework and GPU architecture.

PubMed

Hung, Che-Lun; Hua, Guan-Jie

2014-01-01

With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analyze such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented on GPU architectures, for biologists to compare protein sequences. To deal with the big biology data, it is hard to rely on single GPU. Therefore, we implement a distributed BLASTP by combining Hadoop and multi-GPUs. The experimental results present that the proposed method can improve the performance of BLASTP on single GPU, and also it can achieve high availability and fault tolerance.
A Hybrid Approach for CpG Island Detection in the Human Genome.

PubMed

Yang, Cheng-Hong; Lin, Yu-Da; Chiang, Yi-Cheng; Chuang, Li-Yeh

2016-01-01

CpG islands have been demonstrated to influence local chromatin structures and simplify the regulation of gene activity. However, the accurate and rapid determination of CpG islands for whole DNA sequences remains experimentally and computationally challenging. A novel procedure is proposed to detect CpG islands by combining clustering technology with the sliding-window method (PSO-based). Clustering technology is used to detect the locations of all possible CpG islands and process the data, thus effectively obviating the need for the extensive and unnecessary processing of DNA fragments, and thus improving the efficiency of sliding-window based particle swarm optimization (PSO) search. This proposed approach, named ClusterPSO, provides versatile and highly-sensitive detection of CpG islands in the human genome. In addition, the detection efficiency of ClusterPSO is compared with eight CpG island detection methods in the human genome. Comparison of the detection efficiency for the CpG islands in human genome, including sensitivity, specificity, accuracy, performance coefficient (PC), and correlation coefficient (CC), ClusterPSO revealed superior detection ability among all of the test methods. Moreover, the combination of clustering technology and PSO method can successfully overcome their respective drawbacks while maintaining their advantages. Thus, clustering technology could be hybridized with the optimization algorithm method to optimize CpG island detection. The prediction accuracy of ClusterPSO was quite high, indicating the combination of CpGcluster and PSO has several advantages over CpGcluster and PSO alone. In addition, ClusterPSO significantly reduced implementation time.
Soil DNA metabarcoding and high-throughput sequencing as a forensic tool: considerations, potential limitations and recommendations.

PubMed

Young, J M; Austin, J J; Weyrich, L S

2017-02-01

Analysis of physical evidence is typically a deciding factor in forensic casework by establishing what transpired at a scene or who was involved. Forensic geoscience is an emerging multi-disciplinary science that can offer significant benefits to forensic investigations. Soil is a powerful, nearly 'ideal' contact trace evidence, as it is highly individualistic, easy to characterise, has a high transfer and retention probability, and is often overlooked in attempts to conceal evidence. However, many real-life cases encounter close proximity soil samples or soils with low inorganic content, which cannot be easily discriminated based on current physical and chemical analysis techniques. The capability to improve forensic soil discrimination, and identify key indicator taxa from soil using the organic fraction is currently lacking. The development of new DNA sequencing technologies offers the ability to generate detailed genetic profiles from soils and enhance current forensic soil analyses. Here, we discuss the use of DNA metabarcoding combined with high-throughput sequencing (HTS) technology to distinguish between soils from different locations in a forensic context. Specifically, we provide recommendations for best practice, outline the potential limitations encountered in a forensic context and describe the future directions required to integrate soil DNA analysis into casework. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

PubMed

Dalpé, Gratien; Joly, Yann

2014-09-01

Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.
[Application of multiple polymorphism genetic markers in determination of half sibling sharing a same mother].

PubMed

Que, Ting-zhi; Zhao, Shu-min; Li, Cheng-tao

2010-08-01

Determination strategies for half sibling sharing a same mother were investigated through the detection of autosomal and X-chromosomal STR (X-STR) loci and polymorphisms on hypervariable (HV) region of mitochondrial DNA (mtDNA). Genomic DNA were extracted from blood stain samples of the 3 full siblings and one dubious half sibling sharing the same mother with them. Fifteen autosomal STR loci were genotyped by Sinofiler kit, and 19 X-STR loci were genotyped by Mentype Argus X-8 kit and 16 plex in-house system. Polymorphisms of mtDNA HV-I and HV-II were also detected with sequencing technology. Full sibling relationship between the dubious half sibling and each of the 3 full siblings were excluded based on the results of autosomal STR genotyping and calculation of full sibling index (FSI) and half sibling index (HIS). Results of sequencing for mtDNA HV-I and HV-II showed that all of the 4 samples came from a same maternal line. X-STR genotyping results determined that the dubious half sibling shared a same mother with the 3 full siblings. It is reliable to combine three different genotyping technologies including autosomal STR, X-STR and sequencing of mtDNA HV-I and HV-II for determination of half sibling sharing a same mother.
[Engineered spider silk: the intelligent biomaterial of the future. Part I].

PubMed

Florczak, Anna; Piekoś, Konrad; Kaźmierska, Katarzyna; Mackiewicz, Andrzej; Dams-Kozłowska, Hanna

2011-06-17

The unique properties of spider silk such as strength, extensibility, toughness, biocompatibility and biodegradability are the reasons for the recent development in silk biomaterial technology. For a long time scientific progress was impeded by limited access to spider silk. However, the development of the molecular biology strategy was a breaking point in synthetic spider silk protein design. The sequences of engineered spider silk are based on the consensus motives of the corresponding natural equivalents. Moreover, the engineered silk proteins may be modified in order to gain a new function. The strategy of the hybrid proteins constructed on the DNA level combines the sequence of engineered silk, which is responsible for the biomaterial structure, with the sequence of polypeptide which allows functionalization of the silk biomaterial. The functional domains may comprise receptor binding sites, enzymes, metal or sugar binding sites and others. Currently, advanced research is being conducted, which on the one hand focuses on establishing the particular silk structure and understanding the process of silk thread formation in nature. On the other hand, there are attempts to improve methods of engineered spider silk protein production. Due to acquired knowledge and recent progress in synthetic protein technology, the engineered silk will turn into intelligent biomaterial of the future, while its industrial production scale will trigger a biotechnological revolution.
PET/MRI in Oncological Imaging: State of the Art

PubMed Central

Bashir, Usman; Mallia, Andrew; Stirling, James; Joemon, John; MacKewn, Jane; Charles-Edwards, Geoff; Goh, Vicky; Cook, Gary J.

2015-01-01

Positron emission tomography (PET) combined with magnetic resonance imaging (MRI) is a hybrid technology which has recently gained interest as a potential cancer imaging tool. Compared with CT, MRI is advantageous due to its lack of ionizing radiation, superior soft-tissue contrast resolution, and wider range of acquisition sequences. Several studies have shown PET/MRI to be equivalent to PET/CT in most oncological applications, possibly superior in certain body parts, e.g., head and neck, pelvis, and in certain situations, e.g., cancer recurrence. This review will update the readers on recent advances in PET/MRI technology and review key literature, while highlighting the strengths and weaknesses of PET/MRI in cancer imaging. PMID:26854157
Functional Nanomaterials for Environmental Applications and Bioassemblies

NASA Astrophysics Data System (ADS)

Nguyen, Michelle Anne

The rational design of nanomaterials has yielded new technologies that have revolutionized numerous diverse fields. The work detailed herein first describes the application of photocatalytic nanomaterials towards the environmental remediation of harmful toxins. Specifically, a low-temperature solution-phase synthetic route for size-controlled Cu2O octahedra particles was developed, and these materials were evaluated as catalysts for the photocatalytic degradation of aromatic organic compounds. Moreover, cubic Cu2O/Pd composite structures were fabricated and demonstrated to be effective photocatalysts for the generation of H2 and the reductive dehalogenation of polychlorinated biphenyls, well-known carcinogens present at many contaminated sites around the world. This photocatalytic approach to environmental remediation exemplifies the adaptation of light-driven technologies and sustainable practices to energy-intensive catalytic systems. In addition, this work also investigates the organic/inorganic interface of peptide-mediated Au nanoparticles as a means to identify rational design principles for materials binding peptide sequences for the advancement of stimuli-responsive bionanoassemblies. Factors inherent to peptide sequences that can promote strong materials-binding affinity and/or effective nanoparticle stabilization capability were identified in order to progress biomimetic technologies. These findings were elucidated using a combinational approach of peptide binding experiments to Au in partnership with molecular dynamics simulations. Overall, this work demonstrates the growing applications of nanomaterials in remediation technologies and aids in the understanding of the origins of peptide material affinity and nanoparticle stabilization.
Characterization and analysis of a transcriptome from the boreal spider crab Hyas araneus.

PubMed

Harms, Lars; Frickenhaus, Stephan; Schiffer, Melanie; Mark, Felix C; Storch, Daniela; Pörtner, Hans-Otto; Held, Christoph; Lucassen, Magnus

2013-12-01

Research investigating the genetic basis of physiological responses has significantly broadened our understanding of the mechanisms underlying organismic response to environmental change. However, genomic data are currently available for few taxa only, thus excluding physiological model species from this approach. In this study we report the transcriptome of the model organism Hyas araneus from Spitsbergen (Arctic). We generated 20,479 transcripts, using the 454 GS FLX sequencing technology in combination with an Illumina HiSeq sequencing approach. Annotation by Blastx revealed 7159 blast hits in the NCBI non-redundant protein database. The comparison between the spider crab H. araneus transcriptome and EST libraries of the European lobster Homarus americanus and the porcelain crab Petrolisthes cinctipes yielded 3229/2581 sequences with a significant hit, respectively. The clustering by the Markov Clustering Algorithm (MCL) revealed a common core of 1710 clusters present in all three species and 5903 unique clusters for H. araneus. The combined sequencing approaches generated transcripts that will greatly expand the limited genomic data available for crustaceans. We introduce the MCL clustering for transcriptome comparisons as a simple approach to estimate similarities between transcriptomic libraries of different size and quality and to analyze homologies within the selected group of species. In particular, we identified a large variety of reverse transcriptase (RT) sequences not only in the H. araneus transcriptome and other decapod crustaceans, but also sea urchin, supporting the hypothesis of a heritable, anti-viral immunity and the proposed viral fragment integration by host-derived RTs in marine invertebrates. © 2013.
Metagenomic and near full-length 16S rRNA sequence data in support of the phylogenetic analysis of the rumen bacterial community in steers

USDA-ARS?s Scientific Manuscript database

Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...
Sequence investigation of 34 forensic autosomal STRs with massively parallel sequencing.

PubMed

Zhang, Suhua; Niu, Yong; Bian, Yingnan; Dong, Rixia; Liu, Xiling; Bao, Yun; Jin, Chao; Zheng, Hancheng; Li, Chengtao

2018-05-01

STRs vary not only in the length of the repeat units and the number of repeats but also in the region with which they conform to an incremental repeat pattern. Massively parallel sequencing (MPS) offers new possibilities in the analysis of STRs since they can simultaneously sequence multiple targets in a single reaction and capture potential internal sequence variations. Here, we sequenced 34 STRs applied in the forensic community of China with a custom-designed panel. MPS performance were evaluated from sequencing reads analysis, concordance study and sensitivity testing. High coverage sequencing data were obtained to determine the constitute ratios and heterozygous balance. No actual inconsistent genotypes were observed between capillary electrophoresis (CE) and MPS, demonstrating the reliability of the panel and the MPS technology. With the sequencing data from the 200 investigated individuals, 346 and 418 alleles were obtained via CE and MPS technologies at the 34 STRs, indicating MPS technology provides higher discrimination than CE detection. The whole study demonstrated that STR genotyping with the custom panel and MPS technology has the potential not only to reveal length and sequence variations but also to satisfy the demands of high throughput and high multiplexing with acceptable sensitivity.
Prediction of response to anti-EGFR antibody-based therapies by multigene sequencing in colorectal cancer patients.

PubMed

Lupini, Laura; Bassi, Cristian; Mlcochova, Jitka; Musa, Gentian; Russo, Marta; Vychytilova-Faltejskova, Petra; Svoboda, Marek; Sabbioni, Silvia; Nemecek, Radim; Slaby, Ondrej; Negrini, Massimo

2015-10-27

The anti-epidermal growth factor receptor (EGFR) monoclonal antibodies (moAbs) cetuximab or panitumumab are administered to colorectal cancer (CRC) patients who harbor wild-type RAS proto-oncogenes. However, a percentage of patients do not respond to this treatment. In addition to mutations in the RAS genes, mutations in other genes, such as BRAF, PI3KCA, or PTEN, could be involved in the resistance to anti-EGFR moAb therapy. In order to develop a comprehensive approach for the detection of mutations and to eventually identify other genes responsible for resistance to anti-EGFR moAbs, we investigated a panel of 21 genes by parallel sequencing on the Ion Torrent Personal Genome Machine platform. We sequenced 65 CRCs that were treated with cetuximab or panitumumab. Among these, 37 samples were responsive and 28 were resistant. We confirmed that mutations in EGFR-pathway genes (KRAS, NRAS, BRAF, PI3KCA) were relevant for conferring resistance to therapy and could predict response (p = 0.001). After exclusion of KRAS, NRAS, BRAF and PI3KCA combined mutations could still significantly associate to resistant phenotype (p = 0.045, by Fisher exact test). In addition, mutations in FBXW7 and SMAD4 were prevalent in cases that were non-responsive to anti-EGFR moAb. After we combined the mutations of all genes (excluding KRAS), the ability to predict response to therapy improved significantly (p = 0.002, by Fisher exact test). The combination of mutations at KRAS and at the five gene panel demonstrates the usefulness and feasibility of multigene sequencing to assess response to anti-EGFR moAbs. The application of parallel sequencing technology in clinical practice, in addition to its innate ability to simultaneously examine the genetic status of several cancer genes, proved to be more accurate and sensitive than the presently in use traditional approaches.

Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

PubMed Central

Collins, Lesley Joan

2011-01-01

ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390
Transcript-specific, single-nucleotide polymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.).

PubMed

Allen, Alexandra M; Barker, Gary L A; Berry, Simon T; Coghill, Jane A; Gwilliam, Rhian; Kirby, Susan; Robinson, Phil; Brenchley, Rachel C; D'Amore, Rosalinda; McKenzie, Neil; Waite, Darren; Hall, Anthony; Bevan, Michael; Hall, Neil; Edwards, Keith J

2011-12-01

Food security is a global concern and substantial yield increases in cereal crops are required to feed the growing world population. Wheat is one of the three most important crops for human and livestock feed. However, the complexity of the genome coupled with a decline in genetic diversity within modern elite cultivars has hindered the application of marker-assisted selection (MAS) in breeding programmes. A crucial step in the successful application of MAS in breeding programmes is the development of cheap and easy to use molecular markers, such as single-nucleotide polymorphisms. To mine selected elite wheat germplasm for intervarietal single-nucleotide polymorphisms, we have used expressed sequence tags derived from public sequencing programmes and next-generation sequencing of normalized wheat complementary DNA libraries, in combination with a novel sequence alignment and assembly approach. Here, we describe the development and validation of a panel of 1114 single-nucleotide polymorphisms in hexaploid bread wheat using competitive allele-specific polymerase chain reaction genotyping technology. We report the genotyping results of these markers on 23 wheat varieties, selected to represent a broad cross-section of wheat germplasm including a number of elite UK varieties. Finally, we show that, using relatively simple technology, it is possible to rapidly generate a linkage map containing several hundred single-nucleotide polymorphism markers in the doubled haploid mapping population of Avalon × Cadenza. © 2011 The Authors. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies.

PubMed

Bakker, Bjorn; Taudt, Aaron; Belderbos, Mirjam E; Porubsky, David; Spierings, Diana C J; de Jong, Tristan V; Halsema, Nancy; Kazemier, Hinke G; Hoekstra-Wakker, Karina; Bradley, Allan; de Bont, Eveline S J M; van den Berg, Anke; Guryev, Victor; Lansdorp, Peter M; Colomé-Tatché, Maria; Foijer, Floris

2016-05-31

Chromosome instability leads to aneuploidy, a state in which cells have abnormal numbers of chromosomes, and is found in two out of three cancers. In a chromosomal instable p53 deficient mouse model with accelerated lymphomagenesis, we previously observed whole chromosome copy number changes affecting all lymphoma cells. This suggests that chromosome instability is somehow suppressed in the aneuploid lymphomas or that selection for frequently lost/gained chromosomes out-competes the CIN-imposed mis-segregation. To distinguish between these explanations and to examine karyotype dynamics in chromosome instable lymphoma, we use a newly developed single-cell whole genome sequencing (scWGS) platform that provides a complete and unbiased overview of copy number variations (CNV) in individual cells. To analyse these scWGS data, we develop AneuFinder, which allows annotation of copy number changes in a fully automated fashion and quantification of CNV heterogeneity between cells. Single-cell sequencing and AneuFinder analysis reveals high levels of copy number heterogeneity in chromosome instability-driven murine T-cell lymphoma samples, indicating ongoing chromosome instability. Application of this technology to human B cell leukaemias reveals different levels of karyotype heterogeneity in these cancers. Our data show that even though aneuploid tumours select for particular and recurring chromosome combinations, single-cell analysis using AneuFinder reveals copy number heterogeneity. This suggests ongoing chromosome instability that other platforms fail to detect. As chromosome instability might drive tumour evolution, karyotype analysis using single-cell sequencing technology could become an essential tool for cancer treatment stratification.
Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

DOE PAGES

Hraber, Peter; Korber, Bette; Wagh, Kshitij; ...

2015-10-21

Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

PubMed Central

Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

2013-01-01

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.

PubMed

Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A

2013-07-01

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.
De novo peptide sequencing by deep learning

PubMed Central

Tran, Ngoc Hieu; Zhang, Xianglilan; Xin, Lei; Shan, Baozhen; Li, Ming

2017-01-01

De novo peptide sequencing from tandem MS data is the key technology in proteomics for the characterization of proteins, especially for new sequences, such as mAbs. In this study, we propose a deep neural network model, DeepNovo, for de novo peptide sequencing. DeepNovo architecture combines recent advances in convolutional neural networks and recurrent neural networks to learn features of tandem mass spectra, fragment ions, and sequence patterns of peptides. The networks are further integrated with local dynamic programming to solve the complex optimization task of de novo sequencing. We evaluated the method on a wide variety of species and found that DeepNovo considerably outperformed state of the art methods, achieving 7.7–22.9% higher accuracy at the amino acid level and 38.1–64.0% higher accuracy at the peptide level. We further used DeepNovo to automatically reconstruct the complete sequences of antibody light and heavy chains of mouse, achieving 97.5–100% coverage and 97.2–99.5% accuracy, without assisting databases. Moreover, DeepNovo is retrainable to adapt to any sources of data and provides a complete end-to-end training and prediction solution to the de novo sequencing problem. Not only does our study extend the deep learning revolution to a new field, but it also shows an innovative approach in solving optimization problems by using deep learning and dynamic programming. PMID:28720701
Case Study of a Small Scale Polytechnic Entrepreneurship Capstone Course Sequence

ERIC Educational Resources Information Center

Webster, Rustin D.; Kopp, Richard

2017-01-01

A multidisciplinary entrepreneurial senior capstone has been created for engineering technology students at a research I land-grant university statewide extension. The two semester course sequence welcomes students from Mechanical Engineering Technology, Electrical Engineering Technology, Computer Graphics Technology, and Organizational…
Combining and Sequencing Games Skills

ERIC Educational Resources Information Center

Belka, David E.

2004-01-01

This article discusses the combination of skills into sequences. Combining skills into usable, challenging, and meaningful sequences is often neglected or under-used in many school and community game programs. Reasons for this under-use are discussed. Combinations of skills build on proficiency in performing separate skills and serve as…
Precision medicine for cancer with next-generation functional diagnostics.

PubMed

Friedman, Adam A; Letai, Anthony; Fisher, David E; Flaherty, Keith T

2015-12-01

Precision medicine is about matching the right drugs to the right patients. Although this approach is technology agnostic, in cancer there is a tendency to make precision medicine synonymous with genomics. However, genome-based cancer therapeutic matching is limited by incomplete biological understanding of the relationship between phenotype and cancer genotype. This limitation can be addressed by functional testing of live patient tumour cells exposed to potential therapies. Recently, several 'next-generation' functional diagnostic technologies have been reported, including novel methods for tumour manipulation, molecularly precise assays of tumour responses and device-based in situ approaches; these address the limitations of the older generation of chemosensitivity tests. The promise of these new technologies suggests a future diagnostic strategy that integrates functional testing with next-generation sequencing and immunoprofiling to precisely match combination therapies to individual cancer patients.
A survey of enabling technologies in synthetic biology

PubMed Central

2013-01-01

Background Realizing constructive applications of synthetic biology requires continued development of enabling technologies as well as policies and practices to ensure these technologies remain accessible for research. Broadly defined, enabling technologies for synthetic biology include any reagent or method that, alone or in combination with associated technologies, provides the means to generate any new research tool or application. Because applications of synthetic biology likely will embody multiple patented inventions, it will be important to create structures for managing intellectual property rights that best promote continued innovation. Monitoring the enabling technologies of synthetic biology will facilitate the systematic investigation of property rights coupled to these technologies and help shape policies and practices that impact the use, regulation, patenting, and licensing of these technologies. Results We conducted a survey among a self-identifying community of practitioners engaged in synthetic biology research to obtain their opinions and experiences with technologies that support the engineering of biological systems. Technologies widely used and considered enabling by survey participants included public and private registries of biological parts, standard methods for physical assembly of DNA constructs, genomic databases, software tools for search, alignment, analysis, and editing of DNA sequences, and commercial services for DNA synthesis and sequencing. Standards and methods supporting measurement, functional composition, and data exchange were less widely used though still considered enabling by a subset of survey participants. Conclusions The set of enabling technologies compiled from this survey provide insight into the many and varied technologies that support innovation in synthetic biology. Many of these technologies are widely accessible for use, either by virtue of being in the public domain or through legal tools such as non-exclusive licensing. Access to some patent protected technologies is less clear and use of these technologies may be subject to restrictions imposed by material transfer agreements or other contract terms. We expect the technologies considered enabling for synthetic biology to change as the field advances. By monitoring the enabling technologies of synthetic biology and addressing the policies and practices that impact their development and use, our hope is that the field will be better able to realize its full potential. PMID:23663447
Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

PubMed

Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

2016-07-01

The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.
Historical Perspective, Development and Applications of Next-Generation Sequencing in Plant Virology

PubMed Central

Barba, Marina; Czosnek, Henryk; Hadidi, Ahmed

2014-01-01

Next-generation high throughput sequencing technologies became available at the onset of the 21st century. They provide a highly efficient, rapid, and low cost DNA sequencing platform beyond the reach of the standard and traditional DNA sequencing technologies developed in the late 1970s. They are continually improved to become faster, more efficient and cheaper. They have been used in many fields of biology since 2004. In 2009, next-generation sequencing (NGS) technologies began to be applied to several areas of plant virology including virus/viroid genome sequencing, discovery and detection, ecology and epidemiology, replication and transcription. Identification and characterization of known and unknown viruses and/or viroids in infected plants are currently among the most successful applications of these technologies. It is expected that NGS will play very significant roles in many research and non-research areas of plant virology. PMID:24399207
FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

PubMed

Wilson, Carolyn A; Simonyan, Vahan

2014-01-01

Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and bioinformatics tools to be able to store and analyze large amounts of data. To address this need, we have developed the High-performance Integrated Virtual Environment, or HIVE. HIVE uses a combination of distributed cloud storage and distributed cloud computations to provide a platform that is both rapid and responsive to support the growing and increasingly diverse scientific and regulatory needs of FDA scientists in their evaluation of NGS in research and ultimately for evaluation of NGS data in regulatory submissions. © PDA, Inc. 2014.
Recent advances in rice genome and chromosome structure research by fluorescence in situ hybridization (FISH).

PubMed

Ohmido, Nobuko; Fukui, Kiichi; Kinoshita, Toshiro

2010-01-01

Fluorescence in situ hybridization (FISH) is an effective method for the physical mapping of genes and repetitive DNA sequences on chromosomes. Physical mapping of unique nucleotide sequences on specific rice chromosome regions was performed using a combination of chromosome identification and highly sensitive FISH. Increases in the detection sensitivity of smaller DNA sequences and improvements in spatial resolution have ushered in a new phase in FISH technology. Thus, it is now possible to perform in situ hybridization on somatic chromosomes, pachytene chromosomes, and even on extended DNA fibers (EDFs). Pachytene-FISH allows the integration of genetic linkage maps and quantitative chromosome maps. Visualization methods using FISH can reveal the spatial organization of the centromere, heterochromatin/euchromatin, and the terminal structures of rice chromosomes. Furthermore, EDF-FISH and the DNA combing technique can resolve a spatial distance of 1 kb between adjacent DNA sequences, and the detection of even a 300-bp target is now feasible. The copy numbers of various repetitive sequences and the sizes of various DNA molecules were quantitatively measured using the molecular combing technique. This review describes the significance of these advances in molecular cytology in rice and discusses future applications in plant studies using visualization techniques.
Application of advanced cytometric and molecular technologies to minimal residual disease monitoring

NASA Astrophysics Data System (ADS)

Leary, James F.; He, Feng; Reece, Lisa M.

2000-04-01

Minimal residual disease monitoring presents a number of theoretical and practical challenges. Recently it has been possible to meet some of these challenges by combining a number of new advanced biotechnologies. To monitor the number of residual tumor cells requires complex cocktails of molecular probes that collectively provide sensitivities of detection on the order of one residual tumor cell per million total cells. Ultra-high-speed, multi parameter flow cytometry is capable of analyzing cells at rates in excess of 100,000 cells/sec. Residual tumor selection marker cocktails can be optimized by use of receiver operating characteristic analysis. New data minimizing techniques when combined with multi variate statistical or neural network classifications of tumor cells can more accurately predict residual tumor cell frequencies. The combination of these techniques can, under at least some circumstances, detect frequencies of tumor cells as low as one cell in a million with an accuracy of over 98 percent correct classification. Detection of mutations in tumor suppressor genes requires insolation of these rare tumor cells and single-cell DNA sequencing. Rare residual tumor cells can be isolated at single cell level by high-resolution single-cell cell sorting. Molecular characterization of tumor suppressor gene mutations can be accomplished using a combination of single- cell polymerase chain reaction amplification of specific gene sequences followed by TA cloning techniques and DNA sequencing. Mutations as small as a single base pair in a tumor suppressor gene of a single sorted tumor cell have been detected using these methods. Using new amplification procedures and DNA micro arrays it should be possible to extend the capabilities shown in this paper to screening of multiple DNA mutations in tumor suppressor and other genes on small numbers of sorted metastatic tumor cells.
Top-down analysis of protein samples by de novo sequencing techniques.

PubMed

Vyatkina, Kira; Wu, Si; Dekker, Lennard J M; VanDuijn, Martijn M; Liu, Xiaowen; Tolić, Nikola; Luider, Theo M; Paša-Tolić, Ljiljana; Pevzner, Pavel A

2016-09-15

Recent technological advances have made high-resolution mass spectrometers affordable to many laboratories, thus boosting rapid development of top-down mass spectrometry, and implying a need in efficient methods for analyzing this kind of data. We describe a method for analysis of protein samples from top-down tandem mass spectrometry data, which capitalizes on de novo sequencing of fragments of the proteins present in the sample. Our algorithm takes as input a set of de novo amino acid strings derived from the given mass spectra using the recently proposed Twister approach, and combines them into aggregated strings endowed with offsets. The former typically constitute accurate sequence fragments of sufficiently well-represented proteins from the sample being analyzed, while the latter indicate their location in the protein sequence, and also bear information on post-translational modifications and fragmentation patterns. Freely available on the web at http://bioinf.spbau.ru/en/twister vyatkina@spbau.ru or ppevzner@ucsd.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism

PubMed Central

Yin, Ling; An, Yunhe; Qu, Junjie; Li, Xinlong; Zhang, Yali; Dry, Ian; Wu, Huijuan; Lu, Jiang

2017-01-01

Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3 Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host. PMID:28417959
Community-led comparative genomic and phenotypic analysis of the aquaculture pathogen Pseudomonas baetica a390T sequenced by Ion semiconductor and Nanopore technologies

PubMed Central

Beaton, Ainsley; Lood, Cédric; Cunningham-Oakes, Edward; MacFadyen, Alison; Mullins, Alex J; Bestawy, Walid El; Botelho, João; Chevalier, Sylvie; Dalzell, Chloe; Dolan, Stephen K; Faccenda, Alberto; Ghequire, Maarten G K; Higgins, Steven; Kutschera, Alexander; Murray, Jordan; Redway, Martha; Salih, Talal; Smith, Brian A; Smits, Nathan; Thomson, Ryan; Woodcock, Stuart; Cornelis, Pierre; Lavigne, Rob; van Noort, Vera

2018-01-01

Abstract Pseudomonas baetica strain a390T is the type strain of this recently described species and here we present its high-contiguity draft genome. To celebrate the 16th International Conference on Pseudomonas, the genome of P. baetica strain a390T was sequenced using a unique combination of Ion Torrent semiconductor and Oxford Nanopore methods as part of a collaborative community-led project. The use of high-quality Ion Torrent sequences with long Nanopore reads gave rapid, high-contiguity and -quality, 16-contig genome sequence. Whole genome phylogenetic analysis places P. baetica within the P. koreensis clade of the P. fluorescens group. Comparison of the main genomic features of P. baetica with a variety of other Pseudomonas spp. suggests that it is a highly adaptable organism, typical of the genus. This strain was originally isolated from the liver of a diseased wedge sole fish, and genotypic and phenotypic analyses show that it is tolerant to osmotic stress and to oxytetracycline. PMID:29579234
Multiplexed Sequence Encoding: A Framework for DNA Communication.

PubMed

Zakeri, Bijan; Carr, Peter A; Lu, Timothy K

2016-01-01

Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA.

Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells.

PubMed

Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J; Yancopoulos, George D; Lin, Hsin Chieh; Gromada, Jesper

2016-03-22

This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type-specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system.
RIKEN Integrated Sequence Analysis (RISA) System—384-Format Sequencing Pipeline with 384 Multicapillary Sequencer

PubMed Central

Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide

2000-01-01

The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be revealed by seven RISA systems within one month. PMID:11076861
A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

PubMed

Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

2017-11-23

The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.
Pulmonary parenchyma segmentation in thin CT image sequences with spectral clustering and geodesic active contour model based on similarity

NASA Astrophysics Data System (ADS)

He, Nana; Zhang, Xiaolong; Zhao, Juanjuan; Zhao, Huilan; Qiang, Yan

2017-07-01

While the popular thin layer scanning technology of spiral CT has helped to improve diagnoses of lung diseases, the large volumes of scanning images produced by the technology also dramatically increase the load of physicians in lesion detection. Computer-aided diagnosis techniques like lesions segmentation in thin CT sequences have been developed to address this issue, but it remains a challenge to achieve high segmentation efficiency and accuracy without much involvement of human manual intervention. In this paper, we present our research on automated segmentation of lung parenchyma with an improved geodesic active contour model that is geodesic active contour model based on similarity (GACBS). Combining spectral clustering algorithm based on Nystrom (SCN) with GACBS, this algorithm first extracts key image slices, then uses these slices to generate an initial contour of pulmonary parenchyma of un-segmented slices with an interpolation algorithm, and finally segments lung parenchyma of un-segmented slices. Experimental results show that the segmentation results generated by our method are close to what manual segmentation can produce, with an average volume overlap ratio of 91.48%.
[The principle and application of the single-molecule real-time sequencing technology].

PubMed

Yanhu, Liu; Lu, Wang; Li, Yu

2015-03-01

Last decade witnessed the explosive development of the third-generation sequencing strategy, including single-molecule real-time sequencing (SMRT), true single-molecule sequencing (tSMSTM) and the single-molecule nanopore DNA sequencing. In this review, we summarize the principle, performance and application of the SMRT sequencing technology. Compared with the traditional Sanger method and the next-generation sequencing (NGS) technologies, the SMRT approach has several advantages, including long read length, high speed, PCR-free and the capability of direct detection of epigenetic modiﬁcations. However, the disadvantage of its low accuracy, most of which resulted from insertions and deletions, is also notable. So, the raw sequence data need to be corrected before assembly. Up to now, the SMRT is a good fit for applications in the de novo genomic sequencing and the high-quality assemblies of small genomes. In the future, it is expected to play an important role in epigenetics, transcriptomic sequencing, and assemblies of large genomes.
Human genome project: revolutionizing biology through leveraging technology

NASA Astrophysics Data System (ADS)

Dahl, Carol A.; Strausberg, Robert L.

1996-04-01

The Human Genome Project (HGP) is an international project to develop genetic, physical, and sequence-based maps of the human genome. Since the inception of the HGP it has been clear that substantially improved technology would be required to meet the scientific goals, particularly in order to acquire the complete sequence of the human genome, and that these technologies coupled with the information forthcoming from the project would have a dramatic effect on the way biomedical research is performed in the future. In this paper, we discuss the state-of-the-art for genomic DNA sequencing, technological challenges that remain, and the potential technological paths that could yield substantially improved genomic sequencing technology. The impact of the technology developed from the HGP is broad-reaching and a discussion of other research and medical applications that are leveraging HGP-derived DNA analysis technologies is included. The multidisciplinary approach to the development of new technologies that has been successful for the HGP provides a paradigm for facilitating new genomic approaches toward understanding the biological role of functional elements and systems within the cell, including those encoded within genomic DNA and their molecular products.
ATRF Houses the Latest DNA Sequencing Technologies | Poster

Cancer.gov

By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.
Fluorescent labeling of NASBA amplified tmRNA molecules for microarray applications

PubMed Central

Scheler, Ott; Glynn, Barry; Parkel, Sven; Palta, Priit; Toome, Kadri; Kaplinski, Lauris; Remm, Maido; Maher, Majella; Kurg, Ants

2009-01-01

Background Here we present a novel promising microbial diagnostic method that combines the sensitivity of Nucleic Acid Sequence Based Amplification (NASBA) with the high information content of microarray technology for the detection of bacterial tmRNA molecules. The NASBA protocol was modified to include aminoallyl-UTP (aaUTP) molecules that were incorporated into nascent RNA during the NASBA reaction. Post-amplification labeling with fluorescent dye was carried out subsequently and tmRNA hybridization signal intensities were measured using microarray technology. Significant optimization of the labeled NASBA protocol was required to maintain the required sensitivity of the reactions. Results Two different aaUTP salts were evaluated and optimum final concentrations were identified for both. The final 2 mM concentration of aaUTP Li-salt in NASBA reaction resulted in highest microarray signals overall, being twice as high as the strongest signals with 1 mM aaUTP Na-salt. Conclusion We have successfully demonstrated efficient combination of NASBA amplification technology with microarray based hybridization detection. The method is applicative for many different areas of microbial diagnostics including environmental monitoring, bio threat detection, industrial process monitoring and clinical microbiology. PMID:19445684
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PubMed

Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

2017-07-01

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Combined investigation of Eddy current and ultrasonic techniques for composite materials NDE

NASA Technical Reports Server (NTRS)

Davis, C. W.; Nath, S.; Fulton, J. P.; Namkung, M.

1993-01-01

Advanced composites are not without trade-offs. Their increased designability brings an increase in the complexity of their internal geometry and, as a result, an increase in the number of failure modes associated with a defect. When two or more isotropic materials are combined in a composite, the isotropic material failure modes may also combine. In a laminate, matrix delamination, cracking and crazing, and voids and porosity, will often combine with fiber breakage, shattering, waviness, and separation to bring about ultimate structural failure. This combining of failure modes can result in defect boundaries of different sizes, corresponding to the failure of each structural component. This paper discusses a dual-technology NDE (Non Destructive Evaluation) (eddy current (EC) and ultrasonics (UT)) study of graphite/epoxy (gr/ep) laminate samples. Eddy current and ultrasonic raster (Cscan) imaging were used together to characterize the effects of mechanical impact damage, high temperature thermal damage and various types of inserts in gr/ep laminate samples of various stacking sequences.
Assembly and features of secondary metabolite biosynthetic gene clusters in Streptomyces ansochromogenes.

PubMed

Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong

2013-07-01

A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.
CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

PubMed Central

2011-01-01

Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105
The European Classical Swine Fever Virus Database: Blueprint for a Pathogen-Specific Sequence Database with Integrated Sequence Analysis Tools

PubMed Central

Postel, Alexander; Schmeiser, Stefanie; Zimmermann, Bernd; Becher, Paul

2016-01-01

Molecular epidemiology has become an indispensable tool in the diagnosis of diseases and in tracing the infection routes of pathogens. Due to advances in conventional sequencing and the development of high throughput technologies, the field of sequence determination is in the process of being revolutionized. Platforms for sharing sequence information and providing standardized tools for phylogenetic analyses are becoming increasingly important. The database (DB) of the European Union (EU) and World Organisation for Animal Health (OIE) Reference Laboratory for classical swine fever offers one of the world’s largest semi-public virus-specific sequence collections combined with a module for phylogenetic analysis. The classical swine fever (CSF) DB (CSF-DB) became a valuable tool for supporting diagnosis and epidemiological investigations of this highly contagious disease in pigs with high socio-economic impacts worldwide. The DB has been re-designed and now allows for the storage and analysis of traditionally used, well established genomic regions and of larger genomic regions including complete viral genomes. We present an application example for the analysis of highly similar viral sequences obtained in an endemic disease situation and introduce the new geographic “CSF Maps” tool. The concept of this standardized and easy-to-use DB with an integrated genetic typing module is suited to serve as a blueprint for similar platforms for other human or animal viruses. PMID:27827988
Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape.

PubMed

Dai, Hanjun; Umarov, Ramzan; Kuwahara, Hiroyuki; Li, Yu; Song, Le; Gao, Xin

2017-11-15

An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding of the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought the opportunity for building binding affinity prediction methods, the accurate characterization of TF-DNA binding affinity landscape still remains a challenging problem. Here we propose a novel sequence embedding approach for modeling the transcription factor binding affinity landscape. Our method represents DNA binding sequences as a hidden Markov model which captures both position specific information and long-range dependency in the sequence. A cornerstone of our method is a novel message passing-like embedding algorithm, called Sequence2Vec, which maps these hidden Markov models into a common nonlinear feature space and uses these embedded features to build a predictive model. Our method is a novel combination of the strength of probabilistic graphical models, feature space embedding and deep learning. We conducted comprehensive experiments on over 90 large-scale TF-DNA datasets which were measured by different high-throughput experimental technologies. Sequence2Vec outperforms alternative machine learning methods as well as the state-of-the-art binding affinity prediction methods. Our program is freely available at https://github.com/ramzan1990/sequence2vec. xin.gao@kaust.edu.sa or lsong@cc.gatech.edu. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
A high-density consensus map of barley linking DArT markers to SSR, RFLP and STS loci and agricultural traits

PubMed Central

Wenzl, Peter; Li, Haobing; Carling, Jason; Zhou, Meixue; Raman, Harsh; Paul, Edie; Hearnden, Phillippa; Maier, Christina; Xia, Ling; Caig, Vanessa; Ovesná, Jaroslava; Cakir, Mehmet; Poulsen, David; Wang, Junping; Raman, Rosy; Smith, Kevin P; Muehlbauer, Gary J; Chalmers, Ken J; Kleinhofs, Andris; Huttner, Eric; Kilian, Andrzej

2006-01-01

Background Molecular marker technologies are undergoing a transition from largely serial assays measuring DNA fragment sizes to hybridization-based technologies with high multiplexing levels. Diversity Arrays Technology (DArT) is a hybridization-based technology that is increasingly being adopted by barley researchers. There is a need to integrate the information generated by DArT with previous data produced with gel-based marker technologies. The goal of this study was to build a high-density consensus linkage map from the combined datasets of ten populations, most of which were simultaneously typed with DArT and Simple Sequence Repeat (SSR), Restriction Enzyme Fragment Polymorphism (RFLP) and/or Sequence Tagged Site (STS) markers. Results The consensus map, built using a combination of JoinMap 3.0 software and several purpose-built perl scripts, comprised 2,935 loci (2,085 DArT, 850 other loci) and spanned 1,161 cM. It contained a total of 1,629 'bins' (unique loci), with an average inter-bin distance of 0.7 ± 1.0 cM (median = 0.3 cM). More than 98% of the map could be covered with a single DArT assay. The arrangement of loci was very similar to, and almost as optimal as, the arrangement of loci in component maps built for individual populations. The locus order of a synthetic map derived from merging the component maps without considering the segregation data was only slightly inferior. The distribution of loci along chromosomes indicated centromeric suppression of recombination in all chromosomes except 5H. DArT markers appeared to have a moderate tendency toward hypomethylated, gene-rich regions in distal chromosome areas. On the average, 14 ± 9 DArT loci were identified within 5 cM on either side of SSR, RFLP or STS loci previously identified as linked to agricultural traits. Conclusion Our barley consensus map provides a framework for transferring genetic information between different marker systems and for deploying DArT markers in molecular breeding schemes. The study also highlights the need for improved software for building consensus maps from high-density segregation data of multiple populations. PMID:16904008
Using expected sequence features to improve basecalling accuracy of amplicon pyrosequencing data.

PubMed

Rask, Thomas S; Petersen, Bent; Chen, Donald S; Day, Karen P; Pedersen, Anders Gorm

2016-04-22

Amplicon pyrosequencing targets a known genetic region and thus inherently produces reads highly anticipated to have certain features, such as conserved nucleotide sequence, and in the case of protein coding DNA, an open reading frame. Pyrosequencing errors, consisting mainly of nucleotide insertions and deletions, are on the other hand likely to disrupt open reading frames. Such an inverse relationship between errors and expectation based on prior knowledge can be used advantageously to guide the process known as basecalling, i.e. the inference of nucleotide sequence from raw sequencing data. The new basecalling method described here, named Multipass, implements a probabilistic framework for working with the raw flowgrams obtained by pyrosequencing. For each sequence variant Multipass calculates the likelihood and nucleotide sequence of several most likely sequences given the flowgram data. This probabilistic approach enables integration of basecalling into a larger model where other parameters can be incorporated, such as the likelihood for observing a full-length open reading frame at the targeted region. We apply the method to 454 amplicon pyrosequencing data obtained from a malaria virulence gene family, where Multipass generates 20 % more error-free sequences than current state of the art methods, and provides sequence characteristics that allow generation of a set of high confidence error-free sequences. This novel method can be used to increase accuracy of existing and future amplicon sequencing data, particularly where extensive prior knowledge is available about the obtained sequences, for example in analysis of the immunoglobulin VDJ region where Multipass can be combined with a model for the known recombining germline genes. Multipass is available for Roche 454 data at http://www.cbs.dtu.dk/services/MultiPass-1.0 , and the concept can potentially be implemented for other sequencing technologies as well.
Pelagic and benthic communities of the Antarctic ecosystem of Potter Cove: Genomics and ecological implications.

PubMed

Abele, D; Vazquez, S; Buma, A G J; Hernandez, E; Quiroga, C; Held, C; Frickenhaus, S; Harms, L; Lopez, J L; Helmke, E; Mac Cormack, W P

2017-06-01

Molecular technologies are more frequently applied in Antarctic ecosystem research and the growing amount of sequence-based information available in databases adds a new dimension to understanding the response of Antarctic organisms and communities to environmental change. We apply molecular techniques, including fingerprinting, and amplicon and metagenome sequencing, to understand biodiversity and phylogeography to resolve adaptive processes in an Antarctic coastal ecosystem from microbial to macrobenthic organisms and communities. Interpretation of the molecular data is not only achieved by their combination with classical methods (pigment analyses or microscopy), but furthermore by combining molecular with environmental data (e.g., sediment characteristics, biogeochemistry or oceanography) in space and over time. The studies form part of a long-term ecosystem investigation in Potter Cove on King-George Island, Antarctica, in which we follow the effects of rapid retreat of the local glacier on the cove ecosystem. We formulate and encourage new approaches to integrate molecular tools into Antarctic ecosystem research, environmental conservation actions, and polar ocean observatories. Copyright © 2017 Elsevier B.V. All rights reserved.
High sensitive and direct fluorescence detection of single viral DNA sequences by integration of double strand probes onto microgels particles.

PubMed

Aliberti, A; Cusano, A M; Battista, E; Causa, F; Netti, P A

2016-02-21

A novel class of probes for fluorescence detection was developed and combined to microgel particles for a high sensitive fluorescence detection of nucleic acids. A double strand probe with an optimized fluorescent-quencher couple was designed for the detection of different lengths of nucleic acids (39 nt and 100 nt). Such probe proved efficient in target detection in different contests and specific even in presence of serum proteins. The conjugation of double strand probes onto polymeric microgels allows for a sensitive detection of DNA sequences from HIV, HCV and SARS corona viruses with a LOD of 1.4 fM, 3.7 fM and 1.4 fM, respectively, and with a dynamic range of 10(-9)-10(-15) M. Such combination enhances the sensitivity of the detection of almost five orders of magnitude when compared to the only probe. The proposed platform based on the integration of innovative double strand probe into microgels particles represents an attractive alternative to conventional sensitive DNA detection technologies that rely on amplifications methods.
Electrochemical techniques on sequence-specific PCR amplicon detection for point-of-care applications.

PubMed

Luo, Xiaoteng; Hsing, I-Ming

2009-10-01

Nucleic acid based analysis provides accurate differentiation among closely affiliated species and this species- and sequence-specific detection technique would be particularly useful for point-of-care (POC) testing for prevention and early detection of highly infectious and damaging diseases. Electrochemical (EC) detection and polymerase chain reaction (PCR) are two indispensable steps, in our view, in a nucleic acid based point-of-care testing device as the former, in comparison with the fluorescence counterpart, provides inherent advantages of detection sensitivity, device miniaturization and operation simplicity, and the latter offers an effective way to boost the amount of targets to a detectable quantity. In this mini-review, we will highlight some of the interesting investigations using the combined EC detection and PCR amplification approaches for end-point detection and real-time monitoring. The promise of current approaches and the direction for future investigations will be discussed. It would be our view that the synergistic effect of the combined EC-PCR steps in a portable device provides a promising detection technology platform that will be ready for point-of-care applications in the near future.
Using the QCM Biosensor-Based T7 Phage Display Combined with Bioinformatics Analysis for Target Identification of Bioactive Small Molecule.

PubMed

Takakusagi, Yoichi; Takakusagi, Kaori; Sugawara, Fumio; Sakaguchi, Kengo

2018-01-01

Identification of target proteins that directly bind to bioactive small molecule is of great interest in terms of clarifying the mode of action of the small molecule as well as elucidating the biological phenomena at the molecular level. Of the experimental technologies available, T7 phage display allows comprehensive screening of small molecule-recognizing amino acid sequence from the peptide libraries displayed on the T7 phage capsid. Here, we describe the T7 phage display strategy that is combined with quartz-crystal microbalance (QCM) biosensor for affinity selection platform and bioinformatics analysis for small molecule-recognizing short peptides. This method dramatically enhances efficacy and throughput of the screening for small molecule-recognizing amino acid sequences without repeated rounds of selection. Subsequent execution of bioinformatics programs allows combinatorial and comprehensive target protein discovery of small molecules with its binding site, regardless of protein sample insolubility, instability, or inaccessibility of the fixed small molecules to internally located binding site on larger target proteins when conventional proteomics approaches are used.

Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

2013-06-25

A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Examining the Causes and Consequences of Short-Term Behavioral Change during the Middle Stone Age at Sibudu, South Africa

PubMed Central

Conard, Nicholas J.; Will, Manuel

2015-01-01

Sibudu in KwaZulu-Natal (South Africa) with its rich and high-resolution archaeological sequence provides an ideal case study to examine the causes and consequences of short-term variation in the behavior of modern humans during the Middle Stone Age (MSA). We present the results from a technological analysis of 11 stratified lithic assemblages which overlie the Howiesons Poort deposits and all date to ~58 ka. Based on technological and typological attributes, we conducted inter-assemblage comparisons to characterize the nature and tempo of cultural change in successive occupations. This work identified considerable short-term variation with clear temporal trends throughout the sequence, demonstrating that knappers at Sibudu varied their technology over short time spans. The lithic assemblages can be grouped into three cohesive units which differ from each other in the procurement of raw materials, the frequency in the methods of core reduction, the kind of blanks produced, and in the nature of tools the inhabitants of Sibudu made and used. These groups of assemblages represent different strategies of lithic technology, which build upon each other in a gradual, cumulative manner. We also identify a clear pattern of development toward what we have previously defined as the Sibudan cultural taxonomic unit. Contextualizing these results on larger geographical scales shows that the later phase of the MSA during MIS 3 in KwaZulu-Natal and southern Africa is one of dynamic cultural change rather than of stasis or stagnation as has at times been claimed. In combination with environmental, subsistence and contextual information, our high-resolution data on lithic technology suggest that short-term behavioral variability at Sibudu can be best explained by changes in technological organization and socio-economic dynamics instead of environmental forcing. PMID:26098694
Examining the Causes and Consequences of Short-Term Behavioral Change during the Middle Stone Age at Sibudu, South Africa.

PubMed

Conard, Nicholas J; Will, Manuel

2015-01-01

Sibudu in KwaZulu-Natal (South Africa) with its rich and high-resolution archaeological sequence provides an ideal case study to examine the causes and consequences of short-term variation in the behavior of modern humans during the Middle Stone Age (MSA). We present the results from a technological analysis of 11 stratified lithic assemblages which overlie the Howiesons Poort deposits and all date to ~58 ka. Based on technological and typological attributes, we conducted inter-assemblage comparisons to characterize the nature and tempo of cultural change in successive occupations. This work identified considerable short-term variation with clear temporal trends throughout the sequence, demonstrating that knappers at Sibudu varied their technology over short time spans. The lithic assemblages can be grouped into three cohesive units which differ from each other in the procurement of raw materials, the frequency in the methods of core reduction, the kind of blanks produced, and in the nature of tools the inhabitants of Sibudu made and used. These groups of assemblages represent different strategies of lithic technology, which build upon each other in a gradual, cumulative manner. We also identify a clear pattern of development toward what we have previously defined as the Sibudan cultural taxonomic unit. Contextualizing these results on larger geographical scales shows that the later phase of the MSA during MIS 3 in KwaZulu-Natal and southern Africa is one of dynamic cultural change rather than of stasis or stagnation as has at times been claimed. In combination with environmental, subsistence and contextual information, our high-resolution data on lithic technology suggest that short-term behavioral variability at Sibudu can be best explained by changes in technological organization and socio-economic dynamics instead of environmental forcing.
A typing scheme for the honeybee pathogen Melissococcus plutonius allows detection of disease transmission events and a study of the distribution of variants.

PubMed

Haynes, Edward; Helgason, Thorunn; Young, J Peter W; Thwaites, Richard; Budge, Giles E

2013-08-01

Melissococcus plutonius is the bacterial pathogen that causes European Foulbrood of honeybees, a globally important honeybee brood disease. We have used next-generation sequencing to identify highly polymorphic regions in an otherwise genetically homogenous organism, and used these loci to create a modified MLST scheme. This synthesis of a proven typing scheme format with next-generation sequencing combines reliability and low costs with insights only available from high-throughput sequencing technologies. Using this scheme we show that the global distribution of M.plutonius variants is not uniform. We use the scheme in epidemiological studies to trace movements of infective material around England, insights that would have been impossible to confirm without the typing scheme. We also demonstrate the persistence of local variants over time. © 2013 Crown copyright. Reproduced with the permission of the Controller of Her Majesty's Stationary Office/Queen’s Printer for Scotland and Food and Environment Research Agency.
MR CAT scan: a modular approach for hybrid imaging.

PubMed

Hillenbrand, C; Hahn, D; Haase, A; Jakob, P M

2000-07-01

In this study, a modular concept for NMR hybrid imaging is presented. This concept essentially integrates different imaging modules in a sequential fashion and is therefore called CAT (combined acquisition technique). CAT is not a single specific measurement sequence, but rather a sequence design concept whereby distinct acquisition techniques with varying imaging parameters are employed in rapid succession in order to cover k-space. The power of the CAT approach is that it provides a high flexibility toward the acquisition optimization with respect to the available imaging time and the desired image quality. Important CAT sequence optimization steps include the appropriate choice of the k-space coverage ratio and the application of mixed bandwidth technology. Details of both the CAT methodology and possible CAT acquisition strategies, such as FLASH/EPI-, RARE/EPI- and FLASH/BURST-CAT are provided. Examples from imaging experiments in phantoms and healthy volunteers including mixed bandwidth acquisitions are provided to demonstrate the feasibility of the proposed CAT concept.
Identification, Characterization, and Recombinant Expression of Epidermicin NI01, a Novel Unmodified Bacteriocin Produced by Staphylococcus epidermidis That Displays Potent Activity against Staphylococci

PubMed Central

Sandiford, Stephanie

2012-01-01

We describe the discovery, purification, characterization, and expression of an antimicrobial peptide, epidermicin NI01, which is an unmodified bacteriocin produced by Staphylococcus epidermidis strain 224. It is a highly cationic, hydrophobic, plasmid-encoded peptide that exhibits potent antimicrobial activity toward a wide range of pathogenic Gram-positive bacteria including methicillin-resistant Staphylococcus aureus (MRSA), enterococci, and biofilm-forming S. epidermidis strains. Purification of the peptide was achieved using a combination of hydrophobic interaction, cation exchange, and high-performance liquid chromatography (HPLC). Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) analysis yielded a molecular mass of 6,074 Da, and partial sequence data of the peptide were elucidated using a combination of tandem mass spectrometry (MS/MS) and de novo sequencing. The draft genome sequence of the producing strain was obtained using 454 pyrosequencing technology, thus enabling the identification of the structural gene using the de novo peptide sequence data previously obtained. Epidermicin NI01 contains 51 residues with four tryptophan and nine lysine residues, and the sequence showed approximately 50% identity to peptides lacticin Z, lacticin Q, and aureocin A53, all of which belong to a new family of unmodified type II-like bacteriocins. The peptide is active in the nanomolar range against S. epidermidis, MRSA isolates, and vancomycin-resistant enterococci. Other unique features displayed by epidermicin include a high degree of protease stability and the ability to retain antimicrobial activity over a pH range of 2 to 10, and exposure to the peptide does not result in development of resistance in susceptible isolates. In this study we also show the structural gene alone can be cloned into Escherichia coli strain BL21(DE3), and expression yields active peptide. PMID:22155816
A Simple Method for Visualization of Locus-Specific H4K20me1 Modifications in Living Caenorhabditis elegans Single Cells.

PubMed

Shinkai, Yoichi; Kuramochi, Masahiro; Doi, Motomichi

2018-05-03

Recently, advances in next-generation sequencing technologies have enabled genome-wide analyses of epigenetic modifications; however, it remains difficult to analyze the states of histone modifications at a single-cell resolution in living multicellular organisms because of the heterogeneity within cellular populations. Here we describe a simple method to visualize histone modifications on the specific sequence of target locus at a single-cell resolution in living Caenorhabditis elegans , by combining the LacO/LacI system and a genetically-encoded H4K20me1-specific probe, "mintbody". We demonstrate that Venus-labeled mintbody and mTurquoise2-labeled LacI can co-localize on an artificial chromosome carrying both the target locus and LacO sequences, where H4K20me1 marks the target locus. We demonstrate that our visualization method can precisely detect H4K20me1 depositions on the her-1 gene sequences on the artificial chromosome, to which the dosage compensation complex binds to regulate sex determination. The degree of H4K20me1 deposition on the her-1 sequences on the artificial chromosome correlated strongly with sex, suggesting that, using the artificial chromosome, this method can reflect context-dependent changes of H4K20me1 on endogenous genomes. Furthermore, we demonstrate live imaging of H4K20me1 depositions on the artificial chromosome. Combined with ChIP assays, this mintbody-LacO/LacI visualization method will enable analysis of developmental and context-dependent alterations of locus-specific histone modifications in specific cells and elucidation of the underlying molecular mechanisms. Copyright © 2018, G3: Genes, Genomes, Genetics.
Resolving Relationships among the Megadiverse Butterflies and Moths with a Novel Pipeline for Anchored Phylogenomics.

PubMed

Breinholt, Jesse W; Earl, Chandra; Lemmon, Alan R; Lemmon, Emily Moriarty; Xiao, Lei; Kawahara, Akito Y

2018-01-01

The advent of next-generation sequencing technology has allowed for thecollection of large portions of the genome for phylogenetic analysis. Hybrid enrichment and transcriptomics are two techniques that leverage next-generation sequencing and have shown much promise. However, methods for processing hybrid enrichment data are still limited. We developed a pipeline for anchored hybrid enrichment (AHE) read assembly, orthology determination, contamination screening, and data processing for sequences flanking the target "probe" region. We apply this approach to study the phylogeny of butterflies and moths (Lepidoptera), a megadiverse group of more than 157,000 described species with poorly understood deep-level phylogenetic relationships. We introduce a new, 855 locus AHE kit for Lepidoptera phylogenetics and compare resulting trees to those from transcriptomes. The enrichment kit was designed from existing genomes, transcriptomes, and expressed sequence tags and was used to capture sequence data from 54 species from 23 lepidopteran families. Phylogenies estimated from AHE data were largely congruent with trees generated from transcriptomes, with strong support for relationships at all but the deepest taxonomic levels. We combine AHE and transcriptomic data to generate a new Lepidoptera phylogeny, representing 76 exemplar species in 42 families. The tree provides robust support for many relationships, including those among the seven butterfly families. The addition of AHE data to an existing transcriptomic dataset lowers node support along the Lepidoptera backbone, but firmly places taxa with AHE data on the phylogeny. Combining taxa sequenced for AHE with existing transcriptomes and genomes resulted in a tree with strong support for (Calliduloidea $+$ Gelechioidea $+$ Thyridoidea) $+$ (Papilionoidea $+$ Pyraloidea $+$ Macroheterocera). To examine the efficacy of AHE at a shallow taxonomic level, phylogenetic analyses were also conducted on a sister group representing a more recent divergence, the Saturniidae and Sphingidae. These analyses utilized sequences from the probe region and data flanking it, nearly doubled the size of the dataset; resulting trees supported new phylogenetics relationships, especially within the Saturniidae and Sphingidae (e.g., Hemarina derived in the latter). We hope that our data processing pipeline, hybrid enrichment gene set, and approach of combining AHE data with transcriptomes will be useful for the broader systematics community. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Rapid evaluation and quality control of next generation sequencing data with FaQCs.

PubMed

Lo, Chien-Chi; Chain, Patrick S G

2014-11-19

Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly process large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.
Discrimination of the Lactobacillus acidophilus group using sequencing, species-specific PCR and SNaPshot mini-sequencing technology based on the recA gene.

PubMed

Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Mu-Chiou; Wang, Li-Tin; Huang, Lina; Lee, Fwu-Ling

2012-10-01

To clearly identify specific species and subspecies of the Lactobacillus acidophilus group using phenotypic and genotypic (16S rDNA sequence analysis) techniques alone is difficult. The aim of this study was to use the recA gene for species discrimination in the L. acidophilus group, as well as to develop a species-specific primer and single nucleotide polymorphism primer based on the recA gene sequence for species and subspecies identification. The average sequence similarity for the recA gene among type strains was 80.0%, and most members of the L. acidophilus group could be clearly distinguished. The species-specific primer was designed according to the recA gene sequencing, which was employed for polymerase chain reaction with the template DNA of Lactobacillus strains. A single 231-bp species-specific band was found only in L. delbrueckii. A SNaPshot mini-sequencing assay using recA as a target gene was also developed. The specificity of the mini-sequencing assay was evaluated using 31 strains of L. delbrueckii species and was able to unambiguously discriminate strains belonging to the subspecies L. delbrueckii subsp. bulgaricus. The phylogenetic relationships of most strains in the L. acidophilus group can be resolved using recA gene sequencing, and a novel method to identify the species and subspecies of the L. delbrueckii and L. delbrueckii subsp. bulgaricus was developed by species-specific polymerase chain reaction combined with SNaPshot mini-sequencing. Copyright © 2012 Society of Chemical Industry.
VaDiR: an integrated approach to Variant Detection in RNA.

PubMed

Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

2018-02-01

Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Fungal proteomics: from identification to function.

PubMed

Doyle, Sean

2011-08-01

Some fungi cause disease in humans and plants, while others have demonstrable potential for the control of insect pests. In addition, fungi are also a rich reservoir of therapeutic metabolites and industrially useful enzymes. Detailed analysis of fungal biochemistry is now enabled by multiple technologies including protein mass spectrometry, genome and transcriptome sequencing and advances in bioinformatics. Yet, the assignment of function to fungal proteins, encoded either by in silico annotated, or unannotated genes, remains problematic. The purpose of this review is to describe the strategies used by many researchers to reveal protein function in fungi, and more importantly, to consolidate the nomenclature of 'unknown function protein' as opposed to 'hypothetical protein' - once any protein has been identified by protein mass spectrometry. A combination of approaches including comparative proteomics, pathogen-induced protein expression and immunoproteomics are outlined, which, when used in combination with a variety of other techniques (e.g. functional genomics, microarray analysis, immunochemical and infection model systems), appear to yield comprehensive and definitive information on protein function in fungi. The relative advantages of proteomic, as opposed to transcriptomic-only, analyses are also described. In the future, combined high-throughput, quantitative proteomics, allied to transcriptomic sequencing, are set to reveal much about protein function in fungi. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Investigation of bacterial and archaeal communities: novel protocols using modern sequencing by Illumina MiSeq and traditional DGGE-cloning.

PubMed

Kraková, Lucia; Šoltys, Katarína; Budiš, Jaroslav; Grivalský, Tomáš; Ďuriš, František; Pangallo, Domenico; Szemes, Tomáš

2016-09-01

Different protocols based on Illumina high-throughput DNA sequencing and denaturing gradient gel electrophoresis (DGGE)-cloning were developed and applied for investigating hot spring related samples. The study was focused on three target genes: archaeal and bacterial 16S rRNA and mcrA of methanogenic microflora. Shorter read lengths of the currently most popular technology of sequencing by Illumina do not allow analysis of the complete 16S rRNA region, or of longer gene fragments, as was the case of Sanger sequencing. Here, we demonstrate that there is no need for special indexed or tailed primer sets dedicated to short variable regions of 16S rRNA since the presented approach allows the analysis of complete bacterial 16S rRNA amplicons (V1-V9) and longer archaeal 16S rRNA and mcrA sequences. Sample augmented with transposon is represented by a set of approximately 300 bp long fragments that can be easily sequenced by Illumina MiSeq. Furthermore, a low proportion of chimeric sequences was observed. DGGE-cloning based strategies were performed combining semi-nested PCR, DGGE and clone library construction. Comparing both investigation methods, a certain degree of complementarity was observed confirming that the DGGE-cloning approach is not obsolete. Novel protocols were created for several types of laboratories, utilizing the traditional DGGE technique or using the most modern Illumina sequencing.
Sunflower Hybrid Breeding: From Markers to Genomic Selection

PubMed Central

Dimitrijevic, Aleksandra; Horn, Renate

2018-01-01

In sunflower, molecular markers for simple traits as, e.g., fertility restoration, high oleic acid content, herbicide tolerance or resistances to Plasmopara halstedii, Puccinia helianthi, or Orobanche cumana have been successfully used in marker-assisted breeding programs for years. However, agronomically important complex quantitative traits like yield, heterosis, drought tolerance, oil content or selection for disease resistance, e.g., against Sclerotinia sclerotiorum have been challenging and will require genome-wide approaches. Plant genetic resources for sunflower are being collected and conserved worldwide that represent valuable resources to study complex traits. Sunflower association panels provide the basis for genome-wide association studies, overcoming disadvantages of biparental populations. Advances in technologies and the availability of the sunflower genome sequence made novel approaches on the whole genome level possible. Genotype-by-sequencing, and whole genome sequencing based on next generation sequencing technologies facilitated the production of large amounts of SNP markers for high density maps as well as SNP arrays and allowed genome-wide association studies and genomic selection in sunflower. Genome wide or candidate gene based association studies have been performed for traits like branching, flowering time, resistance to Sclerotinia head and stalk rot. First steps in genomic selection with regard to hybrid performance and hybrid oil content have shown that genomic selection can successfully address complex quantitative traits in sunflower and will help to speed up sunflower breeding programs in the future. To make sunflower more competitive toward other oil crops higher levels of resistance against pathogens and better yield performance are required. In addition, optimizing plant architecture toward a more complex growth type for higher plant densities has the potential to considerably increase yields per hectare. Integrative approaches combining omic technologies (genomics, transcriptomics, proteomics, metabolomics and phenomics) using bioinformatic tools will facilitate the identification of target genes and markers for complex traits and will give a better insight into the mechanisms behind the traits. PMID:29387071
High-throughput sequence alignment using Graphics Processing Units

PubMed Central

Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L; Varshney, Amitabh

2007-01-01

Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU. PMID:18070356
Nanopore-based fourth-generation DNA sequencing technology.

PubMed

Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

2015-02-01

Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Single-cell sequencing in stem cell biology.

PubMed

Wen, Lu; Tang, Fuchou

2016-04-15

Cell-to-cell variation and heterogeneity are fundamental and intrinsic characteristics of stem cell populations, but these differences are masked when bulk cells are used for omic analysis. Single-cell sequencing technologies serve as powerful tools to dissect cellular heterogeneity comprehensively and to identify distinct phenotypic cell types, even within a 'homogeneous' stem cell population. These technologies, including single-cell genome, epigenome, and transcriptome sequencing technologies, have been developing rapidly in recent years. The application of these methods to different types of stem cells, including pluripotent stem cells and tissue-specific stem cells, has led to exciting new findings in the stem cell field. In this review, we discuss the recent progress as well as future perspectives in the methodologies and applications of single-cell omic sequencing technologies.
A Simple and Efficient Method for Assembling TALE Protein Based on Plasmid Library

PubMed Central

Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

2013-01-01

DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate. PMID:23840477
A simple and efficient method for assembling TALE protein based on plasmid library.

PubMed

Zhang, Zhiqiang; Li, Duo; Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

2013-01-01

DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate.
The sequence of sequencers: The history of sequencing DNA

PubMed Central

Heather, James M.; Chain, Benjamin

2016-01-01

Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

ScienceCinema

Patel, Kamlesh D.

2018-01-22

Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.
Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patel, Kamlesh D.

2012-06-01

Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.
Open source tools to exploit DNA sequence data from livestock species

USDA-ARS?s Scientific Manuscript database

Next-Generation Sequencing (NGS) is a recent technological development that allows researchers to rapidly determine the DNA sequence of an individual. The decrease in cost of NGS has brought the technology into the realm of practical applications in livestock genomics, where it can be used to genera...
Fluorescence in situ hybridization and optical mapping to correct scaffold arrangement in the tomato genome

USDA-ARS?s Scientific Manuscript database

Modern biological analyses are often assisted by recent technologies making the sequencing of complex genomes both technically possible and feasible. We recently sequenced the tomato genome that, like many eukaryotic genomes, is large and complex. Current sequencing technologies allow the developmen...
Single-Molecule Electrical Random Resequencing of DNA and RNA

NASA Astrophysics Data System (ADS)

Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji

2012-07-01

Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
Sequencing Computer-Assisted Learning of Transformations of Trigonometric Functions

ERIC Educational Resources Information Center

Ross, John A.; Bruce, Catherine D.; Sibbald, Timothy M.

2011-01-01

Studies incorporating technology into the teaching of trigonometry, although sparse, have demonstrated positive effects on student achievement. The optimal sequence for integrating technology with teacher-led mathematics instruction has not been determined. Our research investigated whether technology has a greater impact on student achievement…
The first FDA marketing authorizations of next-generation sequencing technology and tests: challenges, solutions and impact for future assays.

PubMed

Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana

2015-01-01

The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays.
Assessing the performance of the Oxford Nanopore Technologies MinION

PubMed Central

Laver, T.; Harrison, J.; O’Neill, P.A.; Moore, K.; Farbos, A.; Paszkiewicz, K.; Studholme, D.J.

2015-01-01

The Oxford Nanopore Technologies (ONT) MinION is a new sequencing technology that potentially offers read lengths of tens of kilobases (kb) limited only by the length of DNA molecules presented to it. The device has a low capital cost, is by far the most portable DNA sequencer available, and can produce data in real-time. It has numerous prospective applications including improving genome sequence assemblies and resolution of repeat-rich regions. Before such a technology is widely adopted, it is important to assess its performance and limitations in respect of throughput and accuracy. In this study we assessed the performance of the MinION by re-sequencing three bacterial genomes, with very different nucleotide compositions ranging from 28.6% to 70.7%; the high G + C strain was underrepresented in the sequencing reads. We estimate the error rate of the MinION (after base calling) to be 38.2%. Mean and median read lengths were 2 kb and 1 kb respectively, while the longest single read was 98 kb. The whole length of a 5 kb rRNA operon was covered by a single read. As the first nanopore-based single molecule sequencer available to researchers, the MinION is an exciting prospect; however, the current error rate limits its ability to compete with existing sequencing technologies, though we do show that MinION sequence reads can enhance contiguity of de novo assembly when used in conjunction with Illumina MiSeq data. PMID:26753127
Identification of MicroRNA Targets of Capsicum spp. Using MiRTrans—a Trans-Omics Approach

PubMed Central

Zhang, Lu; Qin, Cheng; Mei, Junpu; Chen, Xiaocui; Wu, Zhiming; Luo, Xirong; Cheng, Jiaowen; Tang, Xiangqun; Hu, Kailin; Li, Shuai C.

2017-01-01

The microRNA (miRNA) can regulate the transcripts that are involved in eukaryotic cell proliferation, differentiation, and metabolism. Especially for plants, our understanding of miRNA targets, is still limited. Early attempts of prediction on sequence alignments have been plagued by enormous false positives. It is helpful to improve target prediction specificity by incorporating the other data sources such as the dependency between miRNA and transcript expression or even cleaved transcripts by miRNA regulations, which are referred to as trans-omics data. In this paper, we developed MiRTrans (Prediction of MiRNA targets by Trans-omics data) to explore miRNA targets by incorporating miRNA sequencing, transcriptome sequencing, and degradome sequencing. MiRTrans consisted of three major steps. First, the target transcripts of miRNAs were predicted by scrutinizing their sequence characteristics and collected as an initial potential targets pool. Second, false positive targets were eliminated if the expression of miRNA and its targets were weakly correlated by lasso regression. Third, degradome sequencing was utilized to capture the miRNA targets by examining the cleaved transcripts that regulated by miRNAs. Finally, the predicted targets from the second and third step were combined by Fisher's combination test. MiRTrans was applied to identify the miRNA targets for Capsicum spp. (i.e., pepper). It can generate more functional miRNA targets than sequence-based predictions by evaluating functional enrichment. MiRTrans identified 58 miRNA-transcript pairs with high confidence from 18 miRNA families conserved in eudicots. Most of these targets were transcription factors; this lent support to the role of miRNA as key regulator in pepper. To our best knowledge, this work is the first attempt to investigate the miRNA targets of pepper, as well as their regulatory networks. Surprisingly, only a small proportion of miRNA-transcript pairs were shared between degradome sequencing and expression dependency predictions, suggesting that miRNA targets predicted by a single technology alone may be prone to report false negatives. PMID:28443105
Pyrosequencing®-Based Identification of Low-Frequency Mutations Enriched Through Enhanced-ice-COLD-PCR.

PubMed

How-Kit, Alexandre; Tost, Jörg

2015-01-01

A number of molecular diagnostic assays have been developed in the last years for mutation detection. Although these methods have become increasingly sensitive, most of them are incompatible with a sequencing-based readout and require prior knowledge of the mutation present in the sample. Consequently, coamplification at low denaturation (COLD)-PCR-based methods have been developed and combine a high analytical sensitivity due to mutation enrichment in the sample with the identification of known or unknown mutations by downstream sequencing experiments. Among these methods, the recently developed Enhanced-ice-COLD-PCR appeared as the most powerful method as it outperformed the other COLD-PCR-based methods in terms of the mutation enrichment and due to the simplicity of the experimental setup of the assay. Indeed, E-ice-COLD-PCR is very versatile as it can be used on all types of PCR platforms and is applicable to different types of samples including fresh frozen, FFPE, and plasma samples. The technique relies on the incorporation of an LNA containing blocker probe in the PCR reaction followed by selective heteroduplex denaturation enabling amplification of the mutant allele while amplification of the wild-type allele is prevented. Combined with Pyrosequencing(®), which is a very quantitative high-resolution sequencing technology, E-ice-COLD-PCR can detect and identify mutations with a limit of detection down to 0.01 %.
Adrenal Insufficiency, Sex Reversal, and Angelman Syndrome due to Uniparental Disomy Unmasking a Mutation in CYP11A1.

PubMed

Kim, Ahlee; Fujimoto, Masanobu; Hwa, Vivian; Backeljauw, Philippe; Dauber, Andrew

2018-01-01

Cholesterol side-chain cleavage enzyme (P450scc) deficiency is a rare genetic disorder causing primary adrenal insufficiency with or without a 46,XY disorder of sexual development (DSD). Herein, we report a case of the combination of primary adrenal insufficiency, a DSD (testes with female external genitalia in a setting of a 47,XXY karyotype), and Angelman syndrome. Comprehensive genetic analyses were performed, including a single nucleotide polymorphism microarray and whole-exome sequencing. In vitro studies were performed to evaluate the pathogenicity of the novel mutation that was identified by whole-exome sequencing. The patient was found to have segmental uniparental disomy (UPD) of chromosome 15 explaining her diagnosis of Angelman syndrome. Whole-exome sequencing further revealed a novel homozygous intronic variant in CYP11A1, the gene encoding P450scc, found within the region of UPD. In vitro studies confirmed that this variant led to decreased efficiency of CYP11A1 splicing. We report the first case of the combination of 2 rare genetic disorders, Angelman syndrome, and P450scc deficiency. After 20 years of diagnostic efforts, significant advances in genetic diagnostic technology allowed us to determine that these 2 disorders originate from a unified genetic etiology, segmental UPD unmasking a novel recessive mutation in CYP11A1. © 2018 S. Karger AG, Basel.
A reference human genome dataset of the BGISEQ-500 sequencer.

PubMed

Huang, Jie; Liang, Xinming; Xuan, Yuankai; Geng, Chunyu; Li, Yuxiang; Lu, Haorong; Qu, Shoufang; Mei, Xianglin; Chen, Hongbo; Yu, Ting; Sun, Nan; Rao, Junhua; Wang, Jiahao; Zhang, Wenwei; Chen, Ying; Liao, Sha; Jiang, Hui; Liu, Xin; Yang, Zhaopeng; Mu, Feng; Gao, Shangxian

2017-05-01

BGISEQ-500 is a new desktop sequencer developed by BGI. Using DNA nanoball and combinational probe anchor synthesis developed from Complete Genomics™ sequencing technologies, it generates short reads at a large scale. Here, we present the first human whole-genome sequencing dataset of BGISEQ-500. The dataset was generated by sequencing the widely used cell line HG001 (NA12878) in two sequencing runs of paired-end 50 bp (PE50) and two sequencing runs of paired-end 100 bp (PE100). We also include examples of the raw images from the sequencer for reference. Finally, we identified variations using this dataset, estimated the accuracy of the variations, and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data. We found similar single nucleotide polymorphism (SNP) detection accuracy for the BGISEQ-500 PE100 data (false positive rate [FPR] = 0.00020%, sensitivity = 96.20%) compared to the PE150 HiSeq2500 data (FPR = 0.00017%, sensitivity = 96.60%) better SNP detection accuracy than the PE50 data (FPR = 0.0006%, sensitivity = 94.15%). But for insertions and deletions (indels), we found lower accuracy for BGISEQ-500 data (FPR = 0.00069% and 0.00067% for PE100 and PE50 respectively, sensitivity = 88.52% and 70.93%) than the HiSeq2500 data (FPR = 0.00032%, sensitivity = 96.28%). Our dataset can serve as the reference dataset, providing basic information not just for future development, but also for all research and applications based on the new sequencing platform. © The Authors 2017. Published by Oxford University Press.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lo, Chien -Chi; Chain, Patrick S. G.

Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less
Enhancing genome assemblies by integrating non-sequence based data

PubMed Central

2011-01-01

Introduction Many genome projects were underway before the advent of high-throughput sequencing and have thus been supported by a wealth of genome information from other technologies. Such information frequently takes the form of linkage and physical maps, both of which can provide a substantial amount of data useful in de novo sequencing projects. Furthermore, the recent abundance of genome resources enables the use of conserved synteny maps identified in related species to further enhance genome assemblies. Methods The tammar wallaby (Macropus eugenii) is a model marsupial mammal with a low coverage genome. However, we have access to extensive comparative maps containing over 14,000 markers constructed through the physical mapping of conserved loci, chromosome painting and comprehensive linkage maps. Using a custom Bioperl pipeline, information from the maps was aligned to assembled tammar wallaby contigs using BLAT. This data was used to construct pseudo paired-end libraries with intervals ranging from 5-10 MB. We then used Bambus (a program designed to scaffold eukaryotic genomes by ordering and orienting contigs through the use of paired-end data) to scaffold our libraries. To determine how map data compares to sequence based approaches to enhance assemblies, we repeated the experiment using a 0.5× coverage of unique reads from 4 KB and 8 KB Illumina paired-end libraries. Finally, we combined both the sequence and non-sequence-based data to determine how a combined approach could further enhance the quality of the low coverage de novo reconstruction of the tammar wallaby genome. Results Using the map data alone, we were able order 2.2% of the initial contigs into scaffolds, and increase the N50 scaffold size to 39 KB (36 KB in the original assembly). Using only the 0.5× paired-end sequence based data, 53% of the initial contigs were assigned to scaffolds. Combining both data sets resulted in a further 2% increase in the number of initial contigs integrated into a scaffold (55% total) but a 35% increase in N50 scaffold size over the use of sequence-based data alone. Conclusions We provide a relatively simple pipeline utilizing existing bioinformatics tools to integrate map data into a genome assembly which is available at http://www.mcb.uconn.edu/fac.php?name=paska. While the map data only contributed minimally to assigning the initial contigs to scaffolds in the new assembly, it greatly increased the N50 size. This process added structure to our low coverage assembly, greatly increasing its utility in further analyses. PMID:21554765
Enhancing genome assemblies by integrating non-sequence based data.

PubMed

Heider, Thomas N; Lindsay, James; Wang, Chenwei; O'Neill, Rachel J; Pask, Andrew J

2011-05-28

Many genome projects were underway before the advent of high-throughput sequencing and have thus been supported by a wealth of genome information from other technologies. Such information frequently takes the form of linkage and physical maps, both of which can provide a substantial amount of data useful in de novo sequencing projects. Furthermore, the recent abundance of genome resources enables the use of conserved synteny maps identified in related species to further enhance genome assemblies. The tammar wallaby (Macropus eugenii) is a model marsupial mammal with a low coverage genome. However, we have access to extensive comparative maps containing over 14,000 markers constructed through the physical mapping of conserved loci, chromosome painting and comprehensive linkage maps. Using a custom Bioperl pipeline, information from the maps was aligned to assembled tammar wallaby contigs using BLAT. This data was used to construct pseudo paired-end libraries with intervals ranging from 5-10 MB. We then used Bambus (a program designed to scaffold eukaryotic genomes by ordering and orienting contigs through the use of paired-end data) to scaffold our libraries. To determine how map data compares to sequence based approaches to enhance assemblies, we repeated the experiment using a 0.5× coverage of unique reads from 4 KB and 8 KB Illumina paired-end libraries. Finally, we combined both the sequence and non-sequence-based data to determine how a combined approach could further enhance the quality of the low coverage de novo reconstruction of the tammar wallaby genome. Using the map data alone, we were able order 2.2% of the initial contigs into scaffolds, and increase the N50 scaffold size to 39 KB (36 KB in the original assembly). Using only the 0.5× paired-end sequence based data, 53% of the initial contigs were assigned to scaffolds. Combining both data sets resulted in a further 2% increase in the number of initial contigs integrated into a scaffold (55% total) but a 35% increase in N50 scaffold size over the use of sequence-based data alone. We provide a relatively simple pipeline utilizing existing bioinformatics tools to integrate map data into a genome assembly which is available at http://www.mcb.uconn.edu/fac.php?name=paska. While the map data only contributed minimally to assigning the initial contigs to scaffolds in the new assembly, it greatly increased the N50 size. This process added structure to our low coverage assembly, greatly increasing its utility in further analyses.
Yellow fever vector live-virus vaccines: West Nile virus vaccine development.

PubMed

Arroyo, J; Miller, C A; Catalan, J; Monath, T P

2001-08-01

By combining molecular-biological techniques with our increased understanding of the effect of gene sequence modification on viral function, yellow fever 17D, a positive-strand RNA virus vaccine, has been manipulated to induce a protective immune response against viruses of the same family (e.g. Japanese encephalitis and dengue viruses). Triggered by the emergence of West Nile virus infections in the New World afflicting humans, horses and birds, the success of this recombinant technology has prompted the rapid development of a live-virus attenuated candidate vaccine against West Nile virus.
Inaugural Genomics Automation Congress and the coming deluge of sequencing data.

PubMed

Creighton, Chad J

2010-10-01

Presentations at Select Biosciences's first 'Genomics Automation Congress' (Boston, MA, USA) in 2010 focused on next-generation sequencing and the platforms and methodology around them. The meeting provided an overview of sequencing technologies, both new and emerging. Speakers shared their recent work on applying sequencing to profile cells for various levels of biomolecular complexity, including DNA sequences, DNA copy, DNA methylation, mRNA and microRNA. With sequencing time and costs continuing to drop dramatically, a virtual explosion of very large sequencing datasets is at hand, which will probably present challenges and opportunities for high-level data analysis and interpretation, as well as for information technology infrastructure.
Combining micelle-clay sorption to solar photo-Fenton processes for domestic wastewater treatment.

PubMed

Brienza, Monica; Nir, Shlomo; Plantard, Gael; Goetz, Vincent; Chiron, Serge

2018-06-08

A tertiary treatment of effluent from a biological domestic wastewater treatment plant was tested by combining filtration and solar photocatalysis. Adsorption was carried out by a sequence of two column filters, the first one filled with granular activated carbon (GAC) and the second one with granulated nano-composite of micelle-montmorillonite mixed with sand (20:100, w/w). The applied solar advanced oxidation process was homogeneous photo-Fenton photocatalysis using peroxymonosulfate (PMS) as oxidant agent. This combination of simple, robust, and low-cost technologies aimed to ensure water disinfection and emerging contaminants (ECs, mainly pharmaceuticals) removal. The filtration step showed good performances in removing dissolved organic matter and practically removing all bacteria such as Escherichia coli and Enterococcus faecalis from the secondary treated water. Solar advanced oxidation processes were efficient in elimination of trace levels of ECs. The final effluent presented an improved sanitary level with acceptable chemical and biological characteristics for irrigation.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

2011-01-18

A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
The sequence of sequencers: The history of sequencing DNA.

PubMed

Heather, James M; Chain, Benjamin

2016-01-01

Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

Regularized rare variant enrichment analysis for case-control exome sequencing data.

PubMed

Larson, Nicholas B; Schaid, Daniel J

2014-02-01

Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.
Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

PubMed Central

2013-01-01

Background The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another. PMID:23870653
Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics.

PubMed

Gullapalli, Rama R; Desai, Ketaki V; Santana-Santos, Lucas; Kant, Jeffrey A; Becich, Michael J

2012-01-01

The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it's hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future.
Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

PubMed Central

Gullapalli, Rama R.; Desai, Ketaki V.; Santana-Santos, Lucas; Kant, Jeffrey A.; Becich, Michael J.

2012-01-01

The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it's hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future. PMID:23248761
Whole genome sequencing of a begomovirus-resistant tomato inbred reveals introgressions from wild Solanum species

USDA-ARS?s Scientific Manuscript database

The low cost of next generation sequencing (NGS) technology and the availability of a large number of well annotated plant genomes has made sequencing technology useful to breeding programs. With the published high quality tomato reference genome of the processing cultivar Heinz 1706, we can now uti...
Analysis of plant microbe interactions in the era of next generation sequencing technologies

PubMed Central

Knief, Claudia

2014-01-01

Next generation sequencing (NGS) technologies have impressively accelerated research in biological science during the last years by enabling the production of large volumes of sequence data to a drastically lower price per base, compared to traditional sequencing methods. The recent and ongoing developments in the field allow addressing research questions in plant-microbe biology that were not conceivable just a few years ago. The present review provides an overview of NGS technologies and their usefulness for the analysis of microorganisms that live in association with plants. Possible limitations of the different sequencing systems, in particular sources of errors and bias, are critically discussed and methods are disclosed that help to overcome these shortcomings. A focus will be on the application of NGS methods in metagenomic studies, including the analysis of microbial communities by amplicon sequencing, which can be considered as a targeted metagenomic approach. Different applications of NGS technologies are exemplified by selected research articles that address the biology of the plant associated microbiota to demonstrate the worth of the new methods. PMID:24904612
[Review of Second Generation Sequencing and Its Application in Forensic Genetics].

PubMed

Zhang, S H; Bian, Y N; Zhao, Q; Li, C T

2016-08-01

The rapid development of second generation sequencing （SGS） within the past few years has led to the increasement of data throughput and read length while at the same time brought down substantially the sequencing cost. This made new breakthrough in the area of biology and ushered the forensic genetics into a new era. Based on the history of sequencing application in forensic genetics, this paper reviews the importance of sequencing technologies for genetic marker detection. The application status and potential of SGS in forensic genetics are discussed based on the already explored SGS platforms of Roche, Illumina and Life Technologies. With these platforms, DNA markers （SNP, STR）, RNA markers （mRNA, microRNA） and whole mtDNA can be sequenced. However, development and validation of application kits, maturation of analysis software, connection to the existing databases and the possible ethical issues occurred with big data will be the key factors that determine whether this technology can substitute or supplement PCR-CE, the mature technology, and be widely used for cases detection. Copyright© by the Editorial Department of Journal of Forensic Medicine.
Nanopore-CMOS Interfaces for DNA Sequencing

PubMed Central

Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

2016-01-01

DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529
Nanopore-CMOS Interfaces for DNA Sequencing.

PubMed

Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

2016-08-06

DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.
The history and advances of reversible terminators used in new generations of sequencing technology.

PubMed

Chen, Fei; Dong, Mengxing; Ge, Meng; Zhu, Lingxiang; Ren, Lufeng; Liu, Guocheng; Mu, Rong

2013-02-01

DNA sequencing using reversible terminators, as one sequencing by synthesis strategy, has garnered a great deal of interest due to its popular application in the second-generation high-throughput DNA sequencing technology. In this review, we provided its history of development, classification, and working mechanism of this technology. We also outlined the screening strategies for DNA polymerases to accommodate the reversible terminators as substrates during polymerization; particularly, we introduced the "REAP" method developed by us. At the end of this review, we discussed current limitations of this approach and provided potential solutions to extend its application. Copyright © 2013. Production and hosting by Elsevier Ltd.
Future of breeding by genome editing is in the hands of regulators.

PubMed

Jones, Huw D

2015-01-01

We are witnessing the timely convergence of several technologies that together will have significant impact on research, human health and in animal and plant breeding. The exponential increase in genome and expressed sequence data, the ability to compile, analyze and mine these data via sophisticated bioinformatics procedures on high-powered computers, and developments in various molecular and in-vitro cellular techniques combine to underpin novel developments in research and commercial biotechnology. Arguably the most important of these is genome editing which encompasses a suite of site directed nucleases (SDN) that can be designed to cut, or otherwise modify predetermined DNA sequences in the genome and result in targeted insertions, deletions, or other changes for genetic improvement. It is a powerful and adaptive technology for animal and plant science, with huge relevance for plant and animal breeding. But this promise will be realized only if the regulatory oversite is proportionate to the potential hazards and has broad support from consumers, researchers and commercial interests. Despite significant progress in research and development and one genome edited crop close to commercialization, in most regions of the world it still remains unclear how or whether this fledgling technology will be regulated. The various risk management authorities and biotechnology regulators have a unique opportunity to set up a logical, appropriate and workable regulatory framework for gene editing that, unlike the situation for GMOs, would have broad support from stakeholders.
Touch HDR: photograph enhancement by user controlled wide dynamic range adaptation

NASA Astrophysics Data System (ADS)

Verrall, Steve; Siddiqui, Hasib; Atanassov, Kalin; Goma, Sergio; Ramachandra, Vikas

2013-03-01

High Dynamic Range (HDR) technology enables photographers to capture a greater range of tonal detail. HDR is typically used to bring out detail in a dark foreground object set against a bright background. HDR technologies include multi-frame HDR and single-frame HDR. Multi-frame HDR requires the combination of a sequence of images taken at different exposures. Single-frame HDR requires histogram equalization post-processing of a single image, a technique referred to as local tone mapping (LTM). Images generated using HDR technology can look less natural than their non- HDR counterparts. Sometimes it is only desired to enhance small regions of an original image. For example, it may be desired to enhance the tonal detail of one subject's face while preserving the original background. The Touch HDR technique described in this paper achieves these goals by enabling selective blending of HDR and non-HDR versions of the same image to create a hybrid image. The HDR version of the image can be generated by either multi-frame or single-frame HDR. Selective blending can be performed as a post-processing step, for example, as a feature of a photo editor application, at any time after the image has been captured. HDR and non-HDR blending is controlled by a weighting surface, which is configured by the user through a sequence of touches on a touchscreen.
The Impact of Normalization Methods on RNA-Seq Data Analysis

PubMed Central

Zyprych-Walczak, J.; Szabelska, A.; Handschuh, L.; Górczak, K.; Klamecka, K.; Figlerowicz, M.; Siatkowski, I.

2015-01-01

High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. PMID:26176014
High-throughput discovery of mutations in tef semi-dwarfing genes by next-generation sequencing analysis.

PubMed

Zhu, Qihui; Smith, Shavannor M; Ayele, Mulu; Yang, Lixing; Jogi, Ansuya; Chaluvadi, Srinivasa R; Bennetzen, Jeffrey L

2012-11-01

Tef (Eragrostis tef) is a major cereal crop in Ethiopia. Lodging is the primary constraint to increasing productivity in this allotetraploid species, accounting for losses of ∼15-45% in yield each year. As a first step toward identifying semi-dwarf varieties that might have improved lodging resistance, an ∼6× fosmid library was constructed and used to identify both homeologues of the dw3 semi-dwarfing gene of Sorghum bicolor. An EMS mutagenized population, consisting of ∼21,210 tef plants, was planted and leaf materials were collected into 23 superpools. Two dwarfing candidate genes, homeologues of dw3 of sorghum and rht1 of wheat, were sequenced directly from each superpool with 454 technology, and 120 candidate mutations were identified. Out of 10 candidates tested, six independent mutations were validated by Sanger sequencing, including two predicted detrimental mutations in both dw3 homeologues with a potential to improve lodging resistance in tef through further breeding. This study demonstrates that high-throughput sequencing can identify potentially valuable mutations in under-studied plant species like tef and has provided mutant lines that can now be combined and tested in breeding programs for improved lodging resistance.
Using comparative genome analysis to identify problems in annotated microbial genomes.

PubMed

Poptsova, Maria S; Gogarten, J Peter

2010-07-01

Genome annotation is a tedious task that is mostly done by automated methods; however, the accuracy of these approaches has been questioned since the beginning of the sequencing era. Genome annotation is a multilevel process, and errors can emerge at different stages: during sequencing, as a result of gene-calling procedures, and in the process of assigning gene functions. Missed or wrongly annotated genes differentially impact different types of analyses. Here we discuss and demonstrate how the methods of comparative genome analysis can refine annotations by locating missing orthologues. We also discuss possible reasons for errors and show that the second-generation annotation systems, which combine multiple gene-calling programs with similarity-based methods, perform much better than the first annotation tools. Since old errors may propagate to the newly sequenced genomes, we emphasize that the problem of continuously updating popular public databases is an urgent and unresolved one. Due to the progress in genome-sequencing technologies, automated annotation techniques will remain the main approach in the future. Researchers need to be aware of the existing errors in the annotation of even well-studied genomes, such as Escherichia coli, and consider additional quality control for their results.
Multiplexed Sequence Encoding: A Framework for DNA Communication

PubMed Central

Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.

2016-01-01

Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646
Mutation detection in the human HSP70B′ gene by denaturing high-performance liquid chromatography

PubMed Central

Hecker, Karl H.; Asea, Alexzander; Kobayashi, Kaoru; Green, Stacy; Tang, Dan; Calderwood, Stuart K.

2000-01-01

Variances, particularly single nucleotide polymorphisms (SNP), in the genomic sequence of individuals are the primary key to understanding gene function as it relates to differences in the susceptibility to disease, environmental influences, and therapy. In this report, the HSP70B′ gene is the target sequence for mutation detection in biopsy samples from human prostate cancer patients undergoing combined hyperthermia and radiation therapy at the Dana-Farber Cancer Institute, using temperature-modulated heteroduplex analysis (TMHA). The underlying principles of TMHA for mutation detection using DHPLC technology are discussed. The procedures involved in amplicon design for mutation analysis by DHPLC are detailed. The melting behavior of the complete coding sequence of the target gene is characterized using WAVEMAKERTM software. Four overlapping amplicons, which span the complete coding region of the HSP70B′ gene, amenable to mutation detection by DHPLC were identified based on the software-predicted melting profile of the target sequence. TMHA was performed on PCR products of individual amplicons of the HSP70B′ gene on the WAVE® Nucleic Acid Fragment Analysis System. The criteria for mutation calling by comparing wild-type and mutant chromatographic patterns are discussed. PMID:11189446
Mutation detection in the human HSP7OB' gene by denaturing high-performance liquid chromatography.

PubMed

Hecker, K H; Asea, A; Kobayashi, K; Green, S; Tang, D; Calderwood, S K

2000-11-01

Variances, particularly single nucleotide polymorphisms (SNP), in the genomic sequence of individuals are the primary key to understanding gene function as it relates to differences in the susceptibility to disease, environmental influences, and therapy. In this report, the HSP70B' gene is the target sequence for mutation detection in biopsy samples from human prostate cancer patients undergoing combined hyperthermia and radiation therapy at the Dana-Farber Cancer Institute, using temperature-modulated heteroduplex analysis (TMHA). The underlying principles of TMHA for mutation detection using DHPLC technology are discussed. The procedures involved in amplicon design for mutation analysis by DHPLC are detailed. The melting behavior of the complete coding sequence of the target gene is characterized using WAVEMAKER software. Four overlapping amplicons, which span the complete coding region of the HSP70B' gene, amenable to mutation detection by DHPLC were identified based on the software-predicted melting profile of the target sequence. TMHA was performed on PCR products of individual amplicons of the HSP70B' gene on the WAVE Nucleic Acid Fragment Analysis System. The criteria for mutation calling by comparing wild-type and mutant chromatographic patterns are discussed.
Microbes, metagenomes and marine mammals: enabling the next generation of scientist to enter the genomic era

PubMed Central

2013-01-01

Background The revolution in DNA sequencing technology continues unabated, and is affecting all aspects of the biological and medical sciences. The training and recruitment of the next generation of researchers who are able to use and exploit the new technology is severely lacking and potentially negatively influencing research and development efforts to advance genome biology. Here we present a cross-disciplinary course that provides undergraduate students with practical experience in running a next generation sequencing instrument through to the analysis and annotation of the generated DNA sequences. Results Many labs across world are installing next generation sequencing technology and we show that the undergraduate students produce quality sequence data and were excited to participate in cutting edge research. The students conducted the work flow from DNA extraction, library preparation, running the sequencing instrument, to the extraction and analysis of the data. They sequenced microbes, metagenomes, and a marine mammal, the Californian sea lion, Zalophus californianus. The students met sequencing quality controls, had no detectable contamination in the targeted DNA sequences, provided publication quality data, and became part of an international collaboration to investigate carcinomas in carnivores. Conclusions Students learned important skills for their future education and career opportunities, and a perceived increase in students’ ability to conduct independent scientific research was measured. DNA sequencing is rapidly expanding in the life sciences. Teaching undergraduates to use the latest technology to sequence genomic DNA ensures they are ready to meet the challenges of the genomic era and allows them to participate in annotating the tree of life. PMID:24007365
The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.

PubMed

Droege, Marcus; Hill, Brendon

2008-08-31

The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer.

Optical Communications Channel Combiner

NASA Technical Reports Server (NTRS)

Quirk, Kevin J.; Quirk, Kevin J.; Nguyen, Danh H.; Nguyen, Huy

2012-01-01

NASA has identified deep-space optical communications links as an integral part of a unified space communication network in order to provide data rates in excess of 100 Mb/s. The distances and limited power inherent in a deep-space optical downlink necessitate the use of photon-counting detectors and a power-efficient modulation such as pulse position modulation (PPM). For the output of each photodetector, whether from a separate telescope or a portion of the detection area, a communication receiver estimates a log-likelihood ratio for each PPM slot. To realize the full effective aperture of these receivers, their outputs must be combined prior to information decoding. A channel combiner was developed to synchronize the log-likelihood ratio (LLR) sequences of multiple receivers, and then combines these into a single LLR sequence for information decoding. The channel combiner synchronizes the LLR sequences of up to three receivers and then combines these into a single LLR sequence for output. The channel combiner has three channel inputs, each of which takes as input a sequence of four-bit LLRs for each PPM slot in a codeword via a XAUI 10 Gb/s quad optical fiber interface. The cross-correlation between the channels LLR time series are calculated and used to synchronize the sequences prior to combining. The output of the channel combiner is a sequence of four-bit LLRs for each PPM slot in a codeword via a XAUI 10 Gb/s quad optical fiber interface. The unit is controlled through a 1 Gb/s Ethernet UDP/IP interface. A deep-space optical communication link has not yet been demonstrated. This ground-station channel combiner was developed to demonstrate this capability and is unique in its ability to process such a signal.
Sequencing to Station in 12 Months (Targeting Orbital 5 Launch, March 30th)

NASA Technical Reports Server (NTRS)

Smith, David J.; Burton, Aaron Steven

2015-01-01

The Biomolecule Sequencer is a Commercial Off-The-Shelf device developed by Oxford Nanopore Technologies and implements a method of DNA sequencing unlike any other current sequencers. The device measures changes in electrical current through a nanopore depending on the sequence of the DNA strand that is passing through it. Since the technology is built on nanometer-scale ion pores, the hardware itself is exceptionally small (3 x 1 x 58 inches), lightweight (less than 120 grams with USB cable), and powered only by a USB connection. The sequencing device is permanent, while the flow cells, to which the samples are added, are periodically replaced. The goal of our upcoming technology demonstration on ISS is to provide evidence that DNA sequencing in space is possible, which holds the exciting potential to enable the identification of microorganisms, monitor changes in microbes and humans in response to spaceflight, and possibly aid in the detection of DNA-based life elsewhere in the universe.
Next-generation sequencing in the clinic: promises and challenges.

PubMed

Xuan, Jiekun; Yu, Ying; Qing, Tao; Guo, Lei; Shi, Leming

2013-11-01

The advent of next generation sequencing (NGS) technologies has revolutionized the field of genomics, enabling fast and cost-effective generation of genome-scale sequence data with exquisite resolution and accuracy. Over the past years, rapid technological advances led by academic institutions and companies have continued to broaden NGS applications from research to the clinic. A recent crop of discoveries have highlighted the medical impact of NGS technologies on Mendelian and complex diseases, particularly cancer. However, the ever-increasing pace of NGS adoption presents enormous challenges in terms of data processing, storage, management and interpretation as well as sequencing quality control, which hinder the translation from sequence data into clinical practice. In this review, we first summarize the technical characteristics and performance of current NGS platforms. We further highlight advances in the applications of NGS technologies towards the development of clinical diagnostics and therapeutics. Common issues in NGS workflows are also discussed to guide the selection of NGS platforms and pipelines for specific research purposes. Published by Elsevier Ireland Ltd.
Technical-economical analysis of selected decentralized technologies for municipal wastewater treatment in the city of Rome.

PubMed

Gavasci, Renato; Chiavola, Agostina; Spizzirri, Massimo

2010-01-01

Several wastewater treatment technologies were evaluated as alternative systems to the more traditional centralized continuous flow system to serve decentralized areas of the city of Rome (Italy). For instance, the following technologies were selected: (1) Constructed wetlands, (2) Membrane Biological Reactor, (3) Deep Shaft, (4) Sequencing Batch Reactor, and (5) Combined Filtration and UV-disinfection. Such systems were distinguished based on the limits they are potentially capable of accomplishing on the effluent. Consequently, the SBR and DS were grouped together for their capability to comply with the standards for the discharge into surface waters (according to the Italian D.Lgs. 152/06, Table 1, All. 5), whereas the MBR and tertiary system (Filtration+UVc-disinfection) were considered together as they should be able to allow effluent discharge into soil (according to the Italian D.Lgs. 152/06, Table 4, All. 5) and/or reuse (according to the Italian D.M. 185/03). Both groups of technologies were evaluated in comparison with the more common continuous flow treatment sequence consisting of a biological activated sludge tank followed by the secondary settlement, with final chlorination. CWs were studied separately as a solution for decentralized urban areas with limited population. After the analysis of the main technical features, an economical estimate was carried out taking into account the investment, operation and maintenance costs as a function of the plant's capacity. The analysis was based on real data provided by the Company who manages the entire water system of the City of Rome (Acea Ato 2 S.p.A.). A preliminary design of the treatment plants using some of the selected technologies was finally carried out.
Genetically engineered livestock for agriculture: a generation after the first transgenic animal research conference.

PubMed

Murray, James D; Maga, Elizabeth A

2016-06-01

At the time of the first Transgenic Animal Research Conference, the lack of knowledge about promoter, enhancer and coding regions of genes of interest greatly hampered our efforts to create transgenes that would express appropriately in livestock. Additionally, we were limited to gene insertion by pronuclear microinjection. As predicted then, widespread genome sequencing efforts and technological advancements have profoundly altered what we can do. There have been many developments in technology to create transgenic animals since we first met at Granlibakken in 1997, including the advent of somatic cell nuclear transfer-based cloning and gene editing. We can now create new transgenes that will express when and where we want and can target precisely in the genome where we want to make a change or insert a transgene. With the large number of sequenced genomes, we have unprecedented access to sequence information including, control regions, coding regions, and known allelic variants. These technological developments have ushered in new and renewed enthusiasm for the production of transgenic animals among scientists and animal agriculturalists around the world, both for the production of more relevant biomedical research models as well as for agricultural applications. However, even though great advancements have been made in our ability to control gene expression and target genetic changes in our animals, there still are no genetically engineered animal products on the market for food. World-wide there has been a failure of the regulatory processes to effectively move forward. Estimates suggest the world will need to increase our current food production 70 % by 2050; that is we will have to produce the total amount of food each year that has been consumed by mankind over the past 500 years. The combination of transgenic animal technology and gene editing will become increasingly more important tools to help feed the world. However, to date the practical benefits of these technologies have not yet reached consumers in any country and in the absence of predictable, science-based regulatory programs it is unlikely that the benefits will be realized in the short to medium term.
Rapid evaluation and quality control of next generation sequencing data with FaQCs

DOE PAGES

Lo, Chien -Chi; Chain, Patrick S. G.

2014-12-01

Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less
A 454 multiplex sequencing method for rapid and reliable genotyping of highly polymorphic genes in large-scale studies.

PubMed

Galan, Maxime; Guivier, Emmanuel; Caraux, Gilles; Charbonnel, Nathalie; Cosson, Jean-François

2010-05-11

High-throughput sequencing technologies offer new perspectives for biomedical, agronomical and evolutionary research. Promising progresses now concern the application of these technologies to large-scale studies of genetic variation. Such studies require the genotyping of high numbers of samples. This is theoretically possible using 454 pyrosequencing, which generates billions of base pairs of sequence data. However several challenges arise: first in the attribution of each read produced to its original sample, and second, in bioinformatic analyses to distinguish true from artifactual sequence variation. This pilot study proposes a new application for the 454 GS FLX platform, allowing the individual genotyping of thousands of samples in one run. A probabilistic model has been developed to demonstrate the reliability of this method. DNA amplicons from 1,710 rodent samples were individually barcoded using a combination of tags located in forward and reverse primers. Amplicons consisted in 222 bp fragments corresponding to DRB exon 2, a highly polymorphic gene in mammals. A total of 221,789 reads were obtained, of which 153,349 were finally assigned to original samples. Rules based on a probabilistic model and a four-step procedure, were developed to validate sequences and provide a confidence level for each genotype. The method gave promising results, with the genotyping of DRB exon 2 sequences for 1,407 samples from 24 different rodent species and the sequencing of 392 variants in one half of a 454 run. Using replicates, we estimated that the reproducibility of genotyping reached 95%. This new approach is a promising alternative to classical methods involving electrophoresis-based techniques for variant separation and cloning-sequencing for sequence determination. The 454 system is less costly and time consuming and may enhance the reliability of genotypes obtained when high numbers of samples are studied. It opens up new perspectives for the study of evolutionary and functional genetics of highly polymorphic genes like major histocompatibility complex genes in vertebrates or loci regulating self-compatibility in plants. Important applications in biomedical research will include the detection of individual variation in disease susceptibility. Similarly, agronomy will benefit from this approach, through the study of genes implicated in productivity or disease susceptibility traits.
Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.

PubMed

Kleftogiannis, Dimitrios; Kalnis, Panos; Bajic, Vladimir B

2013-01-01

A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.
Large scale analysis of the mutational landscape in HT-SELEX improves aptamer discovery

PubMed Central

Hoinka, Jan; Berezhnoy, Alexey; Dao, Phuong; Sauna, Zuben E.; Gilboa, Eli; Przytycka, Teresa M.

2015-01-01

High-Throughput (HT) SELEX combines SELEX (Systematic Evolution of Ligands by EXponential Enrichment), a method for aptamer discovery, with massively parallel sequencing technologies. This emerging technology provides data for a global analysis of the selection process and for simultaneous discovery of a large number of candidates but currently lacks dedicated computational approaches for their analysis. To close this gap, we developed novel in-silico methods to analyze HT-SELEX data and utilized them to study the emergence of polymerase errors during HT-SELEX. Rather than considering these errors as a nuisance, we demonstrated their utility for guiding aptamer discovery. Our approach builds on two main advancements in aptamer analysis: AptaMut—a novel technique allowing for the identification of polymerase errors conferring an improved binding affinity relative to the ‘parent’ sequence and AptaCluster—an aptamer clustering algorithm which is to our best knowledge, the only currently available tool capable of efficiently clustering entire aptamer pools. We applied these methods to an HT-SELEX experiment developing aptamers against Interleukin 10 receptor alpha chain (IL-10RA) and experimentally confirmed our predictions thus validating our computational methods. PMID:25870409
Modifications of allergenicity linked to food technologies.

PubMed

Moneret-Vautrin, D A

1998-01-01

The prevalence of food allergies (FA) has increased over the past fifteen years. The reasons suggested are changes in dietary behaviour and the evolution of food technologies. New cases of FA have been described with chayote, rambutan, arguta, pumpkin seeds, custard apple, and with mycoproteins from Fusarium.... Additives using food proteins are at high risk: caseinates, lysozyme, cochineal red, papaïn, alpha-amylase, lactase etc. Heating can reduce allergenicity or create neo-allergens, as well as storage, inducing the synthesis of allergenic stress or PR proteins. Aeroallergens (miles, moulds) contaminate foods and can induce allergic reactions. Involuntary contamination by peanut proteins on production lines is a problem which is not yet solved. Genetically modified plants are at risk of allergenicity, requiring methodological steps of investigations: the comparison of the amino-acid sequence of the transferred protein with the sequence of known allergens, the evaluation of thermo degradability and of the denaturation by pepsin and trypsin are required, as well as the study with sera from patients allergic to the plant producing the gene. The combination of enzymatic hydrolysis, heating, or the development of genetically modified plants may offer new alternatives towards hypoallergenic foods (57 references).
Future technologies for monitoring HIV drug resistance and cure.

PubMed

Parikh, Urvi M; McCormick, Kevin; van Zyl, Gert; Mellors, John W

2017-03-01

Sensitive, scalable and affordable assays are critically needed for monitoring the success of interventions for preventing, treating and attempting to cure HIV infection. This review evaluates current and emerging technologies that are applicable for both surveillance of HIV drug resistance (HIVDR) and characterization of HIV reservoirs that persist despite antiretroviral therapy and are obstacles to curing HIV infection. Next-generation sequencing (NGS) has the potential to be adapted into high-throughput, cost-efficient approaches for HIVDR surveillance and monitoring during continued scale-up of antiretroviral therapy and rollout of preexposure prophylaxis. Similarly, improvements in PCR and NGS are resulting in higher throughput single genome sequencing to detect intact proviruses and to characterize HIV integration sites and clonal expansions of infected cells. Current population genotyping methods for resistance monitoring are high cost and low throughput. NGS, combined with simpler sample collection and storage matrices (e.g. dried blood spots), has considerable potential to broaden global surveillance and patient monitoring for HIVDR. Recent adaptions of NGS to identify integration sites of HIV in the human genome and to characterize the integrated HIV proviruses are likely to facilitate investigations of the impact of experimental 'curative' interventions on HIV reservoirs.
The history of MR imaging as seen through the pages of radiology.

PubMed

Edelman, Robert R

2014-11-01

The first reports in Radiology pertaining to magnetic resonance (MR) imaging were published in 1980, 7 years after Paul Lauterbur pioneered the first MR images and 9 years after the first human computed tomographic images were obtained. Historical advances in the research and clinical applications of MR imaging very much parallel the remarkable advances in MR imaging technology. These advances can be roughly classified into hardware (eg, magnets, gradients, radiofrequency [RF] coils, RF transmitter and receiver, MR imaging-compatible biopsy devices) and imaging techniques (eg, pulse sequences, parallel imaging, and so forth). Image quality has been dramatically improved with the introduction of high-field-strength superconducting magnets, digital RF systems, and phased-array coils. Hybrid systems, such as MR/positron emission tomography (PET), combine the superb anatomic and functional imaging capabilities of MR imaging with the unsurpassed capability of PET to demonstrate tissue metabolism. Supported by the improvements in hardware, advances in pulse sequence design and image reconstruction techniques have spurred dramatic improvements in imaging speed and the capability for studying tissue function. In this historical review, the history of MR imaging technology and developing research and clinical applications, as seen through the pages of Radiology, will be considered.
Next generation sequencing technology: a powerful tool for the genome characterization of sugarcane mosaic virus from Sorghum almum

USDA-ARS?s Scientific Manuscript database

Next generation sequencing (NGS) technology was used to analyze the occurrence of viruses in Sorghum almum plants in Florida exhibiting mosaic symptoms. Total RNA was extracted from symptomatic leaves and used as a template for cDNA library preparation. The resulting library was sequenced on an Illu...
Whole-genome sequencing for comparative genomics and de novo genome assembly.

PubMed

Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

2015-01-01

Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).
Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

NASA Technical Reports Server (NTRS)

Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

2016-01-01

On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human Research Program investigations, and even life detection experiments for astrobiology missions.
Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions

PubMed Central

2014-01-01

Deep sequencing harnesses the high throughput nature of next generation sequencing technologies to generate population samples, treating information contained in individual reads as meaningful. Here, we review applications of deep sequencing to pathogen evolution. Pioneering deep sequencing studies from the virology literature are discussed, such as whole genome Roche-454 sequencing analyses of the dynamics of the rapidly mutating pathogens hepatitis C virus and HIV. Extension of the deep sequencing approach to bacterial populations is then discussed, including the impacts of emerging sequencing technologies. While it is clear that deep sequencing has unprecedented potential for assessing the genetic structure and evolutionary history of pathogen populations, bioinformatic challenges remain. We summarise current approaches to overcoming these challenges, in particular methods for detecting low frequency variants in the context of sequencing error and reconstructing individual haplotypes from short reads. PMID:24428920
Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications.

PubMed

Gowrisankar, Sivakumar; Lerner-Ellis, Jordan P; Cox, Stephanie; White, Emily T; Manion, Megan; LeVan, Kevin; Liu, Jonathan; Farwell, Lisa M; Iartchouk, Oleg; Rehm, Heidi L; Funke, Birgit H

2010-11-01

Medical sequencing for diseases with locus and allelic heterogeneities has been limited by the high cost and low throughput of traditional sequencing technologies. "Second-generation" sequencing (SGS) technologies allow the parallel processing of a large number of genes and, therefore, offer great promise for medical sequencing; however, their use in clinical laboratories is still in its infancy. Our laboratory offers clinical resequencing for dilated cardiomyopathy (DCM) using an array-based platform that interrogates 19 of more than 30 genes known to cause DCM. We explored both the feasibility and cost effectiveness of using PCR amplification followed by SGS technology for sequencing these 19 genes in a set of five samples enriched for known sequence alterations (109 unique substitutions and 27 insertions and deletions). While the analytical sensitivity for substitutions was comparable to that of the DCM array (98%), SGS technology performed better than the DCM array for insertions and deletions (90.6% versus 58%). Overall, SGS performed substantially better than did the current array-based testing platform; however, the operational cost and projected turnaround time do not meet our current standards. Therefore, efficient capture methods and/or sample pooling strategies that shorten the turnaround time and decrease reagent and labor costs are needed before implementing this platform into routine clinical applications.
Newborn Sequencing in Genomic Medicine and Public Health

PubMed Central

Agrawal, Pankaj B.; Bailey, Donald B.; Beggs, Alan H.; Brenner, Steven E.; Brower, Amy M.; Cakici, Julie A.; Ceyhan-Birsoy, Ozge; Chan, Kee; Chen, Flavia; Currier, Robert J.; Dukhovny, Dmitry; Green, Robert C.; Harris-Wai, Julie; Holm, Ingrid A.; Iglesias, Brenda; Joseph, Galen; Kingsmore, Stephen F.; Koenig, Barbara A.; Kwok, Pui-Yan; Lantos, John; Leeder, Steven J.; Lewis, Megan A.; McGuire, Amy L.; Milko, Laura V.; Mooney, Sean D.; Parad, Richard B.; Pereira, Stacey; Petrikin, Joshua; Powell, Bradford C.; Powell, Cynthia M.; Puck, Jennifer M.; Rehm, Heidi L.; Risch, Neil; Roche, Myra; Shieh, Joseph T.; Veeraraghavan, Narayanan; Watson, Michael S.; Willig, Laurel; Yu, Timothy W.; Urv, Tiina; Wise, Anastasia L.

2017-01-01

The rapid development of genomic sequencing technologies has decreased the cost of genetic analysis to the extent that it seems plausible that genome-scale sequencing could have widespread availability in pediatric care. Genomic sequencing provides a powerful diagnostic modality for patients who manifest symptoms of monogenic disease and an opportunity to detect health conditions before their development. However, many technical, clinical, ethical, and societal challenges should be addressed before such technology is widely deployed in pediatric practice. This article provides an overview of the Newborn Sequencing in Genomic Medicine and Public Health Consortium, which is investigating the application of genome-scale sequencing in newborns for both diagnosis and screening. PMID:28096516
Human genetics and genomics a decade after the release of the draft sequence of the human genome.

PubMed

Naidoo, Nasheen; Pawitan, Yudi; Soong, Richie; Cooper, David N; Ku, Chee-Seng

2011-10-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.
Human genetics and genomics a decade after the release of the draft sequence of the human genome

PubMed Central

2011-01-01

Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade. PMID:22155605

Progress in ion torrent semiconductor chip based sequencing.

PubMed

Merriman, Barry; Rothberg, Jonathan M

2012-12-01

In order for next-generation sequencing to become widely used as a diagnostic in the healthcare industry, sequencing instrumentation will need to be mass produced with a high degree of quality and economy. One way to achieve this is to recast DNA sequencing in a format that fully leverages the manufacturing base created for computer chips, complementary metal-oxide semiconductor chip fabrication, which is the current pinnacle of large scale, high quality, low-cost manufacturing of high technology. To achieve this, ideally the entire sensory apparatus of the sequencer would be embodied in a standard semiconductor chip, manufactured in the same fab facilities used for logic and memory chips. Recently, such a sequencing chip, and the associated sequencing platform, has been developed and commercialized by Ion Torrent, a division of Life Technologies, Inc. Here we provide an overview of this semiconductor chip based sequencing technology, and summarize the progress made since its commercial introduction. We described in detail the progress in chip scaling, sequencing throughput, read length, and accuracy. We also summarize the enhancements in the associated platform, including sample preparation, data processing, and engagement of the broader development community through open source and crowdsourcing initiatives. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

PubMed Central

Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

2005-01-01

Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134
Testing of the NASA Hypersonics Project Combined Cycle Engine Large Scale Inlet Mode Transition Experiment (CCE LlMX)

NASA Technical Reports Server (NTRS)

Saunders, J. D.; Stueber, T. J.; Thomas, S. R.; Suder, K. L.; Weir, L. J.; Sanders, B. W.

2012-01-01

Status on an effort to develop Turbine Based Combined Cycle (TBCC) propulsion is described. This propulsion technology can enable reliable and reusable space launch systems. TBCC propulsion offers improved performance and safety over rocket propulsion. The potential to realize aircraft-like operations and reduced maintenance are additional benefits. Among most the critical TBCC enabling technologies are: 1) mode transition from turbine to scramjet propulsion, 2) high Mach turbine engines and 3) TBCC integration. To address these TBCC challenges, the effort is centered on a propulsion mode transition experiment and includes analytical research. The test program, the Combined-Cycle Engine Large Scale Inlet Mode Transition Experiment (CCE LIMX), was conceived to integrate TBCC propulsion with proposed hypersonic vehicles. The goals address: (1) dual inlet operability and performance, (2) mode-transition sequences enabling a switch between turbine and scramjet flow paths, and (3) turbine engine transients during transition. Four test phases are planned from which a database can be used to both validate design and analysis codes and characterize operability and integration issues for TBCC propulsion. In this paper we discuss the research objectives, features of the CCE hardware and test plans, and status of the parametric inlet characterization testing which began in 2011. This effort is sponsored by the NASA Fundamental Aeronautics Hypersonics project
A new generation of cancer genome diagnostics for routine clinical use: overcoming the roadblocks to personalized cancer medicine.

PubMed

Heuckmann, J M; Thomas, R K

2015-09-01

The identification of 'druggable' kinase gene alterations has revolutionized cancer treatment in the last decade by providing new and successfully targetable drug targets. Thus, genotyping tumors for matching the right patients with the right drugs have become a clinical routine. Today, advances in sequencing technology and computational genome analyses enable the discovery of a constantly growing number of genome alterations relevant for clinical decision making. As a consequence, several technological approaches have emerged in order to deal with these rapidly increasing demands for clinical cancer genome analyses. Here, we describe challenges on the path to the broad introduction of diagnostic cancer genome analyses and the technologies that can be applied to overcome them. We define three generations of molecular diagnostics that are in clinical use. The latest generation of these approaches involves deep and thus, highly sensitive sequencing of all therapeutically relevant types of genome alterations-mutations, copy number alterations and rearrangements/fusions-in a single assay. Such approaches therefore have substantial advantages (less time and less tissue required) over PCR-based methods that typically have to be combined with fluorescence in situ hybridization for detection of gene amplifications and fusions. Since these new technologies work reliably on routine diagnostic formalin-fixed, paraffin-embedded specimens, they can help expedite the broad introduction of personalized cancer therapy into the clinic by providing comprehensive, sensitive and accurate cancer genome diagnoses in 'real-time'. © The Author 2015. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.

PubMed

Koren, Sergey; Phillippy, Adam M

2015-02-01

Like a jigsaw puzzle with large pieces, a genome sequenced with long reads is easier to assemble. However, recent sequencing technologies have favored lowering per-base cost at the expense of read length. This has dramatically reduced sequencing cost, but resulted in fragmented assemblies, which negatively affect downstream analyses and hinder the creation of finished (gapless, high-quality) genomes. In contrast, emerging long-read sequencing technologies can now produce reads tens of kilobases in length, enabling the automated finishing of microbial genomes for under $1000. This promises to improve the quality of reference databases and facilitate new studies of chromosomal structure and variation. We present an overview of these new technologies and the methods used to assemble long reads into complete genomes. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
New CRISPR-Cas systems from uncultivated microbes

NASA Astrophysics Data System (ADS)

Burstein, David; Harrington, Lucas B.; Strutt, Steven C.; Probst, Alexander J.; Anantharaman, Karthik; Thomas, Brian C.; Doudna, Jennifer A.; Banfield, Jillian F.

2017-02-01

CRISPR-Cas systems provide microbes with adaptive immunity by employing short DNA sequences, termed spacers, that guide Cas proteins to cleave foreign DNA. Class 2 CRISPR-Cas systems are streamlined versions, in which a single RNA-bound Cas protein recognizes and cleaves target sequences. The programmable nature of these minimal systems has enabled researchers to repurpose them into a versatile technology that is broadly revolutionizing biological and clinical research. However, current CRISPR-Cas technologies are based solely on systems from isolated bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped. Metagenomics, the sequencing of DNA extracted directly from natural microbial communities, provides access to the genetic material of a huge array of uncultivated organisms. Here, using genome-resolved metagenomics, we identify a number of CRISPR-Cas systems, including the first reported Cas9 in the archaeal domain of life, to our knowledge. This divergent Cas9 protein was found in little-studied nanoarchaea as part of an active CRISPR-Cas system. In bacteria, we discovered two previously unknown systems, CRISPR-CasX and CRISPR-CasY, which are among the most compact systems yet discovered. Notably, all required functional components were identified by metagenomics, enabling validation of robust in vivo RNA-guided DNA interference activity in Escherichia coli. Interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.
New CRISPR–Cas systems from uncultivated microbes

DOE PAGES

Burstein, David; Harrington, Lucas B.; Strutt, Steven C.; ...

2016-12-22

We present that CRISPR-Cas systems provide microbes with adaptive immunity by employing short DNA sequences, termed spacers, that guide Cas proteins to cleave foreign DNA. Class 2 CRISPR-Cas systems are streamlined versions, in which a single RNA-bound Cas protein recognizes and cleaves target sequences. The programmable nature of these minimal systems has enabled researchers to repurpose them into a versatile technology that is broadly revolutionizing biological and clinical research. However, current CRISPR-Cas technologies are based solely on systems from isolated bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped. Metagenomics, the sequencing of DNAmore » extracted directly from natural microbial communities, provides access to the genetic material of a huge array of uncultivated organisms. Here, using genome-resolved metagenomics, we identify a number of CRISPR-Cas systems, including the first reported Cas9 in the archaeal domain of life, to our knowledge. This divergent Cas9 protein was found in little-studied nanoarchaea as part of an active CRISPR-Cas system. In bacteria, we discovered two previously unknown systems, CRISPR-CasX and CRISPR-CasY, which are among the most compact systems yet discovered. Notably, all required functional components were identified by metagenomics, enabling validation of robust in vivo RNA-guided DNA interference activity in Escherichia coli. Lastly, interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.« less
New CRISPR–Cas systems from uncultivated microbes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burstein, David; Harrington, Lucas B.; Strutt, Steven C.

We present that CRISPR-Cas systems provide microbes with adaptive immunity by employing short DNA sequences, termed spacers, that guide Cas proteins to cleave foreign DNA. Class 2 CRISPR-Cas systems are streamlined versions, in which a single RNA-bound Cas protein recognizes and cleaves target sequences. The programmable nature of these minimal systems has enabled researchers to repurpose them into a versatile technology that is broadly revolutionizing biological and clinical research. However, current CRISPR-Cas technologies are based solely on systems from isolated bacteria, leaving the vast majority of enzymes from organisms that have not been cultured untapped. Metagenomics, the sequencing of DNAmore » extracted directly from natural microbial communities, provides access to the genetic material of a huge array of uncultivated organisms. Here, using genome-resolved metagenomics, we identify a number of CRISPR-Cas systems, including the first reported Cas9 in the archaeal domain of life, to our knowledge. This divergent Cas9 protein was found in little-studied nanoarchaea as part of an active CRISPR-Cas system. In bacteria, we discovered two previously unknown systems, CRISPR-CasX and CRISPR-CasY, which are among the most compact systems yet discovered. Notably, all required functional components were identified by metagenomics, enabling validation of robust in vivo RNA-guided DNA interference activity in Escherichia coli. Lastly, interrogation of environmental microbial communities combined with in vivo experiments allows us to access an unprecedented diversity of genomes, the content of which will expand the repertoire of microbe-based biotechnologies.« less
The genetic basis of new treatment modalities in melanoma.

PubMed

Kunz, Manfred

2015-01-01

In recent years, intracellular signal transduction via RAS-RAF-MEK-ERK has been successfully targeted in new treatment approaches for melanoma using small molecule inhibitors against activated BRAF (V600E mutation) and activated MEK1/2. Also mutated c-KIT has been identified as a promising target. Meanwhile, evidence has been provided that combinations between BRAF inhibitors and MEK1/2 inhibitors are more promising than single-agent treatments. Moreover, new treatment algorithms favor sequential treatment using BRAF inhibitors and newly developed immunotherapies targeting common T lymphocyte antigen 4 (CTLA-4) or programmed cell death 1 (PD-1). In depth molecular analyses have uncovered new mechanisms of treatment resistance and recurrence, which may impact on future treatment decisions. Moreover, next-generation sequencing data have shown that recurrent lesions harbor specific genetic aberrations. At the same time, high throughput sequencing studies of melanoma unraveled a series of new treatment candidates for future treatment approaches such as ERBB4, GRIN2A, GRM3, and RAC1. More recent bioinformatic technologies provided genetic evidence for extensive tumor heterogeneity and tumor clonality of solid tumors, which might also be of relevance for melanoma. However, these technologies have not yet been applied to this tumor. In this review, an overview on the genetic basis of current treatment of melanoma, treatment resistance and recurrences including new treatment perspectives based on recent high-throughput sequencing data is provided. Moreover, future aspects of individualized treatment based on each patient's individual mutational landscape are discussed.
Next-Generation Immune Repertoire Sequencing as a Clue to Elucidate the Landscape of Immune Modulation by Host-Gut Microbiome Interactions.

PubMed

Ichinohe, Tatsuo; Miyama, Takahiko; Kawase, Takakazu; Honjo, Yasuko; Kitaura, Kazutaka; Sato, Hiroyuki; Shin-I, Tadasu; Suzuki, Ryuji

2018-01-01

The human immune system is a fine network consisted of the innumerable numbers of functional cells that balance the immunity and tolerance against various endogenous and environmental challenges. Although advances in modern immunology have revealed a role of many unique immune cell subsets, technologies that enable us to capture the whole landscape of immune responses against specific antigens have been not available to date. Acquired immunity against various microorganisms including host microbiome is principally founded on T cell and B cell populations, each of which expresses antigen-specific receptors that define a unique clonotype. Over the past several years, high-throughput next-generation sequencing has been developed as a powerful tool to profile T- and B-cell receptor repertoires in a given individual at the single-cell level. Sophisticated immuno-bioinformatic analyses by use of this innovative methodology have been already implemented in clinical development of antibody engineering, vaccine design, and cellular immunotherapy. In this article, we aim to discuss the possible application of high-throughput immune receptor sequencing in the field of nutritional and intestinal immunology. Although there are still unsolved caveats, this emerging technology combined with single-cell transcriptomics/proteomics provides a critical tool to unveil the previously unrecognized principle of host-microbiome immune homeostasis. Accumulation of such knowledge will lead to the development of effective ways for personalized immune modulation through deeper understanding of the mechanisms by which the intestinal environment affects our immune ecosystem.
Genes for seed longevity in barley identified by genomic analysis on Near Isogenic Lines.

PubMed

Wozny, Dorothee; Kramer, Katharina; Finkemeier, Iris; Acosta, Ivan F; Koornneef, Maarten

2018-05-09

Genes controlling differences in seed longevity between two barley (Hordeum vulgare) accessions were identified by combining quantitative genetics 'omics' technologies in Near Isogenic Lines (NILs). The NILs were derived from crosses between the spring barley landraces L94 from Ethiopia and Cebada Capa from Argentina. A combined transcriptome and proteome analysis on mature, non-aged seeds of the two parental lines and the L94 NILs by RNA-sequencing and total seed proteomic profiling identified the UDP-glycosyltransferase MLOC_11661.1 as candidate gene for the QTL on 2H, and the NADP-dependent malic enzyme (NADP-ME) MLOC_35785.1 as possible downstream target gene. To validate these candidates, they were expressed in Arabidopsis under the control of constitutive promoters to attempt complementing the T-DNA knock-out line nadp-me1. Both the NADP-ME MLOC_35785.1 and the UDP-glycosyltransferase MLOC_11661.1 were able to rescue the nadp-me1 seed longevity phenotype. In the case of the UDP-glycosyltransferase, with high accumulation in NILs, only the coding sequence of Cebada Capa had a rescue effect. This article is protected by copyright. All rights reserved.
Microbial genomics, transcriptomics and proteomics: new discoveries in decomposition research using complementary methods.

PubMed

Baldrian, Petr; López-Mondéjar, Rubén

2014-02-01

Molecular methods for the analysis of biomolecules have undergone rapid technological development in the last decade. The advent of next-generation sequencing methods and improvements in instrumental resolution enabled the analysis of complex transcriptome, proteome and metabolome data, as well as a detailed annotation of microbial genomes. The mechanisms of decomposition by model fungi have been described in unprecedented detail by the combination of genome sequencing, transcriptomics and proteomics. The increasing number of available genomes for fungi and bacteria shows that the genetic potential for decomposition of organic matter is widespread among taxonomically diverse microbial taxa, while expression studies document the importance of the regulation of expression in decomposition efficiency. Importantly, high-throughput methods of nucleic acid analysis used for the analysis of metagenomes and metatranscriptomes indicate the high diversity of decomposer communities in natural habitats and their taxonomic composition. Today, the metaproteomics of natural habitats is of interest. In combination with advanced analytical techniques to explore the products of decomposition and the accumulation of information on the genomes of environmentally relevant microorganisms, advanced methods in microbial ecophysiology should increase our understanding of the complex processes of organic matter transformation.
Nanochannel Device with Embedded Nanopore: a New Approach for Single-Molecule DNA Analysis and Manipulation

NASA Astrophysics Data System (ADS)

Zhang, Yuning; Reisner, Walter

2012-02-01

Nanopore and nanochannel based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with nanpore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a nanopore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We will discuss our recent progress on device fabrication and characterization. In particular, we demonstrate that we can detect - using fluorescent microscopy - successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. In particular, we show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore, suggesting that the embedded pore could be used as a nanoscale window through which to interrogate a nanochannel extended DNA molecule.
Nanochannel Device with Embedded Nanopore: a New Approach for Single-Molecule DNA Analysis and Manipulation

NASA Astrophysics Data System (ADS)

Zhang, Yuning; Reisner, Walter

2013-03-01

Nanopore and nanochannel based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with embedded pore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a pore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can optically detect successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. In particular, we show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore, suggesting that the pore could be used as a nanoscale window through which to interrogate a nanochannel extended DNA molecule. Furthermore, electrical measurements through the nanopore are performed, indicating that DNA sensing is feasible using the nanochannel-nanopore device.
Whole-genome sequencing in bacteriology: state of the art

PubMed Central

Dark, Michael J

2013-01-01

Over the last ten years, genome sequencing capabilities have expanded exponentially. There have been tremendous advances in sequencing technology, DNA sample preparation, genome assembly, and data analysis. This has led to advances in a number of facets of bacterial genomics, including metagenomics, clinical medicine, bacterial archaeology, and bacterial evolution. This review examines the strengths and weaknesses of techniques in bacterial genome sequencing, upcoming technologies, and assembly techniques, as well as highlighting recent studies that highlight new applications for bacterial genomics. PMID:24143115
Mississippi Curriculum Framework for Forestry Technology (Program CIP: 03.0401--Forest Harvesting and Production Technology). Postsecondary Programs.

ERIC Educational Resources Information Center

Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

This document, which is intended for use by community and junior colleges throughout Mississippi, contains curriculum frameworks for the course sequences in the forestry technology program cluster. Presented in the introductory section are a description of the program and suggested course sequence. Section I lists baseline competencies for the…
Reference quality assembly of the 3.5 Gb genome of Capsicum annuum form a single linked-read library

USDA-ARS?s Scientific Manuscript database

Linked-Read sequencing technology has recently been employed successfully for de novo assembly of multiple human genomes, however the utility of this technology for complex plant genomes is unproven. We evaluated the technology for this purpose by sequencing the 3.5 gigabase (Gb) diploid pepper (Cap...
Mississippi Curriculum Framework for Medical Radiologic Technology (Radiography) (CIP: 51.0907--Medical Radiologic Technology). Postsecondary Programs.

ERIC Educational Resources Information Center

Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

This document, which is intended for use by community and junior colleges throughout Mississippi, contains curriculum frameworks for the course sequences in the radiologic technology program. Presented in the introductory section are a description of the program and suggested course sequence. Section I lists baseline competencies for the program,…
Mississippi Curriculum Framework for Civil Technology (Program CIP: 15.0201--Civil Engineering/Civil Technology). Postsecondary Programs.

ERIC Educational Resources Information Center

Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

This document, which is intended for use by community and junior colleges throughout Mississippi, contains curriculum frameworks for the course sequences in the civil technology programs cluster. Presented in the introductory section are a description of the program and suggested course sequence. Section I lists baseline competencies, and section…
Mississippi Curriculum Framework for Medical Laboratory Technology Programs (CIP: 51.1004--Medical Laboratory Technology). Postsecondary Programs.

ERIC Educational Resources Information Center

Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

This document, which is intended for use by community and junior colleges throughout Mississippi, contains curriculum frameworks for the course sequences in the medical laboratory technology program. Presented in the introductory section are a description of the program and suggested course sequence. Section I lists baseline competencies, and…

The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

PubMed Central

Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

2013-01-01

Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520
The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

PubMed

Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

2013-01-01

Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.
gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing.

PubMed

Olejnik, Michael; Steuwer, Michel; Gorlatch, Sergei; Heider, Dominik

2014-11-15

Next-generation sequencing (NGS) has a large potential in HIV diagnostics, and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In this study, we demonstrate the use of graphics processing units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify >175 000 sequences per second on an NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable NGS technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. The source code can be downloaded at http://www.heiderlab.de d.heider@wz-straubing.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
THE MASTER PROTOCOL CONCEPT

PubMed Central

Allegra, Carmen J.

2015-01-01

During the past decade, biomedical technologies have undergone an explosive evolution---from the publication of the first complete human genome in 2003, after more than a decade of effort and at a cost of hundreds of millions of dollars---to the present time, where a complete genomic sequence can be available in less than a day and at a small fraction of the cost of the original sequence. The widespread availability of next generation genomic sequencing has opened the door to the development of precision oncology. The need to test multiple new targeted agents both alone and in combination with other targeted therapies, as well as classic cytotoxic agents, demand the development of novel therapeutic platforms (particularly Master Protocols) capable of efficiently and effectively testing multiple targeted agents or targeted therapeutic strategies in relatively small patient subpopulations. Here, we describe the Master Protocol concept, with a focus on the expected gains and complexities of the use of this design. An overview of Master Protocols currently active or in development is provided along with a more extensive discussion of the Lung Master Protocol (Lung-MAP study). PMID:26433553
Unifying cancer and normal RNA sequencing data from different sources

PubMed Central

Wang, Qingguo; Armenia, Joshua; Zhang, Chao; Penson, Alexander V.; Reznik, Ed; Zhang, Liguo; Minet, Thais; Ochoa, Angelica; Gross, Benjamin E.; Iacobuzio-Donahue, Christine A.; Betel, Doron; Taylor, Barry S.; Gao, Jianjiong; Schultz, Nikolaus

2018-01-01

Driven by the recent advances of next generation sequencing (NGS) technologies and an urgent need to decode complex human diseases, a multitude of large-scale studies were conducted recently that have resulted in an unprecedented volume of whole transcriptome sequencing (RNA-seq) data, such as the Genotype Tissue Expression project (GTEx) and The Cancer Genome Atlas (TCGA). While these data offer new opportunities to identify the mechanisms underlying disease, the comparison of data from different sources remains challenging, due to differences in sample and data processing. Here, we developed a pipeline that processes and unifies RNA-seq data from different studies, which includes uniform realignment, gene expression quantification, and batch effect removal. We find that uniform alignment and quantification is not sufficient when combining RNA-seq data from different sources and that the removal of other batch effects is essential to facilitate data comparison. We have processed data from GTEx and TCGA and successfully corrected for study-specific biases, enabling comparative analysis between TCGA and GTEx. The normalized datasets are available for download on figshare. PMID:29664468
Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis

PubMed Central

Moore, Michael; Zhang, Chaolin; Gantman, Emily Conn; Mele, Aldo; Darnell, Jennifer C.; Darnell, Robert B.

2014-01-01

Summary Identifying sites where RNA binding proteins (RNABPs) interact with target RNAs opens the door to understanding the vast complexity of RNA regulation. UV-crosslinking and immunoprecipitation (CLIP) is a transformative technology in which RNAs purified from in vivo cross-linked RNA-protein complexes are sequenced to reveal footprints of RNABP:RNA contacts. CLIP combined with high throughput sequencing (HITS-CLIP) is a generalizable strategy to produce transcriptome-wide RNA binding maps with higher accuracy and resolution than standard RNA immunoprecipitation (RIP) profiling or purely computational approaches. Applying CLIP to Argonaute proteins has expanded the utility of this approach to mapping binding sites for microRNAs and other small regulatory RNAs. Finally, recent advances in data analysis take advantage of crosslinked-induced mutation sites (CIMS) to refine RNA-binding maps to single-nucleotide resolution. Once IP conditions are established, HITS-CLIP takes approximately eight days to prepare RNA for sequencing. Established pipelines for data analysis, including for CIMS, take 3-4 days. PMID:24407355
OTG-snpcaller: An Optimized Pipeline Based on TMAP and GATK for SNP Calling from Ion Torrent Data

PubMed Central

Huang, Wenpan; Xi, Feng; Lin, Lin; Zhi, Qihuan; Zhang, Wenwei; Tang, Y. Tom; Geng, Chunyu; Lu, Zhiyuan; Xu, Xun

2014-01-01

Because the new Proton platform from Life Technologies produced markedly different data from those of the Illumina platform, the conventional Illumina data analysis pipeline could not be used directly. We developed an optimized SNP calling method using TMAP and GATK (OTG-snpcaller). This method combined our own optimized processes, Remove Duplicates According to AS Tag (RDAST) and Alignment Optimize Structure (AOS), together with TMAP and GATK, to call SNPs from Proton data. We sequenced four sets of exomes captured by Agilent SureSelect and NimbleGen SeqCap EZ Kit, using Life Technology’s Ion Proton sequencer. Then we applied OTG-snpcaller and compared our results with the results from Torrent Variants Caller. The results indicated that OTG-snpcaller can reduce both false positive and false negative rates. Moreover, we compared our results with Illumina results generated by GATK best practices, and we found that the results of these two platforms were comparable. The good performance in variant calling using GATK best practices can be primarily attributed to the high quality of the Illumina sequences. PMID:24824529
A Brief Survey of Media Access Control, Data Link Layer, and Protocol Technologies for Lunar Surface Communications

NASA Technical Reports Server (NTRS)

Wallett, Thomas M.

2009-01-01

This paper surveys and describes some of the existing media access control and data link layer technologies for possible application in lunar surface communications and the advanced wideband Direct Sequence Code Division Multiple Access (DSCDMA) conceptual systems utilizing phased-array technology that will evolve in the next decade. Time Domain Multiple Access (TDMA) and Code Division Multiple Access (CDMA) are standard Media Access Control (MAC) techniques that can be incorporated into lunar surface communications architectures. Another novel hybrid technique that is recently being developed for use with smart antenna technology combines the advantages of CDMA with those of TDMA. The relatively new and sundry wireless LAN data link layer protocols that are continually under development offer distinct advantages for lunar surface applications over the legacy protocols which are not wireless. Also several communication transport and routing protocols can be chosen with characteristics commensurate with smart antenna systems to provide spacecraft communications for links exhibiting high capacity on the surface of the Moon. The proper choices depend on the specific communication requirements.
Genome Sequencing of Steroid Producing Bacteria Using Ion Torrent Technology and a Reference Genome.

PubMed

Sola-Landa, Alberto; Rodríguez-García, Antonio; Barreiro, Carlos; Pérez-Redondo, Rosario

2017-01-01

The Next-Generation Sequencing technology has enormously eased the bacterial genome sequencing and several tens of thousands of genomes have been sequenced during the last 10 years. Most of the genome projects are published as draft version, however, for certain applications the complete genome sequence is required.In this chapter, we describe the strategy that allowed the complete genome sequencing of Mycobacterium neoaurum NRRL B-3805, an industrial strain exploited for steroid production, using Ion Torrent sequencing reads and the genome of a close strain as the reference. This protocol can be applied to analyze the genetic variations between closely related strains; for example, to elucidate the point mutations between a parental strain and a random mutagenesis-derived mutant.
Potential in vivo roles of nucleic acid triple-helices

PubMed Central

Buske, Fabian A

2011-01-01

The ability of double-stranded DNA to form a triple-helical structure by hydrogen bonding with a third strand is well established, but the biological functions of these structures remain largely unknown. There is considerable albeit circumstantial evidence for the existence of nucleic triplexes in vivo and their potential participation in a variety of biological processes including chromatin organization, DNA repair, transcriptional regulation and RNA processing has been investigated in a number of studies to date. There is also a range of possible mechanisms to regulate triplex formation through differential expression of triplex-forming RNAs, alteration of chromatin accessibility, sequence unwinding and nucleotide modifications. With the advent of next generation sequencing technology combined with targeted approaches to isolate triplexes, it is now possible to survey triplex formation with respect to their genomic context, abundance and dynamical changes during differentiation and development, which may open up new vistas in understanding genome biology and gene regulation. PMID:21525785
Design of a final approach spacing tool for TRACON air traffic control

NASA Technical Reports Server (NTRS)

Davis, Thomas J.; Erzberger, Heinz; Bergeron, Hugh

1989-01-01

This paper describes an automation tool that assists air traffic controllers in the Terminal Radar Approach Control (TRACON) Facilities in providing safe and efficient sequencing and spacing of arrival traffic. The automation tool, referred to as the Final Approach Spacing Tool (FAST), allows the controller to interactively choose various levels of automation and advisory information ranging from predicted time errors to speed and heading advisories for controlling time error. FAST also uses a timeline to display current scheduling and sequencing information for all aircraft in the TRACON airspace. FAST combines accurate predictive algorithms and state-of-the-art mouse and graphical interface technology to present advisory information to the controller. Furthermore, FAST exchanges various types of traffic information and communicates with automation tools being developed for the Air Route Traffic Control Center. Thus it is part of an integrated traffic management system for arrival traffic at major terminal areas.
Identifying Differentially Abundant Metabolic Pathways in Metagenomic Datasets

NASA Astrophysics Data System (ADS)

Liu, Bo; Pop, Mihai

Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of such studies is to identify specific functional adaptations of microbial communities to their habitats. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic data-sets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge. We show that MetaPath outperforms other common approaches when evaluated on simulated datasets. We also demonstrate the power of our methods in analyzing two, publicly available, metagenomic datasets: a comparison of the gut microbiome of obese and lean twins; and a comparison of the gut microbiome of infant and adult subjects. We demonstrate that the subpathways identified by our method provide valuable insights into the biological activities of the microbiome.
Intelligent Access to Sequence and Structure Databases (IASSD) - an interface for accessing information from major web databases.

PubMed

Ganguli, Sayak; Gupta, Manoj Kumar; Basu, Protip; Banik, Rahul; Singh, Pankaj Kumar; Vishal, Vineet; Bera, Abhisek Ranjan; Chakraborty, Hirak Jyoti; Das, Sasti Gopal

2014-01-01

With the advent of age of big data and advances in high throughput technology accessing data has become one of the most important step in the entire knowledge discovery process. Most users are not able to decipher the query result that is obtained when non specific keywords or a combination of keywords are used. Intelligent access to sequence and structure databases (IASSD) is a desktop application for windows operating system. It is written in Java and utilizes the web service description language (wsdl) files and Jar files of E-utilities of various databases such as National Centre for Biotechnology Information (NCBI) and Protein Data Bank (PDB). Apart from that IASSD allows the user to view protein structure using a JMOL application which supports conditional editing. The Jar file is freely available through e-mail from the corresponding author.
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome.

PubMed

Wenger, Yvan; Galliot, Brigitte

2013-03-25

Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome

PubMed Central

2013-01-01

Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
Massively parallel sequencing of 124 SNPs included in the precision ID identity panel in three East Asian minority ethnicities.

PubMed

Liu, Jing; Wang, Zheng; He, Guanglin; Zhao, Xueying; Wang, Mengge; Luo, Tao; Li, Chengtao; Hou, Yiping

2018-07-01

Massively parallel sequencing (MPS) technologies can sequence many targeted regions of multiple samples simultaneously and are gaining great interest in the forensic community. The Precision ID Identity Panel contains 90 autosomal SNPs and 34 upper Y-Clade SNPs, which was designed with small amplicons and optimized for forensic degraded or challenging samples. Here, 184 unrelated individuals from three East Asian minority ethnicities (Tibetan, Uygur and Hui) were analyzed using the Precision ID Identity Panel and the Ion PGM System. The sequencing performance and corresponding forensic statistical parameters of this MPS-SNP panel were investigated. The inter-population relationships and substructures among three investigated populations and 30 worldwide populations were further investigated using PCA, MDS, cladogram and STRUCTURE. No significant deviation from Hardy-Weinberg equilibrium (HWE) and Linkage Disequilibrium (LD) tests was observed across all 90 autosomal SNPs. The combined matching probability (CMP) for Tibetan, Uygur and Hui were 2.5880 × 10 -33 , 1.7480 × 10 -35 and 4.6326 × 10 -34 respectively, and the combined power of exclusion (CPE) were 0.999999386152271, 0.999999607712827 and 0.999999696360182 respectively. For 34 Y-SNPs, only 16 haplogroups were obtained, but the haplogroup distributions differ among the three populations. Tibetans from the Sino-Tibetan population and Hui with multiple ethnicities with an admixture population have genetic affinity with East Asian populations, while Uygurs of a Eurasian admixture population have similar genetic components to the South Asian populations and are distributed between East Asian and European populations. The aforementioned results suggest that the Precision ID Identity Panel is informative and polymorphic in three investigated populations and could be used as an effective tool for human forensics. Copyright © 2018 Elsevier B.V. All rights reserved.
Impact of gluten-friendly™ technology on wheat kernel endosperm and gluten protein structure in seeds by light and electron microscopy.

PubMed

Landriscina, L; D'Agnello, P; Bevilacqua, A; Corbo, M R; Sinigaglia, M; Lamacchia, C

2017-04-15

The main aim of this paper was to assess the impact of Gluten-Friendly™ (GF) technology (Italian priority patent n° 102015000084813 filed on 17th December 2015) on wheat kernel endosperm morphology and gluten protein structure, using SEM, light and immunofluorescent microscopy. Microscopy was combined with immunodetection with specific antibodies for gliadins, γ-gliadins, LMW subunits and antigenic epitopes to gain a better understanding of the technology at a molecular level. The results showed significant changes to gluten proteins after GF treatment; cross-reactivity towards the antibodies recognizing almost the entire range of gluten proteins as well as the antigenic epitopes through the sequences QQSF, QQSY, PEQPFPQGC and QQPFP was significantly reduced. The present study confirms the results from our previous work and shows, for the first time, the mechanism by which a chemical-physical treatment abolishes the antigenic capacity of gluten. Copyright © 2016 Elsevier Ltd. All rights reserved.
Molecular inversion probe assay for allelic quantitation

PubMed Central

Ji, Hanlee; Welch, Katrina

2010-01-01

Molecular inversion probe (MIP) technology has been demonstrated to be a robust platform for large-scale dual genotyping and copy number analysis. Applications in human genomic and genetic studies include the possibility of running dual germline genotyping and combined copy number variation ascertainment. MIPs analyze large numbers of specific genetic target sequences in parallel, relying on interrogation of a barcode tag, rather than direct hybridization of genomic DNA to an array. The MIP approach does not replace, but is complementary to many of the copy number technologies being performed today. Some specific advantages of MIP technology include: Less DNA required (37 ng vs. 250 ng), DNA quality less important, more dynamic range (amplifications detected up to copy number 60), allele specific information “cleaner” (less SNP crosstalk/contamination), and quality of markers better (fewer individual MIPs versus SNPs needed to identify copy number changes). MIPs can be considered a candidate gene (targeted whole genome) approach and can find specific areas of interest that otherwise may be missed with other methods. PMID:19488872
High-throughput multiplex HLA-typing by ligase detection reaction (LDR) and universal array (UA) approach.

PubMed

Consolandi, Clarissa

2009-01-01

One major goal of genetic research is to understand the role of genetic variation in living systems. In humans, by far the most common type of such variation involves differences in single DNA nucleotides, and is thus termed single nucleotide polymorphism (SNP). The need for improvement in throughput and reliability of traditional techniques makes it necessary to develop new technologies. Thus the past few years have witnessed an extraordinary surge of interest in DNA microarray technology. This new technology offers the first great hope for providing a systematic way to explore the genome. It permits a very rapid analysis of thousands genes for the purpose of gene discovery, sequencing, mapping, expression, and polymorphism detection. We generated a series of analytical tools to address the manufacturing, detection and data analysis components of a microarray experiment. In particular, we set up a universal array approach in combination with a PCR-LDR (polymerase chain reaction-ligation detection reaction) strategy for allele identification in the HLA gene.
Newborn Sequencing in Genomic Medicine and Public Health.

PubMed

Berg, Jonathan S; Agrawal, Pankaj B; Bailey, Donald B; Beggs, Alan H; Brenner, Steven E; Brower, Amy M; Cakici, Julie A; Ceyhan-Birsoy, Ozge; Chan, Kee; Chen, Flavia; Currier, Robert J; Dukhovny, Dmitry; Green, Robert C; Harris-Wai, Julie; Holm, Ingrid A; Iglesias, Brenda; Joseph, Galen; Kingsmore, Stephen F; Koenig, Barbara A; Kwok, Pui-Yan; Lantos, John; Leeder, Steven J; Lewis, Megan A; McGuire, Amy L; Milko, Laura V; Mooney, Sean D; Parad, Richard B; Pereira, Stacey; Petrikin, Joshua; Powell, Bradford C; Powell, Cynthia M; Puck, Jennifer M; Rehm, Heidi L; Risch, Neil; Roche, Myra; Shieh, Joseph T; Veeraraghavan, Narayanan; Watson, Michael S; Willig, Laurel; Yu, Timothy W; Urv, Tiina; Wise, Anastasia L

2017-02-01

The rapid development of genomic sequencing technologies has decreased the cost of genetic analysis to the extent that it seems plausible that genome-scale sequencing could have widespread availability in pediatric care. Genomic sequencing provides a powerful diagnostic modality for patients who manifest symptoms of monogenic disease and an opportunity to detect health conditions before their development. However, many technical, clinical, ethical, and societal challenges should be addressed before such technology is widely deployed in pediatric practice. This article provides an overview of the Newborn Sequencing in Genomic Medicine and Public Health Consortium, which is investigating the application of genome-scale sequencing in newborns for both diagnosis and screening. Copyright © 2017 by the American Academy of Pediatrics.

Continuities in stone flaking technology at Liang Bua, Flores, Indonesia.

PubMed

Moore, M W; Sutikna, T; Jatmiko; Morwood, M J; Brumm, A

2009-11-01

This study examines trends in stone tool reduction technology at Liang Bua, Flores, Indonesia, where excavations have revealed a stratified artifact sequence spanning 95k.yr. The reduction sequence practiced throughout the Pleistocene was straightforward and unchanging. Large flakes were produced off-site and carried into the cave where they were reduced centripetally and bifacially by four techniques: freehand, burination, truncation, and bipolar. The locus of technological complexity at Liang Bua was not in knapping products, but in the way techniques were integrated. This reduction sequence persisted across the Pleistocene/Holocene boundary with a minor shift favoring unifacial flaking after 11ka. Other stone-related changes occurred at the same time, including the first appearance of edge-glossed flakes, a change in raw material selection, and more frequent fire-induced damage to stone artifacts. Later in the Holocene, technological complexity was generated by "adding-on" rectangular-sectioned stone adzes to the reduction sequence. The Pleistocene pattern is directly associated with Homo floresiensis skeletal remains and the Holocene changes correlate with the appearance of Homo sapiens. The one reduction sequence continues across this hominin replacement.
Diagnostic Applications of Next Generation Sequencing in Immunogenetics and Molecular Oncology

PubMed Central

Grumbt, Barbara; Eck, Sebastian H.; Hinrichsen, Tanja; Hirv, Kaimo

2013-01-01

Summary With the introduction of the next generation sequencing (NGS) technologies, remarkable new diagnostic applications have been established in daily routine. Implementation of NGS is challenging in clinical diagnostics, but definite advantages and new diagnostic possibilities make the switch to the technology inevitable. In addition to the higher sequencing capacity, clonal sequencing of single molecules, multiplexing of samples, higher diagnostic sensitivity, workflow miniaturization, and cost benefits are some of the valuable features of the technology. After the recent advances, NGS emerged as a proven alternative for classical Sanger sequencing in the typing of human leukocyte antigens (HLA). By virtue of the clonal amplification of single DNA molecules ambiguous typing results can be avoided. Simultaneously, a higher sample throughput can be achieved by tagging of DNA molecules with multiplex identifiers and pooling of PCR products before sequencing. In our experience, up to 380 samples can be typed for HLA-A, -B, and -DRB1 in high-resolution during every sequencing run. In molecular oncology, NGS shows a markedly increased sensitivity in comparison to the conventional Sanger sequencing and is developing to the standard diagnostic tool in detection of somatic mutations in cancer cells with great impact on personalized treatment of patients. PMID:23922545
Ultra-deep sequencing enables high-fidelity recovery of biodiversity for bulk arthropod samples without PCR amplification

PubMed Central

2013-01-01

Background Next-generation-sequencing (NGS) technologies combined with a classic DNA barcoding approach have enabled fast and credible measurement for biodiversity of mixed environmental samples. However, the PCR amplification involved in nearly all existing NGS protocols inevitably introduces taxonomic biases. In the present study, we developed new Illumina pipelines without PCR amplifications to analyze terrestrial arthropod communities. Results Mitochondrial enrichment directly followed by Illumina shotgun sequencing, at an ultra-high sequence volume, enabled the recovery of Cytochrome c Oxidase subunit 1 (COI) barcode sequences, which allowed for the estimation of species composition at high fidelity for a terrestrial insect community. With 15.5 Gbp Illumina data, approximately 97% and 92% were detected out of the 37 input Operational Taxonomic Units (OTUs), whether the reference barcode library was used or not, respectively, while only 1 novel OTU was found for the latter. Additionally, relatively strong correlation between the sequencing volume and the total biomass was observed for species from the bulk sample, suggesting a potential solution to reveal relative abundance. Conclusions The ability of the new Illumina PCR-free pipeline for DNA metabarcoding to detect small arthropod specimens and its tendency to avoid most, if not all, false positives suggests its great potential in biodiversity-related surveillance, such as in biomonitoring programs. However, further improvement for mitochondrial enrichment is likely needed for the application of the new pipeline in analyzing arthropod communities at higher diversity. PMID:23587339
Whole-Genome Sequencing and Assembly with High-Throughput, Short-Read Technologies

PubMed Central

Sundquist, Andreas; Ronaghi, Mostafa; Tang, Haixu; Pevzner, Pavel; Batzoglou, Serafim

2007-01-01

While recently developed short-read sequencing technologies may dramatically reduce the sequencing cost and eventually achieve the $1000 goal for re-sequencing, their limitations prevent the de novo sequencing of eukaryotic genomes with the standard shotgun sequencing protocol. We present SHRAP (SHort Read Assembly Protocol), a sequencing protocol and assembly methodology that utilizes high-throughput short-read technologies. We describe a variation on hierarchical sequencing with two crucial differences: (1) we select a clone library from the genome randomly rather than as a tiling path and (2) we sample clones from the genome at high coverage and reads from the clones at low coverage. We assume that 200 bp read lengths with a 1% error rate and inexpensive random fragment cloning on whole mammalian genomes is feasible. Our assembly methodology is based on first ordering the clones and subsequently performing read assembly in three stages: (1) local assemblies of regions significantly smaller than a clone size, (2) clone-sized assemblies of the results of stage 1, and (3) chromosome-sized assemblies. By aggressively localizing the assembly problem during the first stage, our method succeeds in assembling short, unpaired reads sampled from repetitive genomes. We tested our assembler using simulated reads from D. melanogaster and human chromosomes 1, 11, and 21, and produced assemblies with large sets of contiguous sequence and a misassembly rate comparable to other draft assemblies. Tested on D. melanogaster and the entire human genome, our clone-ordering method produces accurate maps, thereby localizing fragment assembly and enabling the parallelization of the subsequent steps of our pipeline. Thus, we have demonstrated that truly inexpensive de novo sequencing of mammalian genomes will soon be possible with high-throughput, short-read technologies using our methodology. PMID:17534434
Pivoting the Plant Immune System from Dissection to Deployment

PubMed Central

Dangl, Jeffery L.; Horvath, Diana M.; Staskawicz, Brian J.

2013-01-01

Diverse and rapidly evolving pathogens cause plant diseases and epidemics that threaten crop yield and food security around the world. Research over the last 25 years has led to an increasingly clear conceptual understanding of the molecular components of the plant immune system. Combined with ever-cheaper DNA-sequencing technology and the rich diversity of germ plasm manipulated for over a century by plant breeders, we now have the means to begin development of durable (long-lasting) disease resistance beyond the limits imposed by conventional breeding and in a manner that will replace costly and unsustainable chemical controls. PMID:23950531
Analysis of Protein-DNA Interaction by Chromatin Immunoprecipitation and DNA Tiling Microarray (ChIP-on-chip).

PubMed

Gao, Hui; Zhao, Chunyan

2018-01-01

Chromatin immunoprecipitation (ChIP) has become the most effective and widely used tool to study the interactions between specific proteins or modified forms of proteins and a genomic DNA region. Combined with genome-wide profiling technologies, such as microarray hybridization (ChIP-on-chip) or massively parallel sequencing (ChIP-seq), ChIP could provide a genome-wide mapping of in vivo protein-DNA interactions in various organisms. Here, we describe a protocol of ChIP-on-chip that uses tiling microarray to obtain a genome-wide profiling of ChIPed DNA.
Method for shallow junction formation

DOEpatents

Weiner, K.H.

1996-10-29

A doping sequence is disclosed that reduces the cost and complexity of forming source/drain regions in complementary metal oxide silicon (CMOS) integrated circuit technologies. The process combines the use of patterned excimer laser annealing, dopant-saturated spin-on glass, silicide contact structures and interference effects creates by thin dielectric layers to produce source and drain junctions that are ultrashallow in depth but exhibit low sheet and contact resistance. The process utilizes no photolithography and can be achieved without the use of expensive vacuum equipment. The process margins are wide, and yield loss due to contact of the ultrashallow dopants is eliminated. 8 figs.
Method for shallow junction formation

DOEpatents

Weiner, Kurt H.

1996-01-01

A doping sequence that reduces the cost and complexity of forming source/drain regions in complementary metal oxide silicon (CMOS) integrated circuit technologies. The process combines the use of patterned excimer laser annealing, dopant-saturated spin-on glass, silicide contact structures and interference effects creates by thin dielectric layers to produce source and drain junctions that are ultrashallow in depth but exhibit low sheet and contact resistance. The process utilizes no photolithography and can be achieved without the use of expensive vacuum equipment. The process margins are wide, and yield loss due to contact of the ultrashallow dopants is eliminated.
NCBI prokaryotic genome annotation pipeline.

PubMed

Tatusova, Tatiana; DiCuccio, Michael; Badretdin, Azat; Chetvernin, Vyacheslav; Nawrocki, Eric P; Zaslavsky, Leonid; Lomsadze, Alexandre; Pruitt, Kim D; Borodovsky, Mark; Ostell, James

2016-08-19

Recent technological advances have opened unprecedented opportunities for large-scale sequencing and analysis of populations of pathogenic species in disease outbreaks, as well as for large-scale diversity studies aimed at expanding our knowledge across the whole domain of prokaryotes. To meet the challenge of timely interpretation of structure, function and meaning of this vast genetic information, a comprehensive approach to automatic genome annotation is critically needed. In collaboration with Georgia Tech, NCBI has developed a new approach to genome annotation that combines alignment based methods with methods of predicting protein-coding and RNA genes and other functional elements directly from sequence. A new gene finding tool, GeneMarkS+, uses the combined evidence of protein and RNA placement by homology as an initial map of annotation to generate and modify ab initio gene predictions across the whole genome. Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the absence of external evidence. The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy. For additional information on PGAP see https://www.ncbi.nlm.nih.gov/genome/annotation_prok/ and the NCBI Handbook, https://www.ncbi.nlm.nih.gov/books/NBK174280/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Surveying N2O-producing pathways in bacteria.

PubMed

Stein, Lisa Y

2011-01-01

Nitrous oxide (N(2)O) is produced by bacteria as an intermediate of both dissimilatory and detoxification pathways under a range of oxygen levels, although the majority of N(2)O is released in suboxic to anoxic environments. N(2)O production under physiologically relevant conditions appears to require the reduction of nitric oxide (NO) produced from the oxidation of hydroxylamine (nitrification), reduction of nitrite (denitrification), or by host cells of pathogenic bacteria. In a single bacterial isolate, N(2)O-producing pathways can be complex, overlapping, involve multiple enzymes with the same function, and require multiple layers of regulatory machinery. This overview discusses how to identify known N(2)O-producing inventory and regulatory sequences within bacterial genome sequences and basic physiological approaches for investigating the function of that inventory. A multitude of review articles have been published on individual enzymes, pathways, regulation, and environmental significance of N(2)O-production encompassing a large diversity of bacterial isolates. The combination of next-generation deep sequencing platforms, emerging proteomics technologies, and basic microbial physiology can be used to expand what is known about N(2)O-producing pathways in individual bacterial species to discover novel inventory and unifying features of pathways. A combination of approaches is required to understand and generalize the function and control of N(2)O production across a range of temporal and spatial scales within natural and host environments. Copyright © 2011 Elsevier Inc. All rights reserved.
The challenges of sequencing by synthesis.

PubMed

Fuller, Carl W; Middendorf, Lyle R; Benner, Steven A; Church, George M; Harris, Timothy; Huang, Xiaohua; Jovanovich, Stevan B; Nelson, John R; Schloss, Jeffery A; Schwartz, David C; Vezenov, Dmitri V

2009-11-01

DNA sequencing-by-synthesis (SBS) technology, using a polymerase or ligase enzyme as its core biochemistry, has already been incorporated in several second-generation DNA sequencing systems with significant performance. Notwithstanding the substantial success of these SBS platforms, challenges continue to limit the ability to reduce the cost of sequencing a human genome to $100,000 or less. Achieving dramatically reduced cost with enhanced throughput and quality will require the seamless integration of scientific and technological effort across disciplines within biochemistry, chemistry, physics and engineering. The challenges include sample preparation, surface chemistry, fluorescent labels, optimizing the enzyme-substrate system, optics, instrumentation, understanding tradeoffs of throughput versus accuracy, and read-length/phasing limitations. By framing these challenges in a manner accessible to a broad community of scientists and engineers, we hope to solicit input from the broader research community on means of accelerating the advancement of genome sequencing technology.
MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments

PubMed Central

Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin

2017-01-01

Motivation: With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. Results: We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. Availability and implementation: MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator. The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. Contact: igs@sanger.ac.uk or mh26@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27605100
MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments.

PubMed

Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin

2017-01-01

With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. igs@sanger.ac.uk or mh26@sanger.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Deep-sequencing to resolve complex diversity of apicomplexan parasites in platypuses and echidnas: Proof of principle for wildlife disease investigation.

PubMed

Šlapeta, Jan; Saverimuttu, Stefan; Vogelnest, Larry; Sangster, Cheryl; Hulst, Frances; Rose, Karrie; Thompson, Paul; Whittington, Richard

2017-11-01

The short-beaked echidna (Tachyglossus aculeatus) and the platypus (Ornithorhynchus anatinus) are iconic egg-laying monotremes (Mammalia: Monotremata) from Australasia. The aim of this study was to demonstrate the utility of diversity profiles in disease investigations of monotremes. Using small subunit (18S) rDNA amplicon deep-sequencing we demonstrated the presence of apicomplexan parasites and confirmed by direct and cloned amplicon gene sequencing Theileria ornithorhynchi, Theileria tachyglossi, Eimeria echidnae and Cryptosporidium fayeri. Using a combination of samples from healthy and diseased animals, we show a close evolutionary relationship between species of coccidia (Eimeria) and piroplasms (Theileria) from the echidna and platypus. The presence of E. echidnae was demonstrated in faeces and tissues affected by disseminated coccidiosis. Moreover, the presence of E. echidnae DNA in the blood of echidnas was associated with atoxoplasma-like stages in white blood cells, suggesting Hepatozoon tachyglossi blood stages are disseminated E. echidnae stages. These next-generation DNA sequencing technologies are suited to material and organisms that have not been previously characterised and for which the material is scarce. The deep sequencing approach supports traditional diagnostic methods, including microscopy, clinical pathology and histopathology, to better define the status quo. This approach is particularly suitable for wildlife disease investigation. Copyright © 2017 Elsevier B.V. All rights reserved.
Whole-Exome Sequencing to Decipher the Genetic Heterogeneity of Hearing Loss in a Chinese Family with Deaf by Deaf Mating

PubMed Central

Qing, Jie; Yan, Denise; Zhou, Yuan; Liu, Qiong; Wu, Weijing; Xiao, Zian; Liu, Yuyuan; Liu, Jia; Du, Lilin; Xie, Dinghua; Liu, Xue Zhong

2014-01-01

Inherited deafness has been shown to have high genetic heterogeneity. For many decades, linkage analysis and candidate gene approaches have been the main tools to elucidate the genetics of hearing loss. However, this associated study design is costly, time-consuming, and unsuitable for small families. This is mainly due to the inadequate numbers of available affected individuals, locus heterogeneity, and assortative mating. Exome sequencing has now become technically feasible and a cost-effective method for detection of disease variants underlying Mendelian disorders due to the recent advances in next-generation sequencing (NGS) technologies. In the present study, we have combined both the Deafness Gene Mutation Detection Array and exome sequencing to identify deafness causative variants in a large Chinese composite family with deaf by deaf mating. The simultaneous screening of the 9 common deafness mutations using the allele-specific PCR based universal array, resulted in the identification of the 1555A>G in the mitochondrial DNA (mtDNA) 12S rRNA in affected individuals in one branch of the family. We then subjected the mutation-negative cases to exome sequencing and identified novel causative variants in the MYH14 and WFS1 genes. This report confirms the effective use of a NGS technique to detect pathogenic mutations in affected individuals who were not candidates for classical genetic studies. PMID:25289672
SMART SKINS - A Development Roadmap

NASA Astrophysics Data System (ADS)

Lochocki, Joseph M.

1990-02-01

The Air Force Project Forecast II identified a number of key technology initiatives for development. This paper addresses one such initiative, PT-16, Smart Skins. The concept of the Smart Skin is introduced by briefly highlighting its attributes and potential advantages over standard avionics packaging and maintenance, and then goes on to describe some of the key ingredients necessary for its development. Problem areas are brought out along with some of the required trades that must be made. Finally, a time phased development roadmap is introduced which shows Calspan's proposed sequence of technology development programs that can, in combination, lead to first functional Smart Skins implementations in narrowband form in the late 1990's and in wideband form in first decade of the twenty - first century. A Smart Skins implementation in integral aircraft skin structure form will take at least until 2010.
Mississippi Curriculum Framework for Emergency Medical Technology--Basic (Program CIP: 51.0904). Emergency Medical Technology--Paramedic (Program CIP: 51.0904). Postsecondary Programs.

ERIC Educational Resources Information Center

Mississippi Research and Curriculum Unit for Vocational and Technical Education, State College.

This document, which is intended for use by community and junior colleges throughout Mississippi, contains curriculum frameworks for the course sequences in the emergency medical technology (EMT) programs cluster. Presented in the introductory section are a description of the program and suggested course sequence. Section I lists baseline…
Review of sequencing platforms and their applications in phaeochromocytoma and paragangliomas.

PubMed

Pillai, Suja; Gopalan, Vinod; Lam, Alfred King-Yin

2017-08-01

Genetic testing is recommended for patients with phaeochromocytoma (PCC) and paraganglioma (PGL) because of their genetic heterogeneity and heritability. Due to the large number of susceptibility genes associated with PCC/PGL, next-generation sequencing (NGS) technology is ideally suited for carrying out genetic screening of these individuals. New generations of DNA sequencing technologies facilitate the development of comprehensive genetic testing in PCC/PGL at a lower cost. Whole-exome sequencing and targeted NGS are the preferred methods for screening of PCC/PGL, both having precise mutation detection methods and low costs. RNA sequencing and DNA methylation studies using NGS technology in PCC/PGL can be adopted to act as diagnostic or prognostic biomarkers as well as in planning targeted epigenetic treatment of patients with PCC/PGL. The designs of NGS having a high depth of coverage and robust analytical pipelines can lead to the successful detection of a wide range of genomic defects in PCC/PGL. Nevertheless, the major challenges of this technology must be addressed before it has practical applications in the clinical diagnostics to fulfill the goal of personalized medicine in PCC/PGL. In future, novel approaches of sequencing, such as third and fourth generation sequencing can alter the workflow, cost, analysis, and interpretation of genomics associated with PCC/PGL. Copyright © 2017 Elsevier B.V. All rights reserved.
Gene calling and bacterial genome annotation with BG7.

PubMed

Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo

2015-01-01

New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).
Construction of a high-density genetic map for grape using next generation restriction-site associated DNA sequencing

PubMed Central

2012-01-01

Background Genetic mapping and QTL detection are powerful methodologies in plant improvement and breeding. Construction of a high-density and high-quality genetic map would be of great benefit in the production of superior grapes to meet human demand. High throughput and low cost of the recently developed next generation sequencing (NGS) technology have resulted in its wide application in genome research. Sequencing restriction-site associated DNA (RAD) might be an efficient strategy to simplify genotyping. Combining NGS with RAD has proven to be powerful for single nucleotide polymorphism (SNP) marker development. Results An F1 population of 100 individual plants was developed. In-silico digestion-site prediction was used to select an appropriate restriction enzyme for construction of a RAD sequencing library. Next generation RAD sequencing was applied to genotype the F1 population and its parents. Applying a cluster strategy for SNP modulation, a total of 1,814 high-quality SNP markers were developed: 1,121 of these were mapped to the female genetic map, 759 to the male map, and 1,646 to the integrated map. A comparison of the genetic maps to the published Vitis vinifera genome revealed both conservation and variations. Conclusions The applicability of next generation RAD sequencing for genotyping a grape F1 population was demonstrated, leading to the successful development of a genetic map with high density and quality using our designed SNP markers. Detailed analysis revealed that this newly developed genetic map can be used for a variety of genome investigations, such as QTL detection, sequence assembly and genome comparison. PMID:22908993

Overcoming bias and systematic errors in next generation sequencing data.

PubMed

Taub, Margaret A; Corrada Bravo, Hector; Irizarry, Rafael A

2010-12-10

Considerable time and effort has been spent in developing analysis and quality assessment methods to allow the use of microarrays in a clinical setting. As is the case for microarrays and other high-throughput technologies, data from new high-throughput sequencing technologies are subject to technological and biological biases and systematic errors that can impact downstream analyses. Only when these issues can be readily identified and reliably adjusted for will clinical applications of these new technologies be feasible. Although much work remains to be done in this area, we describe consistently observed biases that should be taken into account when analyzing high-throughput sequencing data. In this article, we review current knowledge about these biases, discuss their impact on analysis results, and propose solutions.
Future of breeding by genome editing is in the hands of regulators

PubMed Central

Jones, Huw D

2015-01-01

ABSTRACT We are witnessing the timely convergence of several technologies that together will have significant impact on research, human health and in animal and plant breeding. The exponential increase in genome and expressed sequence data, the ability to compile, analyze and mine these data via sophisticated bioinformatics procedures on high-powered computers, and developments in various molecular and in-vitro cellular techniques combine to underpin novel developments in research and commercial biotechnology. Arguably the most important of these is genome editing which encompasses a suite of site directed nucleases (SDN) that can be designed to cut, or otherwise modify predetermined DNA sequences in the genome and result in targeted insertions, deletions, or other changes for genetic improvement. It is a powerful and adaptive technology for animal and plant science, with huge relevance for plant and animal breeding. But this promise will be realized only if the regulatory oversight is proportionate to the potential hazards and has broad support from consumers, researchers and commercial interests. Despite significant progress in research and development and one genome edited crop close to commercialization, in most regions of the world it still remains unclear how or whether this fledgling technology will be regulated. The various risk management authorities and biotechnology regulators have a unique opportunity to set up a logical, appropriate and workable regulatory framework for gene editing that, unlike the situation for GMOs, would have broad support from stakeholders. PMID:26930115
The genome revolution and its role in understanding complex diseases.

PubMed

Hofker, Marten H; Fu, Jingyuan; Wijmenga, Cisca

2014-10-01

The completion of the human genome sequence in 2003 clearly marked the beginning of a new era for biomedical research. It spurred technological progress that was unprecedented in the life sciences, including the development of high-throughput technologies to detect genetic variation and gene expression. The study of genetics has become "big data science". One of the current goals of genetic research is to use genomic information to further our understanding of common complex diseases. An essential first step made towards this goal was by the identification of thousands of single nucleotide polymorphisms showing robust association with hundreds of different traits and diseases. As insight into common genetic variation has expanded enormously and the technology to identify more rare variation has become available, we can utilize these advances to gain a better understanding of disease etiology. This will lead to developments in personalized medicine and P4 healthcare. Here, we review some of the historical events and perspectives before and after the completion of the human genome sequence. We also describe the success of large-scale genetic association studies and how these are expected to yield more insight into complex disorders. We show how we can now combine gene-oriented research and systems-based approaches to develop more complex models to help explain the etiology of common diseases. This article is part of a Special Issue entitled: From Genome to Function. Copyright © 2014 Elsevier B.V. All rights reserved.
Highlights from a Mach 4 Experimental Demonstration of Inlet Mode Transition for Turbine-Based Combined Cycle Hypersonic Propulsion

NASA Technical Reports Server (NTRS)

Foster, Lancert E.; Saunders, John D., Jr.; Sanders, Bobby W.; Weir, Lois J.

2012-01-01

NASA is focused on technologies for combined cycle, air-breathing propulsion systems to enable reusable launch systems for access to space. Turbine Based Combined Cycle (TBCC) propulsion systems offer specific impulse (Isp) improvements over rocket-based propulsion systems in the subsonic takeoff and return mission segments along with improved safety. Among the most critical TBCC enabling technologies are: 1) mode transition from the low speed propulsion system to the high speed propulsion system, 2) high Mach turbine engine development and 3) innovative turbine based combined cycle integration. To address these challenges, NASA initiated an experimental mode transition task including analytical methods to assess the state-of-the-art of propulsion system performance and design codes. One effort has been the Combined-Cycle Engine Large Scale Inlet Mode Transition Experiment (CCE-LIMX) which is a fully integrated TBCC propulsion system with flowpath sizing consistent with previous NASA and DoD proposed Hypersonic experimental flight test plans. This experiment was tested in the NASA GRC 10 by 10-Foot Supersonic Wind Tunnel (SWT) Facility. The goal of this activity is to address key hypersonic combined-cycle engine issues including: (1) dual integrated inlet operability and performance issues-unstart constraints, distortion constraints, bleed requirements, and controls, (2) mode-transition sequence elements caused by switching between the turbine and the ramjet/scramjet flowpaths (imposed variable geometry requirements), and (3) turbine engine transients (and associated time scales) during transition. Testing of the initial inlet and dynamic characterization phases were completed and smooth mode transition was demonstrated. A database focused on a Mach 4 transition speed with limited off-design elements was developed and will serve to guide future TBCC system studies and to validate higher level analyses.
Multiplexed fragaria chloroplast genome sequencing

Treesearch

W. Njuguna; A. Liston; R. Cronn; N.V. Bassil

2010-01-01

A method to sequence multiple chloroplast genomes using ultra high throughput sequencing technologies was recently described. Complete chloroplast genome sequences can resolve phylogenetic relationships at low taxonomic levels and identify informative point mutations and indels. The objective of this research was to sequence multiple Fragaria...
HIA: a genome mapper using hybrid index-based sequence alignment.

PubMed

Choi, Jongpill; Park, Kiejung; Cho, Seong Beom; Chung, Myungguen

2015-01-01

A number of alignment tools have been developed to align sequencing reads to the human reference genome. The scale of information from next-generation sequencing (NGS) experiments, however, is increasing rapidly. Recent studies based on NGS technology have routinely produced exome or whole-genome sequences from several hundreds or thousands of samples. To accommodate the increasing need of analyzing very large NGS data sets, it is necessary to develop faster, more sensitive and accurate mapping tools. HIA uses two indices, a hash table index and a suffix array index. The hash table performs direct lookup of a q-gram, and the suffix array performs very fast lookup of variable-length strings by exploiting binary search. We observed that combining hash table and suffix array (hybrid index) is much faster than the suffix array method for finding a substring in the reference sequence. Here, we defined the matching region (MR) is a longest common substring between a reference and a read. And, we also defined the candidate alignment regions (CARs) as a list of MRs that is close to each other. The hybrid index is used to find candidate alignment regions (CARs) between a reference and a read. We found that aligning only the unmatched regions in the CAR is much faster than aligning the whole CAR. In benchmark analysis, HIA outperformed in mapping speed compared with the other aligners, without significant loss of mapping accuracy. Our experiments show that the hybrid of hash table and suffix array is useful in terms of speed for mapping NGS sequencing reads to the human reference genome sequence. In conclusion, our tool is appropriate for aligning massive data sets generated by NGS sequencing.
Peroxidase gene discovery from the horseradish transcriptome.

PubMed

Näätsaari, Laura; Krainer, Florian W; Schubert, Michael; Glieder, Anton; Thallinger, Gerhard G

2014-03-24

Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana. In this study, a normalized, size-selected A. rusticana transcriptome library was sequenced using 454 Titanium technology. The resulting reads were assembled into 14871 isotigs with an average length of 1133 bp. Sequence databases, ORF finding and ORF characterization were utilized to identify peroxidase genes from the 14871 isotigs generated by de novo assembly. The sequences were manually reviewed and verified with Sanger sequencing of PCR amplified genomic fragments, resulting in the discovery of 28 secretory peroxidases, 23 of them previously unknown. A total of 22 isoenzymes including allelic variants were successfully expressed in Pichia pastoris and showed peroxidase activity with at least one of the substrates tested, thus enabling their development into commercial pure isoenzymes. This study demonstrates that transcriptome sequencing combined with sequence motif search is a powerful concept for the discovery and quick supply of new enzymes and isoenzymes from any plant or other eukaryotic organisms. Identification and manual verification of the sequences of 28 HRP isoenzymes do not only contribute a set of peroxidases for industrial, biological and biomedical applications, but also provide valuable information on the reliability of the approach in identifying and characterizing a large group of isoenzymes.
Peroxidase gene discovery from the horseradish transcriptome

PubMed Central

2014-01-01

Background Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana. Results In this study, a normalized, size-selected A. rusticana transcriptome library was sequenced using 454 Titanium technology. The resulting reads were assembled into 14871 isotigs with an average length of 1133 bp. Sequence databases, ORF finding and ORF characterization were utilized to identify peroxidase genes from the 14871 isotigs generated by de novo assembly. The sequences were manually reviewed and verified with Sanger sequencing of PCR amplified genomic fragments, resulting in the discovery of 28 secretory peroxidases, 23 of them previously unknown. A total of 22 isoenzymes including allelic variants were successfully expressed in Pichia pastoris and showed peroxidase activity with at least one of the substrates tested, thus enabling their development into commercial pure isoenzymes. Conclusions This study demonstrates that transcriptome sequencing combined with sequence motif search is a powerful concept for the discovery and quick supply of new enzymes and isoenzymes from any plant or other eukaryotic organisms. Identification and manual verification of the sequences of 28 HRP isoenzymes do not only contribute a set of peroxidases for industrial, biological and biomedical applications, but also provide valuable information on the reliability of the approach in identifying and characterizing a large group of isoenzymes. PMID:24666710
Mismatch and G-Stack Modulated Probe Signals on SNP Microarrays

PubMed Central

Binder, Hans; Fasold, Mario; Glomb, Torsten

2009-01-01

Background Single nucleotide polymorphism (SNP) arrays are important tools widely used for genotyping and copy number estimation. This technology utilizes the specific affinity of fragmented DNA for binding to surface-attached oligonucleotide DNA probes. We analyze the variability of the probe signals of Affymetrix GeneChip SNP arrays as a function of the probe sequence to identify relevant sequence motifs which potentially cause systematic biases of genotyping and copy number estimates. Methodology/Principal Findings The probe design of GeneChip SNP arrays enables us to disentangle different sources of intensity modulations such as the number of mismatches per duplex, matched and mismatched base pairings including nearest and next-nearest neighbors and their position along the probe sequence. The effect of probe sequence was estimated in terms of triple-motifs with central matches and mismatches which include all 256 combinations of possible base pairings. The probe/target interactions on the chip can be decomposed into nearest neighbor contributions which correlate well with free energy terms of DNA/DNA-interactions in solution. The effect of mismatches is about twice as large as that of canonical pairings. Runs of guanines (G) and the particular type of mismatched pairings formed in cross-allelic probe/target duplexes constitute sources of systematic biases of the probe signals with consequences for genotyping and copy number estimates. The poly-G effect seems to be related to the crowded arrangement of probes which facilitates complex formation of neighboring probes with at minimum three adjacent G's in their sequence. Conclusions The applied method of “triple-averaging” represents a model-free approach to estimate the mean intensity contributions of different sequence motifs which can be applied in calibration algorithms to correct signal values for sequence effects. Rules for appropriate sequence corrections are suggested. PMID:19924253
Identification, validation and high-throughput genotyping of transcribed gene SNPs in cassava.

PubMed

Ferguson, Morag E; Hearne, Sarah J; Close, Timothy J; Wanamaker, Steve; Moskal, William A; Town, Christopher D; de Young, Joe; Marri, Pradeep Reddy; Rabbi, Ismail Yusuf; de Villiers, Etienne P

2012-03-01

The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation.
VDJServer: A Cloud-Based Analysis Portal and Data Commons for Immune Repertoire Sequences and Rearrangements.

PubMed

Christley, Scott; Scarborough, Walter; Salinas, Eddie; Rounds, William H; Toby, Inimary T; Fonner, John M; Levin, Mikhail K; Kim, Min; Mock, Stephen A; Jordan, Christopher; Ostmeyer, Jared; Buntzman, Adam; Rubelt, Florian; Davila, Marco L; Monson, Nancy L; Scheuermann, Richard H; Cowell, Lindsay G

2018-01-01

Recent technological advances in immune repertoire sequencing have created tremendous potential for advancing our understanding of adaptive immune response dynamics in various states of health and disease. Immune repertoire sequencing produces large, highly complex data sets, however, which require specialized methods and software tools for their effective analysis and interpretation. VDJServer is a cloud-based analysis portal for immune repertoire sequence data that provide access to a suite of tools for a complete analysis workflow, including modules for preprocessing and quality control of sequence reads, V(D)J gene segment assignment, repertoire characterization, and repertoire comparison. VDJServer also provides sophisticated visualizations for exploratory analysis. It is accessible through a standard web browser via a graphical user interface designed for use by immunologists, clinicians, and bioinformatics researchers. VDJServer provides a data commons for public sharing of repertoire sequencing data, as well as private sharing of data between users. We describe the main functionality and architecture of VDJServer and demonstrate its capabilities with use cases from cancer immunology and autoimmunity. VDJServer provides a complete analysis suite for human and mouse T-cell and B-cell receptor repertoire sequencing data. The combination of its user-friendly interface and high-performance computing allows large immune repertoire sequencing projects to be analyzed with no programming or software installation required. VDJServer is a web-accessible cloud platform that provides access through a graphical user interface to a data management infrastructure, a collection of analysis tools covering all steps in an analysis, and an infrastructure for sharing data along with workflows, results, and computational provenance. VDJServer is a free, publicly available, and open-source licensed resource.
Pyrosequencing the Manduca sexta larval midgut transcriptome: messages for digestion, detoxification and defence.

PubMed

Pauchet, Y; Wilkinson, P; Vogel, H; Nelson, D R; Reynolds, S E; Heckel, D G; ffrench-Constant, R H

2010-02-01

The tobacco hornworm Manduca sexta is an important model for insect physiology but genomic and transcriptomic data are currently lacking. Following a recent pyrosequencing study generating immune related expressed sequence tags (ESTs), here we use this new technology to define the M. sexta larval midgut transcriptome. We generated over 387,000 midgut ESTs, using a combination of Sanger and 454 sequencing, and classified predicted proteins into those involved in digestion, detoxification and immunity. In many cases the depth of 454 pyrosequencing coverage allowed us to define the entire cDNA sequence of a particular gene. Many new M. sexta genes are described including up to 36 new cytochrome P450s, some of which have been implicated in the metabolism of host plant-derived nicotine. New lepidopteran gene families such as the beta-fructofuranosidases, previously thought to be restricted to Bombyx mori, are also described. An unexpectedly high number of ESTs were involved in immunity, for example 39 contigs encoding serpins, and the increasingly appreciated role of the midgut in insect immunity is discussed. Similar studies of other tissues will allow for a tissue by tissue description of the M. sexta transcriptome and will form an essential complimentary step on the road to genome sequencing and annotation.
Persistence and evolution of allergen-specific IgE repertoires during subcutaneous specific immunotherapy

PubMed Central

Levin, Mattias; King, Jasmine J.; Glanville, Jacob; Jackson, Katherine J. L.; Looney, Timothy J.; Hoh, Ramona A.; Mari, Adriano; Andersson, Morgan; Greiff, Lennart; Fire, Andrew Z.; Boyd, Scott D.; Ohlin, Mats

2016-01-01

Background Specific immunotherapy (SIT) is the only treatment with proven long-term curative potential in allergic disease. Allergen-specific IgE is the causative agent of allergic disease, and antibodies contribute to SIT, but the effects of SIT on aeroallergen-specific B cell repertoires are not well understood. Objective To characterize the IgE sequences expressed by allergen-specific B cells, and track the fate of these B cell clones during SIT. Methods We have used high-throughput antibody gene sequencing and identification of allergen-specific IgE using combinatorial antibody fragment library technology to analyze immunoglobulin repertoires of blood and nasal mucosa of aeroallergen-sensitized individuals before and during the first year of subcutaneous SIT. Results Of 52 distinct allergen-specific IgE heavy chains from eight allergic donors, 37 were also detected by high-throughput antibody gene sequencing of blood, nasal mucosa, or both sample types. The allergen-specific clones had increased persistence, higher likelihood of belonging to clones expressing other switched isotypes, and possibly larger clone size than the rest of the IgE repertoire. Clone members in nasal tissue showed close mutational relationships. Conclusion Combining functional binding studies, deep antibody repertoire sequencing, and information on clinical outcomes in larger studies may in the future aid assessment of SIT mechanisms and efficacy. PMID:26559321
Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing.

PubMed

Legendre, Matthieu; Santini, Sébastien; Rico, Alain; Abergel, Chantal; Claverie, Jean-Michel

2011-03-04

Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs). Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads), and a complete genome re-sequencing (45.3 Million reads). This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.
High-throughput sequencing and morphology perform equally well for benthic monitoring of marine ecosystems

PubMed Central

Lejzerowicz, Franck; Esling, Philippe; Pillet, Loïc; Wilding, Thomas A.; Black, Kenneth D.; Pawlowski, Jan

2015-01-01

Environmental diversity surveys are crucial for the bioassessment of anthropogenic impacts on marine ecosystems. Traditional benthic monitoring relying on morphotaxonomic inventories of macrofaunal communities is expensive, time-consuming and expertise-demanding. High-throughput sequencing of environmental DNA barcodes (metabarcoding) offers an alternative to describe biological communities. However, whether the metabarcoding approach meets the quality standards of benthic monitoring remains to be tested. Here, we compared morphological and eDNA/RNA-based inventories of metazoans from samples collected at 10 stations around a fish farm in Scotland, including near-cage and distant zones. For each of 5 replicate samples per station, we sequenced the V4 region of the 18S rRNA gene using the Illumina technology. After filtering, we obtained 841,766 metazoan sequences clustered in 163 Operational Taxonomic Units (OTUs). We assigned the OTUs by combining local BLAST searches with phylogenetic analyses. We calculated two commonly used indices: the Infaunal Trophic Index and the AZTI Marine Biotic Index. We found that the molecular data faithfully reflect the morphology-based indices and provides an equivalent assessment of the impact associated with fish farms activities. We advocate that future benthic monitoring should integrate metabarcoding as a rapid and accurate tool for the evaluation of the quality of marine benthic ecosystems. PMID:26355099
A Sequence for Sentence-Combining Instruction.

ERIC Educational Resources Information Center

Lawlor, Joseph

Although sentence combining practice has been shown to be an effective instructional technique for improving students' writing, scant attention has been paid to the appropriate sequence for such instruction. Studies of the natural development of oral and written language point out two general trends that should be considered in sequencing sentence…
Improving ESL Writing Using an Online Formulaic Sequence Word-Combination Checker

ERIC Educational Resources Information Center

Grami, G. M. A.; Alkazemi, B. Y.

2016-01-01

Writing correct English sentences can be challenging. Furthermore, writing correct formulaic sequences can be especially difficult because accepted combinations do not follow clear rules governing which words appear together in a sequence. One solution is to provide examples of correct usage accompanied by statistical feedback from web-based…
Effectiveness of sodium azide alone compared to sodium azide in combination with methyl nitrosurea for rice mutagenesis

USDA-ARS?s Scientific Manuscript database

Rice seeds of the temperate japonica cultivar Kitaake were mutagenized with sodium azide alone and in combination with methyl nitrosourea. Using the reduced representation sequencing method Restriction Enzyme Sequence Comparative Analysis (RESCAN), the mutation densities, types and local sequence co...
A New Way to Introduce Microarray Technology in a Lecture/Laboratory Setting by Studying the Evolution of This Modern Technology

ERIC Educational Resources Information Center

Rowland-Goldsmith, Melissa

2009-01-01

DNA microarray is an ordered grid containing known sequences of DNA, which represent many of the genes in a particular organism. Each DNA sequence is unique to a specific gene. This technology enables the researcher to screen many genes from cells or tissue grown in different conditions. We developed an undergraduate lecture and laboratory…
Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification

PubMed Central

Kamps, Rick; Brandão, Rita D.; van den Bosch, Bianca J.; Paulussen, Aimee D. C.; Xanthoulea, Sofia; Blok, Marinus J.; Romano, Andrea

2017-01-01

Next-generation sequencing (NGS) technology has expanded in the last decades with significant improvements in the reliability, sequencing chemistry, pipeline analyses, data interpretation and costs. Such advances make the use of NGS feasible in clinical practice today. This review describes the recent technological developments in NGS applied to the field of oncology. A number of clinical applications are reviewed, i.e., mutation detection in inherited cancer syndromes based on DNA-sequencing, detection of spliceogenic variants based on RNA-sequencing, DNA-sequencing to identify risk modifiers and application for pre-implantation genetic diagnosis, cancer somatic mutation analysis, pharmacogenetics and liquid biopsy. Conclusive remarks, clinical limitations, implications and ethical considerations that relate to the different applications are provided. PMID:28146134

Single molecule real-time (SMRT) sequencing comes of age: applications and utilities for medical diagnostics

PubMed Central

Ardui, Simon; Ameur, Adam; Vermeesch, Joris R; Hestand, Matthew S

2018-01-01

Abstract Short read massive parallel sequencing has emerged as a standard diagnostic tool in the medical setting. However, short read technologies have inherent limitations such as GC bias, difficulties mapping to repetitive elements, trouble discriminating paralogous sequences, and difficulties in phasing alleles. Long read single molecule sequencers resolve these obstacles. Moreover, they offer higher consensus accuracies and can detect epigenetic modifications from native DNA. The first commercially available long read single molecule platform was the RS system based on PacBio's single molecule real-time (SMRT) sequencing technology, which has since evolved into their RSII and Sequel systems. Here we capsulize how SMRT sequencing is revolutionizing constitutional, reproductive, cancer, microbial and viral genetic testing. PMID:29401301
Molecular characterization and combined genotype association study of bovine cluster of differentiation 14 gene with clinical mastitis in crossbred dairy cattle

PubMed Central

Selvan, A. Sakthivel; Gupta, I. D.; Verma, A.; Chaudhari, M. V.; Magotra, A.

2016-01-01

Aim: The present study was undertaken with the objectives to characterize and to analyze combined genotypes of cluster of differentiation 14 (CD14) gene to explore its association with clinical mastitis in Karan Fries (KF) cows maintained in the National Dairy Research Institute herd, Karnal. Materials and Methods: Genomic DNA was extracted using blood of randomly selected 94 KF lactating cattle by phenol-chloroform method. After checking its quality and quantity, polymerase chain reaction (PCR) was carried out using six sets of reported gene-specific primers to amplify complete KF CD14 gene. The forward and reverse sequences for each PCR fragments were assembled to form complete sequence for the respective region of KF CD14 gene. The multiple sequence alignments of the edited sequence with the corresponding reference with reported Bos taurus sequence (EU148610.1) were performed with ClustalW software to identify single nucleotide polymorphisms (SNPs). Basic Local Alignment Search Tool analysis was performed to compare the sequence identity of KF CD14 gene with other species. The restriction fragment length polymorphism (RFLP) analysis was carried out in all KF cows using Helicobacter pylori 188I (Hpy188I) (contig 2) and Haemophilus influenzae I (HinfI) (contig 4) restriction enzyme (RE). Cows were assigned genotypes obtained by PCR-RFLP analysis, and association study was done using Chi-square (χ2) test. The genotypes of both contigs (loci) number 2 and 4 were combined with respect to each animal to construct combined genotype patterns. Results: Two types of sequences of KF were obtained: One with 2630 bp having one insertion at 616 nucleotide (nt) position and one deletion at 1117 nt position, and the another sequence was of 2629 bp having only one deletion at 615 nt position. ClustalW, multiple alignments of KF CD14 gene sequence with B. taurus cattle sequence (EU148610.1), revealed 24 nt changes (SNPs). Cows were also screened using PCR-RFLP with Hpy188I (contig 2) and HinfI (contig 4) RE, which revealed three genotypes each that differed significantly regarding mastitis incidence. The maximum possible combination of these two loci shown nine combined genotype patterns and it was observed only eight combined genotypes out of nine: AACC, AACD, AADD, ABCD, ABDD, BBCC, BBCD, and BBDD. The combined genotype ABCC was not observed in the studied population of KF cows. Out of 94 animals, AACD combined genotype animals (10.63%) were found to be not affected with mastitis, and ABDD combined genotyped animals was observed having the highest mastitis incidence of 15.96%. Conclusion: AACD typed cows were found to be least susceptible to mastitis incidence as compared to other combined genotypes. PMID:27536026
Molecular characterization and combined genotype association study of bovine cluster of differentiation 14 gene with clinical mastitis in crossbred dairy cattle.

PubMed

Selvan, A Sakthivel; Gupta, I D; Verma, A; Chaudhari, M V; Magotra, A

2016-07-01

The present study was undertaken with the objectives to characterize and to analyze combined genotypes of cluster of differentiation 14 (CD14) gene to explore its association with clinical mastitis in Karan Fries (KF) cows maintained in the National Dairy Research Institute herd, Karnal. Genomic DNA was extracted using blood of randomly selected 94 KF lactating cattle by phenol-chloroform method. After checking its quality and quantity, polymerase chain reaction (PCR) was carried out using six sets of reported gene-specific primers to amplify complete KF CD14 gene. The forward and reverse sequences for each PCR fragments were assembled to form complete sequence for the respective region of KF CD14 gene. The multiple sequence alignments of the edited sequence with the corresponding reference with reported Bos taurus sequence (EU148610.1) were performed with ClustalW software to identify single nucleotide polymorphisms (SNPs). Basic Local Alignment Search Tool analysis was performed to compare the sequence identity of KF CD14 gene with other species. The restriction fragment length polymorphism (RFLP) analysis was carried out in all KF cows using Helicobacter pylori 188I (Hpy188I) (contig 2) and Haemophilus influenzae I (HinfI) (contig 4) restriction enzyme (RE). Cows were assigned genotypes obtained by PCR-RFLP analysis, and association study was done using Chi-square (χ (2)) test. The genotypes of both contigs (loci) number 2 and 4 were combined with respect to each animal to construct combined genotype patterns. Two types of sequences of KF were obtained: One with 2630 bp having one insertion at 616 nucleotide (nt) position and one deletion at 1117 nt position, and the another sequence was of 2629 bp having only one deletion at 615 nt position. ClustalW, multiple alignments of KF CD14 gene sequence with B. taurus cattle sequence (EU148610.1), revealed 24 nt changes (SNPs). Cows were also screened using PCR-RFLP with Hpy188I (contig 2) and HinfI (contig 4) RE, which revealed three genotypes each that differed significantly regarding mastitis incidence. The maximum possible combination of these two loci shown nine combined genotype patterns and it was observed only eight combined genotypes out of nine: AACC, AACD, AADD, ABCD, ABDD, BBCC, BBCD, and BBDD. The combined genotype ABCC was not observed in the studied population of KF cows. Out of 94 animals, AACD combined genotype animals (10.63%) were found to be not affected with mastitis, and ABDD combined genotyped animals was observed having the highest mastitis incidence of 15.96%. AACD typed cows were found to be least susceptible to mastitis incidence as compared to other combined genotypes.
Virtual surgical planning for treatment of severe mandibular retrognathia with collapsed occlusion using contemporary surgical and prosthodontic protocols.

PubMed

Dhima, Matilda; Salinas, Thomas J; Rieck, Kevin L

2013-11-01

To meet functional and esthetic needs in an older adult for treatment of complex skeletal and dentoalveolar deformities using contemporary surgical and prosthodontic protocols. An older adult with dentoalveolar complex and skeletal deformity (mandibular retrognathia) was treated by a combination of virtual planning and current surgical and prosthodontic protocols. Treatment planning steps and sequencing are presented. Skeletal, soft tissue, and dental harmonies were attained without biological or mechanical complications. Definitive oral rehabilitation was completed with a maxillary complete denture and a mandibular metal ceramic fixed implant-retained prosthesis. A surgical and prosthodontic team approach in combination with technologic advances can predictably optimize esthetic and functional outcomes for patients with complex skeletal and dentoalveolar deformities. Copyright © 2013 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Combination of COFRADIC and high temperature-extended column length conventional liquid chromatography: a very efficient way to tackle complex protein samples, such as serum.

PubMed

Sandra, Koen; Verleysen, Katleen; Labeur, Christine; Vanneste, Lies; D'Hondt, Filip; Thomas, Grégoire; Kas, Koen; Gevaert, Kris; Vandekerckhove, Joël; Sandra, Pat

2007-03-01

The previously reported COmbined FRActional DIagonal Chromatography (COFRA-DIC) methodology, in which a subset of peptides representative for their parent proteins are sorted, is particularly powerful for whole proteome analysis. This peptide-centric technology is built around diagonal chromatography, where peptide separations are crucial. This paper presents high efficiency peptide separations, in which four 250 x 2.1 mm, 5 microm Zorbax 300SB-C18 columns (total length 1 m) were coupled at operating temperatures of 60'C using a dedicated LC oven and conventional LC equipment. The high efficiency separations were combined with the COFRADIC procedure. This extremely powerful combination resulted, for the analysis of serum, in an increase in the uniquely identified peptide sequences by a factor of 2.6, compared to the COFRADIC procedure on a 25 cm column. This is a reflection of the increased peak capacity obtained on the 1 m column, which was calculated to be a factor 2.7 higher than on the 25 cm column. Besides more efficient sorting, less ion suppression was noticed.
Identification and correction of systematic error in high-throughput sequence data

PubMed Central

2011-01-01

Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. PMID:22099972
Analysis of Illumina Microbial Assemblies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clum, Alicia; Foster, Brian; Froula, Jeff

2010-05-28

Since the emerging of second generation sequencing technologies, the evaluation of different sequencing approaches and their assembly strategies for different types of genomes has become an important undertaken. Next generation sequencing technologies dramatically increase sequence throughput while decreasing cost, making them an attractive tool for whole genome shotgun sequencing. To compare different approaches for de-novo whole genome assembly, appropriate tools and a solid understanding of both quantity and quality of the underlying sequence data are crucial. Here, we performed an in-depth analysis of short-read Illumina sequence assembly strategies for bacterial and archaeal genomes. Different types of Illumina libraries as wellmore » as different trim parameters and assemblers were evaluated. Results of the comparative analysis and sequencing platforms will be presented. The goal of this analysis is to develop a cost-effective approach for the increased throughput of the generation of high quality microbial genomes.« less
Coal-oil coprocessing at HTI - development and improvement of the technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stalzer, R.H.; Lee, L.K.; Hu, J.

1995-12-31

Co-Processing refers to the combined processing of coal and petroleum-derived heavy oil feedstocks. The coal feedstocks used are those typically utilized in direct coal liquefaction: bituminous, subbituminous, and lignites. Petroleum-derived oil, is typically a petroleum residuum, containing at least 70 W% material boiling above 525{degrees}C. The combined coal and oil feedstocks are processed simultaneously with the dual objective of liquefying the coal and upgrading the petroleum-derived residuum to lower boiling (<525{degrees}C) premium products. HTI`s investigation of the Co-Processing technology has included work performed in laboratory, bench and PDU scale operations. The concept of co-processing technology is quite simple and amore » natural outgrowth of the work done with direct coal liquefaction. A 36 month program to evaluate new process concepts in coal-oil coprocessing at the bench-scale was begun in September 1994 and runs until September 1997. Included in this continuous bench-scale program are provisions to examine new improvements in areas such as: interstage product separation, feedstock concentrations (coal/oil), improved supported/dispersed catalysts, optimization of reactor temperature sequencing, and in-line hydrotreating. This does not preclude other ideas from DOE contracts and other sources that can lead to improved product quality and economics. This research work has led to important findings which significantly increased liquid yields, improved product quality, and improved process economics.« less
Comparative analysis of RNAi screening technologies at genome-scale reveals an inherent processing inefficiency of the plasmid-based shRNA hairpin.

PubMed

Bhinder, Bhavneet; Shum, David; Djaballah, Hakim

2014-02-01

RNAi screening in combination with the genome-sequencing projects would constitute the Holy Grail of modern genetics; enabling discovery and validation towards a better understanding of fundamental biology leading to novel targets to combat disease. Hit discordance at inter-screen level together with the lack of reproducibility is emerging as the technology's main pitfalls. To examine some of the underlining factors leading to such discrepancies, we reasoned that perhaps there is an inherent difference in knockdown efficiency of the various RNAi technologies. For this purpose, we utilized the two most popular ones, chemically synthesized siRNA duplex and plasmid-based shRNA hairpin, in order to perform a head to head comparison. Using a previously developed gain-of-function assay probing modulators of the miRNA biogenesis pathway, we first executed on a siRNA screen against the Silencer Select V4.0 library (AMB) nominating 1,273, followed by an shRNA screen against the TRC1 library (TRC1) nominating 497 gene candidates. We observed a poor overlap of only 29 hits given that there are 15,068 overlapping genes between the two libraries; with DROSHA as the only common hit out of the seven known core miRNA biogenesis genes. Distinct genes interacting with the same biogenesis regulators were observed in both screens, with a dismal cross-network overlap of only 3 genes (DROSHA, TGFBR1, and DIS3). Taken together, our study demonstrates differential knockdown activities between the two technologies, possibly due to the inefficient intracellular processing and potential cell-type specificity determinants in generating intended targeting sequences for the plasmid-based shRNA hairpins; and suggests this observed inefficiency as potential culprit in addressing the lack of reproducibility.
Methods and statistics for combining motif match scores.

PubMed

Bailey, T L; Gribskov, M

1998-01-01

Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.
Genome Sequencing Technologies and Nursing: What Are the Roles of Nurses and Nurse Scientists?

PubMed

Taylor, Jacquelyn Y; Wright, Michelle L; Hickey, Kathleen T; Housman, David E

Advances in DNA sequencing technology have resulted in an abundance of personalized data with challenging clinical utility and meaning for clinicians. This wealth of data has potential to dramatically impact the quality of healthcare. Nurses are at the focal point in educating patients regarding relevant healthcare needs; therefore, an understanding of sequencing technology and utilizing these data are critical. The objective of this study was to explicate the role of nurses and nurse scientists as integral members of healthcare teams in improving understanding of DNA sequencing data and translational genomics for patients. A history of the nurse role in newborn screening is used as an exemplar. This study serves as an exemplar on how genome sequencing has been utilized in nursing science and incorporates linkages of other omics approaches used by nurses that are included in this special issue. This special issue showcased nurse scientists conducting multi-omic research from various methods, including targeted candidate genes, pharmacogenomics, proteomics, epigenomics, and the microbiome. From this vantage point, we provide an overview of the roles of nurse scientists in genome sequencing research and provide recommendations for the best utilization of nurses and nurse scientists related to genome sequencing.
Meeting the challenges of non-referenced genome assembly from short-read sequence data

Treesearch

M. Parks; A. Liston; R. Cronn

2010-01-01

Massively parallel sequencing technologies (MPST) offer unprecedented opportunities for novel sequencing projects. MPST, while offering tremendous sequencing capacity, are typically most effective in resequencing projects (as opposed to the sequencing of novel genomes) due to the fact that sequence is returned in relatively short reads. Nonetheless, there is great...
A hybrid systems strategy for automated spacecraft tour design and optimization

NASA Astrophysics Data System (ADS)

Stuart, Jeffrey R.

As the number of operational spacecraft increases, autonomous operations is rapidly evolving into a critical necessity. Additionally, the capability to rapidly generate baseline trajectories greatly expands the range of options available to analysts as they explore the design space to meet mission demands. Thus, a general strategy is developed, one that is suitable for the construction of flight plans for both Earth-based and interplanetary spacecraft that encounter multiple objects, where these multiple encounters comprise a ``tour''. The proposed scheme is flexible in implementation and can readily be adjusted to a variety of mission architectures. Heuristic algorithms that autonomously generate baseline tour trajectories and, when appropriate, adjust reference solutions in the presence of rapidly changing environments are investigated. Furthermore, relative priorities for ranking the targets are explicitly accommodated during the construction of potential tour sequences. As a consequence, a priori, as well as newly acquired, knowledge concerning the target objects enhances the potential value of the ultimate encounter sequences. A variety of transfer options are incorporated, from rendezvous arcs enabled by low-thrust engines to more conventional impulsive orbit adjustments via chemical propulsion technologies. When advantageous, trajectories are optimized in terms of propellant consumption via a combination of indirect and direct methods; such a combination of available technologies is an example of hybrid optimization. Additionally, elements of hybrid systems theory, i.e., the blending of dynamical states, some discrete and some continuous, are integrated into the high-level tour generation scheme. For a preliminary investigation, this strategy is applied to mission design scenarios for a Sun-Jupiter Trojan asteroid tour as well as orbital debris removal for near-Earth applications.
The impact of next-generation sequencing on genomics

PubMed Central

Zhang, Jun; Chiodini, Rod; Badr, Ahmed; Zhang, Genfa

2011-01-01

This article reviews basic concepts, general applications, and the potential impact of next-generation sequencing (NGS) technologies on genomics, with particular reference to currently available and possible future platforms and bioinformatics. NGS technologies have demonstrated the capacity to sequence DNA at unprecedented speed, thereby enabling previously unimaginable scientific achievements and novel biological applications. But, the massive data produced by NGS also presents a significant challenge for data storage, analyses, and management solutions. Advanced bioinformatic tools are essential for the successful application of NGS technology. As evidenced throughout this review, NGS technologies will have a striking impact on genomic research and the entire biological field. With its ability to tackle the unsolved challenges unconquered by previous genomic technologies, NGS is likely to unravel the complexity of the human genome in terms of genetic variations, some of which may be confined to susceptible loci for some common human conditions. The impact of NGS technologies on genomics will be far reaching and likely change the field for years to come. PMID:21477781
Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS) Data in Plants.

PubMed

Taheri, Sima; Lee Abdullah, Thohirah; Yusop, Mohd Rafii; Hanafi, Mohamed Musa; Sahebi, Mahbod; Azizi, Parisa; Shamshiri, Redmond Ramin

2018-02-13

Microsatellites, or simple sequence repeats (SSRs), are one of the most informative and multi-purpose genetic markers exploited in plant functional genomics. However, the discovery of SSRs and development using traditional methods are laborious, time-consuming, and costly. Recently, the availability of high-throughput sequencing technologies has enabled researchers to identify a substantial number of microsatellites at less cost and effort than traditional approaches. Illumina is a noteworthy transcriptome sequencing technology that is currently used in SSR marker development. Although 454 pyrosequencing datasets can be used for SSR development, this type of sequencing is no longer supported. This review aims to present an overview of the next generation sequencing, with a focus on the efficient use of de novo transcriptome sequencing (RNA-Seq) and related tools for mining and development of microsatellites in plants.
Advanced Applications of Next-Generation Sequencing Technologies to Orchid Biology.

PubMed

Yeh, Chuan-Ming; Liu, Zhong-Jian; Tsai, Wen-Chieh

2018-01-01

Next-generation sequencing technologies are revolutionizing biology by permitting, transcriptome sequencing, whole-genome sequencing and resequencing, and genome-wide single nucleotide polymorphism profiling. Orchid research has benefited from this breakthrough, and a few orchid genomes are now available; new biological questions can be approached and new breeding strategies can be designed. The first part of this review describes the unique features of orchid biology. The second part provides an overview of the current next-generation sequencing platforms, many of which are already used in plant laboratories. The third part summarizes the state of orchid transcriptome and genome sequencing and illustrates current achievements. The genetic sequences currently obtained will not only provide a broad scope for the study of orchid biology, but also serves as a starting point for uncovering the mystery of orchid evolution.
The genome sequence of ectromelia virus Naval and Cornell isolates from outbreaks in North America.

PubMed

Mavian, Carla; López-Bueno, Alberto; Bryant, Neil A; Seeger, Kathy; Quail, Michael A; Harris, David; Barrell, Bart; Alcami, Antonio

2014-08-01

Ectromelia virus (ECTV) is the causative agent of mousepox, a disease of laboratory mouse colonies and an excellent model for human smallpox. We report the genome sequence of two isolates from outbreaks in laboratory mouse colonies in the USA in 1995 and 1999: ECTV-Naval and ECTV-Cornell, respectively. The genome of ECTV-Naval and ECTV-Cornell was sequenced by the 454-Roche technology. The ECTV-Naval genome was also sequenced by the Sanger and Illumina technologies in order to evaluate these technologies for poxvirus genome sequencing. Genomic comparisons revealed that ECTV-Naval and ECTV-Cornell correspond to the same virus isolated from independent outbreaks. Both ECTV-Naval and ECTV-Cornell are extremely virulent in susceptible BALB/c mice, similar to ECTV-Moscow. This is consistent with the ECTV-Naval genome sharing 98.2% DNA sequence identity with that of ECTV-Moscow, and indicates that the genetic differences with ECTV-Moscow do not affect the virulence of ECTV-Naval in the mousepox model of footpad infection. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Long Reads: their Purpose and Place.

PubMed

Pollard, Martin O; Gurdasani, Deepti; Mentzer, Alexander J; Porter, Tarryn; Sandhu, Manjinder S

2018-05-14

In recent years long read technologies have moved from being a niche and specialist field to a point of relative maturity likely to feature frequently in the genomic landscape. Analogous to next generation sequencing (NGS), the cost of sequencing using long read technologies has materially dropped whilst the instrument throughput continues to increase. Together these changes present the prospect of sequencing large numbers of individuals with the aim of fully characterising genomes at high resolution. In this article, we will endeavour to present an introduction to long read technologies showing: what long reads are; how they are distinct from short reads; why long reads are useful; and how they are being used. We will highlight the recent developments in this field, and the applications and potential of these technologies in medical research, and clinical diagnostics and therapeutics.
Biofilm-Growing Bacteria Involved in the Corrosion of Concrete Wastewater Pipes: Protocols for Comparative Metagenomic Analyses

EPA Science Inventory

Advances in high-throughput next-generation sequencing (NGS) technology for direct sequencing of environmental DNA (i.e. shotgun metagenomics) is transforming the field of microbiology. NGS technologies are now regularly being applied in comparative metagenomic studies, which pr...

Genome assembly reborn: recent computational challenges

PubMed Central

2009-01-01

Research into genome assembly algorithms has experienced a resurgence due to new challenges created by the development of next generation sequencing technologies. Several genome assemblers have been published in recent years specifically targeted at the new sequence data; however, the ever-changing technological landscape leads to the need for continued research. In addition, the low cost of next generation sequencing data has led to an increased use of sequencing in new settings. For example, the new field of metagenomics relies on large-scale sequencing of entire microbial communities instead of isolate genomes, leading to new computational challenges. In this article, we outline the major algorithmic approaches for genome assembly and describe recent developments in this domain. PMID:19482960
Accurate multiplex polony sequencing of an evolved bacterial genome.

PubMed

Shendure, Jay; Porreca, Gregory J; Reppas, Nikos B; Lin, Xiaoxia; McCutcheon, John P; Rosenbaum, Abraham M; Wang, Michael D; Zhang, Kun; Mitra, Robi D; Church, George M

2005-09-09

We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents.
P7-S Combining Workflow-Based Project Organization with Protein-Dependant Data Retrieval for the Retrieval of Extensive Proteome Information

PubMed Central

Glandorf, J.; Thiele, H.; Macht, M.; Vorm, O.; Podtelejnikov, A.

2007-01-01

In the course of a full-scale proteomics experiment, the handling of the data as well as the retrieval of the relevant information from the results is a major challenge due to the massive amount of generated data (gel images, chromatograms, and spectra) as well as associated result information (sequences, literature, etc.). To obtain meaningful information from these data, one has to filter the results in an easy way. Possibilities to do so can be based on GO terms or structural features such as transmembrane domains, involvement in certain pathways, etc. In this presentation we will show how a combination of a software package with a workflow-based result organization (Bruker ProteinScape) and a protein-centered data-mining software (Proxeon ProteinCenter) can assist in the comparison of the results from large projects, such as comparison of cross-platform results from 2D PAGE/MS with shotgun LC-ESI-MS/MS. We will present differences between different technologies and show how these differences can be easily identified and how they allow us to draw conclusions on the involved technologies.
Combining Comprehensive Analysis of Off-Site Lambda Phage Integration with a CRISPR-Based Means of Characterizing Downstream Physiology.

PubMed

Tanouchi, Yu; Covert, Markus W

2017-09-19

During its lysogenic life cycle, the phage genome is integrated into the host chromosome by site-specific recombination. In this report, we analyze lambda phage integration into noncanonical sites using next-generation sequencing and show that it generates significant genetic diversity by targeting over 300 unique sites in the host Escherichia coli genome. Moreover, these integration events can have important phenotypic consequences for the host, including changes in cell motility and increased antibiotic resistance. Importantly, the new technologies that we developed to enable this study-sequencing secondary sites using next-generation sequencing and then selecting relevant lysogens using clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9-based selection-are broadly applicable to other phage-bacterium systems. IMPORTANCE Bacteriophages play an important role in bacterial evolution through lysogeny, where the phage genome is integrated into the host chromosome. While phage integration generally occurs at a specific site in the host chromosome, it is also known to occur at other, so-called secondary sites. In this study, we developed a new experimental technology to comprehensively study secondary integration sites and discovered that phage can integrate into over 300 unique sites in the host genome, resulting in significant genetic diversity in bacteria. We further developed an assay to examine the phenotypic consequence of such diverse integration events and found that phage integration can cause changes in evolutionarily relevant traits such as bacterial motility and increases in antibiotic resistance. Importantly, our method is readily applicable to other phage-bacterium systems. Copyright © 2017 Tanouchi and Covert.
Application of population sequencing (POPSEQ) for ordering and inputting genotyping-by-sequencing markers in hexaploid wheat

USDA-ARS?s Scientific Manuscript database

The advancement of next-generation sequencing technologies in conjunction with new bioinformatics tools enabled fine-tuning of sequence-based high resolution mapping strategies for complex genomes. Although genotyping-by-sequencing (GBS) provides a large number of markers, its application for assoc...
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

USDA-ARS?s Scientific Manuscript database

A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo) genome assembly and analysis

USDA-ARS?s Scientific Manuscript database

Next-generation sequencing technologies were used to rapidly and efficiently sequence the genome of the domestic turkey (Meleagris gallopavo). The current genome assembly (~1.1 Gb) includes 917 Mb of sequence assigned to chromosomes. Innate heterozygosity of the sequenced bird allowed discovery of...
Sequencing of adenine in DNA by scanning tunneling microscopy

NASA Astrophysics Data System (ADS)

Tanaka, Hiroyuki; Taniguchi, Masateru

2017-08-01

The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.
Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes.

PubMed

Burkholder, William F; Newell, Evan W; Poidinger, Michael; Chen, Swaine; Fink, Katja

2017-01-01

The inaugural workshop "Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes" was held in Singapore on 13-14 October 2016. The aim of the workshop was to discuss the latest trends in using high-throughput sequencing, bioinformatics, and allied technologies to analyze immune and pathogen repertoires and their interplay within the host, bringing together key international players in the field and Singapore-based researchers and clinician-scientists. The focus was in particular on the application of these technologies for the improvement of patient diagnosis, prognosis and treatment, and for other broad public health outcomes. The presentations by scientists and clinicians showed the potential of deep sequencing technology to capture the coevolution of adaptive immunity and pathogens. For clinical applications, some key challenges remain, such as the long turnaround time and relatively high cost of deep sequencing for pathogen identification and characterization and the lack of international standardization in immune repertoire analysis.
Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes

PubMed Central

Burkholder, William F.; Newell, Evan W.; Poidinger, Michael; Chen, Swaine; Fink, Katja

2017-01-01

The inaugural workshop “Deep Sequencing in Infectious Diseases: Immune and Pathogen Repertoires for the Improvement of Patient Outcomes” was held in Singapore on 13–14 October 2016. The aim of the workshop was to discuss the latest trends in using high-throughput sequencing, bioinformatics, and allied technologies to analyze immune and pathogen repertoires and their interplay within the host, bringing together key international players in the field and Singapore-based researchers and clinician-scientists. The focus was in particular on the application of these technologies for the improvement of patient diagnosis, prognosis and treatment, and for other broad public health outcomes. The presentations by scientists and clinicians showed the potential of deep sequencing technology to capture the coevolution of adaptive immunity and pathogens. For clinical applications, some key challenges remain, such as the long turnaround time and relatively high cost of deep sequencing for pathogen identification and characterization and the lack of international standardization in immune repertoire analysis. PMID:28620372
The diploid genome sequence of an Asian individual

PubMed Central

Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

2009-01-01

Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735
Sequencing consolidates molecular markers with plant breeding practice.

PubMed

Yang, Huaan; Li, Chengdao; Lam, Hon-Ming; Clements, Jonathan; Yan, Guijun; Zhao, Shancen

2015-05-01

Plenty of molecular markers have been developed by contemporary sequencing technologies, whereas few of them are successfully applied in breeding, thus we present a review on how sequencing can facilitate marker-assisted selection in plant breeding. The growing global population and shrinking arable land area require efficient plant breeding. Novel strategies assisted by certain markers have proven effective for genetic gains. Fortunately, cutting-edge sequencing technologies bring us a deluge of genomes and genetic variations, enlightening the potential of marker development. However, a large gap still exists between the potential of molecular markers and actual plant breeding practices. In this review, we discuss marker-assisted breeding from a historical perspective, describe the road from crop sequencing to breeding, and highlight how sequencing facilitates the application of markers in breeding practice.
Arrays of probes for positional sequencing by hybridization

DOEpatents

Cantor, Charles R [Boston, MA; Prezetakiewiczr, Marek [East Boston, MA; Smith, Cassandra L [Boston, MA; Sano, Takeshi [Waltham, MA

2008-01-15

This invention is directed to methods and reagents useful for sequencing nucleic acid targets utilizing sequencing by hybridization technology comprising probes, arrays of probes and methods whereby sequence information is obtained rapidly and efficiently in discrete packages. That information can be used for the detection, identification, purification and complete or partial sequencing of a particular target nucleic acid. When coupled with a ligation step, these methods can be performed under a single set of hybridization conditions. The invention also relates to the replication of probe arrays and methods for making and replicating arrays of probes which are useful for the large scale manufacture of diagnostic aids used to screen biological samples for specific target sequences. Arrays created using PCR technology may comprise probes with 5'- and/or 3'-overhangs.
Contribution of Tryptophan Residues to the Combining Site of a Monoclonal Anti Dinitrophenyl Spin-Label Antibody

DTIC Science & Technology

1987-01-01

identified in the difference spectra, implying that: there are five to seven tryptophans within 17 A of the spin-label hapten. Amino acid sequences...of the heavy, and light chains were obtained by a combination of amino acid and DNA sequencing. A molecular model’ was constructed from the sequence...Clore & acids yields detailed information about the amino acid com- Gronenborn, 1982, 1983). This technique should also identify position of the combining
From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

PubMed

Kwok, Hin; Chiang, Alan Kwok Shing

2016-02-24

Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.
The role of next generation sequencing for the development and testing of veterinary biologics

USDA-ARS?s Scientific Manuscript database

Next generation sequencing technology has become widely available and it offers many new opportunities in vaccine technology. Both human and veterinary medicine has numerous examples of adventitious agents being found in live vaccines. In veterinary medicine a continuing trend is the use of viral ...
From prenatal genomic diagnosis to fetal personalized medicine: progress and challenges

PubMed Central

Bianchi, Diana W

2015-01-01

Thus far, the focus of personalized medicine has been the prevention and treatment of conditions that affect adults. Although advances in genetic technology have been applied more frequently to prenatal diagnosis than to fetal treatment, genetic and genomic information is beginning to influence pregnancy management. Recent developments in sequencing the fetal genome combined with progress in understanding fetal physiology using gene expression arrays indicate that we could have the technical capabilities to apply an individualized medicine approach to the fetus. Here I review recent advances in prenatal genetic diagnostics, the challenges associated with these new technologies and how the information derived from them can be used to advance fetal care. Historically, the goal of prenatal diagnosis has been to provide an informed choice to prospective parents. We are now at a point where that goal can and should be expanded to incorporate genetic, genomic and transcriptomic data to develop new approaches to fetal treatment. PMID:22772565
Affinity purification of bacterial outer membrane vesicles (OMVs) utilizing a His-tag mutant.

PubMed

Alves, Nathan J; Turner, Kendrick B; DiVito, Kyle A; Daniele, Michael A; Walper, Scott A

To facilitate the rapid purification of bacterial outer membrane vesicles (OMVs), we developed two plasmid constructs that utilize a truncated, transmembrane protein to present an exterior histidine repeat sequence. We chose OmpA, a highly abundant porin protein, as the protein scaffold and utilized the lac promoter to allow for inducible control of the epitope-presenting construct. OMVs containing mutant OmpA-His6 were purified directly from Escherichia coli culture media on an immobilized metal affinity chromatography (IMAC) Ni-NTA resin. This enabling technology can be combined with other molecular tools directed at OMV packaging to facilitate the separation of modified/cargo-loaded OMV from their wt counterparts. In addition to numerous applications in the pharmaceutical and environmental remediation industries, this technology can be utilized to enhance basic research capabilities in the area of elucidating endogenous OMV function. Published by Elsevier Masson SAS.
Whole genome DNA methylation: beyond genes silencing.

PubMed

Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati

2017-01-17

The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the major biological consequences of DNA methylation recently discovered. We also discuss the necessity of tuning DNA methylation resolution into an adequate scale to ease the integration of the methylome information with other chromatin features and transcription events such as gene expression, nucleosome positioning, transcription factors binding dynamic, gene splicing and genomic imprinting. Finally, our review sheds light on DNA methylation heterogeneity in cell population and the different approaches used for its assessment, including the contribution of single cell DNA analysis technology.
Whole genome DNA methylation: beyond genes silencing

PubMed Central

Tirado-Magallanes, Roberto; Rebbani, Khadija; Lim, Ricky; Pradhan, Sriharsa; Benoukraf, Touati

2017-01-01

The combination of DNA bisulfite treatment with high-throughput sequencing technologies has enabled investigation of genome-wide DNA methylation at near base pair level resolution, far beyond that of the kilobase-long canonical CpG islands that initially revealed the biological relevance of this covalent DNA modification. The latest high-resolution studies have revealed a role for very punctual DNA methylation in chromatin plasticity, gene regulation and splicing. Here, we aim to outline the major biological consequences of DNA methylation recently discovered. We also discuss the necessity of tuning DNA methylation resolution into an adequate scale to ease the integration of the methylome information with other chromatin features and transcription events such as gene expression, nucleosome positioning, transcription factors binding dynamic, gene splicing and genomic imprinting. Finally, our review sheds light on DNA methylation heterogeneity in cell population and the different approaches used for its assessment, including the contribution of single cell DNA analysis technology. PMID:27895318

RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.

PubMed

An, Ji-Yong; You, Zhu-Hong; Meng, Fan-Rong; Xu, Shu-Juan; Wang, Yin

2016-05-18

Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.
3D Protein structure prediction with genetic tabu search algorithm

PubMed Central

2010-01-01

Background Protein structure prediction (PSP) has important applications in different fields, such as drug design, disease prediction, and so on. In protein structure prediction, there are two important issues. The first one is the design of the structure model and the second one is the design of the optimization technology. Because of the complexity of the realistic protein structure, the structure model adopted in this paper is a simplified model, which is called off-lattice AB model. After the structure model is assumed, optimization technology is needed for searching the best conformation of a protein sequence based on the assumed structure model. However, PSP is an NP-hard problem even if the simplest model is assumed. Thus, many algorithms have been developed to solve the global optimization problem. In this paper, a hybrid algorithm, which combines genetic algorithm (GA) and tabu search (TS) algorithm, is developed to complete this task. Results In order to develop an efficient optimization algorithm, several improved strategies are developed for the proposed genetic tabu search algorithm. The combined use of these strategies can improve the efficiency of the algorithm. In these strategies, tabu search introduced into the crossover and mutation operators can improve the local search capability, the adoption of variable population size strategy can maintain the diversity of the population, and the ranking selection strategy can improve the possibility of an individual with low energy value entering into next generation. Experiments are performed with Fibonacci sequences and real protein sequences. Experimental results show that the lowest energy obtained by the proposed GATS algorithm is lower than that obtained by previous methods. Conclusions The hybrid algorithm has the advantages from both genetic algorithm and tabu search algorithm. It makes use of the advantage of multiple search points in genetic algorithm, and can overcome poor hill-climbing capability in the conventional genetic algorithm by using the flexible memory functions of TS. Compared with some previous algorithms, GATS algorithm has better performance in global optimization and can predict 3D protein structure more effectively. PMID:20522256
Identification of susceptibility genes and genetic modifiers of human diseases

NASA Astrophysics Data System (ADS)

Abel, Kenneth; Kammerer, Stefan; Hoyal, Carolyn; Reneland, Rikard; Marnellos, George; Nelson, Matthew R.; Braun, Andreas

2005-03-01

The completion of the human genome sequence enables the discovery of genes involved in common human disorders. The successful identification of these genes is dependent on the availability of informative sample sets, validated marker panels, a high-throughput scoring technology, and a strategy for combining these resources. We have developed a universal platform technology based on mass spectrometry (MassARRAY) for analyzing nucleic acids with high precision and accuracy. To fuel this technology, we generated more than 100,000 validated assays for single nucleotide polymorphisms (SNPs) covering virtually all known and predicted human genes. We also established a large DNA sample bank comprised of more than 50,000 consented healthy and diseased individuals. This combination of reagents and technology allows the execution of large-scale genome-wide association studies. Taking advantage of MassARRAY"s capability for quantitative analysis of nucleic acids, allele frequencies are estimated in sample pools containing large numbers of individual DNAs. To compare pools as a first-pass "filtering" step is a tremendous advantage in throughput and cost over individual genotyping. We employed this approach in numerous genome-wide, hypothesis-free searches to identify genes associated with common complex diseases, such as breast cancer, osteoporosis, and osteoarthritis, and genes involved in quantitative traits like high density lipoproteins cholesterol (HDL-c) levels and central fat. Access to additional well-characterized patient samples through collaborations allows us to conduct replication studies that validate true disease genes. These discoveries will expand our understanding of genetic disease predisposition, and our ability for early diagnosis and determination of specific disease subtype or progression stage.
Genomics of crop wild relatives: expanding the gene pool for crop improvement.

PubMed

Brozynska, Marta; Furtado, Agnelo; Henry, Robert J

2016-04-01

Plant breeders require access to new genetic diversity to satisfy the demands of a growing human population for more food that can be produced in a variable or changing climate and to deliver the high-quality food with nutritional and health benefits demanded by consumers. The close relatives of domesticated plants, crop wild relatives (CWRs), represent a practical gene pool for use by plant breeders. Genomics of CWR generates data that support the use of CWR to expand the genetic diversity of crop plants. Advances in DNA sequencing technology are enabling the efficient sequencing of CWR and their increased use in crop improvement. As the sequencing of genomes of major crop species is completed, attention has shifted to analysis of the wider gene pool of major crops including CWR. A combination of de novo sequencing and resequencing is required to efficiently explore useful genetic variation in CWR. Analysis of the nuclear genome, transcriptome and maternal (chloroplast and mitochondrial) genome of CWR is facilitating their use in crop improvement. Genome analysis results in discovery of useful alleles in CWR and identification of regions of the genome in which diversity has been lost in domestication bottlenecks. Targeting of high priority CWR for sequencing will maximize the contribution of genome sequencing of CWR. Coordination of global efforts to apply genomics has the potential to accelerate access to and conservation of the biodiversity essential to the sustainability of agriculture and food production. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
Mycobacterium tuberculosis and whole-genome sequencing: how close are we to unleashing its full potential?

PubMed

Satta, G; Lipman, M; Smith, G P; Arnold, C; Kon, O M; McHugh, T D

2018-06-01

Nearly two decades after completion of the genome sequence of Mycobacterium tuberculosis (MTB), and with the advent of next generation sequencing technologies (NGS), whole-genome sequencing (WGS) has been applied to a wide range of clinical scenarios. Starting in 2017, England is the first country in the world to pioneer its use on a national scale for the diagnosis of tuberculosis, detection of drug resistance, and typing of MTB. This narrative review critically analyses the current applications of WGS for MTB and explains how close we are to realizing its full potential as a diagnostic, epidemiologic, and research tool. We searched for reports (both original articles and reviews) published in English up to 31 May 2017, with combinations of the following keywords: whole-genome sequencing, Mycobacterium, and tuberculosis. MEDLINE, Embase, and Scopus were used as search engines. We included articles that covered different aspects of whole-genome sequencing in relation to MTB. This review focuses on three main themes: the role of WGS for the prediction of drug susceptibility, MTB outbreak investigation and genetic diversity, and research applications of NGS. Many of the original expectations have been accomplished, and we believe that with its unprecedented sensitivity and power, WGS has the potential to address many unanswered questions in the near future. However, caution is still needed when interpreting WGS data as there are some important limitations to be aware of, from correct interpretation of drug susceptibilities to the bioinformatic support needed. Copyright © 2017 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project

DOE PAGES

Kyrpides, Nikos C.; Woyke, Tanja; Eisen, Jonathan A.; ...

2014-06-15

The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project with the objective of sequencing 250 bacterial and archaeal genomes. The two major goals of that project were (a) to test the hypothesis that there are many benefits to the use the phylogenetic diversity of organisms in the tree of life as a primary criterion for generating their genome sequence and (b) to develop the necessary framework, technology and organization for large-scale sequencing of microbial isolate genomes. While the GEBA pilot project has not yet been entirely completed, both ofmore » the original goals have already been successfully accomplished, leading the way for the next phase of the project. Here we propose taking the GEBA project to the next level, by generating high quality draft genomes for 1,000 bacterial and archaeal strains. This represents a combined 16-fold increase in both scale and speed as compared to the GEBA pilot project (250 isolate genomes in 4+ years). We will follow a similar approach for organism selection and sequencing prioritization as was done for the GEBA pilot project (i.e. phylogenetic novelty, availability and growth of cultures of type strains and DNA extraction capability), focusing on type strains as this ensures reproducibility of our results and provides the strongest linkage between genome sequences and other knowledge about each strain. In turn, this project will constitute a pilot phase of a larger effort that will target the genome sequences of all available type strains of the Bacteria and Archaea.« less
Unravelling the complexity of microRNA-mediated gene regulation in black pepper (Piper nigrum L.) using high-throughput small RNA profiling.

PubMed

Asha, Srinivasan; Sreekumar, Sweda; Soniya, E V

2016-01-01

Analysis of high-throughput small RNA deep sequencing data, in combination with black pepper transcriptome sequences revealed microRNA-mediated gene regulation in black pepper ( Piper nigrum L.). Black pepper is an important spice crop and its berries are used worldwide as a natural food additive that contributes unique flavour to foods. In the present study to characterize microRNAs from black pepper, we generated a small RNA library from black pepper leaf and sequenced it by Illumina high-throughput sequencing technology. MicroRNAs belonging to a total of 303 conserved miRNA families were identified from the sRNAome data. Subsequent analysis from recently sequenced black pepper transcriptome confirmed precursor sequences of 50 conserved miRNAs and four potential novel miRNA candidates. Stem-loop qRT-PCR experiments demonstrated differential expression of eight conserved miRNAs in black pepper. Computational analysis of targets of the miRNAs showed 223 potential black pepper unigene targets that encode diverse transcription factors and enzymes involved in plant development, disease resistance, metabolic and signalling pathways. RLM-RACE experiments further mapped miRNA-mediated cleavage at five of the mRNA targets. In addition, miRNA isoforms corresponding to 18 miRNA families were also identified from black pepper. This study presents the first large-scale identification of microRNAs from black pepper and provides the foundation for the future studies of miRNA-mediated gene regulation of stress responses and diverse metabolic processes in black pepper.
Genomic Encyclopedia of Type Strains, Phase I: The one thousand microbial genomes (KMG-I) project

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kyrpides, Nikos C.; Woyke, Tanja; Eisen, Jonathan A.

The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project was launched by the JGI in 2007 as a pilot project with the objective of sequencing 250 bacterial and archaeal genomes. The two major goals of that project were (a) to test the hypothesis that there are many benefits to the use the phylogenetic diversity of organisms in the tree of life as a primary criterion for generating their genome sequence and (b) to develop the necessary framework, technology and organization for large-scale sequencing of microbial isolate genomes. While the GEBA pilot project has not yet been entirely completed, both ofmore » the original goals have already been successfully accomplished, leading the way for the next phase of the project. Here we propose taking the GEBA project to the next level, by generating high quality draft genomes for 1,000 bacterial and archaeal strains. This represents a combined 16-fold increase in both scale and speed as compared to the GEBA pilot project (250 isolate genomes in 4+ years). We will follow a similar approach for organism selection and sequencing prioritization as was done for the GEBA pilot project (i.e. phylogenetic novelty, availability and growth of cultures of type strains and DNA extraction capability), focusing on type strains as this ensures reproducibility of our results and provides the strongest linkage between genome sequences and other knowledge about each strain. In turn, this project will constitute a pilot phase of a larger effort that will target the genome sequences of all available type strains of the Bacteria and Archaea.« less
Estimation of daily stream flow of southeastern coastal plain watersheds by combining estimated magnitude and sequence

Treesearch

Herbert Ssegane; Devendra M. Amatya; E.W. Tollner; Zhaohua Dai; Jami E. Nettles

2013-01-01

Commonly used methods to predict streamflow at ungauged watersheds implicitly predict streamflow magnitude and temporal sequence concurrently. An alternative approach that has not been fully explored is the conceptualization of streamflow as a composite of two separable components of magnitude and sequence, where each component is estimated separately and then combined...
Recent patents of nanopore DNA sequencing technology: progress and challenges.

PubMed

Zhou, Jianfeng; Xu, Bingqian

2010-11-01

DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
ABACAS: algorithm-based automatic contiguation of assembled sequences

PubMed Central

Assefa, Samuel; Keane, Thomas M.; Otto, Thomas D.; Newbold, Chris; Berriman, Matthew

2009-01-01

Summary: Due to the availability of new sequencing technologies, we are now increasingly interested in sequencing closely related strains of existing finished genomes. Recently a number of de novo and mapping-based assemblers have been developed to produce high quality draft genomes from new sequencing technology reads. New tools are necessary to take contigs from a draft assembly through to a fully contiguated genome sequence. ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. The input to ABACAS is a set of contigs which will be aligned to the reference genome, ordered and orientated, visualized in the ACT comparative browser, and optimal primer sequences are automatically generated. Availability and Implementation: ABACAS is implemented in Perl and is freely available for download from http://abacas.sourceforge.net Contact: sa4@sanger.ac.uk PMID:19497936
Comparison of Next-Generation Sequencing Systems

PubMed Central

Liu, Lin; Li, Yinhu; Li, Siliang; Hu, Ni; He, Yimin; Pong, Ray; Lin, Danni; Lu, Lihua; Law, Maggie

2012-01-01

With fast development and wide applications of next-generation sequencing (NGS) technologies, genomic sequence information is within reach to aid the achievement of goals to decode life mysteries, make better crops, detect pathogens, and improve life qualities. NGS systems are typically represented by SOLiD/Ion Torrent PGM from Life Sciences, Genome Analyzer/HiSeq 2000/MiSeq from Illumina, and GS FLX Titanium/GS Junior from Roche. Beijing Genomics Institute (BGI), which possesses the world's biggest sequencing capacity, has multiple NGS systems including 137 HiSeq 2000, 27 SOLiD, one Ion Torrent PGM, one MiSeq, and one 454 sequencer. We have accumulated extensive experience in sample handling, sequencing, and bioinformatics analysis. In this paper, technologies of these systems are reviewed, and first-hand data from extensive experience is summarized and analyzed to discuss the advantages and specifics associated with each sequencing system. At last, applications of NGS are summarized. PMID:22829749
Massively Parallel DNA Sequencing Facilitates Diagnosis of Patients with Usher Syndrome Type 1

PubMed Central

Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-ichi

2014-01-01

Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance. PMID:24618850
Clustering evolving proteins into homologous families.

PubMed

Chan, Cheong Xin; Mahbob, Maisarah; Ragan, Mark A

2013-04-08

Clustering sequences into groups of putative homologs (families) is a critical first step in many areas of comparative biology and bioinformatics. The performance of clustering approaches in delineating biologically meaningful families depends strongly on characteristics of the data, including content bias and degree of divergence. New, highly scalable methods have recently been introduced to cluster the very large datasets being generated by next-generation sequencing technologies. However, there has been little systematic investigation of how characteristics of the data impact the performance of these approaches. Using clusters from a manually curated dataset as reference, we examined the performance of a widely used graph-based Markov clustering algorithm (MCL) and a greedy heuristic approach (UCLUST) in delineating protein families coded by three sets of bacterial genomes of different G+C content. Both MCL and UCLUST generated clusters that are comparable to the reference sets at specific parameter settings, although UCLUST tends to under-cluster compositionally biased sequences (G+C content 33% and 66%). Using simulated data, we sought to assess the individual effects of sequence divergence, rate heterogeneity, and underlying G+C content. Performance decreased with increasing sequence divergence, decreasing among-site rate variation, and increasing G+C bias. Two MCL-based methods recovered the simulated families more accurately than did UCLUST. MCL using local alignment distances is more robust across the investigated range of sequence features than are greedy heuristics using distances based on global alignment. Our results demonstrate that sequence divergence, rate heterogeneity and content bias can individually and in combination affect the accuracy with which MCL and UCLUST can recover homologous protein families. For application to data that are more divergent, and exhibit higher among-site rate variation and/or content bias, MCL may often be the better choice, especially if computational resources are not limiting.
Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

PubMed Central

Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

2015-01-01

Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486
Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.

PubMed

Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi

2014-01-01

Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.
Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study.

PubMed

Cerdeira, Louise Teixeira; Carneiro, Adriana Ribeiro; Ramos, Rommel Thiago Jucá; de Almeida, Sintia Silva; D'Afonseca, Vivian; Schneider, Maria Paula Cruz; Baumbach, Jan; Tauch, Andreas; McCulloch, John Anthony; Azevedo, Vasco Ariston Carvalho; Silva, Artur

2011-08-01

Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de novo strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain I19, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de novo strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former. Copyright © 2011 Elsevier B.V. All rights reserved.
Monitoring and Surveillance of Marine Invasive Species in Californian Waters by DNA Barcoding: Methodological and Analytical Solutions

NASA Astrophysics Data System (ADS)

Campbell, T. L.; Geller, J. B.; Heller, P.; Ruiz, G.; Chang, A.; McCann, L.; Ceballos, L.; Marraffini, M.; Ashton, G.; Larson, K.; Havard, S.; Meagher, K.; Wheelock, M.; Drake, C.; Rhett, G.

2016-02-01

The Ballast Water Management Act, the Marine Invasive Species Act, and the Coastal Ecosystem Protection Act require the California Department of Fish and Wildlife to monitor and evaluate the extent of biological invasions in the state's marine and estuarine waters. This has been performed statewide, using a variety of methodologies. Conventional sample collection and processing is laborious, slow and costly, and may require considerable taxonomic expertise requiring detailed time-consuming microscopic study of multiple specimens. These factors limit the volume of biomass that can be searched for introduced species. New technologies continue to reduce the cost and increase the throughput of genetic analyses, which become efficient alternatives to traditional morphological analysis for identification, monitoring and surveillance of marine invasive species. Using next-generation sequencing of mitochondrial Cytochrome c oxidase subunit I (COI) and nuclear large subunit ribosomal RNA (LSU), we analyzed over 15,000 individual marine invertebrates collected in Californian waters. We have created sequence databases of California native and non-native species to assist in molecular identification and surveillance in North American waters. Metagenetics, the next-generation sequencing of environmental samples with comparison to DNA sequence databases, is a faster and cost-effective alternative to individual sample analysis. We have sequenced from biomass collected from whole settlement plates and plankton in California harbors, and used our introduced species database to create species lists. We can combine these species lists for individual marinas with collected environmental data, such as temperature, salinity, and dissolved oxygen to understand the ecology of marine invasions. Here we discuss high throughput sampling, sequencing, and COASTLINE, our data analysis answer to challenges working with hundreds of millions of sequencing reads from tens of thousands of specimens.
An efficient and scalable graph modeling approach for capturing information at different levels in next generation sequencing reads

PubMed Central

2013-01-01

Background Next generation sequencing technologies have greatly advanced many research areas of the biomedical sciences through their capability to generate massive amounts of genetic information at unprecedented rates. The advent of next generation sequencing has led to the development of numerous computational tools to analyze and assemble the millions to billions of short sequencing reads produced by these technologies. While these tools filled an important gap, current approaches for storing, processing, and analyzing short read datasets generally have remained simple and lack the complexity needed to efficiently model the produced reads and assemble them correctly. Results Previously, we presented an overlap graph coarsening scheme for modeling read overlap relationships on multiple levels. Most current read assembly and analysis approaches use a single graph or set of clusters to represent the relationships among a read dataset. Instead, we use a series of graphs to represent the reads and their overlap relationships across a spectrum of information granularity. At each information level our algorithm is capable of generating clusters of reads from the reduced graph, forming an integrated graph modeling and clustering approach for read analysis and assembly. Previously we applied our algorithm to simulated and real 454 datasets to assess its ability to efficiently model and cluster next generation sequencing data. In this paper we extend our algorithm to large simulated and real Illumina datasets to demonstrate that our algorithm is practical for both sequencing technologies. Conclusions Our overlap graph theoretic algorithm is able to model next generation sequencing reads at various levels of granularity through the process of graph coarsening. Additionally, our model allows for efficient representation of the read overlap relationships, is scalable for large datasets, and is practical for both Illumina and 454 sequencing technologies. PMID:24564333
Eye vision system using programmable micro-optics and micro-electronics

NASA Astrophysics Data System (ADS)

Riza, Nabeel A.; Amin, M. Junaid; Riza, Mehdi N.

2014-02-01

Proposed is a novel eye vision system that combines the use of advanced micro-optic and microelectronic technologies that includes programmable micro-optic devices, pico-projectors, Radio Frequency (RF) and optical wireless communication and control links, energy harvesting and storage devices and remote wireless energy transfer capabilities. This portable light weight system can measure eye refractive powers, optimize light conditions for the eye under test, conduct color-blindness tests, and implement eye strain relief and eye muscle exercises via time sequenced imaging. Described is the basic design of the proposed system and its first stage system experimental results for vision spherical lens refractive error correction.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.