Test Input Generation for Red-Black Trees using Abstraction
NASA Technical Reports Server (NTRS)
Visser, Willem; Pasareanu, Corina S.; Pelanek, Radek
2005-01-01
We consider the problem of test input generation for code that manipulates complex data structures. Test inputs are sequences of method calls from the data structure interface. We describe test input generation techniques that rely on state matching to avoid generation of redundant tests. Exhaustive techniques use explicit state model checking to explore all the possible test sequences up to predefined input sizes. Lossy techniques rely on abstraction mappings to compute and store abstract versions of the concrete states; they explore under-approximations of all the possible test sequences. We have implemented the techniques on top of the Java PathFinder model checker and we evaluate them using a Java implementation of red-black trees.
Next-Generation Sequencing in the Mycology Lab.
Zoll, Jan; Snelders, Eveline; Verweij, Paul E; Melchers, Willem J G
New state-of-the-art techniques in sequencing offer valuable tools in both detection of mycobiota and in understanding of the molecular mechanisms of resistance against antifungal compounds and virulence. Introduction of new sequencing platform with enhanced capacity and a reduction in costs for sequence analysis provides a potential powerful tool in mycological diagnosis and research. In this review, we summarize the applications of next-generation sequencing techniques in mycology.
King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach
2014-01-01
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Optical Processing Techniques For Pseudorandom Sequence Prediction
NASA Astrophysics Data System (ADS)
Gustafson, Steven C.
1983-11-01
Pseudorandom sequences are series of apparently random numbers generated, for example, by linear or nonlinear feedback shift registers. An important application of these sequences is in spread spectrum communication systems, in which, for example, the transmitted carrier phase is digitally modulated rapidly and pseudorandomly and in which the information to be transmitted is incorporated as a slow modulation in the pseudorandom sequence. In this case the transmitted information can be extracted only by a receiver that uses for demodulation the same pseudorandom sequence used by the transmitter, and thus this type of communication system has a very high immunity to third-party interference. However, if a third party can predict in real time the probable future course of the transmitted pseudorandom sequence given past samples of this sequence, then interference immunity can be significantly reduced.. In this application effective pseudorandom sequence prediction techniques should be (1) applicable in real time to rapid (e.g., megahertz) sequence generation rates, (2) applicable to both linear and nonlinear pseudorandom sequence generation processes, and (3) applicable to error-prone past sequence samples of limited number and continuity. Certain optical processing techniques that may meet these requirements are discussed in this paper. In particular, techniques based on incoherent optical processors that perform general linear transforms or (more specifically) matrix-vector multiplications are considered. Computer simulation examples are presented which indicate that significant prediction accuracy can be obtained using these transforms for simple pseudorandom sequences. However, the useful prediction of more complex pseudorandom sequences will probably require the application of more sophisticated optical processing techniques.
Next-Generation Technologies for Multiomics Approaches Including Interactome Sequencing
Ohashi, Hiroyuki; Miyamoto-Sato, Etsuko
2015-01-01
The development of high-speed analytical techniques such as next-generation sequencing and microarrays allows high-throughput analysis of biological information at a low cost. These techniques contribute to medical and bioscience advancements and provide new avenues for scientific research. Here, we outline a variety of new innovative techniques and discuss their use in omics research (e.g., genomics, transcriptomics, metabolomics, proteomics, and interactomics). We also discuss the possible applications of these methods, including an interactome sequencing technology that we developed, in future medical and life science research. PMID:25649523
Impact of Next Generation Sequencing Techniques in Food Microbiology
Mayo, Baltasar; Rachid, Caio T. C. C; Alegría, Ángel; Leite, Analy M. O; Peixoto, Raquel S; Delgado, Susana
2014-01-01
Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. PMID:25132799
Altimari, Annalisa; de Biase, Dario; De Maglio, Giovanna; Gruppioni, Elisa; Capizzi, Elisa; Degiovanni, Alessio; D’Errico, Antonia; Pession, Annalisa; Pizzolitto, Stefano; Fiorentino, Michelangelo; Tallini, Giovanni
2013-01-01
Detection of KRAS mutations in archival pathology samples is critical for therapeutic appropriateness of anti-EGFR monoclonal antibodies in colorectal cancer. We compared the sensitivity, specificity, and accuracy of Sanger sequencing, ARMS-Scorpion (TheraScreen®) real-time polymerase chain reaction (PCR), pyrosequencing, chip array hybridization, and 454 next-generation sequencing to assess KRAS codon 12 and 13 mutations in 60 nonconsecutive selected cases of colorectal cancer. Twenty of the 60 cases were detected as wild-type KRAS by all methods with 100% specificity. Among the 40 mutated cases, 13 were discrepant with at least one method. The sensitivity was 85%, 90%, 93%, and 92%, and the accuracy was 90%, 93%, 95%, and 95% for Sanger sequencing, TheraScreen real-time PCR, pyrosequencing, and chip array hybridization, respectively. The main limitation of Sanger sequencing was its low analytical sensitivity, whereas TheraScreen real-time PCR, pyrosequencing, and chip array hybridization showed higher sensitivity but suffered from the limitations of predesigned assays. Concordance between the methods was k = 0.79 for Sanger sequencing and k > 0.85 for the other techniques. Tumor cell enrichment correlated significantly with the abundance of KRAS-mutated deoxyribonucleic acid (DNA), evaluated as ΔCt for TheraScreen real-time PCR (P = 0.03), percentage of mutation for pyrosequencing (P = 0.001), ratio for chip array hybridization (P = 0.003), and percentage of mutation for 454 next-generation sequencing (P = 0.004). Also, 454 next-generation sequencing showed the best cross correlation for quantification of mutation abundance compared with all the other methods (P < 0.001). Our comparison showed the superiority of next-generation sequencing over the other techniques in terms of sensitivity and specificity. Next-generation sequencing will replace Sanger sequencing as the reference technique for diagnostic detection of KRAS mutation in archival tumor tissues. PMID:23950653
Talking Drums: Generating drum grooves with neural networks
NASA Astrophysics Data System (ADS)
Hutchings, P.
2017-05-01
Presented is a method of generating a full drum kit part for a provided kick-drum sequence. A sequence to sequence neural network model used in natural language translation was adopted to encode multiple musical styles and an online survey was developed to test different techniques for sampling the output of the softmax function. The strongest results were found using a sampling technique that drew from the three most probable outputs at each subdivision of the drum pattern but the consistency of output was found to be heavily dependent on style.
2016-07-06
1 Targeted next-generation sequencing for the detection of ciprofloxacin resistance markers using molecular inversion probes Christopher P...development and evaluation of a panel of 44 single-stranded molecular inversion probes (MIPs) coupled to next-generation sequencing (NGS) for the...padlock and molecular inversion probes as upfront enrichment steps for use with NGS showed the specificity and multiplexability of these techniques
Association mining of dependency between time series
NASA Astrophysics Data System (ADS)
Hafez, Alaaeldin
2001-03-01
Time series analysis is considered as a crucial component of strategic control over a broad variety of disciplines in business, science and engineering. Time series data is a sequence of observations collected over intervals of time. Each time series describes a phenomenon as a function of time. Analysis on time series data includes discovering trends (or patterns) in a time series sequence. In the last few years, data mining has emerged and been recognized as a new technology for data analysis. Data Mining is the process of discovering potentially valuable patterns, associations, trends, sequences and dependencies in data. Data mining techniques can discover information that many traditional business analysis and statistical techniques fail to deliver. In this paper, we adapt and innovate data mining techniques to analyze time series data. By using data mining techniques, maximal frequent patterns are discovered and used in predicting future sequences or trends, where trends describe the behavior of a sequence. In order to include different types of time series (e.g. irregular and non- systematic), we consider past frequent patterns of the same time sequences (local patterns) and of other dependent time sequences (global patterns). We use the word 'dependent' instead of the word 'similar' for emphasis on real life time series where two time series sequences could be completely different (in values, shapes, etc.), but they still react to the same conditions in a dependent way. In this paper, we propose the Dependence Mining Technique that could be used in predicting time series sequences. The proposed technique consists of three phases: (a) for all time series sequences, generate their trend sequences, (b) discover maximal frequent trend patterns, generate pattern vectors (to keep information of frequent trend patterns), use trend pattern vectors to predict future time series sequences.
Probabilistic topic modeling for the analysis and classification of genomic sequences
2015-01-01
Background Studies on genomic sequences for classification and taxonomic identification have a leading role in the biomedical field and in the analysis of biodiversity. These studies are focusing on the so-called barcode genes, representing a well defined region of the whole genome. Recently, alignment-free techniques are gaining more importance because they are able to overcome the drawbacks of sequence alignment techniques. In this paper a new alignment-free method for DNA sequences clustering and classification is proposed. The method is based on k-mers representation and text mining techniques. Methods The presented method is based on Probabilistic Topic Modeling, a statistical technique originally proposed for text documents. Probabilistic topic models are able to find in a document corpus the topics (recurrent themes) characterizing classes of documents. This technique, applied on DNA sequences representing the documents, exploits the frequency of fixed-length k-mers and builds a generative model for a training group of sequences. This generative model, obtained through the Latent Dirichlet Allocation (LDA) algorithm, is then used to classify a large set of genomic sequences. Results and conclusions We performed classification of over 7000 16S DNA barcode sequences taken from Ribosomal Database Project (RDP) repository, training probabilistic topic models. The proposed method is compared to the RDP tool and Support Vector Machine (SVM) classification algorithm in a extensive set of trials using both complete sequences and short sequence snippets (from 400 bp to 25 bp). Our method reaches very similar results to RDP classifier and SVM for complete sequences. The most interesting results are obtained when short sequence snippets are considered. In these conditions the proposed method outperforms RDP and SVM with ultra short sequences and it exhibits a smooth decrease of performance, at every taxonomic level, when the sequence length is decreased. PMID:25916734
Choi, Jung-Han
2011-01-01
This study aimed to evaluate the effect of different screw-tightening sequences, torques, and methods on the strains generated on an internal-connection implant (Astra Tech) superstructure with good fit. An edentulous mandibular master model and a metal framework directly connected to four parallel implants with a passive fit to each other were fabricated. Six stone casts were made from a dental stone master model by a splinted impression technique to represent a well-fitting situation with the metal framework. Strains generated by four screw-tightening sequences (1-2-3-4, 4-3-2-1, 2-4-3-1, and 2-3-1-4), two torques (10 and 20 Ncm), and two methods (one-step and two-step) were evaluated. In the two-step method, screws were tightened to the initial torque (10 Ncm) in a predetermined screw-tightening sequence and then to the final torque (20 Ncm) in the same sequence. Strains were recorded twice by three strain gauges attached to the framework (superior face midway between abutments). Deformation data were analyzed using multiple analysis of variance at a .05 level of statistical significance. In all stone casts, strains were produced by connection of the superstructure, regardless of screw-tightening sequence, torque, and method. No statistically significant differences in superstructure strains were found based on screw-tightening sequences (range, -409.8 to -413.8 μm/m), torques (-409.7 and -399.1 μm/m), or methods (-399.1 and -410.3 μm/m). Within the limitations of this in vitro study, screw-tightening sequence, torque, and method were not critical factors for the strain generated on a well-fitting internal-connection implant superstructure by the splinted impression technique. Further studies are needed to evaluate the effect of screw-tightening techniques on the preload stress in various different clinical situations.
A note on chaotic unimodal maps and applications.
Zhou, C T; He, X T; Yu, M Y; Chew, L Y; Wang, X G
2006-09-01
Based on the word-lift technique of symbolic dynamics of one-dimensional unimodal maps, we investigate the relation between chaotic kneading sequences and linear maximum-length shift-register sequences. Theoretical and numerical evidence that the set of the maximum-length shift-register sequences is a subset of the set of the universal sequence of one-dimensional chaotic unimodal maps is given. By stabilizing unstable periodic orbits on superstable periodic orbits, we also develop techniques to control the generation of long binary sequences.
A high-speed on-chip pseudo-random binary sequence generator for multi-tone phase calibration
NASA Astrophysics Data System (ADS)
Gommé, Liesbeth; Vandersteen, Gerd; Rolain, Yves
2011-07-01
An on-chip reference generator is conceived by adopting the technique of decimating a pseudo-random binary sequence (PRBS) signal in parallel sequences. This is of great benefit when high-speed generation of PRBS and PRBS-derived signals is the objective. The design implemented standard CMOS logic is available in commercial libraries to provide the logic functions for the generator. The design allows the user to select the periodicity of the PRBS and the PRBS-derived signals. The characterization of the on-chip generator marks its performance and reveals promising specifications.
Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment
2011-01-01
Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510
Optimization of conditions to sequence long cDNAs from viruses
USDA-ARS?s Scientific Manuscript database
Fourth generation sequencing with the Minion nanopore sequencer provides opportunity to obtain deep coverage and long read for single molecules. This will benefit studies on RNA viruses. In the past, Sanger, Illumina, and Ion Torrent sequencing have been utilized to study RNA viruses. Both technique...
USDA-ARS?s Scientific Manuscript database
Current technologies with next generation sequencing have revolutionized metagenomics analysis of clinical samples. To achieve the non-selective amplification and recovery of low abundance genetic sequences, a simplified Sequence-Independent, Single-Primer Amplification (SISPA) technique in combinat...
Gullapalli, Rama R; Desai, Ketaki V; Santana-Santos, Lucas; Kant, Jeffrey A; Becich, Michael J
2012-01-01
The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it's hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future.
Gullapalli, Rama R.; Desai, Ketaki V.; Santana-Santos, Lucas; Kant, Jeffrey A.; Becich, Michael J.
2012-01-01
The Human Genome Project (HGP) provided the initial draft of mankind's DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS) techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized.[7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it's hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future. PMID:23248761
USDA-ARS?s Scientific Manuscript database
Early stage infections caused by fungal/oomycete spores can remain undetected until signs or symptoms develop. Serological and molecular techniques are currently used for detecting these pathogens. Next-generation sequencing (NGS) has potential as a diagnostic tool, due to the capacity to target mul...
ERIC Educational Resources Information Center
Bowling, Bethany; Zimmer, Erin; Pyatt, Robert E.
2014-01-01
Although the development of next-generation (NextGen) sequencing technologies has revolutionized genomic research and medicine, the incorporation of these topics into the classroom is challenging, given an implied high degree of technical complexity. We developed an easy-to-implement, interactive classroom activity investigating the similarities…
Szymanski, Maciej; Karlowski, Wojciech M
2016-01-01
In eukaryotes, ribosomal 5S rRNAs are products of multigene families organized within clusters of tandemly repeated units. Accumulation of genomic data obtained from a variety of organisms demonstrated that the potential 5S rRNA coding sequences show a large number of variants, often incompatible with folding into a correct secondary structure. Here, we present results of an analysis of a large set of short RNA sequences generated by the next generation sequencing techniques, to address the problem of heterogeneity of the 5S rRNA transcripts in Arabidopsis and identification of potentially functional rRNA-derived fragments.
Cao, Yu; Fanning, Séamus; Proos, Sinéad; Jordan, Kieran; Srikumar, Shabarinath
2017-01-01
The development of next generation sequencing (NGS) techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents) or a food manufacturing facility econiche (e.g., floor drain). To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods. PMID:29033905
Unlocking hidden genomic sequence
Keith, Jonathan M.; Cochran, Duncan A. E.; Lala, Gita H.; Adams, Peter; Bryant, Darryn; Mitchelson, Keith R.
2004-01-01
Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs. PMID:14973330
Dilliott, Allison A; Farhan, Sali M K; Ghani, Mahdi; Sato, Christine; Liang, Eric; Zhang, Ming; McIntyre, Adam D; Cao, Henian; Racacho, Lemuel; Robinson, John F; Strong, Michael J; Masellis, Mario; Bulman, Dennis E; Rogaeva, Ekaterina; Lang, Anthony; Tartaglia, Carmela; Finger, Elizabeth; Zinman, Lorne; Turnbull, John; Freedman, Morris; Swartz, Rick; Black, Sandra E; Hegele, Robert A
2018-04-04
Next-generation sequencing (NGS) is quickly revolutionizing how research into the genetic determinants of constitutional disease is performed. The technique is highly efficient with millions of sequencing reads being produced in a short time span and at relatively low cost. Specifically, targeted NGS is able to focus investigations to genomic regions of particular interest based on the disease of study. Not only does this further reduce costs and increase the speed of the process, but it lessens the computational burden that often accompanies NGS. Although targeted NGS is restricted to certain regions of the genome, preventing identification of potential novel loci of interest, it can be an excellent technique when faced with a phenotypically and genetically heterogeneous disease, for which there are previously known genetic associations. Because of the complex nature of the sequencing technique, it is important to closely adhere to protocols and methodologies in order to achieve sequencing reads of high coverage and quality. Further, once sequencing reads are obtained, a sophisticated bioinformatics workflow is utilized to accurately map reads to a reference genome, to call variants, and to ensure the variants pass quality metrics. Variants must also be annotated and curated based on their clinical significance, which can be standardized by applying the American College of Medical Genetics and Genomics Pathogenicity Guidelines. The methods presented herein will display the steps involved in generating and analyzing NGS data from a targeted sequencing panel, using the ONDRISeq neurodegenerative disease panel as a model, to identify variants that may be of clinical significance.
ChIP-seq: advantages and challenges of a maturing technology.
Park, Peter J
2009-10-01
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a technique for genome-wide profiling of DNA-binding proteins, histone modifications or nucleosomes. Owing to the tremendous progress in next-generation sequencing technology, ChIP-seq offers higher resolution, less noise and greater coverage than its array-based predecessor ChIP-chip. With the decreasing cost of sequencing, ChIP-seq has become an indispensable tool for studying gene regulation and epigenetic mechanisms. In this Review, I describe the benefits and challenges in harnessing this technique with an emphasis on issues related to experimental design and data analysis. ChIP-seq experiments generate large quantities of data, and effective computational analysis will be crucial for uncovering biological mechanisms.
Computerized technique for recording board defect data
R. Bruce Anderson; R. Edward Thomas; Charles J. Gatchell; Neal D. Bennett; Neal D. Bennett
1993-01-01
A computerized technique for recording board defect data has been developed that is faster and more accurate than manual techniques. The lumber database generated by this technique is a necessary input to computer simulation models that estimate potential cutting yields from various lumber breakdown sequences. The technique allows collection of detailed information...
Ship Speed Retrieval From Single Channel TerraSAR-X Data
NASA Astrophysics Data System (ADS)
Soccorsi, Matteo; Lehner, Susanne
2010-04-01
A method to estimate the speed of a moving ship is presented. The technique, introduced in Kirscht (1998), is extended to marine application and validated on TerraSAR-X High-Resolution (HR) data. The generation of a sequence of single-look SAR images from a single- channel image corresponds to an image time series with reduced resolution. This allows applying change detection techniques on the time series to evaluate the velocity components in range and azimuth of the ship. The evaluation of the displacement vector of a moving target in consecutive images of the sequence allows the estimation of the azimuth velocity component. The range velocity component is estimated by evaluating the variation of the signal amplitude during the sequence. In order to apply the technique on TerraSAR-X Spot Light (SL) data a further processing step is needed. The phase has to be corrected as presented in Eineder et al. (2009) due to the SL acquisition mode; otherwise the image sequence cannot be generated. The analysis, when possible validated by the Automatic Identification System (AIS), was performed in the framework of the ESA project MARISS.
Archer, Stuart K; Shirokikh, Nikolay E; Preiss, Thomas
2015-04-01
Most applications for RNA-seq require the depletion of abundant transcripts to gain greater coverage of the underlying transcriptome. The sequences to be targeted for depletion depend on application and species and in many cases may not be supported by commercial depletion kits. This unit describes a method for generating RNA-seq libraries that incorporates probe-directed degradation (PDD), which can deplete any unwanted sequence set, with the low-bias split-adapter method of library generation (although many other library generation methods are in principle compatible). The overall strategy is suitable for applications requiring customized sequence depletion or where faithful representation of fragment ends and lack of sequence bias is paramount. We provide guidelines to rapidly design specific probes against the target sequence, and a detailed protocol for library generation using the split-adapter method including several strategies for streamlining the technique and reducing adapter dimer content. Copyright © 2015 John Wiley & Sons, Inc.
Bioinspired second harmonic generation
NASA Astrophysics Data System (ADS)
Sonay, Ali Y.; Pantazis, Periklis
2017-07-01
Second harmonic generation (SHG) is a microscopic technique applicable to a broad spectrum of biological and medical imaging due to its excellent photostability, high signal-to-noise ratio (SNR) and narrow emission profile. Current SHG microscopy techniques rely on two main contrast modalities. These are endogenous SHG generated by tissue structures, which is clinically relevant but cannot be targeted to another location, or SHG nanoprobes, inorganic nanocrystals that can be directed to proteins and cells of interest, but cannot be applied for clinical imaging due to their chemical composition. Here we analyzed SHG signal generated by large-scale peptide assemblies. Our results show the sequence of peptides play an important role on both the morphology and SHG signal of the peptide assemblies. Changing peptide sequence allows confinement of large number of peptides to smaller voxels, generating intense SHG signal. With miniaturization of these peptides and their proper functionalization strategies, such bioinspired nanoparticles would emerge as valuable tools for clinical imaging.
Counting of oligomers in sequences generated by markov chains for DNA motif discovery.
Shan, Gao; Zheng, Wei-Mou
2009-02-01
By means of the technique of the imbedded Markov chain, an efficient algorithm is proposed to exactly calculate first, second moments of word counts and the probability for a word to occur at least once in random texts generated by a Markov chain. A generating function is introduced directly from the imbedded Markov chain to derive asymptotic approximations for the problem. Two Z-scores, one based on the number of sequences with hits and the other on the total number of word hits in a set of sequences, are examined for discovery of motifs on a set of promoter sequences extracted from A. thaliana genome. Source code is available at http://www.itp.ac.cn/zheng/oligo.c.
Analyzing Immunoglobulin Repertoires
Chaudhary, Neha; Wesemann, Duane R.
2018-01-01
Somatic assembly of T cell receptor and B cell receptor (BCR) genes produces a vast diversity of lymphocyte antigen recognition capacity. The advent of efficient high-throughput sequencing of lymphocyte antigen receptor genes has recently generated unprecedented opportunities for exploration of adaptive immune responses. With these opportunities have come significant challenges in understanding the analysis techniques that most accurately reflect underlying biological phenomena. In this regard, sample preparation and sequence analysis techniques, which have largely been borrowed and adapted from other fields, continue to evolve. Here, we review current methods and challenges of library preparation, sequencing and statistical analysis of lymphocyte receptor repertoire studies. We discuss the general steps in the process of immune repertoire generation including sample preparation, platforms available for sequencing, processing of sequencing data, measurable features of the immune repertoire, and the statistical tools that can be used for analysis and interpretation of the data. Because BCR analysis harbors additional complexities, such as immunoglobulin (Ig) (i.e., antibody) gene somatic hypermutation and class switch recombination, the emphasis of this review is on Ig/BCR sequence analysis. PMID:29593723
Zhou, Bin; Lin, Xudong; Wang, Wei; Halpin, Rebecca A.; Bera, Jayati; Stockwell, Timothy B.; Barr, Ian G.
2014-01-01
Although human influenza B virus (IBV) is a significant human pathogen, its great genetic diversity has limited our ability to universally amplify the entire genome for subsequent sequencing or vaccine production. The generation of sequence data via next-generation approaches and the rapid cloning of viral genes are critical for basic research, diagnostics, antiviral drugs, and vaccines to combat IBV. To overcome the difficulty of amplifying the diverse and ever-changing IBV genome, we developed and optimized techniques that amplify the complete segmented negative-sense RNA genome from any IBV strain in a single tube/well (IBV genomic amplification [IBV-GA]). Amplicons for >1,000 diverse IBV genomes from different sample types (e.g., clinical specimens) were generated and sequenced using this robust technology. These approaches are sensitive, robust, and sequence independent (i.e., universally amplify past, present, and future IBVs), which facilitates next-generation sequencing and advanced genomic diagnostics. Importantly, special terminal sequences engineered into the optimized IBV-GA2 products also enable ligation-free cloning to rapidly generate reverse-genetics plasmids, which can be used for the rescue of recombinant viruses and/or the creation of vaccine seed stock. PMID:24501036
Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique
Li, Chaozheng; Weng, Shaoping; Chen, Yonggui; Yu, Xiaoqiang; Lü, Ling; Zhang, Haiqing; He, Jianguo; Xu, Xiaopeng
2012-01-01
Background Pacific white shrimp (Litopenaeus vannamei), the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. Methodology/Principal Findings This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp) with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG) categories, 8171 unigenes were assigned into 51 Gene ontology (GO) functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. Conclusions/Significance The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei. PMID:23071809
Agricultural biodiversity in the post-genomics era
USDA-ARS?s Scientific Manuscript database
The toolkit available for assessing and utilizing biological diversity within agricultural systems is rapidly expanding. In particular, genome and transcriptome re-sequencing as well as genome complexity reduction techniques are gaining popularity as the cost of generating short read sequence data d...
DNA Base-Calling from a Nanopore Using a Viterbi Algorithm
Timp, Winston; Comer, Jeffrey; Aksimentiev, Aleksei
2012-01-01
Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (∼98%), even with a poor signal/noise ratio. PMID:22677395
Navigating Microbiological Food Safety in the Era of Whole-Genome Sequencing
Nasheri, Neda; Petronella, Nicholas; Pagotto, Franco
2016-01-01
SUMMARY The epidemiological investigation of a foodborne outbreak, including identification of related cases, source attribution, and development of intervention strategies, relies heavily on the ability to subtype the etiological agent at a high enough resolution to differentiate related from nonrelated cases. Historically, several different molecular subtyping methods have been used for this purpose; however, emerging techniques, such as single nucleotide polymorphism (SNP)-based techniques, that use whole-genome sequencing (WGS) offer a resolution that was previously not possible. With WGS, unlike traditional subtyping methods that lack complete information, data can be used to elucidate phylogenetic relationships and disease-causing lineages can be tracked and monitored over time. The subtyping resolution and evolutionary context provided by WGS data allow investigators to connect related illnesses that would be missed by traditional techniques. The added advantage of data generated by WGS is that these data can also be used for secondary analyses, such as virulence gene detection, antibiotic resistance gene profiling, synteny comparisons, mobile genetic element identification, and geographic attribution. In addition, several software packages are now available to generate in silico results for traditional molecular subtyping methods from the whole-genome sequence, allowing for efficient comparison with historical databases. Metagenomic approaches using next-generation sequencing have also been successful in the detection of nonculturable foodborne pathogens. This review addresses state-of-the-art techniques in microbial WGS and analysis and then discusses how this technology can be used to help support food safety investigations. Retrospective outbreak investigations using WGS are presented to provide organism-specific examples of the benefits, and challenges, associated with WGS in comparison to traditional molecular subtyping techniques. PMID:27559074
Quantum-Sequencing: Fast electronic single DNA molecule sequencing
NASA Astrophysics Data System (ADS)
Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant
2014-03-01
A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.
Tian, Yao; Smith, David Roy
2016-05-01
Thousands of mitochondrial genomes have been sequenced, but there are comparatively few available mitochondrial transcriptomes. This might soon be changing. High-throughput RNA sequencing (RNA-Seq) techniques have made it fast and cheap to generate massive amounts of mitochondrial transcriptomic data. Here, we explore the utility of RNA-Seq for assembling mitochondrial genomes and studying their expression patterns. Specifically, we investigate the mitochondrial transcriptomes from Polytomella non-photosynthetic green algae, which have among the smallest, most reduced mitochondrial genomes from the Archaeplastida as well as fragmented rRNA-coding regions, palindromic genes, and linear chromosomes with telomeres. Isolation of whole genomic RNA from the four known Polytomella species followed by Illumina paired-end sequencing generated enough mitochondrial-derived reads to easily recover almost-entire mitochondrial genome sequences. Read-mapping and coverage statistics also gave insights into Polytomella mitochondrial transcriptional architecture, revealing polycistronic transcripts and the expression of telomeres and palindromic genes. Ultimately, RNA-Seq is a promising, cost-effective technique for studying mitochondrial genetics, but it does have drawbacks, which are discussed. One of its greatest potentials, as shown here, is that it can be used to generate near-complete mitochondrial genome sequences, which could be particularly useful in situations where there is a lack of available mtDNA data. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Skipping Strategy (SS) for Initial Population of Job-Shop Scheduling Problem
NASA Astrophysics Data System (ADS)
Abdolrazzagh-Nezhad, M.; Nababan, E. B.; Sarim, H. M.
2018-03-01
Initial population in job-shop scheduling problem (JSSP) is an essential step to obtain near optimal solution. Techniques used to solve JSSP are computationally demanding. Skipping strategy (SS) is employed to acquire initial population after sequence of job on machine and sequence of operations (expressed in Plates-jobs and mPlates-jobs) are determined. The proposed technique is applied to benchmark datasets and the results are compared to that of other initialization techniques. It is shown that the initial population obtained from the SS approach could generate optimal solution.
Nissan, J; Gross, M; Shifman, A; Assif, D
2001-07-01
Unfavorable stress distribution and occlusal overload have been reported to result in failures ranging from screw loosening to loss of osseointegration. The purpose of this study was to assess the effect of different tightening forces and sequences, with different operators, on stresses generated on an accurately fitting implant superstructure on multiple working casts made with a splinted impression technique. The effects of different tightening forces (10 and 20 Ncm) were assessed with the use of 30 stone casts made from a metal master model with a splinted impression technique. Stresses generated were recorded by 4 strain gauges attached to the superior surface of the master framework. A multiple analysis of variance with repeated measures was performed to test for significant differences among the groups. Tightening force values at 10 Ncm ranged from 150.43 to 256 Ncm. At 20 Ncm, microstrain values ranged from 149.43 to 284.37 Ncm. Microstrain values related to the sequence of tightening ranged from 150.8 to 308.43 Ncm (left to right) and 154.63 to 274.80 Ncm (right to left). For the different operators, microstrain values ranged from 100.13 to 206.07 Ncm. No statistically significant differences among the variables of tightening force, tightening sequence, and operators were found ( P >.05). The interaction between groups and strain gauges was also found to be nonsignificant (P >.05). The potential of variable tightening force and tightening sequence to generate unfavorable preload stresses can be minimized through use of the splinted impression technique, which ensures an accurately fitting superstructure.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.
Sucher, Nikolaus J; Hennell, James R; Carles, Maria C
2012-01-01
DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Meher, J K; Meher, P K; Dash, G N; Raval, M K
2012-01-01
The first step in gene identification problem based on genomic signal processing is to convert character strings into numerical sequences. These numerical sequences are then analysed spectrally or using digital filtering techniques for the period-3 peaks, which are present in exons (coding areas) and absent in introns (non-coding areas). In this paper, we have shown that single-indicator sequences can be generated by encoding schemes based on physico-chemical properties. Two new methods are proposed for generating single-indicator sequences based on hydration energy and dipole moments. The proposed methods produce high peak at exon locations and effectively suppress false exons (intron regions having greater peak than exon regions) resulting in high discriminating factor, sensitivity and specificity.
Using chaos to generate variations on movement sequences
NASA Astrophysics Data System (ADS)
Bradley, Elizabeth; Stuart, Joshua
1998-12-01
We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.
Molecular taxonomic techniques such as DNA barcoding offer interesting new capabilities for studying community biodiversity for applications like biological monitoring. Beyond DNA barcoding, new DNA sequencing technologies (i.e. Next-Generation Sequencing) present even greater po...
Mathias, Patrick C; Turner, Emily H; Scroggins, Sheena M; Salipante, Stephen J; Hoffman, Noah G; Pritchard, Colin C; Shirts, Brian H
2016-03-01
To apply techniques for ancestry and sex computation from next-generation sequencing (NGS) data as an approach to confirm sample identity and detect sample processing errors. We combined a principal component analysis method with k-nearest neighbors classification to compute the ancestry of patients undergoing NGS testing. By combining this calculation with X chromosome copy number data, we determined the sex and ancestry of patients for comparison with self-report. We also modeled the sensitivity of this technique in detecting sample processing errors. We applied this technique to 859 patient samples with reliable self-report data. Our k-nearest neighbors ancestry screen had an accuracy of 98.7% for patients reporting a single ancestry. Visual inspection of principal component plots was consistent with self-report in 99.6% of single-ancestry and mixed-ancestry patients. Our model demonstrates that approximately two-thirds of potential sample swaps could be detected in our patient population using this technique. Patient ancestry can be estimated from NGS data incidentally sequenced in targeted panels, enabling an inexpensive quality control method when coupled with patient self-report. © American Society for Clinical Pathology, 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DNA base-calling from a nanopore using a Viterbi algorithm.
Timp, Winston; Comer, Jeffrey; Aksimentiev, Aleksei
2012-05-16
Nanopore-based DNA sequencing is the most promising third-generation sequencing method. It has superior read length, speed, and sample requirements compared with state-of-the-art second-generation methods. However, base-calling still presents substantial difficulty because the resolution of the technique is limited compared with the measured signal/noise ratio. Here we demonstrate a method to decode 3-bp-resolution nanopore electrical measurements into a DNA sequence using a Hidden Markov model. This method shows tremendous potential for accuracy (~98%), even with a poor signal/noise ratio. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Generating constrained randomized sequences: item frequency matters.
French, Robert M; Perruchet, Pierre
2009-11-01
All experimental psychologists understand the importance of randomizing lists of items. However, randomization is generally constrained, and these constraints-in particular, not allowing immediately repeated items-which are designed to eliminate particular biases, frequently engender others. We describe a simple Monte Carlo randomization technique that solves a number of these problems. However, in many experimental settings, we are concerned not only with the number and distribution of items but also with the number and distribution of transitions between items. The algorithm mentioned above provides no control over this. We therefore introduce a simple technique that uses transition tables for generating correctly randomized sequences. We present an analytic method of producing item-pair frequency tables and item-pair transitional probability tables when immediate repetitions are not allowed. We illustrate these difficulties and how to overcome them, with reference to a classic article on word segmentation in infants. Finally, we provide free access to an Excel file that allows users to generate transition tables with up to 10 different item types, as well as to generate appropriately distributed randomized sequences of any length without immediately repeated elements. This file is freely available from http://leadserv.u-bourgogne.fr/IMG/xls/TransitionMatrix.xls.
Transforming clinical microbiology with bacterial genome sequencing.
Didelot, Xavier; Bowden, Rory; Wilson, Daniel J; Peto, Tim E A; Crook, Derrick W
2012-09-01
Whole-genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here, we review the current status of clinical microbiology and how it has already begun to be transformed by using next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties, such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. We predict that the application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow.
Transforming clinical microbiology with bacterial genome sequencing
2016-01-01
Whole genome sequencing of bacteria has recently emerged as a cost-effective and convenient approach for addressing many microbiological questions. Here we review the current status of clinical microbiology and how it has already begun to be transformed by the use of next-generation sequencing. We focus on three essential tasks: identifying the species of an isolate, testing its properties such as resistance to antibiotics and virulence, and monitoring the emergence and spread of bacterial pathogens. The application of next-generation sequencing will soon be sufficiently fast, accurate and cheap to be used in routine clinical microbiology practice, where it could replace many complex current techniques with a single, more efficient workflow. PMID:22868263
Wood, Henry M; Belvedere, Ornella; Conway, Caroline; Daly, Catherine; Chalkley, Rebecca; Bickerdike, Melissa; McKinley, Claire; Egan, Phil; Ross, Lisa; Hayward, Bruce; Morgan, Joanne; Davidson, Leslie; MacLennan, Ken; Ong, Thian K; Papagiannopoulos, Kostas; Cook, Ian; Adams, David J; Taylor, Graham R; Rabbitts, Pamela
2010-08-01
The use of next-generation sequencing technologies to produce genomic copy number data has recently been described. Most approaches, however, reply on optimal starting DNA, and are therefore unsuitable for the analysis of formalin-fixed paraffin-embedded (FFPE) samples, which largely precludes the analysis of many tumour series. We have sought to challenge the limits of this technique with regards to quality and quantity of starting material and the depth of sequencing required. We confirm that the technique can be used to interrogate DNA from cell lines, fresh frozen material and FFPE samples to assess copy number variation. We show that as little as 5 ng of DNA is needed to generate a copy number karyogram, and follow this up with data from a series of FFPE biopsies and surgical samples. We have used various levels of sample multiplexing to demonstrate the adjustable resolution of the methodology, depending on the number of samples and available resources. We also demonstrate reproducibility by use of replicate samples and comparison with microarray-based comparative genomic hybridization (aCGH) and digital PCR. This technique can be valuable in both the analysis of routine diagnostic samples and in examining large repositories of fixed archival material.
From genomics to functional markers in the era of next-generation sequencing.
Salgotra, R K; Gupta, B B; Stewart, C N
2014-03-01
The availability of complete genome sequences, along with other genomic resources for Arabidopsis, rice, pigeon pea, soybean and other crops, has revolutionized our understanding of the genetic make-up of plants. Next-generation DNA sequencing (NGS) has facilitated single nucleotide polymorphism discovery in plants. Functionally-characterized sequences can be identified and functional markers (FMs) for important traits can be developed at an ever-increasing ease. FMs are derived from sequence polymorphisms found in allelic variants of a functional gene. Linkage disequilibrium-based association mapping and homologous recombinants have been developed for identification of "perfect" markers for their use in crop improvement practices. Compared with many other molecular markers, FMs derived from the functionally characterized sequence genes using NGS techniques and their use provide opportunities to develop high-yielding plant genotypes resistant to various stresses at a fast pace.
Peck, Michelle A; Sturk-Andreaggi, Kimberly; Thomas, Jacqueline T; Oliver, Robert S; Barritt-Ross, Suzanne; Marshall, Charla
2018-05-01
Generating mitochondrial genome (mitogenome) data from reference samples in a rapid and efficient manner is critical to harnessing the greater power of discrimination of the entire mitochondrial DNA (mtDNA) marker. The method of long-range target enrichment, Nextera XT library preparation, and Illumina sequencing on the MiSeq is a well-established technique for generating mitogenome data from high-quality samples. To this end, a validation was conducted for this mitogenome method processing up to 24 samples simultaneously along with analysis in the CLC Genomics Workbench and utilizing the AQME (AFDIL-QIAGEN mtDNA Expert) tool to generate forensic profiles. This validation followed the Federal Bureau of Investigation's Quality Assurance Standards (QAS) for forensic DNA testing laboratories and the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines. The evaluation of control DNA, non-probative samples, blank controls, mixtures, and nonhuman samples demonstrated the validity of this method. Specifically, the sensitivity was established at ≥25 pg of nuclear DNA input for accurate mitogenome profile generation. Unreproducible low-level variants were observed in samples with low amplicon yields. Further, variant quality was shown to be a useful metric for identifying sequencing error and crosstalk. Success of this method was demonstrated with a variety of reference sample substrates and extract types. These studies further demonstrate the advantages of using NGS techniques by highlighting the quantitative nature of heteroplasmy detection. The results presented herein from more than 175 samples processed in ten sequencing runs, show this mitogenome sequencing method and analysis strategy to be valid for the generation of reference data. Copyright © 2018 Elsevier B.V. All rights reserved.
Optimized scheduling technique of null subcarriers for peak power control in 3GPP LTE downlink.
Cho, Soobum; Park, Sang Kyu
2014-01-01
Orthogonal frequency division multiple access (OFDMA) is a key multiple access technique for the long term evolution (LTE) downlink. However, high peak-to-average power ratio (PAPR) can cause the degradation of power efficiency. The well-known PAPR reduction technique, dummy sequence insertion (DSI), can be a realistic solution because of its structural simplicity. However, the large usage of subcarriers for the dummy sequences may decrease the transmitted data rate in the DSI scheme. In this paper, a novel DSI scheme is applied to the LTE system. Firstly, we obtain the null subcarriers in single-input single-output (SISO) and multiple-input multiple-output (MIMO) systems, respectively; then, optimized dummy sequences are inserted into the obtained null subcarrier. Simulation results show that Walsh-Hadamard transform (WHT) sequence is the best for the dummy sequence and the ratio of 16 to 20 for the WHT and randomly generated sequences has the maximum PAPR reduction performance. The number of near optimal iteration is derived to prevent exhausted iterations. It is also shown that there is no bit error rate (BER) degradation with the proposed technique in LTE downlink system.
Optimized Scheduling Technique of Null Subcarriers for Peak Power Control in 3GPP LTE Downlink
Park, Sang Kyu
2014-01-01
Orthogonal frequency division multiple access (OFDMA) is a key multiple access technique for the long term evolution (LTE) downlink. However, high peak-to-average power ratio (PAPR) can cause the degradation of power efficiency. The well-known PAPR reduction technique, dummy sequence insertion (DSI), can be a realistic solution because of its structural simplicity. However, the large usage of subcarriers for the dummy sequences may decrease the transmitted data rate in the DSI scheme. In this paper, a novel DSI scheme is applied to the LTE system. Firstly, we obtain the null subcarriers in single-input single-output (SISO) and multiple-input multiple-output (MIMO) systems, respectively; then, optimized dummy sequences are inserted into the obtained null subcarrier. Simulation results show that Walsh-Hadamard transform (WHT) sequence is the best for the dummy sequence and the ratio of 16 to 20 for the WHT and randomly generated sequences has the maximum PAPR reduction performance. The number of near optimal iteration is derived to prevent exhausted iterations. It is also shown that there is no bit error rate (BER) degradation with the proposed technique in LTE downlink system. PMID:24883376
New tool to assemble repetitive regions using next-generation sequencing data
NASA Astrophysics Data System (ADS)
Kuśmirek, Wiktor; Nowak, Robert M.; Neumann, Łukasz
2017-08-01
The next generation sequencing techniques produce a large amount of sequencing data. Some part of the genome are composed of repetitive DNA sequences, which are very problematic for the existing genome assemblers. We propose a modification of the algorithm for a DNA assembly, which uses the relative frequency of reads to properly reconstruct repetitive sequences. The new approach was implemented and tested, as a demonstration of the capability of our software we present some results for model organisms. The new implementation, using a three-layer software architecture was selected, where the presentation layer, data processing layer, and data storage layer were kept separate. Source code as well as demo application with web interface and the additional data are available at project web-page: http://dnaasm.sourceforge.net.
Chemical genomic profiling via barcode sequencing to predict compound mode of action
Piotrowski, Jeff S.; Simpkins, Scott W.; Li, Sheena C.; Deshpande, Raamesh; McIlwain, Sean; Ong, Irene; Myers, Chad L.; Boone, Charlie; Andersen, Raymond J.
2015-01-01
Summary Chemical genomics is an unbiased, whole-cell approach to characterizing novel compounds to determine mode of action and cellular target. Our version of this technique is built upon barcoded deletion mutants of Saccharomyces cerevisiae and has been adapted to a high-throughput methodology using next-generation sequencing. Here we describe the steps to generate a chemical genomic profile from a compound of interest, and how to use this information to predict molecular mechanism and targets of bioactive compounds. PMID:25618354
On the decomposition of synchronous state mechines using sequence invariant state machines
NASA Technical Reports Server (NTRS)
Hebbalalu, K.; Whitaker, S.; Cameron, K.
1992-01-01
This paper presents a few techniques for the decomposition of Synchronous State Machines of medium to large sizes into smaller component machines. The methods are based on the nature of the transitions and sequences of states in the machine and on the number and variety of inputs to the machine. The results of the decomposition, and of using the Sequence Invariant State Machine (SISM) Design Technique for generating the component machines, include great ease and quickness in the design and implementation processes. Furthermore, there is increased flexibility in making modifications to the original design leading to negligible re-design time.
Nanopore-based fourth-generation DNA sequencing technology.
Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei
2015-02-01
Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Lu, Emily; Elizondo-Riojas, Miguel-Angel; Chang, Jeffrey T; Volk, David E
2014-06-10
Next-generation sequencing results from bead-based aptamer libraries have demonstrated that traditional DNA/RNA alignment software is insufficient. This is particularly true for X-aptamers containing specialty bases (W, X, Y, Z, ...) that are identified by special encoding. Thus, we sought an automated program that uses the inherent design scheme of bead-based X-aptamers to create a hypothetical reference library and Markov modeling techniques to provide improved alignments. Aptaligner provides this feature as well as length error and noise level cutoff features, is parallelized to run on multiple central processing units (cores), and sorts sequences from a single chip into projects and subprojects.
Marshall, Owen J; Southall, Tony D; Cheetham, Seth W; Brand, Andrea H
2016-09-01
This protocol is an extension to: Nat. Protoc. 2, 1467-1478 (2007); doi:10.1038/nprot.2007.148; published online 7 June 2007The ability to profile transcription and chromatin binding in a cell-type-specific manner is a powerful aid to understanding cell-fate specification and cellular function in multicellular organisms. We recently developed targeted DamID (TaDa) to enable genome-wide, cell-type-specific profiling of DNA- and chromatin-binding proteins in vivo without cell isolation. As a protocol extension, this article describes substantial modifications to an existing protocol, and it offers additional applications. TaDa builds upon DamID, a technique for detecting genome-wide DNA-binding profiles of proteins, by coupling it with the GAL4 system in Drosophila to enable both temporal and spatial resolution. TaDa ensures that Dam-fusion proteins are expressed at very low levels, thus avoiding toxicity and potential artifacts from overexpression. The modifications to the core DamID technique presented here also increase the speed of sample processing and throughput, and adapt the method to next-generation sequencing technology. TaDa is robust, reproducible and highly sensitive. Compared with other methods for cell-type-specific profiling, the technique requires no cell-sorting, cross-linking or antisera, and binding profiles can be generated from as few as 10,000 total induced cells. By profiling the genome-wide binding of RNA polymerase II (Pol II), TaDa can also identify transcribed genes in a cell-type-specific manner. Here we describe a detailed protocol for carrying out TaDa experiments and preparing the material for next-generation sequencing. Although we developed TaDa in Drosophila, it should be easily adapted to other organisms with an inducible expression system. Once transgenic animals are obtained, the entire experimental procedure-from collecting tissue samples to generating sequencing libraries-can be accomplished within 5 d.
Random sequences generation through optical measurements by phase-shifting interferometry
NASA Astrophysics Data System (ADS)
François, M.; Grosges, T.; Barchiesi, D.; Erra, R.; Cornet, A.
2012-04-01
The development of new techniques for producing random sequences with a high level of security is a challenging topic of research in modern cryptographics. The proposed method is based on the measurement by phase-shifting interferometry of the speckle signals of the interaction between light and structures. We show how the combination of amplitude and phase distributions (maps) under a numerical process can produce random sequences. The produced sequences satisfy all the statistical requirements of randomness and can be used in cryptographic schemes.
Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids
NASA Astrophysics Data System (ADS)
Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant
2014-03-01
Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.
Single-Molecule Electrical Random Resequencing of DNA and RNA
NASA Astrophysics Data System (ADS)
Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji
2012-07-01
Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
The sequence of sequencers: The history of sequencing DNA
Heather, James M.; Chain, Benjamin
2016-01-01
Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401
Pulse Compression Techniques for Laser Generated Ultrasound
NASA Technical Reports Server (NTRS)
Anastasi, R. F.; Madaras, E. I.
1999-01-01
Laser generated ultrasound for nondestructive evaluation has an optical power density limit due to rapid high heating that causes material damage. This damage threshold limits the generated ultrasound amplitude, which impacts nondestructive evaluation inspection capability. To increase ultrasound signal levels and improve the ultrasound signal-to-noise ratio without exceeding laser power limitations, it is possible to use pulse compression techniques. The approach illustrated here uses a 150mW laser-diode modulated with a pseudo-random sequence and signal correlation. Results demonstrate the successful generation of ultrasonic bulk waves in aluminum and graphite-epoxy composite materials using a modulated low-power laser diode and illustrate ultrasound bandwidth control.
Fritsch, Leonie; Fischer, Rainer; Wambach, Christoph; Dudek, Max; Schillberg, Stefan; Schröper, Florian
2015-08-01
Simple and reliable, high-throughput techniques to detect the zygosity of transgenic events in plants are valuable for biotechnology and plant breeding companies seeking robust genotyping data for the assessment of new lines and the monitoring of breeding programs. We show that next-generation sequencing (NGS) applied to short PCR products spanning the transgene integration site provides accurate zygosity data that are more robust and reliable than those generated by PCR-based methods. The NGS reads covered the 5' border of the transgenic events (incorporating part of the transgene and the flanking genomic DNA), or the genomic sequences flanking the unfilled transgene integration site at the wild-type locus. We compared the NGS method to competitive real-time PCR with transgene-specific and wild-type-specific primer/probe pairs, one pair matching the 5' genomic flanking sequence and 5' part of the transgene and the other matching the unfilled transgene integration site. Although both NGS and real-time PCR provided useful zygosity data, the NGS technique was favorable because it needed fewer optimization steps. It also provided statistically more-reliable evidence for the presence of each allele because each product was often covered by more than 100 reads. The NGS method is also more suitable for the genotyping of large panels of plants because up to 80 million reads can be produced in one sequencing run. Our novel method is therefore ideal for the rapid and accurate genotyping of large numbers of samples.
DNA-based random number generation in security circuitry.
Gearheart, Christy M; Arazi, Benjamin; Rouchka, Eric C
2010-06-01
DNA-based circuit design is an area of research in which traditional silicon-based technologies are replaced by naturally occurring phenomena taken from biochemistry and molecular biology. This research focuses on further developing DNA-based methodologies to mimic digital data manipulation. While exhibiting fundamental principles, this work was done in conjunction with the vision that DNA-based circuitry, when the technology matures, will form the basis for a tamper-proof security module, revolutionizing the meaning and concept of tamper-proofing and possibly preventing it altogether based on accurate scientific observations. A paramount part of such a solution would be self-generation of random numbers. A novel prototype schema employs solid phase synthesis of oligonucleotides for random construction of DNA sequences; temporary storage and retrieval is achieved through plasmid vectors. A discussion of how to evaluate sequence randomness is included, as well as how these techniques are applied to a simulation of the random number generation circuitry. Simulation results show generated sequences successfully pass three selected NIST random number generation tests specified for security applications.
DNA nanomapping using CRISPR-Cas9 as a programmable nanoparticle.
Mikheikin, Andrey; Olsen, Anita; Leslie, Kevin; Russell-Pavier, Freddie; Yacoot, Andrew; Picco, Loren; Payton, Oliver; Toor, Amir; Chesney, Alden; Gimzewski, James K; Mishra, Bud; Reed, Jason
2017-11-21
Progress in whole-genome sequencing using short-read (e.g., <150 bp), next-generation sequencing technologies has reinvigorated interest in high-resolution physical mapping to fill technical gaps that are not well addressed by sequencing. Here, we report two technical advances in DNA nanotechnology and single-molecule genomics: (1) we describe a labeling technique (CRISPR-Cas9 nanoparticles) for high-speed AFM-based physical mapping of DNA and (2) the first successful demonstration of using DVD optics to image DNA molecules with high-speed AFM. As a proof of principle, we used this new "nanomapping" method to detect and map precisely BCL2-IGH translocations present in lymph node biopsies of follicular lymphoma patents. This HS-AFM "nanomapping" technique can be complementary to both sequencing and other physical mapping approaches.
The use of PacBio and Hi-C data in denovo assembly of the goat genome
USDA-ARS?s Scientific Manuscript database
Generating de novo reference genome assemblies for non-model organisms is a laborious task that often requires a large amount of data from several sequencing platforms and cytogenetic surveys. By using PacBio sequence data and new library creation techniques, we present a de novo, high quality refer...
Watson, Christopher M.; Crinnion, Laura A.; Gurgel‐Gianetti, Juliana; Harrison, Sally M.; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F.; Pena, Sergio D. J.; Bonthron, David T.
2015-01-01
ABSTRACT Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease‐causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome‐wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution. PMID:26037133
Considerations for standardizing predictive molecular pathology for cancer prognosis.
Fiorentino, Michelangelo; Scarpelli, Marina; Lopez-Beltran, Antonio; Cheng, Liang; Montironi, Rodolfo
2017-01-01
Molecular tests that were once ancillary to the core business of cyto-histopathology are becoming the most relevant workload in pathology departments after histopathology/cytopathology and before autopsies. This has resulted from innovations in molecular biology techniques, which have developed at an incredibly fast pace. Areas covered: Most of the current widely used techniques in molecular pathology such as FISH, direct sequencing, pyrosequencing, and allele-specific PCR will be replaced by massive parallel sequencing that will not be considered next generation, but rather, will be considered to be current generation sequencing. The pre-analytical steps of molecular techniques such as DNA extraction or sample preparation will be largely automated. Moreover, all the molecular pathology instruments will be part of an integrated workflow that traces the sample from extraction to the analytical steps until the results are reported; these steps will be guided by expert laboratory information systems. In situ hybridization and immunohistochemistry for quantification will be largely digitalized as much as histology will be mostly digitalized rather than viewed using microscopy. Expert commentary: This review summarizes the technical and regulatory issues concerning the standardization of molecular tests in pathology. A vision of the future perspectives of technological changes is also provided.
Single-Cell RNA Sequencing of Glioblastoma Cells.
Sen, Rajeev; Dolgalev, Igor; Bayin, N Sumru; Heguy, Adriana; Tsirigos, Aris; Placantonakis, Dimitris G
2018-01-01
Single-cell RNA sequencing (sc-RNASeq) is a recently developed technique used to evaluate the transcriptome of individual cells. As opposed to conventional RNASeq in which entire populations are sequenced in bulk, sc-RNASeq can be beneficial when trying to better understand gene expression patterns in markedly heterogeneous populations of cells or when trying to identify transcriptional signatures of rare cells that may be underrepresented when using conventional bulk RNASeq. In this method, we describe the generation and analysis of cDNA libraries from single patient-derived glioblastoma cells using the C1 Fluidigm system. The protocol details the use of the C1 integrated fluidics circuit (IFC) for capturing, imaging and lysing cells; performing reverse transcription; and generating cDNA libraries that are ready for sequencing and analysis.
Bontems, Franck; Baerlocher, Loic; Mehenni, Sabrina; Bahechar, Ilham; Farinelli, Laurent; Dosch, Roland
2011-02-18
Fish models like medaka, stickleback or zebrafish provide a valuable resource to study vertebrate genes. However, finding genetic variants e.g. mutations in the genome is still arduous. Here we used a combination of microarray capturing and next generation sequencing to identify the affected gene in the mozartkugelp11cv (mzlp11cv) mutant zebrafish. We discovered a 31-bp deletion in macf1 demonstrating the potential of this technique to efficiently isolate mutations in a vertebrate genome. Copyright © 2011 Elsevier Inc. All rights reserved.
Next-generation sequencing: hype and hope for development of personalized radiation therapy?
Tinhofer, Ingeborg; Niehr, Franziska; Konschak, Robert; Liebs, Sandra; Munz, Matthias; Stenzinger, Albrecht; Weichert, Wilko; Keilholz, Ulrich; Budach, Volker
2015-08-28
The introduction of next-generation sequencing (NGS) in the field of cancer research has boosted worldwide efforts of genome-wide personalized oncology aiming at identifying predictive biomarkers and novel actionable targets. Despite considerable progress in understanding the molecular biology of distinct cancer entities by the use of this revolutionary technology and despite contemporaneous innovations in drug development, translation of NGS findings into improved concepts for cancer treatment remains a challenge. The aim of this article is to describe shortly the NGS platforms for DNA sequencing and in more detail key achievements and unresolved hurdles. A special focus will be given on potential clinical applications of this innovative technique in the field of radiation oncology.
Zhang, Ran; Yin, Yinliang; Zhang, Yujun; Li, Kexin; Zhu, Hongxia; Gong, Qin; Wang, Jianwu; Hu, Xiaoxiang; Li, Ning
2012-01-01
As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integration events is limited, making it technically challenging to characterize transgenes. Next-generation sequencing has enabled cost-effective, routine and widespread high-throughput genomic analysis. Here, we demonstrate the use of next-generation sequencing to extensively characterize cattle harboring a 150-kb human lactoferrin transgene that was initially analyzed by chromosome walking without success. Using this approach, the sites upstream and downstream of the target gene integration site in the host genome were identified at the single nucleotide level. The sequencing result was verified by event-specific PCR for the integration sites and FISH for the chromosomal location. Sequencing depth analysis revealed that multiple copies of the incomplete target gene and the vector backbone were present in the host genome. Upon integration, complex recombination was also observed between the target gene and the vector backbone. These findings indicate that next-generation sequencing is a reliable and accurate approach for the molecular characterization of the transgene sequence, integration sites and copy number in transgenic species. PMID:23185606
Generative adversarial networks for brain lesion detection
NASA Astrophysics Data System (ADS)
Alex, Varghese; Safwan, K. P. Mohammed; Chennamsetty, Sai Saketh; Krishnamurthi, Ganapathy
2017-02-01
Manual segmentation of brain lesions from Magnetic Resonance Images (MRI) is cumbersome and introduces errors due to inter-rater variability. This paper introduces a semi-supervised technique for detection of brain lesion from MRI using Generative Adversarial Networks (GANs). GANs comprises of a Generator network and a Discriminator network which are trained simultaneously with the objective of one bettering the other. The networks were trained using non lesion patches (n=13,000) from 4 different MR sequences. The network was trained on BraTS dataset and patches were extracted from regions excluding tumor region. The Generator network generates data by modeling the underlying probability distribution of the training data, (PData). The Discriminator learns the posterior probability P (Label Data) by classifying training data and generated data as "Real" or "Fake" respectively. The Generator upon learning the joint distribution, produces images/patches such that the performance of the Discriminator on them are random, i.e. P (Label Data = GeneratedData) = 0.5. During testing, the Discriminator assigns posterior probability values close to 0.5 for patches from non lesion regions, while patches centered on lesion arise from a different distribution (PLesion) and hence are assigned lower posterior probability value by the Discriminator. On the test set (n=14), the proposed technique achieves whole tumor dice score of 0.69, sensitivity of 91% and specificity of 59%. Additionally the generator network was capable of generating non lesion patches from various MR sequences.
Reading biological processes from nucleotide sequences
NASA Astrophysics Data System (ADS)
Murugan, Anand
Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical mechanisms.
NASA Astrophysics Data System (ADS)
Holden, Todd; Marchese, P.; Tremberger, G., Jr.; Cheung, E.; Subramaniam, R.; Sullivan, R.; Schneider, P.; Flamholz, A.; Lieberman, D.; Cheung, T.
2008-08-01
We have characterized function related DNA sequences of various organisms using informatics techniques, including fractal dimension calculation, nucleotide and multi-nucleotide statistics, and sequence fluctuation analysis. Our analysis shows trends which differentiate extremophile from non-extremophile organisms, which could be reproduced in extraterrestrial life. Among the systems studied are radiation repair genes, genes involved in thermal shocks, and genes involved in drug resistance. We also evaluate sequence level changes that have occurred during short term evolution (several thousand generations) under extreme conditions.
Engineering of a DNA Polymerase for Direct m6 A Sequencing.
Aschenbrenner, Joos; Werner, Stephan; Marchand, Virginie; Adam, Martina; Motorin, Yuri; Helm, Mark; Marx, Andreas
2018-01-08
Methods for the detection of RNA modifications are of fundamental importance for advancing epitranscriptomics. N 6 -methyladenosine (m 6 A) is the most abundant RNA modification in mammalian mRNA and is involved in the regulation of gene expression. Current detection techniques are laborious and rely on antibody-based enrichment of m 6 A-containing RNA prior to sequencing, since m 6 A modifications are generally "erased" during reverse transcription (RT). To overcome the drawbacks associated with indirect detection, we aimed to generate novel DNA polymerase variants for direct m 6 A sequencing. Therefore, we developed a screen to evolve an RT-active KlenTaq DNA polymerase variant that sets a mark for N 6 -methylation. We identified a mutant that exhibits increased misincorporation opposite m 6 A compared to unmodified A. Application of the generated DNA polymerase in next-generation sequencing allowed the identification of m 6 A sites directly from the sequencing data of untreated RNA samples. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Ordulu, Zehra; Wong, Kristen E; Currall, Benjamin B; Ivanov, Andrew R; Pereira, Shahrin; Althari, Sara; Gusella, James F; Talkowski, Michael E; Morton, Cynthia C
2014-05-01
With recent rapid advances in genomic technologies, precise delineation of structural chromosome rearrangements at the nucleotide level is becoming increasingly feasible. In this era of "next-generation cytogenetics" (i.e., an integration of traditional cytogenetic techniques and next-generation sequencing), a consensus nomenclature is essential for accurate communication and data sharing. Currently, nomenclature for describing the sequencing data of these aberrations is lacking. Herein, we present a system called Next-Gen Cytogenetic Nomenclature, which is concordant with the International System for Human Cytogenetic Nomenclature (2013). This system starts with the alignment of rearrangement sequences by BLAT or BLAST (alignment tools) and arrives at a concise and detailed description of chromosomal changes. To facilitate usage and implementation of this nomenclature, we are developing a program designated BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), a demonstrative version of which is accessible online. A standardized characterization of structural chromosomal rearrangements is essential both for research analyses and for application in the clinical setting. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
DNA barcoding Indian freshwater fishes.
Lakra, Wazir Singh; Singh, M; Goswami, Mukunda; Gopalakrishnan, A; Lal, K K; Mohindra, V; Sarkar, U K; Punia, P P; Singh, K V; Bhatt, J P; Ayyappan, S
2016-11-01
DNA barcoding is a promising technique for species identification using a short mitochondrial DNA sequence of cytochrome c oxidase I (COI) gene. In the present study, DNA barcodes were generated from 72 species of freshwater fish covering the Orders Cypriniformes, Siluriformes, Perciformes, Synbranchiformes, and Osteoglossiformes representing 50 genera and 19 families. All the samples were collected from diverse sites except the species endemic to a particular location. Species were represented by multiple specimens in the great majority of the barcoded species. A total of 284 COI sequences were generated. After amplification and sequencing of 700 base pair fragment of COI, primers were trimmed which invariably generated a 655 base pair barcode sequence. The average Kimura two-parameter (K2P) distances within-species, genera, families, and orders were 0.40%, 9.60%, 13.10%, and 17.16%, respectively. DNA barcode discriminated congeneric species without any confusion. The study strongly validated the efficiency of COI as an ideal marker for DNA barcoding of Indian freshwater fishes.
SU-E-J-90: MRI-Based Treatment Simulation and Patient Setup for Radiation Therapy of Brain Cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Y; Cao, M; Han, F
2014-06-01
Purpose: Traditional radiation therapy of cancer is heavily dependent on CT. CT provides excellent depiction of the bones but lacks good soft tissue contrast, which makes contouring difficult. Often, MRIs are fused with CT to take advantage of its superior soft tissue contrast. Such an approach has drawbacks. It is desirable to perform treatment simulation entirely based on MRI. To achieve MR-based simulation for radiation therapy, bone imaging is an important challenge because of the low MR signal intensity from bone due to its ultra-short T2 and T1, which presents difficulty for both dose calculation and patient setup in termsmore » of digitally reconstructed radiograph (DRR) generation. Current solutions will either require manual bone contouring or multiple MR scans. We present a technique to generate DRR using MRI with an Ultra Short Echo Time (UTE) sequence which is applicable to both OBI and ExacTrac 2D patient setup. Methods: Seven brain cancer patients were scanned at 1.5 Tesla using a radial UTE sequence. The sequence acquires two images at two different echo times. The two images were processed using in-house software. The resultant bone images were subsequently loaded into commercial systems to generate DRRs. Simulation and patient clinical on-board images were used to evaluate 2D patient setup with MRI-DRRs. Results: The majority bones are well visualized in all patients. The fused image of patient CT with the MR bone image demonstrates the accuracy of automatic bone identification using our technique. The generated DRR is of good quality. Accuracy of 2D patient setup by using MRI-DRR is comparable to CT-based 2D patient setup. Conclusion: This study shows the potential of DRR generation with single MR sequence. Further work will be needed on MR sequence development and post-processing procedure to achieve robust MR bone imaging for other human sites in addition to brain.« less
The sequence of sequencers: The history of sequencing DNA.
Heather, James M; Chain, Benjamin
2016-01-01
Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Beltman, Joost B; Urbanus, Jos; Velds, Arno; van Rooij, Nienke; Rohr, Jan C; Naik, Shalin H; Schumacher, Ton N
2016-04-02
Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and rare sequences are biologically relevant, the relatively high error rate of NGS techniques complicates data analysis, as it is difficult to distinguish rare true sequences from spurious sequences that are generated by PCR or sequencing errors. This issue, for instance, applies to cellular barcoding strategies that aim to follow the amount and type of offspring of single cells, by supplying these with unique heritable DNA tags. Here, we use genetic barcoding data from the Illumina HiSeq platform to show that straightforward read threshold-based filtering of data is typically insufficient to filter out spurious barcodes. Importantly, we demonstrate that specific sequencing errors occur at an approximately constant rate across different samples that are sequenced in parallel. We exploit this observation by developing a novel approach to filter out spurious sequences. Application of our new method demonstrates its value in the identification of true sequences amongst spurious sequences in biological data sets.
Cell-free DNA and next-generation sequencing in the service of personalized medicine for lung cancer
Bennett, Catherine W.; Berchem, Guy; Kim, Yeoun Jin; El-Khoury, Victoria
2016-01-01
Personalized medicine has emerged as the future of cancer care to ensure that patients receive individualized treatment specific to their needs. In order to provide such care, molecular techniques that enable oncologists to diagnose, treat, and monitor tumors are necessary. In the field of lung cancer, cell free DNA (cfDNA) shows great potential as a less invasive liquid biopsy technique, and next-generation sequencing (NGS) is a promising tool for analysis of tumor mutations. In this review, we outline the evolution of cfDNA and NGS and discuss the progress of using them in a clinical setting for patients with lung cancer. We also present an analysis of the role of cfDNA as a liquid biopsy technique and NGS as an analytical tool in studying EGFR and MET, two frequently mutated genes in lung cancer. Ultimately, we hope that using cfDNA and NGS for cancer diagnosis and treatment will become standard for patients with lung cancer and across the field of oncology. PMID:27589834
Schramm, Chaim A; Sheng, Zizhang; Zhang, Zhenhai; Mascola, John R; Kwong, Peter D; Shapiro, Lawrence
2016-01-01
The rapid advance of massively parallel or next-generation sequencing technologies has made possible the characterization of B cell receptor repertoires in ever greater detail, and these developments have triggered a proliferation of software tools for processing and annotating these data. Of especial interest, however, is the capability to track the development of specific antibody lineages across time, which remains beyond the scope of most current programs. We have previously reported on the use of techniques such as inter- and intradonor analysis and CDR3 tracing to identify transcripts related to an antibody of interest. Here, we present Software for the Ontogenic aNalysis of Antibody Repertoires (SONAR), capable of automating both general repertoire analysis and specialized techniques for investigating specific lineages. SONAR annotates next-generation sequencing data, identifies transcripts in a lineage of interest, and tracks lineage development across multiple time points. SONAR also generates figures, such as identity-divergence plots and longitudinal phylogenetic "birthday" trees, and provides interfaces to other programs such as DNAML and BEAST. SONAR can be downloaded as a ready-to-run Docker image or manually installed on a local machine. In the latter case, it can also be configured to take advantage of a high-performance computing cluster for the most computationally intensive steps, if available. In summary, this software provides a useful new tool for the processing of large next-generation sequencing datasets and the ontogenic analysis of neutralizing antibody lineages. SONAR can be found at https://github.com/scharch/SONAR, and the Docker image can be obtained from https://hub.docker.com/r/scharch/sonar/.
Jun, Goo; Wing, Mary Kate; Abecasis, Gonçalo R; Kang, Hyun Min
2015-06-01
The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies. © 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
Design automation techniques for custom LSI arrays
NASA Technical Reports Server (NTRS)
Feller, A.
1975-01-01
The standard cell design automation technique is described as an approach for generating random logic PMOS, CMOS or CMOS/SOS custom large scale integration arrays with low initial nonrecurring costs and quick turnaround time or design cycle. The system is composed of predesigned circuit functions or cells and computer programs capable of automatic placement and interconnection of the cells in accordance with an input data net list. The program generates a set of instructions to drive an automatic precision artwork generator. A series of support design automation and simulation programs are described, including programs for verifying correctness of the logic on the arrays, performing dc and dynamic analysis of MOS devices, and generating test sequences.
Next generation sequencing (NGS): a golden tool in forensic toolkit.
Aly, S M; Sabri, D M
The DNA analysis is a cornerstone in contemporary forensic sciences. DNA sequencing technologies are powerful tools that enrich molecular sciences in the past based on Sanger sequencing and continue to glowing these sciences based on Next generation sequencing (NGS). Next generation sequencing has excellent potential to flourish and increase the molecular applications in forensic sciences by jumping over the pitfalls of the conventional method of sequencing. The main advantages of NGS compared to conventional method that it utilizes simultaneously a large number of genetic markers with high-resolution of genetic data. These advantages will help in solving several challenges such as mixture analysis and dealing with minute degraded samples. Based on these new technologies, many markers could be examined to get important biological data such as age, geographical origins, tissue type determination, external visible traits and monozygotic twins identification. It also could get data related to microbes, insects, plants and soil which are of great medico-legal importance. Despite the dozens of forensic research involving NGS, there are requirements before using this technology routinely in forensic cases. Thus, there is a great need to more studies that address robustness of these techniques. Therefore, this work highlights the applications of forensic sciences in the era of massively parallel sequencing.
Jain, Mamta; Kumar, Anil; Choudhary, Rishabh Charan
2017-06-01
In this article, we have proposed an improved diagonal queue medical image steganography for patient secret medical data transmission using chaotic standard map, linear feedback shift register, and Rabin cryptosystem, for improvement of previous technique (Jain and Lenka in Springer Brain Inform 3:39-51, 2016). The proposed algorithm comprises four stages, generation of pseudo-random sequences (pseudo-random sequences are generated by linear feedback shift register and standard chaotic map), permutation and XORing using pseudo-random sequences, encryption using Rabin cryptosystem, and steganography using the improved diagonal queues. Security analysis has been carried out. Performance analysis is observed using MSE, PSNR, maximum embedding capacity, as well as by histogram analysis between various Brain disease stego and cover images.
NASA Astrophysics Data System (ADS)
Sherwood, R.; Mutz, D.; Estlin, T.; Chien, S.; Backes, P.; Norris, J.; Tran, D.; Cooper, B.; Rabideau, G.; Mishkin, A.; Maxwell, S.
2001-07-01
This article discusses a proof-of-concept prototype for ground-based automatic generation of validated rover command sequences from high-level science and engineering activities. This prototype is based on ASPEN, the Automated Scheduling and Planning Environment. This artificial intelligence (AI)-based planning and scheduling system will automatically generate a command sequence that will execute within resource constraints and satisfy flight rules. An automated planning and scheduling system encodes rover design knowledge and uses search and reasoning techniques to automatically generate low-level command sequences while respecting rover operability constraints, science and engineering preferences, environmental predictions, and also adhering to hard temporal constraints. This prototype planning system has been field-tested using the Rocky 7 rover at JPL and will be field-tested on more complex rovers to prove its effectiveness before transferring the technology to flight operations for an upcoming NASA mission. Enabling goal-driven commanding of planetary rovers greatly reduces the requirements for highly skilled rover engineering personnel. This in turn greatly reduces mission operations costs. In addition, goal-driven commanding permits a faster response to changes in rover state (e.g., faults) or science discoveries by removing the time-consuming manual sequence validation process, allowing rapid "what-if" analyses, and thus reducing overall cycle times.
Davidsson, Marcus; Diaz-Fernandez, Paula; Schwich, Oliver D.; Torroba, Marcos; Wang, Gang; Björklund, Tomas
2016-01-01
Detailed characterization and mapping of oligonucleotide function in vivo is generally a very time consuming effort that only allows for hypothesis driven subsampling of the full sequence to be analysed. Recent advances in deep sequencing together with highly efficient parallel oligonucleotide synthesis and cloning techniques have, however, opened up for entirely new ways to map genetic function in vivo. Here we present a novel, optimized protocol for the generation of universally applicable, barcode labelled, plasmid libraries. The libraries are designed to enable the production of viral vector preparations assessing coding or non-coding RNA function in vivo. When generating high diversity libraries, it is a challenge to achieve efficient cloning, unambiguous barcoding and detailed characterization using low-cost sequencing technologies. With the presented protocol, diversity of above 3 million uniquely barcoded adeno-associated viral (AAV) plasmids can be achieved in a single reaction through a process achievable in any molecular biology laboratory. This approach opens up for a multitude of in vivo assessments from the evaluation of enhancer and promoter regions to the optimization of genome editing. The generated plasmid libraries are also useful for validation of sequencing clustering algorithms and we here validate the newly presented message passing clustering process named Starcode. PMID:27874090
Haimovich, Adrian D.; Muir, Paul; Isaacs, Farren J.
2016-01-01
Next-generation DNA sequencing has revealed the complete genome sequences of numerous organisms, establishing a fundamental and growing understanding of genetic variation and phenotypic diversity. Engineering at the gene, network and whole-genome scale aims to introduce targeted genetic changes both to explore emergent phenotypes and to introduce new functionalities. Expansion of these approaches into massively parallel platforms establishes the ability to generate targeted genome modifications, elucidating causal links between genotype and phenotype, as well as the ability to design and reprogramme organisms. In this Review, we explore techniques and applications in genome engineering, outlining key advances and defining challenges. PMID:26260262
Choi, Jung-Han; Lim, Young-Jun; Kim, Chang-Whe; Kim, Myung-Joo
2009-01-01
This study evaluated the effect of different screw-tightening sequences, forces, and methods on the stresses generated on a well-fitting internal-connection implant (Astra Tech) superstructure. A metal framework directly connected to four parallel implants was fabricated on a fully edentulous mandibular resin model. Six stone casts with four implant replicas were made from a pickup impression of the superstructure to represent a "well-fitting" situation. Stresses generated by four screw-tightening sequences (1-2-3-4, 4-3-2-1, 2-4-3-1, and 2-3-1-4), two forces (10 and 20 Ncm), and two methods (one-step and two-step) were evaluated. In the two-step method, screws were tightened to the initial torque (10 Ncm) in a predetermined screw-tightening sequence and then to the final torque (20 Ncm) in the same sequence. Stresses were recorded twice by three strain gauges attached to the framework (superior face midway between abutments). Deformation data were analyzed using multiple analysis of variance at a .05 level of statistical significance. In all stone casts, stresses were produced by the superstructure connection, regardless of screw-tightening sequence, force, and method. No statistically significant differences for superstructure preload stresses were found based on screw-tightening sequences (-180.0 to -181.6 microm/m) or forces (-163.4 and -169.2 microm/m) (P > .05). However, different screw-tightening methods induced different stresses on the superstructure. The two-step screw-tightening method (-180.1 microm/m) produced significantly higher stress than the one-step method (-169.2 microm/m) (P = .0457). Within the limitations of this in vitro study, screw-tightening sequence and force were not critical factors in the stress generated on a well-fitting internal-connection implant superstructure. The stress caused by the two-step method was greater than that produced using the one-step method. Further studies are needed to evaluate the effect of screw-tightening techniques on preload stress in various different clinical situations.
Mapping the zebrafish brain methylome using reduced representation bisulfite sequencing
Chatterjee, Aniruddha; Ozaki, Yuichi; Stockwell, Peter A; Horsfield, Julia A; Morison, Ian M; Nakagawa, Shinichi
2013-01-01
Reduced representation bisulfite sequencing (RRBS) has been used to profile DNA methylation patterns in mammalian genomes such as human, mouse and rat. The methylome of the zebrafish, an important animal model, has not yet been characterized at base-pair resolution using RRBS. Therefore, we evaluated the technique of RRBS in this model organism by generating four single-nucleotide resolution DNA methylomes of adult zebrafish brain. We performed several simulations to show the distribution of fragments and enrichment of CpGs in different in silico reduced representation genomes of zebrafish. Four RRBS brain libraries generated 98 million sequenced reads and had higher frequencies of multiple mapping than equivalent human RRBS libraries. The zebrafish methylome indicates there is higher global DNA methylation in the zebrafish genome compared with its equivalent human methylome. This observation was confirmed by RRBS of zebrafish liver. High coverage CpG dinucleotides are enriched in CpG island shores more than in the CpG island core. We found that 45% of the mapped CpGs reside in gene bodies, and 7% in gene promoters. This analysis provides a roadmap for generating reproducible base-pair level methylomes for zebrafish using RRBS and our results provide the first evidence that RRBS is a suitable technique for global methylation analysis in zebrafish. PMID:23975027
Woo, Kevin L; Rieucau, Guillaume
2008-07-01
The increasing use of the video playback technique in behavioural ecology reveals a growing need to ensure better control of the visual stimuli that focal animals experience. Technological advances now allow researchers to develop computer-generated animations instead of using video sequences of live-acting demonstrators. However, care must be taken to match the motion characteristics (speed and velocity) of the animation to the original video source. Here, we presented a tool based on the use of an optic flow analysis program to measure the resemblance of motion characteristics of computer-generated animations compared to videos of live-acting animals. We examined three distinct displays (tail-flick (TF), push-up body rock (PUBR), and slow arm wave (SAW)) exhibited by animations of Jacky dragons (Amphibolurus muricatus) that were compared to the original video sequences of live lizards. We found no significant differences between the motion characteristics of videos and animations across all three displays. Our results showed that our animations are similar the speed and velocity features of each display. Researchers need to ensure that similar motion characteristics in animation and video stimuli are represented, and this feature is a critical component in the future success of the video playback technique.
Generative technique for dynamic infrared image sequences
NASA Astrophysics Data System (ADS)
Zhang, Qian; Cao, Zhiguo; Zhang, Tianxu
2001-09-01
The generative technique of the dynamic infrared image was discussed in this paper. Because infrared sensor differs from CCD camera in imaging mechanism, it generates the infrared image by incepting the infrared radiation of scene (including target and background). The infrared imaging sensor is affected deeply by the atmospheric radiation, the environmental radiation and the attenuation of atmospheric radiation transfers. Therefore at first in this paper the imaging influence of all kinds of the radiations was analyzed and the calculation formula of radiation was provided, in addition, the passive scene and the active scene were analyzed separately. Then the methods of calculation in the passive scene were provided, and the functions of the scene model, the atmospheric transmission model and the material physical attribute databases were explained. Secondly based on the infrared imaging model, the design idea, the achievable way and the software frame for the simulation software of the infrared image sequence were introduced in SGI workstation. Under the guidance of the idea above, in the third segment of the paper an example of simulative infrared image sequences was presented, which used the sea and sky as background and used the warship as target and used the aircraft as eye point. At last the simulation synthetically was evaluated and the betterment scheme was presented.
Keller, A; Danner, N; Grimmer, G; Ankenbrand, M; von der Ohe, K; von der Ohe, W; Rost, S; Härtel, S; Steffan-Dewenter, I
2015-03-01
The identification of pollen plays an important role in ecology, palaeo-climatology, honey quality control and other areas. Currently, expert knowledge and reference collections are essential to identify pollen origin through light microscopy. Pollen identification through molecular sequencing and DNA barcoding has been proposed as an alternative approach, but the assessment of mixed pollen samples originating from multiple plant species is still a tedious and error-prone task. Next-generation sequencing has been proposed to avoid this hindrance. In this study we assessed mixed pollen probes through next-generation sequencing of amplicons from the highly variable, species-specific internal transcribed spacer 2 region of nuclear ribosomal DNA. Further, we developed a bioinformatic workflow to analyse these high-throughput data with a newly created reference database. To evaluate the feasibility, we compared results from classical identification based on light microscopy from the same samples with our sequencing results. We assessed in total 16 mixed pollen samples, 14 originated from honeybee colonies and two from solitary bee nests. The sequencing technique resulted in higher taxon richness (deeper assignments and more identified taxa) compared to light microscopy. Abundance estimations from sequencing data were significantly correlated with counted abundances through light microscopy. Simulation analyses of taxon specificity and sensitivity indicate that 96% of taxa present in the database are correctly identifiable at the genus level and 70% at the species level. Next-generation sequencing thus presents a useful and efficient workflow to identify pollen at the genus and species level without requiring specialised palynological expert knowledge. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Zhou, Ruo-Nan; Hu, Zan-Min
2007-01-01
The technique of chromosome microdissection and microcloning has been developed for more than 20 years. As a bridge between cytogenetics and molecular genetics, it leads to a number of applications: chromosome painting probe isolation, genetic linkage map and physical map construction, and expressed sequence tags generation. During those 20 years, this technique has not only been benefited from other technological advances but also cross-fertilized with other techniques. Today, it becomes a practicality with extensive uses. The purpose of this article is to review the development of this technique and its application in the field of genomic research. Moreover, a new method of generating ESTs of specific chromosomes developed by our lab is introduced. By using this method, the technique of chromosome microdissection and microcloning would be more valuable in the advancement of genomic research. PMID:18645627
Classification of a set of vectors using self-organizing map- and rule-based technique
NASA Astrophysics Data System (ADS)
Ae, Tadashi; Okaniwa, Kaishirou; Nosaka, Kenzaburou
2005-02-01
There exist various objects, such as pictures, music, texts, etc., around our environment. We have a view for these objects by looking, reading or listening. Our view is concerned with our behaviors deeply, and is very important to understand our behaviors. We have a view for an object, and decide the next action (data selection, etc.) with our view. Such a series of actions constructs a sequence. Therefore, we propose a method which acquires a view as a vector from several words for a view, and apply the vector to sequence generation. We focus on sequences of the data of which a user selects from a multimedia database containing pictures, music, movie, etc... These data cannot be stereotyped because user's view for them changes by each user. Therefore, we represent the structure of the multimedia database as the vector representing user's view and the stereotyped vector, and acquire sequences containing the structure as elements. Such a vector can be classified by SOM (Self-Organizing Map). Hidden Markov Model (HMM) is a method to generate sequences. Therefore, we use HMM of which a state corresponds to the representative vector of user's view, and acquire sequences containing the change of user's view. We call it Vector-state Markov Model (VMM). We introduce the rough set theory as a rule-base technique, which plays a role of classifying the sets of data such as the sets of "Tour".
Jazaeri Farsani, Seyed Mohammad; Deijs, Martin; Dijkman, Ronald; Molenkamp, Richard; Jeeninga, Rienk E; Ieven, Margareta; Goossens, Herman; van der Hoek, Lia
2015-01-01
Background Currently, virus discovery is mainly based on molecular techniques. Here, we propose a method that relies on virus culturing combined with state-of-the-art sequencing techniques. The most natural ex vivo culture system was used to enable replication of respiratory viruses. Method Three respiratory clinical samples were tested on well-differentiated pseudostratified tracheobronchial human airway epithelial (HAE) cultures grown at an air–liquid interface, which resemble the airway epithelium. Cells were stained with convalescent serum of the patients to identify infected cells and apical washes were analyzed by VIDISCA-454, a next-generation sequencing virus discovery technique. Results Infected cells were observed for all three samples. Sequencing subsequently indicated that the cells were infected by either human coronavirus OC43, influenzavirus B, or influenzavirus A. The sequence reads covered a large part of the genome (52%, 82%, and 57%, respectively). Conclusion We present here a new method for virus discovery that requires a virus culture on primary cells and an antibody detection. The virus in the harvest can be used to characterize the viral genome sequence and cell tropism, but also provides progeny virus to initiate experiments to fulfill the Koch's postulates. PMID:25482367
Cooper, James; Ding, Yi; Song, Jiuzhou; Zhao, Keji
2017-11-01
Increased chromatin accessibility is a feature of cell-type-specific cis-regulatory elements; therefore, mapping of DNase I hypersensitive sites (DHSs) enables the detection of active regulatory elements of transcription, including promoters, enhancers, insulators and locus-control regions. Single-cell DNase sequencing (scDNase-seq) is a method of detecting genome-wide DHSs when starting with either single cells or <1,000 cells from primary cell sources. This technique enables genome-wide mapping of hypersensitive sites in a wide range of cell populations that cannot be analyzed using conventional DNase I sequencing because of the requirement for millions of starting cells. Fresh cells, formaldehyde-cross-linked cells or cells recovered from formalin-fixed paraffin-embedded (FFPE) tissue slides are suitable for scDNase-seq assays. To generate scDNase-seq libraries, cells are lysed and then digested with DNase I. Circular carrier plasmid DNA is included during subsequent DNA purification and library preparation steps to prevent loss of the small quantity of DHS DNA. Libraries are generated for high-throughput sequencing on the Illumina platform using standard methods. Preparation of scDNase-seq libraries requires only 2 d. The materials and molecular biology techniques described in this protocol should be accessible to any general molecular biology laboratory. Processing of high-throughput sequencing data requires basic bioinformatics skills and uses publicly available bioinformatics software.
Laboratory procedures to generate viral metagenomes.
Thurber, Rebecca V; Haynes, Matthew; Breitbart, Mya; Wegley, Linda; Rohwer, Forest
2009-01-01
This collection of laboratory protocols describes the steps to collect viruses from various samples with the specific aim of generating viral metagenome sequence libraries (viromes). Viral metagenomics, the study of uncultured viral nucleic acid sequences from different biomes, relies on several concentration, purification, extraction, sequencing and heuristic bioinformatic methods. No single technique can provide an all-inclusive approach, and therefore the protocols presented here will be discussed in terms of hypothetical projects. However, care must be taken to individualize each step depending on the source and type of viral-particles. This protocol is a description of the processes we have successfully used to: (i) concentrate viral particles from various types of samples, (ii) eliminate contaminating cells and free nucleic acids and (iii) extract, amplify and purify viral nucleic acids. Overall, a sample can be processed to isolate viral nucleic acids suitable for high-throughput sequencing in approximately 1 week.
Hwang, Young Sun; Seo, Minseok; Choi, Hee Jung; Kim, Sang Kyung; Kim, Heebal; Han, Jae Yong
2018-04-01
The chicken is a valuable model organism, especially in evolutionary and embryology research because its embryonic development occurs in the egg. However, despite its scientific importance, no transcriptome data have been generated for deciphering the early developmental stages of the chicken because of practical and technical constraints in accessing pre-oviposited embryos. Here, we determine the entire transcriptome of pre-oviposited avian embryos, including oocyte, zygote, and intrauterine embryos from Eyal-giladi and Kochav stage I (EGK.I) to EGK.X collected using a noninvasive approach for the first time. We also compare RNA-sequencing data obtained using a bulked embryo sequencing and single embryo/cell sequencing technique. The raw sequencing data were preprocessed with two genome builds, Galgal4 and Galgal5, and the expression of 17,108 and 26,102 genes was quantified in the respective builds. There were some differences between the two techniques, as well as between the two genome builds, and these were affected by the emergence of long intergenic noncoding RNA annotations. The first transcriptome datasets of pre-oviposited early chicken embryos based on bulked and single embryo sequencing techniques will serve as a valuable resource for investigating early avian embryogenesis, for comparative studies among vertebrates, and for novel gene annotation in the chicken genome.
Pitfalls in genetic testing: the story of missed SCN1A mutations.
Djémié, Tania; Weckhuysen, Sarah; von Spiczak, Sarah; Carvill, Gemma L; Jaehn, Johanna; Anttonen, Anna-Kaisa; Brilstra, Eva; Caglayan, Hande S; de Kovel, Carolien G; Depienne, Christel; Gaily, Eija; Gennaro, Elena; Giraldez, Beatriz G; Gormley, Padhraig; Guerrero-López, Rosa; Guerrini, Renzo; Hämäläinen, Eija; Hartmann, Corinna; Hernandez-Hernandez, Laura; Hjalgrim, Helle; Koeleman, Bobby P C; Leguern, Eric; Lehesjoki, Anna-Elina; Lemke, Johannes R; Leu, Costin; Marini, Carla; McMahon, Jacinta M; Mei, Davide; Møller, Rikke S; Muhle, Hiltrud; Myers, Candace T; Nava, Caroline; Serratosa, Jose M; Sisodiya, Sanjay M; Stephani, Ulrich; Striano, Pasquale; van Kempen, Marjan J A; Verbeek, Nienke E; Usluer, Sunay; Zara, Federico; Palotie, Aarno; Mefford, Heather C; Scheffer, Ingrid E; De Jonghe, Peter; Helbig, Ingo; Suls, Arvid
2016-07-01
Sanger sequencing, still the standard technique for genetic testing in most diagnostic laboratories and until recently widely used in research, is gradually being complemented by next-generation sequencing (NGS). No single mutation detection technique is however perfect in identifying all mutations. Therefore, we wondered to what extent inconsistencies between Sanger sequencing and NGS affect the molecular diagnosis of patients. Since mutations in SCN1A, the major gene implicated in epilepsy, are found in the majority of Dravet syndrome (DS) patients, we focused on missed SCN1A mutations. We sent out a survey to 16 genetic centers performing SCN1A testing. We collected data on 28 mutations initially missed using Sanger sequencing. All patients were falsely reported as SCN1A mutation-negative, both due to technical limitations and human errors. We illustrate the pitfalls of Sanger sequencing and most importantly provide evidence that SCN1A mutations are an even more frequent cause of DS than already anticipated.
Applying next-generation DNA sequencing technology to aquatic bioassessment
The growing challenges for environmental monitoring and assessment have pushed standard techniques to the limits of their application. Current biological monitoring programs often require considerable time and workload to provide environmental condition assessments. New molecular...
Automating the generation of lexical patterns for processing free text in clinical documents.
Meng, Frank; Morioka, Craig
2015-09-01
Many tasks in natural language processing utilize lexical pattern-matching techniques, including information extraction (IE), negation identification, and syntactic parsing. However, it is generally difficult to derive patterns that achieve acceptable levels of recall while also remaining highly precise. We present a multiple sequence alignment (MSA)-based technique that automatically generates patterns, thereby leveraging language usage to determine the context of words that influence a given target. MSAs capture the commonalities among word sequences and are able to reveal areas of linguistic stability and variation. In this way, MSAs provide a systemic approach to generating lexical patterns that are generalizable, which will both increase recall levels and maintain high levels of precision. The MSA-generated patterns exhibited consistent F1-, F.5-, and F2- scores compared to two baseline techniques for IE across four different tasks. Both baseline techniques performed well for some tasks and less well for others, but MSA was found to consistently perform at a high level for all four tasks. The performance of MSA on the four extraction tasks indicates the method's versatility. The results show that the MSA-based patterns are able to handle the extraction of individual data elements as well as relations between two concepts without the need for large amounts of manual intervention. We presented an MSA-based framework for generating lexical patterns that showed consistently high levels of both performance and recall over four different extraction tasks when compared to baseline methods. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Schramm, Chaim A.; Sheng, Zizhang; Zhang, Zhenhai; Mascola, John R.; Kwong, Peter D.; Shapiro, Lawrence
2016-01-01
The rapid advance of massively parallel or next-generation sequencing technologies has made possible the characterization of B cell receptor repertoires in ever greater detail, and these developments have triggered a proliferation of software tools for processing and annotating these data. Of especial interest, however, is the capability to track the development of specific antibody lineages across time, which remains beyond the scope of most current programs. We have previously reported on the use of techniques such as inter- and intradonor analysis and CDR3 tracing to identify transcripts related to an antibody of interest. Here, we present Software for the Ontogenic aNalysis of Antibody Repertoires (SONAR), capable of automating both general repertoire analysis and specialized techniques for investigating specific lineages. SONAR annotates next-generation sequencing data, identifies transcripts in a lineage of interest, and tracks lineage development across multiple time points. SONAR also generates figures, such as identity–divergence plots and longitudinal phylogenetic “birthday” trees, and provides interfaces to other programs such as DNAML and BEAST. SONAR can be downloaded as a ready-to-run Docker image or manually installed on a local machine. In the latter case, it can also be configured to take advantage of a high-performance computing cluster for the most computationally intensive steps, if available. In summary, this software provides a useful new tool for the processing of large next-generation sequencing datasets and the ontogenic analysis of neutralizing antibody lineages. SONAR can be found at https://github.com/scharch/SONAR, and the Docker image can be obtained from https://hub.docker.com/r/scharch/sonar/. PMID:27708645
Detection of a novel herpesvirus from bats in the Philippines.
Sano, Kaori; Okazaki, Sachiko; Taniguchi, Satoshi; Masangkay, Joseph S; Puentespina, Roberto; Eres, Eduardo; Cosico, Edison; Quibod, Niña; Kondo, Taisuke; Shimoda, Hiroshi; Hatta, Yuuki; Mitomo, Shumpei; Oba, Mami; Katayama, Yukie; Sassa, Yukiko; Furuya, Tetsuya; Nagai, Makoto; Une, Yumi; Maeda, Ken; Kyuwa, Shigeru; Yoshikawa, Yasuhiro; Akashi, Hiroomi; Omatsu, Tsutomu; Mizutani, Tetsuya
2015-08-01
Bats are natural hosts of many zoonotic viruses. Monitoring bat viruses is important to detect novel bat-borne infectious diseases. In this study, next generation sequencing techniques and conventional PCR were used to analyze intestine, lung, and blood clot samples collected from wild bats captured at three locations in Davao region, in the Philippines in 2012. Different viral genes belonging to the Retroviridae and Herpesviridae families were identified using next generation sequencing. The existence of herpesvirus in the samples was confirmed by PCR using herpesvirus consensus primers. The nucleotide sequences of the resulting PCR amplicons were 166-bp. Further phylogenetic analysis identified that the virus from which this nucleotide sequence was obtained belonged to the Gammaherpesvirinae subfamily. PCR using primers specific to the nucleotide sequence obtained revealed that the infection rate among the captured bats was 30 %. In this study, we present the partial genome of a novel gammaherpesvirus detected from wild bats. Our observations also indicate that this herpesvirus may be widely distributed in bat populations in Davao region.
2016-01-01
Comprehensive next generation sequencing virus detection was used to detect the whole spectrum of viruses and viroids in selected grapevines from the Czech Republic. The novel NGS approach was based on sequencing libraries of small RNA isolated from grapevine vascular tissues. Eight previously partially-characterized grapevines of diverse varieties were selected and subjected to analysis: Chardonnay, Laurot, Guzal Kara, and rootstock Kober 125AA from the Moravia wine-producing region; plus Müller-Thurgau and Pinot Noir from the Bohemia wine-producing region, both in the Czech Republic. Using next generation sequencing of small RNA, the presence of 8 viruses and 2 viroids were detected in a set of eight grapevines; therefore, confirming the high effectiveness of the technique in plant virology and producing results supporting previous data on multiple infected grapevines in Czech vineyards. Among the pathogens detected, the Grapevine rupestris vein feathering virus and Grapevine yellow speckle viroid 1 were recorded in the Czech Republic for the first time. PMID:27959951
Using Tablet for visual exploration of second-generation sequencing data.
Milne, Iain; Stephen, Gordon; Bayer, Micha; Cock, Peter J A; Pritchard, Leighton; Cardle, Linda; Shaw, Paul D; Marshall, David
2013-03-01
The advent of second-generation sequencing (2GS) has provided a range of significant new challenges for the visualization of sequence assemblies. These include the large volume of data being generated, short-read lengths and different data types and data formats associated with the diversity of new sequencing technologies. This article illustrates how Tablet-a high-performance graphical viewer for visualization of 2GS assemblies and read mappings-plays an important role in the analysis of these data. We present Tablet, and through a selection of use cases, demonstrate its value in quality assurance and scientific discovery, through features such as whole-reference coverage overviews, variant highlighting, paired-end read mark-up, GFF3-based feature tracks and protein translations. We discuss the computing and visualization techniques utilized to provide a rich and responsive graphical environment that enables users to view a range of file formats with ease. Tablet installers can be freely downloaded from http://bioinf.hutton.ac.uk/tablet in 32 or 64-bit versions for Windows, OS X, Linux or Solaris. For further details on the Tablet, contact tablet@hutton.ac.uk.
Eichmeier, Aleš; Komínková, Marcela; Komínek, Petr; Baránek, Miroslav
2016-01-01
Comprehensive next generation sequencing virus detection was used to detect the whole spectrum of viruses and viroids in selected grapevines from the Czech Republic. The novel NGS approach was based on sequencing libraries of small RNA isolated from grapevine vascular tissues. Eight previously partially-characterized grapevines of diverse varieties were selected and subjected to analysis: Chardonnay, Laurot, Guzal Kara, and rootstock Kober 125AA from the Moravia wine-producing region; plus Müller-Thurgau and Pinot Noir from the Bohemia wine-producing region, both in the Czech Republic. Using next generation sequencing of small RNA, the presence of 8 viruses and 2 viroids were detected in a set of eight grapevines; therefore, confirming the high effectiveness of the technique in plant virology and producing results supporting previous data on multiple infected grapevines in Czech vineyards. Among the pathogens detected, the Grapevine rupestris vein feathering virus and Grapevine yellow speckle viroid 1 were recorded in the Czech Republic for the first time.
Genetics of pediatric obesity.
Manco, Melania; Dallapiccola, Bruno
2012-07-01
Onset of obesity has been anticipated at earlier ages, and prevalence has dramatically increased worldwide over the past decades. Epidemic obesity is mainly attributable to modern lifestyle, but family studies prove the significant role of genes in the individual's predisposition to obesity. Advances in genotyping technologies have raised great hope and expectations that genetic testing will pave the way to personalized medicine and that complex traits such as obesity will be prevented even before birth. In the presence of the pressing offer of direct-to-consumer genetic testing services from private companies to estimate the individual's risk for complex phenotypes including obesity, the present review offers pediatricians an update of the state of the art on genomics obesity in childhood. Discrepancies with respect to genomics of adult obesity are discussed. After an appraisal of findings from genome-wide association studies in pediatric populations, the rare variant-common disease hypothesis, the theoretical soil for next-generation sequencing techniques, is discussed as opposite to the common disease-common variant hypothesis. Next-generation sequencing techniques are expected to fill the gap of "missing heritability" of obesity, identifying rare variants associated with the trait and clarifying the role of epigenetics in its heritability. Pediatric obesity emerges as a complex phenotype, modulated by unique gene-environment interactions that occur in periods of life and are "permissive" for the programming of adult obesity. With the advent of next-generation sequencing techniques and advances in the field of exposomics, sensitive and specific tools to predict the obesity risk as early as possible are the challenge for the next decade.
RNA sequencing: current and prospective uses in metabolic research.
Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay
2014-10-01
Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.
Next-generation sequencing for endocrine cancers: Recent advances and challenges.
Suresh, Padmanaban S; Venkatesh, Thejaswini; Tsutsumi, Rie; Shetty, Abhishek
2017-05-01
Contemporary molecular biology research tools have enriched numerous areas of biomedical research that address challenging diseases, including endocrine cancers (pituitary, thyroid, parathyroid, adrenal, testicular, ovarian, and neuroendocrine cancers). These tools have placed several intriguing clues before the scientific community. Endocrine cancers pose a major challenge in health care and research despite considerable attempts by researchers to understand their etiology. Microarray analyses have provided gene signatures from many cells, tissues, and organs that can differentiate healthy states from diseased ones, and even show patterns that correlate with stages of a disease. Microarray data can also elucidate the responses of endocrine tumors to therapeutic treatments. The rapid progress in next-generation sequencing methods has overcome many of the initial challenges of these technologies, and their advantages over microarray techniques have enabled them to emerge as valuable aids for clinical research applications (prognosis, identification of drug targets, etc.). A comprehensive review describing the recent advances in next-generation sequencing methods and their application in the evaluation of endocrine and endocrine-related cancers is lacking. The main purpose of this review is to illustrate the concepts that collectively constitute our current view of the possibilities offered by next-generation sequencing technological platforms, challenges to relevant applications, and perspectives on the future of clinical genetic testing of patients with endocrine tumors. We focus on recent discoveries in the use of next-generation sequencing methods for clinical diagnosis of endocrine tumors in patients and conclude with a discussion on persisting challenges and future objectives.
Using Next-Generation Sequencing to Explore Genetics and Race in the High School Classroom
Yang, Xinmiao; Hartman, Mark R.; Harrington, Kristin T.; Etson, Candice M.; Fierman, Matthew B.; Slonim, Donna K.; Walt, David R.
2017-01-01
With the development of new sequencing and bioinformatics technologies, concepts relating to personal genomics play an increasingly important role in our society. To promote interest and understanding of sequencing and bioinformatics in the high school classroom, we developed and implemented a laboratory-based teaching module called “The Genetics of Race.” This module uses the topic of race to engage students with sequencing and genetics. In the experimental portion of this module, students isolate their own mitochondrial DNA using standard biotechnology techniques and collect next-generation sequencing data to determine which of their classmates are most and least genetically similar to themselves. We evaluated the efficacy of this module by administering a pretest/posttest evaluation to measure student knowledge related to sequencing and bioinformatics, and we also conducted a survey at the conclusion of the module to assess student attitudes. Upon completion of our Genetics of Race module, students demonstrated significant learning gains, with lower-performing students obtaining the highest gains, and developed more positive attitudes toward scientific research. PMID:28408407
Symbolic dynamics techniques for complex systems: Application to share price dynamics
NASA Astrophysics Data System (ADS)
Xu, Dan; Beck, Christian
2017-05-01
The symbolic dynamics technique is well known for low-dimensional dynamical systems and chaotic maps, and lies at the roots of the thermodynamic formalism of dynamical systems. Here we show that this technique can also be successfully applied to time series generated by complex systems of much higher dimensionality. Our main example is the investigation of share price returns in a coarse-grained way. A nontrivial spectrum of Rényi entropies is found. We study how the spectrum depends on the time scale of returns, the sector of stocks considered, as well as the number of symbols used for the symbolic description. Overall our analysis confirms that in the symbol space transition probabilities of observed share price returns depend on the entire history of previous symbols, thus emphasizing the need for a modelling based on non-Markovian stochastic processes. Our method allows for quantitative comparisons of entirely different complex systems, for example the statistics of symbol sequences generated by share price returns using 4 symbols can be compared with that of genomic sequences.
Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing
Matochko, Wadim L.; Derda, Ratmir
2013-01-01
Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N × 1 frequency vector n = ||ni||, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N × N matrix and a stochastic sampling operator (S a). The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of S a and use them to define the sequencing operator (S e q). Sequencing without any bias and errors is S e q = S a IN, where IN is a N × N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (C E N), which describes elimination or statistically significant downsampling, of specific reads during the sequencing process. PMID:24416071
Y and W Chromosome Assemblies: Approaches and Discoveries.
Tomaszkiewicz, Marta; Medvedev, Paul; Makova, Kateryna D
2017-04-01
Hundreds of vertebrate genomes have been sequenced and assembled to date. However, most sequencing projects have ignored the sex chromosomes unique to the heterogametic sex - Y and W - that are known as sex-limited chromosomes (SLCs). Indeed, haploid and repetitive Y chromosomes in species with male heterogamety (XY), and W chromosomes in species with female heterogamety (ZW), are difficult to sequence and assemble. Nevertheless, obtaining their sequences is important for understanding the intricacies of vertebrate genome function and evolution. Recent progress has been made towards the adaptation of next-generation sequencing (NGS) techniques to deciphering SLC sequences. We review here currently available methodology and results with regard to SLC sequencing and assembly. We focus on vertebrates, but bring in some examples from other taxa. Copyright © 2017 Elsevier Ltd. All rights reserved.
Development of DNA-Free Sediment for Ecological Assays with Genomic Endpoints
Recent advances in genomics are currently being exploited to discern ecological changes that have conventionally been measured using laborious counting techniques. For example, next generation sequencing technologies can be used to create DNA libraries from benthic community ass...
Storyboard method of end-user programming with natural language configuration
Bouchard, Ann M; Osbourn, Gordon C
2013-11-19
A technique for end-user programming includes populating a template with graphically illustrated actions and then invoking a command to generate a screen element based on the template. The screen element is rendered within a computing environment and provides a mechanism for triggering execution of a sequence of user actions. The sequence of user actions is based at least in part on the graphically illustrated actions populated into the template.
Next-generation sequencing in clinical virology: Discovery of new viruses.
Datta, Sibnarayan; Budhauliya, Raghvendra; Das, Bidisha; Chatterjee, Soumya; Vanlalhmuaka; Veer, Vijay
2015-08-12
Viruses are a cause of significant health problem worldwide, especially in the developing nations. Due to different anthropological activities, human populations are exposed to different viral pathogens, many of which emerge as outbreaks. In such situations, discovery of novel viruses is utmost important for deciding prevention and treatment strategies. Since last century, a number of different virus discovery methods, based on cell culture inoculation, sequence-independent PCR have been used for identification of a variety of viruses. However, the recent emergence and commercial availability of next-generation sequencers (NGS) has entirely changed the field of virus discovery. These massively parallel sequencing platforms can sequence a mixture of genetic materials from a very heterogeneous mix, with high sensitivity. Moreover, these platforms work in a sequence-independent manner, making them ideal tools for virus discovery. However, for their application in clinics, sample preparation or enrichment is necessary to detect low abundance virus populations. A number of techniques have also been developed for enrichment or viral nucleic acids. In this manuscript, we review the evolution of sequencing; NGS technologies available today as well as widely used virus enrichment technologies. We also discuss the challenges associated with their applications in the clinical virus discovery.
Hamula, Camille L A; Peng, Hanyong; Wang, Zhixin; Tyrrell, Gregory J; Li, Xing-Fang; Le, X Chris
2016-03-15
Streptococcus pyogenes is a clinically important pathogen consisting of various serotypes determined by different M proteins expressed on the cell surface. The M type is therefore a useful marker to monitor the spread of invasive S. pyogenes in a population. Serotyping and nucleic acid amplification/sequencing methods for the identification of M types are laborious, inconsistent, and usually confined to reference laboratories. The primary objective of this work is to develop a technique that enables generation of aptamers binding to specific M-types of S. pyogenes. We describe here an in vitro technique that directly used live bacterial cells and the Systematic Evolution of Ligands by Exponential Enrichment (SELEX) strategy. Live S. pyogenes cells were incubated with DNA libraries consisting of 40-nucleotides randomized sequences. Those sequences that bound to the cells were separated, amplified using polymerase chain reaction (PCR), purified using gel electrophoresis, and served as the input DNA pool for the next round of SELEX selection. A specially designed forward primer containing extended polyA20/5Sp9 facilitated gel electrophoresis purification of ssDNA after PCR amplification. A counter-selection step using non-target cells was introduced to improve selectivity. DNA libraries of different starting sequence diversity (10(16) and 10(14)) were compared. Aptamer pools from each round of selection were tested for their binding to the target and non-target cells using flow cytometry. Selected aptamer pools were then cloned and sequenced. Individual aptamer sequences were screened on the basis of their binding to the 10 M-types that were used as targets. Aptamer pools obtained from SELEX rounds 5-8 showed high affinity to the target S. pyogenes cells. Tests against non-target Streptococcus bovis, Streptococcus pneumoniae, and Enterococcus species demonstrated selectivity of these aptamers for binding to S. pyogenes. Several aptamer sequences were found to bind preferentially to the M11 M-type of S. pyogenes. Estimated binding dissociation constants (Kd) were in the low nanomolar range for the M11 specific sequences; for example, sequence E-CA20 had a Kd of 7±1 nM. These affinities are comparable to those of a monoclonal antibody. The improved bacterial cell-SELEX technique is successful in generating aptamers selective for S. pyogenes and some of its M-types. These aptamers are potentially useful for detecting S. pyogenes, achieving binding profiles of the various M-types, and developing new M-typing technologies for non-specialized laboratories or point-of-care testing. Copyright © 2015 Elsevier Inc. All rights reserved.
Haider, Nadia
2017-01-01
Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
Ryu, Nari; Lee, Seokwon; Park, Hong-Joon; Lee, Byeonghyeon; Kwon, Tae-Jun; Bok, Jinwoong; Park, Chan Ik; Lee, Kyu-Yup; Baek, Jeong-In; Kim, Un-Kyung
2017-09-05
Hereditary hearing loss (HHL) is a common genetically heterogeneous disorder, which follows Mendelian inheritance in humans. Because of this heterogeneity, the identification of the causative gene of HHL by linkage analysis or Sanger sequencing have shown economic and temporal limitations. With recent advances in next-generation sequencing (NGS) techniques, rapid identification of a causative gene via massively parallel sequencing is now possible. We recruited a Korean family with three generations exhibiting autosomal dominant inheritance of hearing loss (HL), and the clinical information about this family revealed that there are no other symptoms accompanied with HL. To identify a causative mutation of HL in this family, we performed whole-exome sequencing of 4 family members, 3 affected and an unaffected. As the result, A novel splicing mutation, c.763+1G>T, in the solute carrier family 17, member 8 (SLC17A8) gene was identified in the patients, and the genotypes of the mutation were co-segregated with the phenotype of HL. Additionally, this mutation was not detected in 100 Koreans with normal hearing. Via NGS, we detected a novel splicing mutation that might influence the hearing ability within the patients with autosomal dominant non-syndromic HL. Our data suggests that this technique is a powerful tool to discover causative genetic factors of HL and facilitate diagnoses of the primary cause of HHL. Copyright © 2017 Elsevier B.V. All rights reserved.
Characterization of the Gut Microbiome Using 16S or Shotgun Metagenomics
Jovel, Juan; Patterson, Jordan; Wang, Weiwei; Hotte, Naomi; O'Keefe, Sandra; Mitchel, Troy; Perry, Troy; Kao, Dina; Mason, Andrew L.; Madsen, Karen L.; Wong, Gane K.-S.
2016-01-01
The advent of next generation sequencing (NGS) has enabled investigations of the gut microbiome with unprecedented resolution and throughput. This has stimulated the development of sophisticated bioinformatics tools to analyze the massive amounts of data generated. Researchers therefore need a clear understanding of the key concepts required for the design, execution and interpretation of NGS experiments on microbiomes. We conducted a literature review and used our own data to determine which approaches work best. The two main approaches for analyzing the microbiome, 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics, are illustrated with analyses of libraries designed to highlight their strengths and weaknesses. Several methods for taxonomic classification of bacterial sequences are discussed. We present simulations to assess the number of sequences that are required to perform reliable appraisals of bacterial community structure. To the extent that fluctuations in the diversity of gut bacterial populations correlate with health and disease, we emphasize various techniques for the analysis of bacterial communities within samples (α-diversity) and between samples (β-diversity). Finally, we demonstrate techniques to infer the metabolic capabilities of a bacteria community from these 16S and shotgun data. PMID:27148170
Germline transformation of the butterfly Bicyclus anynana.
Marcus, Jeffrey M; Ramos, Diane M; Monteiro, Antónia
2004-08-07
Ecological and evolutionary theory has frequently been inspired by the diversity of colour patterns on the wings of butterflies. More recently, these varied patterns have also become model systems for studying the evolution of developmental mechanisms. A technique that will facilitate our understanding of butterfly colour-pattern development is germline transformation. Germline transformation permits functional tests of candidate gene products and of cis-regulatory regions, and provides a means of generating new colour-pattern mutants by insertional mutagenesis. We report the successful transformation of the African satyrid butterfly Bicyclus anynana with two different transposable element vectors, Hermes and piggyBac, each carrying EGFP coding sequences driven by the 3XP3 synthetic enhancer that drives gene expression in the eyes. Candidate lines identified by screening for EGFP in adult eyes were later confirmed by PCR amplification of a fragment of the EGFP coding sequence from genomic DNA. Flanking DNA surrounding the insertions was amplified by inverse PCR and sequenced. Transformation rates were 5% for piggyBac and 10.2% for Hermes. Ultimately, the new data generated by these techniques may permit an integrated understanding of the developmental genetics of colour-pattern formation and of the ecological and evolutionary processes in which these patterns play a role.
Zheng, Guanglou; Fang, Gengfa; Shankaran, Rajan; Orgun, Mehmet A; Zhou, Jie; Qiao, Li; Saleem, Kashif
2017-05-01
Generating random binary sequences (BSes) is a fundamental requirement in cryptography. A BS is a sequence of N bits, and each bit has a value of 0 or 1. For securing sensors within wireless body area networks (WBANs), electrocardiogram (ECG)-based BS generation methods have been widely investigated in which interpulse intervals (IPIs) from each heartbeat cycle are processed to produce BSes. Using these IPI-based methods to generate a 128-bit BS in real time normally takes around half a minute. In order to improve the time efficiency of such methods, this paper presents an ECG multiple fiducial-points based binary sequence generation (MFBSG) algorithm. The technique of discrete wavelet transforms is employed to detect arrival time of these fiducial points, such as P, Q, R, S, and T peaks. Time intervals between them, including RR, RQ, RS, RP, and RT intervals, are then calculated based on this arrival time, and are used as ECG features to generate random BSes with low latency. According to our analysis on real ECG data, these ECG feature values exhibit the property of randomness and, thus, can be utilized to generate random BSes. Compared with the schemes that solely rely on IPIs to generate BSes, this MFBSG algorithm uses five feature values from one heart beat cycle, and can be up to five times faster than the solely IPI-based methods. So, it achieves a design goal of low latency. According to our analysis, the complexity of the algorithm is comparable to that of fast Fourier transforms. These randomly generated ECG BSes can be used as security keys for encryption or authentication in a WBAN system.
Development of DNA-Free Sediment for Ecological Assays with Genomic Endpoints (NAC SETAC)
Recent advances in genomics are currently being exploited to discern ecological changes that have conventionally been measured using laborious counting techniques. For example, next generation sequencing technologies can be used to create DNA libraries from benthic community ass...
Characterizing differential gene expression in polyploid grasses lacking a reference transcriptome
USDA-ARS?s Scientific Manuscript database
Basal transcriptome characterization and differential gene expression in response to varying conditions are often addressed through next generation sequencing (NGS) and data analysis techniques. While these strategies are commonly used, there are countless tools, pipelines, data analysis methods an...
Phenotypic mutant library: potential for gene discovery
USDA-ARS?s Scientific Manuscript database
The rapid development of high throughput and affordable Next- Generation Sequencing (NGS) techniques has renewed interest in gene discovery using forward genetics. The conventional forward genetic approach starts with isolation of mutants with a phenotype of interest, mapping the mutation within a s...
RBT-GA: a novel metaheuristic for solving the Multiple Sequence Alignment problem.
Taheri, Javid; Zomaya, Albert Y
2009-07-07
Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences.
Application of the MIDAS approach for analysis of lysine acetylation sites.
Evans, Caroline A; Griffiths, John R; Unwin, Richard D; Whetton, Anthony D; Corfe, Bernard M
2013-01-01
Multiple Reaction Monitoring Initiated Detection and Sequencing (MIDAS™) is a mass spectrometry-based technique for the detection and characterization of specific post-translational modifications (Unwin et al. 4:1134-1144, 2005), for example acetylated lysine residues (Griffiths et al. 18:1423-1428, 2007). The MIDAS™ technique has application for discovery and analysis of acetylation sites. It is a hypothesis-driven approach that requires a priori knowledge of the primary sequence of the target protein and a proteolytic digest of this protein. MIDAS essentially performs a targeted search for the presence of modified, for example acetylated, peptides. The detection is based on the combination of the predicted molecular weight (measured as mass-charge ratio) of the acetylated proteolytic peptide and a diagnostic fragment (product ion of m/z 126.1), which is generated by specific fragmentation of acetylated peptides during collision induced dissociation performed in tandem mass spectrometry (MS) analysis. Sequence information is subsequently obtained which enables acetylation site assignment. The technique of MIDAS was later trademarked by ABSciex for targeted protein analysis where an MRM scan is combined with full MS/MS product ion scan to enable sequence confirmation.
Roden, Suzanne E; Dutton, Peter H; Morin, Phillip A
2009-01-01
The green sea turtle, Chelonia mydas, was used as a case study for single nucleotide polymorphism (SNP) discovery in a species that has little genetic sequence information available. As green turtles have a complex population structure, additional nuclear markers other than microsatellites could add to our understanding of their complex life history. Amplified fragment length polymorphism technique was used to generate sets of random fragments of genomic DNA, which were then electrophoretically separated with precast gels, stained with SYBR green, excised, and directly sequenced. It was possible to perform this method without the use of polyacrylamide gels, radioactive or fluorescent labeled primers, or hybridization methods, reducing the time, expense, and safety hazards of SNP discovery. Within 13 loci, 2547 base pairs were screened, resulting in the discovery of 35 SNPs. Using this method, it was possible to yield a sufficient number of loci to screen for SNP markers without the availability of prior sequence information.
Continuities in stone flaking technology at Liang Bua, Flores, Indonesia.
Moore, M W; Sutikna, T; Jatmiko; Morwood, M J; Brumm, A
2009-11-01
This study examines trends in stone tool reduction technology at Liang Bua, Flores, Indonesia, where excavations have revealed a stratified artifact sequence spanning 95k.yr. The reduction sequence practiced throughout the Pleistocene was straightforward and unchanging. Large flakes were produced off-site and carried into the cave where they were reduced centripetally and bifacially by four techniques: freehand, burination, truncation, and bipolar. The locus of technological complexity at Liang Bua was not in knapping products, but in the way techniques were integrated. This reduction sequence persisted across the Pleistocene/Holocene boundary with a minor shift favoring unifacial flaking after 11ka. Other stone-related changes occurred at the same time, including the first appearance of edge-glossed flakes, a change in raw material selection, and more frequent fire-induced damage to stone artifacts. Later in the Holocene, technological complexity was generated by "adding-on" rectangular-sectioned stone adzes to the reduction sequence. The Pleistocene pattern is directly associated with Homo floresiensis skeletal remains and the Holocene changes correlate with the appearance of Homo sapiens. The one reduction sequence continues across this hominin replacement.
Manlig, Erika; Wahlberg, Per
2017-01-01
Abstract Sodium bisulphite treatment of DNA combined with next generation sequencing (NGS) is a powerful combination for the interrogation of genome-wide DNA methylation profiles. Library preparation for whole genome bisulphite sequencing (WGBS) is challenging due to side effects of the bisulphite treatment, which leads to extensive DNA damage. Recently, a new generation of methods for bisulphite sequencing library preparation have been devised. They are based on initial bisulphite treatment of the DNA, followed by adaptor tagging of single stranded DNA fragments, and enable WGBS using low quantities of input DNA. In this study, we present a novel approach for quick and cost effective WGBS library preparation that is based on splinted adaptor tagging (SPLAT) of bisulphite-converted single-stranded DNA. Moreover, we validate SPLAT against three commercially available WGBS library preparation techniques, two of which are based on bisulphite treatment prior to adaptor tagging and one is a conventional WGBS method. PMID:27899585
Zhang, Lu; Xu, Jinhao; Ma, Jinbiao
2016-07-25
RNA-binding protein exerts important biological function by specifically recognizing RNA motif. SELEX (Systematic evolution of ligands by exponential enrichment), an in vitro selection method, can obtain consensus motif with high-affinity and specificity for many target molecules from DNA or RNA libraries. Here, we combined SELEX with next-generation sequencing to study the protein-RNA interaction in vitro. A pool of RNAs with 20 bp random sequences were transcribed by T7 promoter, and target protein was inserted into plasmid containing SBP-tag, which can be captured by streptavidin beads. Through only one cycle, the specific RNA motif can be obtained, which dramatically improved the selection efficiency. Using this method, we found that human hnRNP A1 RRMs domain (UP1 domain) bound RNA motifs containing AGG and AG sequences. The EMSA experiment indicated that hnRNP A1 RRMs could bind the obtained RNA motif. Taken together, this method provides a rapid and effective method to study the RNA binding specificity of proteins.
DNA copy number, including telomeres and mitochondria, assayed using next-generation sequencing.
Castle, John C; Biery, Matthew; Bouzek, Heather; Xie, Tao; Chen, Ronghua; Misura, Kira; Jackson, Stuart; Armour, Christopher D; Johnson, Jason M; Rohl, Carol A; Raymond, Christopher K
2010-04-16
DNA copy number variations occur within populations and aberrations can cause disease. We sought to develop an improved lab-automatable, cost-efficient, accurate platform to profile DNA copy number. We developed a sequencing-based assay of nuclear, mitochondrial, and telomeric DNA copy number that draws on the unbiased nature of next-generation sequencing and incorporates techniques developed for RNA expression profiling. To demonstrate this platform, we assayed UMC-11 cells using 5 million 33 nt reads and found tremendous copy number variation, including regions of single and homogeneous deletions and amplifications to 29 copies; 5 times more mitochondria and 4 times less telomeric sequence than a pool of non-diseased, blood-derived DNA; and that UMC-11 was derived from a male individual. The described assay outputs absolute copy number, outputs an error estimate (p-value), and is more accurate than array-based platforms at high copy number. The platform enables profiling of mitochondrial levels and telomeric length. The assay is lab-automatable and has a genomic resolution and cost that are tunable based on the number of sequence reads.
HomSI: a homozygous stretch identifier from next-generation sequencing data.
Görmez, Zeliha; Bakir-Gungor, Burcu; Sagiroglu, Mahmut Samil
2014-02-01
In consanguineous families, as a result of inheriting the same genomic segments through both parents, the individuals have stretches of their genomes that are homozygous. This situation leads to the prevalence of recessive diseases among the members of these families. Homozygosity mapping is based on this observation, and in consanguineous families, several recessive disease genes have been discovered with the help of this technique. The researchers typically use single nucleotide polymorphism arrays to determine the homozygous regions and then search for the disease gene by sequencing the genes within this candidate disease loci. Recently, the advent of next-generation sequencing enables the concurrent identification of homozygous regions and the detection of mutations relevant for diagnosis, using data from a single sequencing experiment. In this respect, we have developed a novel tool that identifies homozygous regions using deep sequence data. Using *.vcf (variant call format) files as an input file, our program identifies the majority of homozygous regions found by microarray single nucleotide polymorphism genotype data. HomSI software is freely available at www.igbam.bilgem.tubitak.gov.tr/softwares/HomSI, with an online manual.
DNA copy number, including telomeres and mitochondria, assayed using next-generation sequencing
2010-01-01
Background DNA copy number variations occur within populations and aberrations can cause disease. We sought to develop an improved lab-automatable, cost-efficient, accurate platform to profile DNA copy number. Results We developed a sequencing-based assay of nuclear, mitochondrial, and telomeric DNA copy number that draws on the unbiased nature of next-generation sequencing and incorporates techniques developed for RNA expression profiling. To demonstrate this platform, we assayed UMC-11 cells using 5 million 33 nt reads and found tremendous copy number variation, including regions of single and homogeneous deletions and amplifications to 29 copies; 5 times more mitochondria and 4 times less telomeric sequence than a pool of non-diseased, blood-derived DNA; and that UMC-11 was derived from a male individual. Conclusion The described assay outputs absolute copy number, outputs an error estimate (p-value), and is more accurate than array-based platforms at high copy number. The platform enables profiling of mitochondrial levels and telomeric length. The assay is lab-automatable and has a genomic resolution and cost that are tunable based on the number of sequence reads. PMID:20398377
The identification of cis-regulatory elements: A review from a machine learning perspective.
Li, Yifeng; Chen, Chih-Yu; Kaye, Alice M; Wasserman, Wyeth W
2015-12-01
The majority of the human genome consists of non-coding regions that have been called junk DNA. However, recent studies have unveiled that these regions contain cis-regulatory elements, such as promoters, enhancers, silencers, insulators, etc. These regulatory elements can play crucial roles in controlling gene expressions in specific cell types, conditions, and developmental stages. Disruption to these regions could contribute to phenotype changes. Precisely identifying regulatory elements is key to deciphering the mechanisms underlying transcriptional regulation. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them. The development of next-generation sequencing techniques has allowed us to capture these genomic features in depth. Applied analysis of genome sequences for clinical genetics has increased the urgency for detecting these regions. However, the complexity of cis-regulatory events and the deluge of sequencing data require accurate and efficient computational approaches, in particular, machine learning techniques. In this review, we describe machine learning approaches for predicting transcription factor binding sites, enhancers, and promoters, primarily driven by next-generation sequencing data. Data sources are provided in order to facilitate testing of novel methods. The purpose of this review is to attract computational experts and data scientists to advance this field. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
Next-generation sequencing library construction on a surface.
Feng, Kuan; Costa, Justin; Edwards, Jeremy S
2018-05-30
Next-generation sequencing (NGS) has revolutionized almost all fields of biology, agriculture and medicine, and is widely utilized to analyse genetic variation. Over the past decade, the NGS pipeline has been steadily improved, and the entire process is currently relatively straightforward. However, NGS instrumentation still requires upfront library preparation, which can be a laborious process, requiring significant hands-on time. Herein, we present a simple but robust approach to streamline library preparation by utilizing surface bound transposases to construct DNA libraries directly on a flowcell surface. The surface bound transposases directly fragment genomic DNA while simultaneously attaching the library molecules to the flowcell. We sequenced and analysed a Drosophila genome library generated by this surface tagmentation approach, and we showed that our surface bound library quality was comparable to the quality of the library from a commercial kit. In addition to the time and cost savings, our approach does not require PCR amplification of the library, which eliminates potential problems associated with PCR duplicates. We described the first study to construct libraries directly on a flowcell. We believe our technique could be incorporated into the existing Illumina sequencing pipeline to simplify the workflow, reduce costs, and improve data quality.
Next generation sequencing applications for breast cancer research
PETRIC, ROXANA COJOCNEANU; POP, LAURA-ANCUTA; JURJ, ANCUTA; RADULY, LAJOS; DUMITRASCU, DAN; DRAGOS, NICOLAE; NEAGOE, IOANA BERINDAN
2015-01-01
For some time, cancer has not been thought of as a disease, but as a multifaceted, heterogeneous complex of genotypic and phenotypic manifestations leading to tumorigenesis. Due to recent technological progress, the outcome of cancer patients can be greatly improved by introducing in clinical practice the advantages brought about by the development of next generation sequencing techniques. Biomedical suppliers have come up with various applications which medical researchers can use to characterize a patient’s disease from molecular and genetic point of view in order to provide caregivers with rapid and relevant information to guide them in choosing the most appropriate course of treatment, with maximum efficiency and minimal side effects. Breast cancer, whose incidence has risen dramatically, is a good candidate for these novel diagnosis and therapeutic approaches, particularly when referring to specific sequencing panels which are designed to detect germline or somatic mutations in genes that are involved in breast cancer tumorigenesis and progression. Benchtop next generation sequencing machines are becoming a more common presence in the clinical setting, empowering physicians to better treat their patients, by offering early diagnosis alternatives, targeted remedies, and bringing medicine a step closer to achieving its ultimate goal, personalized therapy. PMID:26609257
Piazza, Rocco; Magistroni, Vera; Pirola, Alessandra; Redaelli, Sara; Spinelli, Roberta; Redaelli, Serena; Galbiati, Marta; Valletta, Simona; Giudici, Giovanni; Cazzaniga, Giovanni; Gambacorti-Passerini, Carlo
2013-01-01
Copy number alterations (CNA) are common events occurring in leukaemias and solid tumors. Comparative Genome Hybridization (CGH) is actually the gold standard technique to analyze CNAs; however, CGH analysis requires dedicated instruments and is able to perform only low resolution Loss of Heterozygosity (LOH) analyses. Here we present CEQer (Comparative Exome Quantification analyzer), a new graphical, event-driven tool for CNA/allelic-imbalance (AI) coupled analysis of exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection. This data is used to build mixed statistical/heuristic models allowing the identification of CNA/AI events. To test our tool, we initially used in silico generated data, then we performed whole-exome sequencing from 20 leukemic specimens and corresponding matched controls and we analyzed the results using CEQer. Taken globally, these analyses showed that the combined use of comparative digital exon quantification and LOH/AI allows generating very accurate CNA data. Therefore, we propose CEQer as an efficient, robust and user-friendly graphical tool for the identification of CNA/AI in the context of whole-exome sequencing data.
Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye
2016-07-01
In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods.
High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.
Lagarde, Julien; Uszczynska-Ratajczak, Barbara; Carbonell, Silvia; Pérez-Lluch, Sílvia; Abad, Amaya; Davis, Carrie; Gingeras, Thomas R; Frankish, Adam; Harrow, Jennifer; Guigo, Roderic; Johnson, Rory
2017-12-01
Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis
2012-01-01
Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.
Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong
2012-01-25
The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.
Sequence Ready Characterization of the Pericentromeric Region of 19p12
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evan E. Eichler
2006-08-31
Current mapping and sequencing strategies have been inadequate within the proximal portion of 19p12 due, in part, to the presence of a recently expanded ZNF (zinc-finger) gene family and the presence of large (25-50 kb) inverted beta-satellite repeat structures which bracket this tandemly duplicated gene family. The virtual of absence of classically defined “unique” sequence within the region has hampered efforts to identify and characterize a suitable minimal tiling path of clones which can be used as templates required for finished sequencing of the region. The goal of this proposal is to develop and implement a novel sequence-anchor strategy tomore » generate a contiguous BAC map of the most proximal portion of chromosome 19p12 for the purpose of complete sequence characterization. The target region will be an estimated 4.5 Mb of DNA extending from STS marker D19S450 (the beginning of the ZNF gene cluster) to the centromeric (alpha-satellite) junction of 19p11. The approach will entail 1) pre-selection of 19p12 BAC and cosmid clones (NIH approved library) utilizing both 19p12 -unique and 19p12-SPECIFIC repeat probes (Eichler et al., 1998); 2) the generation of a BAC/cosmid end-sequence map across the region with a density of one marker every 8kb; 3) the development of a second-generation of STS (sequence tagged sites) which will be used to identify and verify clonal overlap at the level of the sequence; 4) incorporation of these sequence-anchored overlapping clones into existing cosmid/BAC restriction maps developed at Livermore National Laboratory; and 5) validation of the organization of this region utilizing high-resolution FISH techniques (extended chromatin analysis) on monochromosomal 19 somatic cell hybrids and parental cell lines of source material. The data generated will be used in the selection of the most parsimonious tiling path of BAC clones to be sequenced as part of the JGI effort on chromosome 19 and should serve as a model for the sequence characterization of other difficult regions of the human genome« less
Burden, S; Lin, Y-X; Zhang, R
2005-03-01
Although a great deal of research has been undertaken in the area of promoter prediction, prediction techniques are still not fully developed. Many algorithms tend to exhibit poor specificity, generating many false positives, or poor sensitivity. The neural network prediction program NNPP2.2 is one such example. To improve the NNPP2.2 prediction technique, the distance between the transcription start site (TSS) associated with the promoter and the translation start site (TLS) of the subsequent gene coding region has been studied for Escherichia coli K12 bacteria. An empirical probability distribution that is consistent for all E.coli promoters has been established. This information is combined with the results from NNPP2.2 to create a new technique called TLS-NNPP, which improves the specificity of promoter prediction. The technique is shown to be effective using E.coli DNA sequences, however, it is applicable to any organism for which a set of promoters has been experimentally defined. The data used in this project and the prediction results for the tested sequences can be obtained from http://www.uow.edu.au/~yanxia/E_Coli_paper/SBurden_Results.xls alh98@uow.edu.au.
Web Navigation Sequences Automation in Modern Websites
NASA Astrophysics Data System (ADS)
Montoto, Paula; Pan, Alberto; Raposo, Juan; Bellas, Fernando; López, Javier
Most today’s web sources are designed to be used by humans, but they do not provide suitable interfaces for software programs. That is why a growing interest has arisen in so-called web automation applications that are widely used for different purposes such as B2B integration, automated testing of web applications or technology and business watch. Previous proposals assume models for generating and reproducing navigation sequences that are not able to correctly deal with new websites using technologies such as AJAX: on one hand existing systems only allow recording simple navigation actions and, on the other hand, they are unable to detect the end of the effects caused by an user action. In this paper, we propose a set of new techniques to record and execute web navigation sequences able to deal with all the complexity existing in AJAX-based web sites. We also present an exhaustive evaluation of the proposed techniques that shows very promising results.
Compression of computer generated phase-shifting hologram sequence using AVC and HEVC
NASA Astrophysics Data System (ADS)
Xing, Yafei; Pesquet-Popescu, Béatrice; Dufaux, Frederic
2013-09-01
With the capability of achieving twice the compression ratio of Advanced Video Coding (AVC) with similar reconstruction quality, High Efficiency Video Coding (HEVC) is expected to become the newleading technique of video coding. In order to reduce the storage and transmission burden of digital holograms, in this paper we propose to use HEVC for compressing the phase-shifting digital hologram sequences (PSDHS). By simulating phase-shifting digital holography (PSDH) interferometry, interference patterns between illuminated three dimensional( 3D) virtual objects and the stepwise phase changed reference wave are generated as digital holograms. The hologram sequences are obtained by the movement of the virtual objects and compressed by AVC and HEVC. The experimental results show that AVC and HEVC are efficient to compress PSDHS, with HEVC giving better performance. Good compression rate and reconstruction quality can be obtained with bitrate above 15000kbps.
Next generation sequencing--implications for clinical practice.
Raffan, Eleanor; Semple, Robert K
2011-01-01
Genetic testing in inherited disease has traditionally relied upon recognition of the presenting clinical syndrome and targeted analysis of genes known to be linked to that syndrome. Consequently, many patients with genetic syndromes remain without a specific diagnosis. New 'next-generation' sequencing (NGS) techniques permit simultaneous sequencing of enormous amounts of DNA. A slew of research publications have recently demonstrated the tremendous power of these technologies in increasing understanding of human genetic disease. These approaches are likely to be increasingly employed in routine diagnostic practice, but the scale of the genetic information yielded about individuals means that caution must be exercised to avoid net harm in this setting. Use of NGS in a research setting will increasingly have a major but indirect beneficial impact on clinical practice. However, important technical, ethical and social challenges need to be addressed through informed professional and public dialogue before it finds its mature niche as a direct tool in the clinical diagnostic armoury.
Application of Genomic Technologies to the Breeding of Trees
Badenes, Maria L.; Fernández i Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J.
2016-01-01
The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species. PMID:27895664
Application of Genomic Technologies to the Breeding of Trees.
Badenes, Maria L; Fernández I Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J
2016-01-01
The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species.
Identifying novel sequence variants of RNA 3D motifs
Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.
2015-01-01
Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723
Minimizing structural vibrations with Input Shaping (TM)
NASA Technical Reports Server (NTRS)
Singhose, Bill; Singer, Neil
1995-01-01
A new method for commanding machines to move with increased dynamic performance was developed. This method is an enhanced version of input shaping, a patented vibration suppression algorithm. This technique intercepts a command input to a system command that moves the mechanical system with increased performance and reduced residual vibration. This document describes many advanced methods for generating highly optimized shaping sequences which are tuned to particular systems. The shaping sequence is important because it determines the trade off between move/settle time of the system and the insensitivity of the input shaping algorithm to variations or uncertainties in the machine which can be controlled. For example, a system with a 5 Hz resonance that takes 1 second to settle can be improved to settle instantaneously using a 0.2 shaping sequence (thus improving settle time by a factor of 5). This system could vary by plus or minus 15% in its natural frequency and still have no apparent vibration. However, the same system shaped with a 0.3 second shaping sequence could tolerate plus or minus 40% or more variation in natural frequency. This document describes how to generate sequences that maximize performance, sequences that maximize insensitivity, and sequences that trade off between the two. Several software tools are documented and included.
Milius, Robert P; Heuer, Michael; Valiga, Daniel; Doroschak, Kathryn J; Kennedy, Caleb J; Bolon, Yung-Tsi; Schneider, Joel; Pollack, Jane; Kim, Hwa Ran; Cereb, Nezih; Hollenbach, Jill A; Mack, Steven J; Maiers, Martin
2015-12-01
We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and Sequence Based Typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Applications of Single-Cell Sequencing for Multiomics.
Xu, Yungang; Zhou, Xiaobo
2018-01-01
Single-cell sequencing interrogates the sequence or chromatin information from individual cells with advanced next-generation sequencing technologies. It provides a higher resolution of cellular differences and a better understanding of the underlying genetic and epigenetic mechanisms of an individual cell in the context of its survival and adaptation to microenvironment. However, it is more challenging to perform single-cell sequencing and downstream data analysis, owing to the minimal amount of starting materials, sample loss, and contamination. In addition, due to the picogram level of the amount of nucleic acids used, heavy amplification is often needed during sample preparation of single-cell sequencing, resulting in the uneven coverage, noise, and inaccurate quantification of sequencing data. All these unique properties raise challenges in and thus high demands for computational methods that specifically fit single-cell sequencing data. We here comprehensively survey the current strategies and challenges for multiple single-cell sequencing, including single-cell transcriptome, genome, and epigenome, beginning with a brief introduction to multiple sequencing techniques for single cells.
srRNA evolution and phylogenetic relationships of the genus Naegleria (Protista: Rhizopoda).
Baverstock, P R; Illana, S; Christy, P E; Robinson, B S; Johnson, A M
1989-05-01
A rapid RNA sequencing technique was used to partially sequence the small-subunit ribosomal RNA (srRNA) of four species of the amoeboid genus Naegleria. The extent of nucleotide sequence divergence between the two most divergent species was roughly similar to that found between mammals and frogs. However, the pattern of variation among the Naegleria species was quite different from that found for those species of tetrapods characterized to date. A phylogenetic analysis of the consensus Naegleria sequence showed that Naegleria was not monophyletic with either Acanthamoeba castellanii or Dictyostelium discoideum, two other amoebas for which sequences were available. It was shown that the semiconserved regions of the srRNA molecule evolve in a clocklike fashion and that the clock is time dependent rather than generation dependent.
Random Sequence for Optimal Low-Power Laser Generated Ultrasound
NASA Astrophysics Data System (ADS)
Vangi, D.; Virga, A.; Gulino, M. S.
2017-08-01
Low-power laser generated ultrasounds are lately gaining importance in the research world, thanks to the possibility of investigating a mechanical component structural integrity through a non-contact and Non-Destructive Testing (NDT) procedure. The ultrasounds are, however, very low in amplitude, making it necessary to use pre-processing and post-processing operations on the signals to detect them. The cross-correlation technique is used in this work, meaning that a random signal must be used as laser input. For this purpose, a highly random and simple-to-create code called T sequence, capable of enhancing the ultrasound detectability, is introduced (not previously available at the state of the art). Several important parameters which characterize the T sequence can influence the process: the number of pulses Npulses , the pulse duration δ and the distance between pulses dpulses . A Finite Element FE model of a 3 mm steel disk has been initially developed to analytically study the longitudinal ultrasound generation mechanism and the obtainable outputs. Later, experimental tests have shown that the T sequence is highly flexible for ultrasound detection purposes, making it optimal to use high Npulses and δ but low dpulses . In the end, apart from describing all phenomena that arise in the low-power laser generation process, the results of this study are also important for setting up an effective NDT procedure using this technology.
Ishii, Satoshi; Sadowsky, Michael J
2009-04-01
A large number of repetitive DNA sequences are found in multiple sites in the genomes of numerous bacteria, archaea and eukarya. While the functions of many of these repetitive sequence elements are unknown, they have proven to be useful as the basis of several powerful tools for use in molecular diagnostics, medical microbiology, epidemiological analyses and environmental microbiology. The repetitive sequence-based PCR or rep-PCR DNA fingerprint technique uses primers targeting several of these repetitive elements and PCR to generate unique DNA profiles or 'fingerprints' of individual microbial strains. Although this technique has been extensively used to examine diversity among variety of prokaryotic microorganisms, rep-PCR DNA fingerprinting can also be applied to microbial ecology and microbial evolution studies since it has the power to distinguish microbes at the strain or isolate level. Recent advancement in rep-PCR methodology has resulted in increased accuracy, reproducibility and throughput. In this minireview, we summarize recent improvements in rep-PCR DNA fingerprinting methodology, and discuss its applications to address fundamentally important questions in microbial ecology and evolution.
Mouraviev, Vladimir; McDonald, Michael
2018-06-01
The changing face of current infection phenotypes from planktonic to biofilm type has been developed implicating bacterial biofilms in recurrent infection. To date, no specific medical treatment exists to specifically target biofilms in the human host. Similarly, the identification of a biofilm has relied upon the analysis of tissue samples with electron microscopy or DNA identification with polymerase chain reaction (PCR) and sequencing. Standard culture and sensitivity test is not able to detect a presence of biofilms. Two types of molecular microbial diagnostic testing 'levels' are performed as noted below. In both types of analysis, the microbial DNA is extracted from the patient's sample. The patient report contains information about the pathogenic bacterial and fungal microorganisms detected, bacterial load and resistance genes to different antibiotics. Once the bacteria have been identified antibiotic recommendations are made based on research confirming the effectiveness of treatment. The technique was tested in 112 patients in different areas of urology for prevention and treatment purpose. The clinical application of next generation sequence in different clinical phase I-II trials (acute cystitis in 56 patients, rectal swabs before transrectal prostate biopsy in 32 men, neurogenic bladder in 13 patients, chronic bacterial prostatitis in 17 men) demonstrated that this novel approach extends our knowledge about the microbiome of the urogenital tract in both men and women. DNA sequence has a high sensitivity to detect a bacterial and fungal association with resistant genes to antibiotics revealed allowing to implement a targeted and individual prevention and treatment of urinary tract infection (UTI) with improved efficacy compared to standard culture and sensitivity technique. The next generation DNA sequence technology enables the discovery of new concepts regarding the role of microorganisms in diseases of the urinary tract with an individualized approach for a more accurate diagnosis, prevention, prophylaxis and treatment of UTI.
RBT-GA: a novel metaheuristic for solving the multiple sequence alignment problem
Taheri, Javid; Zomaya, Albert Y
2009-01-01
Background Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. Results This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. Conclusion RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences. PMID:19594869
Zhang, Haitao; Wu, Chenxue; Chen, Zewei; Liu, Zhao; Zhu, Yunhong
2017-01-01
Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules.
Wu, Chenxue; Liu, Zhao; Zhu, Yunhong
2017-01-01
Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules. PMID:28767687
A New Three Dimensional Based Key Generation Technique in AVK
NASA Astrophysics Data System (ADS)
Banerjee, Subhasish; Dutta, Manash Pratim; Bhunia, Chandan Tilak
2017-08-01
In modern era, ensuring high order security becomes one and only objective of computer networks. From the last few decades, many researchers have given their contributions to achieve the secrecy over the communication channel. In achieving perfect security, Shannon had done the pioneer work on perfect secret theorem and illustrated that secrecy of the shared information can be maintained if the key becomes variable in nature instead of static one. In this regard, a key generation technique has been proposed where the key can be changed every time whenever a new block of data needs to be exchanged. In our scheme, the keys not only vary in bit sequences but also in size. The experimental study is also included in this article to prove the correctness and effectiveness of our proposed technique.
NASA Astrophysics Data System (ADS)
El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Kobeissy, Firas
2017-01-01
The crucial biological role of proteases has been visible with the development of degradomics discipline involved in the determination of the proteases/substrates resulting in breakdown-products (BDPs) that can be utilized as putative biomarkers associated with different biological-clinical significance. In the field of cancer biology, matrix metalloproteinases (MMPs) have shown to result in MMPs-generated protein BDPs that are indicative of malignant growth in cancer, while in the field of neural injury, calpain-2 and caspase-3 proteases generate BDPs fragments that are indicative of different neural cell death mechanisms in different injury scenarios. Advanced proteomic techniques have shown a remarkable progress in identifying these BDPs experimentally. In this work, we present a bioinformatics-based prediction method that identifies protease-associated BDPs with high precision and efficiency. The method utilizes state-of-the-art sequence matching and alignment algorithms. It starts by locating consensus sequence occurrences and their variants in any set of protein substrates, generating all fragments resulting from cleavage. The complexity exists in space O(mn) as well as in O(Nmn) time, where N, m, and n are the number of protein sequences, length of the consensus sequence, and length per protein sequence, respectively. Finally, the proposed methodology is validated against βII-spectrin protein, a brain injury validated biomarker.
Sequenced Peer Revision: Creating Competence and Community
ERIC Educational Resources Information Center
Bowman, Ingrid K.; Robertson, John
2013-01-01
Mastering techniques of self- and peer revision is a valuable tool for all writers, especially US-educated Generation 1.5 students, whose near fluency enables them to dialogue successfully about their writing. Using action research, 2 academic writing instructors systematically trained students to more responsibly and effectively revise their…
USDA-ARS?s Scientific Manuscript database
Ethyl methanesulfonate (EMS) efficiently generates high-density mutations in genomes. Conventionally, these mutations are identified by techniques that can detect single-nucleotide mismatches in heteroduplexes of individual PCR amplicons. We applied whole-genome sequencing to 256-phenotyped mutant l...
Xu, Yi-Hua; Manoharan, Herbert T; Pitot, Henry C
2007-09-01
The bisulfite genomic sequencing technique is one of the most widely used techniques to study sequence-specific DNA methylation because of its unambiguous ability to reveal DNA methylation status to the order of a single nucleotide. One characteristic feature of the bisulfite genomic sequencing technique is that a number of sample sequence files will be produced from a single DNA sample. The PCR products of bisulfite-treated DNA samples cannot be sequenced directly because they are heterogeneous in nature; therefore they should be cloned into suitable plasmids and then sequenced. This procedure generates an enormous number of sample DNA sequence files as well as adding extra bases belonging to the plasmids to the sequence, which will cause problems in the final sequence comparison. Finding the methylation status for each CpG in each sample sequence is not an easy job. As a result CpG PatternFinder was developed for this purpose. The main functions of the CpG PatternFinder are: (i) to analyze the reference sequence to obtain CpG and non-CpG-C residue position information. (ii) To tailor sample sequence files (delete insertions and mark deletions from the sample sequence files) based on a configuration of ClustalW multiple alignment. (iii) To align sample sequence files with a reference file to obtain bisulfite conversion efficiency and CpG methylation status. And, (iv) to produce graphics, highlighted aligned sequence text and a summary report which can be easily exported to Microsoft Office suite. CpG PatternFinder is designed to operate cooperatively with BioEdit, a freeware on the internet. It can handle up to 100 files of sample DNA sequences simultaneously, and the total CpG pattern analysis process can be finished in minutes. CpG PatternFinder is an ideal software tool for DNA methylation studies to determine the differential methylation pattern in a large number of individuals in a population. Previously we developed the CpG Analyzer program; CpG PatternFinder is our further effort to create software tools for DNA methylation studies.
Strategies for automatic planning: A collection of ideas
NASA Technical Reports Server (NTRS)
Collins, Carol; George, Julia; Zamani, Elaine
1989-01-01
The main goal of the Jet Propulsion Laboratory (JPL) is to obtain science return from interplanetary probes. The uplink process is concerned with communicating commands to a spacecraft in order to achieve science objectives. There are two main parts to the development of the command file which is sent to a spacecraft. First, the activity planning process integrates the science requests for utilization of spacecraft time into a feasible sequence. Then the command generation process converts the sequence into a set of commands. The development of a feasible sequence plan is an expensive and labor intensive process requiring many months of effort. In order to save time and manpower in the uplink process, automation of parts of this process is desired. There is an ongoing effort to develop automatic planning systems. This has met with some success, but has also been informative about the nature of this effort. It is now clear that innovative techniques and state-of-the-art technology will be required in order to produce a system which can provide automatic sequence planning. As part of this effort to develop automatic planning systems, a survey of the literature, looking for known techniques which may be applicable to our work was conducted. Descriptions of and references for these methods are given, together with ideas for applying the techniques to automatic planning.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Korenberg, J.R.
The ultimate goal of this research is to generate and apply novel technologies to speed completion and integration of the human genome map and sequence with biomedical problems. To do this, techniques were developed and genome-wide resources generated. This includes a genome-wide Mapped and Integrated BAC/PAC Resource that has been used for gene finding, map completion and anchoring, breakpoint definition and sequencing. In the last period of the grant, the Human Mapped BAC/PAC Resource was also applied to determine regions of human variation and to develop a novel paradigm of primate evolution through to humans. Further, in order to moremore » rapidly evaluate animal models of human disease, a BAC Map of the mouse was generated in collaboration with the MTI Genome Center, Dr. Bruce Birren.« less
A simple method for MR elastography: a gradient-echo type multi-echo sequence.
Numano, Tomokazu; Mizuhara, Kazuyuki; Hata, Junichi; Washio, Toshikatsu; Homma, Kazuhiro
2015-01-01
To demonstrate the feasibility of a novel MR elastography (MRE) technique based on a conventional gradient-echo type multi-echo MR sequence which does not need additional bipolar magnetic field gradients (motion encoding gradient: MEG), yet is sensitive to vibration. In a gradient-echo type multi-echo MR sequence, several images are produced from each echo of the train with different echo times (TEs). If these echoes are synchronized with the vibration, each readout's gradient lobes achieve a MEG-like effect, and the later generated echo causes a greater MEG-like effect. The sequence was tested for the tissue-mimicking agarose gel phantoms and the psoas major muscles of healthy volunteers. It was confirmed that the readout gradient lobes caused an MEG-like effect and the later TE images had higher sensitivity to vibrations. The magnitude image of later generated echo suffered the T2 decay and the susceptibility artifacts, but the wave image and elastogram of later generated echo were unaffected by these effects. In in vivo experiments, this method was able to measure the mean shear modulus of the psoas major muscle. From the results of phantom experiments and volunteer studies, it was shown that this method has clinical application potential. Copyright © 2014 Elsevier Inc. All rights reserved.
Piazza, Rocco; Magistroni, Vera; Pirola, Alessandra; Redaelli, Sara; Spinelli, Roberta; Redaelli, Serena; Galbiati, Marta; Valletta, Simona; Giudici, Giovanni; Cazzaniga, Giovanni; Gambacorti-Passerini, Carlo
2013-01-01
Copy number alterations (CNA) are common events occurring in leukaemias and solid tumors. Comparative Genome Hybridization (CGH) is actually the gold standard technique to analyze CNAs; however, CGH analysis requires dedicated instruments and is able to perform only low resolution Loss of Heterozygosity (LOH) analyses. Here we present CEQer (Comparative Exome Quantification analyzer), a new graphical, event-driven tool for CNA/allelic-imbalance (AI) coupled analysis of exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection. This data is used to build mixed statistical/heuristic models allowing the identification of CNA/AI events. To test our tool, we initially used in silico generated data, then we performed whole-exome sequencing from 20 leukemic specimens and corresponding matched controls and we analyzed the results using CEQer. Taken globally, these analyses showed that the combined use of comparative digital exon quantification and LOH/AI allows generating very accurate CNA data. Therefore, we propose CEQer as an efficient, robust and user-friendly graphical tool for the identification of CNA/AI in the context of whole-exome sequencing data. PMID:24124457
Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis
Desikan, Srinidhi; Narayanan, Sujatha
2015-01-01
Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019
The utility of Next Generation Sequencing for molecular diagnostics in Rett syndrome.
Vidal, Silvia; Brandi, Núria; Pacheco, Paola; Gerotina, Edgar; Blasco, Laura; Trotta, Jean-Rémi; Derdak, Sophia; Del Mar O'Callaghan, Maria; Garcia-Cazorla, Àngels; Pineda, Mercè; Armstrong, Judith
2017-09-25
Rett syndrome (RTT) is an early-onset neurodevelopmental disorder that almost exclusively affects girls and is totally disabling. Three genes have been identified that cause RTT: MECP2, CDKL5 and FOXG1. However, the etiology of some of RTT patients still remains unknown. Recently, next generation sequencing (NGS) has promoted genetic diagnoses because of the quickness and affordability of the method. To evaluate the usefulness of NGS in genetic diagnosis, we present the genetic study of RTT-like patients using different techniques based on this technology. We studied 1577 patients with RTT-like clinical diagnoses and reviewed patients who were previously studied and thought to have RTT genes by Sanger sequencing. Genetically, 477 of 1577 patients with a RTT-like suspicion have been diagnosed. Positive results were found in 30% by Sanger sequencing, 23% with a custom panel, 24% with a commercial panel and 32% with whole exome sequencing. A genetic study using NGS allows the study of a larger number of genes associated with RTT-like symptoms simultaneously, providing genetic study of a wider group of patients as well as significantly reducing the response time and cost of the study.
Human Y chromosome copy number variation in the next generation sequencing era and beyond.
Massaia, Andrea; Xue, Yali
2017-05-01
The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.
Jézéquel, Laetitia; Loeper, Jacqueline; Pompon, Denis
2008-11-01
Combinatorial libraries coding for mosaic enzymes with predefined crossover points constitute useful tools to address and model structure-function relationships and for functional optimization of enzymes based on multivariate statistics. The presented method, called sequence-independent generation of a chimera-ordered library (SIGNAL), allows easy shuffling of any predefined amino acid segment between two or more proteins. This method is particularly well adapted to the exchange of protein structural modules. The procedure could also be well suited to generate ordered combinatorial libraries independent of sequence similarities in a robotized manner. Sequence segments to be recombined are first extracted by PCR from a single-stranded template coding for an enzyme of interest using a biotin-avidin-based method. This technique allows the reduction of parental template contamination in the final library. Specific PCR primers allow amplification of two complementary mosaic DNA fragments, overlapping in the region to be exchanged. Fragments are finally reassembled using a fusion PCR. The process is illustrated via the construction of a set of mosaic CYP2B enzymes using this highly modular approach.
High-throughput sequencing in veterinary infection biology and diagnostics.
Belák, S; Karlsson, O E; Leijon, M; Granberg, F
2013-12-01
Sequencing methods have improved rapidly since the first versions of the Sanger techniques, facilitating the development of very powerful tools for detecting and identifying various pathogens, such as viruses, bacteria and other microbes. The ongoing development of high-throughput sequencing (HTS; also known as next-generation sequencing) technologies has resulted in a dramatic reduction in DNA sequencing costs, making the technology more accessible to the average laboratory. In this White Paper of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine (Uppsala, Sweden), several approaches and examples of HTS are summarised, and their diagnostic applicability is briefly discussed. Selected future aspects of HTS are outlined, including the need for bioinformatic resources, with a focus on improving the diagnosis and control of infectious diseases in veterinary medicine.
Johnstone, Emily; Wyatt, Jonathan J; Henry, Ann M; Short, Susan C; Sebag-Montefiore, David; Murray, Louise; Kelly, Charles G; McCallum, Hazel M; Speight, Richard
2018-01-01
Magnetic resonance imaging (MRI) offers superior soft-tissue contrast as compared with computed tomography (CT), which is conventionally used for radiation therapy treatment planning (RTP) and patient positioning verification, resulting in improved target definition. The 2 modalities are co-registered for RTP; however, this introduces a systematic error. Implementing an MRI-only radiation therapy workflow would be advantageous because this error would be eliminated, the patient pathway simplified, and patient dose reduced. Unlike CT, in MRI there is no direct relationship between signal intensity and electron density; however, various methodologies for MRI-only RTP have been reported. A systematic review of these methods was undertaken. The PRISMA guidelines were followed. Embase and Medline databases were searched (1996 to March, 2017) for studies that generated synthetic CT scans (sCT)s for MRI-only radiation therapy. Sixty-one articles met the inclusion criteria. This review showed that MRI-only RTP techniques could be grouped into 3 categories: (1) bulk density override; (2) atlas-based; and (3) voxel-based techniques, which all produce an sCT scan from MR images. Bulk density override techniques either used a single homogeneous or multiple tissue override. The former produced large dosimetric errors (>2%) in some cases and the latter frequently required manual bone contouring. Atlas-based techniques used both single and multiple atlases and included methods incorporating pattern recognition techniques. Clinically acceptable sCTs were reported, but atypical anatomy led to erroneous results in some cases. Voxel-based techniques included methods using routine and specialized MRI sequences, namely ultra-short echo time imaging. High-quality sCTs were produced; however, use of multiple sequences led to long scanning times increasing the chances of patient movement. Using nonroutine sequences would currently be problematic in most radiation therapy centers. Atlas-based and voxel-based techniques were found to be the most clinically useful methods, with some studies reporting dosimetric differences of <1% between planning on the sCT and CT and <1-mm deviations when using sCTs for positional verification. Copyright © 2017 Elsevier Inc. All rights reserved.
Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang
2018-02-01
Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .
Rickert, Keith W; Grinberg, Luba; Woods, Robert M; Wilson, Susan; Bowen, Michael A; Baca, Manuel
2016-01-01
The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3-5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material.
Rickert, Keith W.; Grinberg, Luba; Woods, Robert M.; Wilson, Susan; Bowen, Michael A.; Baca, Manuel
2016-01-01
ABSTRACT The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3–5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material. PMID:26852694
A field ornithologist’s guide to genomics: Practical considerations for ecology and conservation
Oyler-McCance, Sara J.; Oh, Kevin; Langin, Kathryn; Aldridge, Cameron L.
2016-01-01
Vast improvements in sequencing technology have made it practical to simultaneously sequence millions of nucleotides distributed across the genome, opening the door for genomic studies in virtually any species. Ornithological research stands to benefit in three substantial ways. First, genomic methods enhance our ability to parse and simultaneously analyze both neutral and non-neutral genomic regions, thus providing insight into adaptive evolution and divergence. Second, the sheer quantity of sequence data generated by current sequencing platforms allows increased precision and resolution in analyses. Third, high-throughput sequencing can benefit applications that focus on a small number of loci that are otherwise prohibitively expensive, time-consuming, and technically difficult using traditional sequencing methods. These advances have improved our ability to understand evolutionary processes like speciation and local adaptation, but they also offer many practical applications in the fields of population ecology, migration tracking, conservation planning, diet analyses, and disease ecology. This review provides a guide for field ornithologists interested in incorporating genomic approaches into their research program, with an emphasis on techniques related to ecology and conservation. We present a general overview of contemporary genomic approaches and methods, as well as important considerations when selecting a genomic technique. We also discuss research questions that are likely to benefit from utilizing high-throughput sequencing instruments, highlighting select examples from recent avian studies.
BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation.
Dudek, Christian-Alexander; Dannheim, Henning; Schomburg, Dietmar
2017-01-01
The prediction of gene functions is crucial for a large number of different life science areas. Faster high throughput sequencing techniques generate more and larger datasets. The manual annotation by classical wet-lab experiments is not suitable for these large amounts of data. We showed earlier that the automatic sequence pattern-based BrEPS protocol, based on manually curated sequences, can be used for the prediction of enzymatic functions of genes. The growing sequence databases provide the opportunity for more reliable patterns, but are also a challenge for the implementation of automatic protocols. We reimplemented and optimized the BrEPS pattern generation to be applicable for larger datasets in an acceptable timescale. Primary improvement of the new BrEPS protocol is the enhanced data selection step. Manually curated annotations from Swiss-Prot are used as reliable source for function prediction of enzymes observed on protein level. The pool of sequences is extended by highly similar sequences from TrEMBL and SwissProt. This allows us to restrict the selection of Swiss-Prot entries, without losing the diversity of sequences needed to generate significant patterns. Additionally, a supporting pattern type was introduced by extending the patterns at semi-conserved positions with highly similar amino acids. Extended patterns have an increased complexity, increasing the chance to match more sequences, without losing the essential structural information of the pattern. To enhance the usability of the database, we introduced enzyme function prediction based on consensus EC numbers and IUBMB enzyme nomenclature. BrEPS is part of the Braunschweig Enzyme Database (BRENDA) and is available on a completely redesigned website and as download. The database can be downloaded and used with the BrEPScmd command line tool for large scale sequence analysis. The BrEPS website and downloads for the database creation tool, command line tool and database are freely accessible at http://breps.tu-bs.de.
BrEPS 2.0: Optimization of sequence pattern prediction for enzyme annotation
Schomburg, Dietmar
2017-01-01
The prediction of gene functions is crucial for a large number of different life science areas. Faster high throughput sequencing techniques generate more and larger datasets. The manual annotation by classical wet-lab experiments is not suitable for these large amounts of data. We showed earlier that the automatic sequence pattern-based BrEPS protocol, based on manually curated sequences, can be used for the prediction of enzymatic functions of genes. The growing sequence databases provide the opportunity for more reliable patterns, but are also a challenge for the implementation of automatic protocols. We reimplemented and optimized the BrEPS pattern generation to be applicable for larger datasets in an acceptable timescale. Primary improvement of the new BrEPS protocol is the enhanced data selection step. Manually curated annotations from Swiss-Prot are used as reliable source for function prediction of enzymes observed on protein level. The pool of sequences is extended by highly similar sequences from TrEMBL and SwissProt. This allows us to restrict the selection of Swiss-Prot entries, without losing the diversity of sequences needed to generate significant patterns. Additionally, a supporting pattern type was introduced by extending the patterns at semi-conserved positions with highly similar amino acids. Extended patterns have an increased complexity, increasing the chance to match more sequences, without losing the essential structural information of the pattern. To enhance the usability of the database, we introduced enzyme function prediction based on consensus EC numbers and IUBMB enzyme nomenclature. BrEPS is part of the Braunschweig Enzyme Database (BRENDA) and is available on a completely redesigned website and as download. The database can be downloaded and used with the BrEPScmd command line tool for large scale sequence analysis. The BrEPS website and downloads for the database creation tool, command line tool and database are freely accessible at http://breps.tu-bs.de. PMID:28750104
Cardiovascular magnetic resonance physics for clinicians: part II
2012-01-01
This is the second of two reviews that is intended to cover the essential aspects of cardiovascular magnetic resonance (CMR) physics in a way that is understandable and relevant to clinicians using CMR in their daily practice. Starting with the basic pulse sequences and contrast mechanisms described in part I, it briefly discusses further approaches to accelerate image acquisition. It then continues by showing in detail how the contrast behaviour of black blood fast spin echo and bright blood cine gradient echo techniques can be modified by adding rf preparation pulses to derive a number of more specialised pulse sequences. The simplest examples described include T2-weighted oedema imaging, fat suppression and myocardial tagging cine pulse sequences. Two further important derivatives of the gradient echo pulse sequence, obtained by adding preparation pulses, are used in combination with the administration of a gadolinium-based contrast agent for myocardial perfusion imaging and the assessment of myocardial tissue viability using a late gadolinium enhancement (LGE) technique. These two imaging techniques are discussed in more detail, outlining the basic principles of each pulse sequence, the practical steps required to achieve the best results in a clinical setting and, in the case of perfusion, explaining some of the factors that influence current approaches to perfusion image analysis. The key principles of contrast-enhanced magnetic resonance angiography (CE-MRA) are also explained in detail, especially focusing on timing of the acquisition following contrast agent bolus administration, and current approaches to achieving time resolved MRA. Alternative MRA techniques that do not require the use of an endogenous contrast agent are summarised, and the specialised pulse sequence used to image the coronary arteries, using respiratory navigator gating, is described in detail. The article concludes by explaining the principle behind phase contrast imaging techniques which create images that represent the phase of the MR signal rather than the magnitude. It is shown how this principle can be used to generate velocity maps by designing gradient waveforms that give rise to a relative phase change that is proportional to velocity. Choice of velocity encoding range and key pitfalls in the use of this technique are discussed. PMID:22995744
FRESCO: Referential compression of highly similar sequences.
Wandelt, Sebastian; Leser, Ulf
2013-01-01
In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.
Empirical transfer functions for stations in the Central California seismological network
Bakun, W.H.; Dratler, Jay
1976-01-01
A sequence of calibration signals composed of a station identification code, a transient from the release of the seismometer mass at rest from a known displacement from the equilibrium position, and a transient from a known step in voltage to the amplifier input are generated by the automatic daily calibration system (ADCS) now operational in the U.S. Geological Survey central California seismographic network. Documentation of a sequence of interactive programs to compute, from the calibration data, the complex transfer functions for the seismographic system (ground motion through digitizer) the electronics (amplifier through digitizer), and the seismometer alone are presented. The analysis utilizes the Fourier transform technique originally suggested by Espinosa et al (1962). Section I is a general description of seismographic calibration. Section II contrasts the 'Fourier transform' and the 'least-squares' techniques for analyzing transient calibration signals. Theoretical consideration for the Fourier transform technique used here are described in Section III. Section IV is a detailed description of the sequence of calibration signals generated by the ADCS. Section V is a brief 'cookbook description' of the calibration programs; Section VI contains a detailed sample program execution. Section VII suggests the uses of the resultant empirical transfer functions. Supplemental interactive programs by which smooth response functions, suitable for reducing seismic data to ground motion, are also documented in Section VII. Appendices A and B contain complete listings of the Fortran source Codes while Appendix C is an update containing preliminary results obtained from an analysis of some of the calibration signals from stations in the seismographic network near Oroville, California.
Spread spectrum communications. Volume 1, 2 & 3
NASA Technical Reports Server (NTRS)
Simon, M. K.; Levitt, B. K.; Omura, J. K.; Scholtz, R. A.
1985-01-01
The design and operation of spread-spectrum (SS) communication systems are examined in an introductory text intended for graduate engineering students and practicing engineers. Chapters are devoted to an overview of SS systems, the historical origins of SS, basic concepts and system models, antijam communication systems, pseudonoise generators, coherent direct-sequence systems, noncoherent frequency-hopped systems, coherent and differentially coherent modulation techniques, pseudonoise acquisition and tracking in direct-sequence receivers, time and frequency synchronization of frequency-hopped receivers, low-probability-of-intercept communication, and multiple-access communication. Graphs, diagrams, and photographs are provided.
Optimization of algorithm of coding of genetic information of Chlamydia
NASA Astrophysics Data System (ADS)
Feodorova, Valentina A.; Ulyanov, Sergey S.; Zaytsev, Sergey S.; Saltykov, Yury V.; Ulianova, Onega V.
2018-04-01
New method of coding of genetic information using coherent optical fields is developed. Universal technique of transformation of nucleotide sequences of bacterial gene into laser speckle pattern is suggested. Reference speckle patterns of the nucleotide sequences of omp1 gene of typical wild strains of Chlamydia trachomatis of genovars D, E, F, G, J and K and Chlamydia psittaci serovar I as well are generated. Algorithm of coding of gene information into speckle pattern is optimized. Fully developed speckles with Gaussian statistics for gene-based speckles have been used as criterion of optimization.
Image processing methods used to simulate flight over remotely sensed data
NASA Technical Reports Server (NTRS)
Mortensen, H. B.; Hussey, K. J.; Mortensen, R. A.
1988-01-01
It has been demonstrated that image processing techniques can provide an effective means of simulating flight over remotely sensed data (Hussey et al. 1986). This paper explains the methods used to simulate and animate three-dimensional surfaces from two-dimensional imagery. The preprocessing techniques used on the input data, the selection of the animation sequence, the generation of the animation frames, and the recording of the animation is covered. The software used for all steps is discussed.
Smart, Matthew; Cornman, Robert S.; Iwanowicz, Deborah; McDermott-Kubeczko, Margaret; Pettis, Jeff S; Spivak, Marla S; Otto, Clint R.
2017-01-01
Taxonomic identification of pollen has historically been accomplished via light microscopy but requires specialized knowledge and reference collections, particularly when identification to lower taxonomic levels is necessary. Recently, next-generation sequencing technology has been used as a cost-effective alternative for identifying bee-collected pollen; however, this novel approach has not been tested on a spatially or temporally robust number of pollen samples. Here, we compare pollen identification results derived from light microscopy and DNA sequencing techniques with samples collected from honey bee colonies embedded within a gradient of intensive agricultural landscapes in the Northern Great Plains throughout the 2010–2011 growing seasons. We demonstrate that at all taxonomic levels, DNA sequencing was able to discern a greater number of taxa, and was particularly useful for the identification of infrequently detected species. Importantly, substantial phenological overlap did occur for commonly detected taxa using either technique, suggesting that DNA sequencing is an appropriate, and enhancing, substitutive technique for accurately capturing the breadth of bee-collected species of pollen present across agricultural landscapes. We also show that honey bees located in high and low intensity agricultural settings forage on dissimilar plants, though with overlap of the most abundantly collected pollen taxa. We highlight practical applications of utilizing sequencing technology, including addressing ecological issues surrounding land use, climate change, importance of taxa relative to abundance, and evaluating the impact of conservation program habitat enhancement efforts.
Iwanowicz, Deborah; Olson, Deanna H.; Adams, Michael J.; Adams, Cynthia; Anderson, Chauncey; Blaustein, Andrew R; Densmore, Christine L.; Figiel, Chester; Schill, William B.; Chestnut, Tara
2017-01-01
Taxonomic identification of pollen has historically been accomplished via light microscopy but requires specialized knowledge and reference collections, particularly when identification to lower taxonomic levels is necessary. Recently, next-generation sequencing technology has been used as a cost-effective alternative for identifying bee-collected pollen; however, this novel approach has not been tested on a spatially or temporally robust number of pollen samples. Here, we compare pollen identification results derived from light microscopy and DNA sequencing techniques with samples collected from honey bee colonies embedded within a gradient of intensive agricultural landscapes in the Northern Great Plains throughout the 2010–2011 growing seasons. We demonstrate that at all taxonomic levels, DNA sequencing was able to discern a greater number of taxa, and was particularly useful for the identification of infrequently detected species. Importantly, substantial phenological overlap did occur for commonly detected taxa using either technique, suggesting that DNA sequencing is an appropriate, and enhancing, substitutive technique for accurately capturing the breadth of bee-collected species of pollen present across agricultural landscapes. We also show that honey bees located in high and low intensity agricultural settings forage on dissimilar plants, though with overlap of the most abundantly collected pollen taxa. We highlight practical applications of utilizing sequencing technology, including addressing ecological issues surrounding land use, climate change, importance of taxa relative to abundance, and evaluating the impact of conservation program habitat enhancement efforts.
StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics
Ramirez-Gonzalez, Ricardo H.; Leggett, Richard M.; Waite, Darren; Thanki, Anil; Drou, Nizar; Caccamo, Mario; Davey, Robert
2014-01-01
Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. ”provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month”. The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages. PMID:24627795
Applying phylogenetic analysis to viral livestock diseases: moving beyond molecular typing.
Olvera, Alex; Busquets, Núria; Cortey, Marti; de Deus, Nilsa; Ganges, Llilianne; Núñez, José Ignacio; Peralta, Bibiana; Toskano, Jennifer; Dolz, Roser
2010-05-01
Changes in livestock production systems in recent years have altered the presentation of many diseases resulting in the need for more sophisticated control measures. At the same time, new molecular assays have been developed to support the diagnosis of animal viral disease. Nucleotide sequences generated by these diagnostic techniques can be used in phylogenetic analysis to infer phenotypes by sequence homology and to perform molecular epidemiology studies. In this review, some key elements of phylogenetic analysis are highlighted, such as the selection of the appropriate neutral phylogenetic marker, the proper phylogenetic method and different techniques to test the reliability of the resulting tree. Examples are given of current and future applications of phylogenetic reconstructions in viral livestock diseases. Copyright 2009 Elsevier Ltd. All rights reserved.
Generation and migration of hydrocarbons in offshore South Texas Gulf Coast sediments
NASA Astrophysics Data System (ADS)
Huc, A. Y.; Hunt, J. M.
1980-08-01
The hydrocarbon content of two thick Tertiary sequences from the offshore Gulf Coast (South Padre Island and Mustang Island) was studied using a headspace technique, thermal distillation, pyrolysis and solvent extraction. The threshold of oil generation was determined to occur in the range of 3050 m (10,000 ft; 120°C) in Miocene sediments. In the South Padre Island well, the distribution of the different classes of hydrocarbons along the sedimentary column suggests some updip migration processes are occurring.
Chiu, Elliott S; Hoover, Edward A; VandeWoude, Sue
2018-01-10
Feline leukemia virus (FeLV) was the first feline retrovirus discovered, and is associated with multiple fatal disease syndromes in cats, including lymphoma. The original research conducted on FeLV employed classical virological techniques. As methods have evolved to allow FeLV genetic characterization, investigators have continued to unravel the molecular pathology associated with this fascinating agent. In this review, we discuss how FeLV classification, transmission, and disease-inducing potential have been defined sequentially by viral interference assays, Sanger sequencing, PCR, and next-generation sequencing. In particular, we highlight the influences of endogenous FeLV and host genetics that represent FeLV research opportunities on the near horizon.
Ribas, Laia; Pardo, Belén G; Fernández, Carlos; Alvarez-Diós, José Antonio; Gómez-Tato, Antonio; Quiroga, María Isabel; Planas, Josep V; Sitjà-Bobadilla, Ariadna; Martínez, Paulino; Piferrer, Francesc
2013-03-15
Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database ("Turbot 2 database") was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences ("Turbot 3 database"), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50-90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs.
2013-01-01
Background Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry. Results Expressed sequence tags were generated by Sanger sequencing of cDNA libraries from different immune-related tissues after several parasitic challenges. The resulting database (“Turbot 2 database”) was enlarged with sequences generated from a 454 sequencing run of brain-hypophysis-gonadal axis-derived RNA obtained from turbot at different development stages. The assembly of Sanger and 454 sequences generated 52,427 consensus sequences (“Turbot 3 database”), of which 23,661 were successfully annotated. A total of 1,410 sequences were confirmed to be related to reproduction and key genes involved in sex differentiation and maturation were identified for the first time in turbot (AR, AMH, SRY-related genes, CYP19A, ZPGs, STAR FSHR, etc.). Similarly, 2,241 sequences were related to the immune system and several novel key immune genes were identified (BCL, TRAF, NCK, CD28 and TOLLIP, among others). The number of genes of many relevant reproduction- and immune-related pathways present in the database was 50–90% of the total gene count of each pathway. In addition, 1,237 microsatellites and 7,362 single nucleotide polymorphisms (SNPs) were also compiled. Further, 2,976 putative natural antisense transcripts (NATs) including microRNAs were also identified. Conclusions The combined sequencing strategies employed here significantly increased the turbot genomic resources available, including 34,400 novel sequences. The generated database contains a larger number of genes relevant for reproduction- and immune-associated studies, with an excellent coverage of most genes present in many relevant physiological pathways. This database also allowed the identification of many microsatellites and SNP markers that will be very useful for population and genome screening and a valuable aid in marker assisted selection programs. PMID:23497389
Neural-network-designed pulse sequences for robust control of singlet-triplet qubits
NASA Astrophysics Data System (ADS)
Yang, Xu-Chen; Yung, Man-Hong; Wang, Xin
2018-04-01
Composite pulses are essential for universal manipulation of singlet-triplet spin qubits. In the absence of noise, they are required to perform arbitrary single-qubit operations due to the special control constraint of a singlet-triplet qubit, while in a noisy environment, more complicated sequences have been developed to dynamically correct the error. Tailoring these sequences typically requires numerically solving a set of nonlinear equations. Here we demonstrate that these pulse sequences can be generated by a well-trained, double-layer neural network. For sequences designed for the noise-free case, the trained neural network is capable of producing almost exactly the same pulses known in the literature. For more complicated noise-correcting sequences, the neural network produces pulses with slightly different line shapes, but the robustness against noises remains comparable. These results indicate that the neural network can be a judicious and powerful alternative to existing techniques in developing pulse sequences for universal fault-tolerant quantum computation.
Shore, Sabrina; Henderson, Jordana M; Lebedev, Alexandre; Salcedo, Michelle P; Zon, Gerald; McCaffrey, Anton P; Paul, Natasha; Hogrefe, Richard I
2016-01-01
For most sample types, the automation of RNA and DNA sample preparation workflows enables high throughput next-generation sequencing (NGS) library preparation. Greater adoption of small RNA (sRNA) sequencing has been hindered by high sample input requirements and inherent ligation side products formed during library preparation. These side products, known as adapter dimer, are very similar in size to the tagged library. Most sRNA library preparation strategies thus employ a gel purification step to isolate tagged library from adapter dimer contaminants. At very low sample inputs, adapter dimer side products dominate the reaction and limit the sensitivity of this technique. Here we address the need for improved specificity of sRNA library preparation workflows with a novel library preparation approach that uses modified adapters to suppress adapter dimer formation. This workflow allows for lower sample inputs and elimination of the gel purification step, which in turn allows for an automatable sRNA library preparation protocol.
Lauerman, Lloyd H
2004-12-01
Since the discovery of the polymerase chain reaction (PCR) 20 years ago, an avalanche of scientific publications have reported major developments and changes in specialized equipment, reagents, sample preparation, computer programs and techniques, generated through business, government and university research. The requirement for genetic sequences for primer selection and validation has been greatly facilitated by the development of new sequencing techniques, machines and computer programs. Genetic libraries, such as GenBank, EMBL and DDBJ continue to accumulate a wealth of genetic sequence information for the development and validation of molecular-based diagnostic procedures concerning human and veterinary disease agents. The mechanization of various aspects of the PCR assay, such as robotics, microfluidics and nanotechnology, has made it possible for the rapid advancement of new procedures. Real-time PCR, DNA microarray and DNA chips utilize these newer techniques in conjunction with computer and computer programs. Instruments for hand-held PCR assays are being developed. The PCR and reverse transcription-PCR (RT-PCR) assays have greatly accelerated the speed and accuracy of diagnoses of human and animal disease, especially of the infectious agents that are difficult to isolate or demonstrate. The PCR has made it possible to genetically characterize a microbial isolate inexpensively and rapidly for identification, typing and epidemiological comparison.
de la Fuente, Gabriel; Belanche, Alejandro; Girwood, Susan E.; Pinloche, Eric; Wilkinson, Toby; Newbold, C. Jamie
2014-01-01
The development of next generation sequencing has challenged the use of other molecular fingerprinting methods used to study microbial diversity. We analysed the bacterial diversity in the rumen of defaunated sheep following the introduction of different protozoal populations, using both next generation sequencing (NGS: Ion Torrent PGM) and terminal restriction fragment length polymorphism (T-RFLP). Although absolute number differed, there was a high correlation between NGS and T-RFLP in terms of richness and diversity with R values of 0.836 and 0.781 for richness and Shannon-Wiener index, respectively. Dendrograms for both datasets were also highly correlated (Mantel test = 0.742). Eighteen OTUs and ten genera were significantly impacted by the addition of rumen protozoa, with an increase in the relative abundance of Prevotella, Bacteroides and Ruminobacter, related to an increase in free ammonia levels in the rumen. Our findings suggest that classic fingerprinting methods are still valuable tools to study microbial diversity and structure in complex environments but that NGS techniques now provide cost effect alternatives that provide a far greater level of information on the individual members of the microbial population. PMID:25051490
Jung, Yeonjoo; Kim, Pora; Jung, Yeonhwa; Keum, Juhee; Kim, Soon-Nam; Choi, Yong Soo; Do, In-Gu; Lee, Jinseon; Choi, So-Jung; Kim, Sujin; Lee, Jong-Eun; Kim, Jhingook; Lee, Sanghyuk; Kim, Jaesang
2012-06-01
An increasing number of chromosomal aberrations is being identified in solid tumors providing novel biomarkers for various types of cancer and new insights into the mechanisms of carcinogenesis. We applied next generation sequencing technique to analyze the transcriptome of the non-small cell lung carcinoma (NSCLC) cell line H2228 and discovered a fusion transcript composed of multiple exons of ALK (anaplastic lymphoma receptor tyrosine kinase) and PTPN3 (protein tyrosine phosphatase, nonreceptor Type 3). Detailed analysis of the genomic structure revealed that a portion of genomic region encompassing Exons 10 and 11 of ALK has been translocated into the intronic region between Exons 2 and 3 of PTPN3. The key net result appears to be the null mutation of one allele of PTPN3, a gene with tumor suppressor activity. Consistently, ectopic expression of PTPN3 in NSCLC cell lines led to inhibition of colony formation. Our study confirms the utility of next generation sequencing as a tool for the discovery of somatic mutations and has led to the identification of a novel mutation in NSCLC that may be of diagnostic, prognostic, and therapeutic importance. Copyright © 2012 Wiley Periodicals, Inc.
Arpeggio: harmonic compression of ChIP-seq data reveals protein-chromatin interaction signatures
Stanton, Kelly Patrick; Parisi, Fabio; Strino, Francesco; Rabin, Neta; Asp, Patrik; Kluger, Yuval
2013-01-01
Researchers generating new genome-wide data in an exploratory sequencing study can gain biological insights by comparing their data with well-annotated data sets possessing similar genomic patterns. Data compression techniques are needed for efficient comparisons of a new genomic experiment with large repositories of publicly available profiles. Furthermore, data representations that allow comparisons of genomic signals from different platforms and across species enhance our ability to leverage these large repositories. Here, we present a signal processing approach that characterizes protein–chromatin interaction patterns at length scales of several kilobases. This allows us to efficiently compare numerous chromatin-immunoprecipitation sequencing (ChIP-seq) data sets consisting of many types of DNA-binding proteins collected from a variety of cells, conditions and organisms. Importantly, these interaction patterns broadly reflect the biological properties of the binding events. To generate these profiles, termed Arpeggio profiles, we applied harmonic deconvolution techniques to the autocorrelation profiles of the ChIP-seq signals. We used 806 publicly available ChIP-seq experiments and showed that Arpeggio profiles with similar spectral densities shared biological properties. Arpeggio profiles of ChIP-seq data sets revealed characteristics that are not easily detected by standard peak finders. They also allowed us to relate sequencing data sets from different genomes, experimental platforms and protocols. Arpeggio is freely available at http://sourceforge.net/p/arpeggio/wiki/Home/. PMID:23873955
Arpeggio: harmonic compression of ChIP-seq data reveals protein-chromatin interaction signatures.
Stanton, Kelly Patrick; Parisi, Fabio; Strino, Francesco; Rabin, Neta; Asp, Patrik; Kluger, Yuval
2013-09-01
Researchers generating new genome-wide data in an exploratory sequencing study can gain biological insights by comparing their data with well-annotated data sets possessing similar genomic patterns. Data compression techniques are needed for efficient comparisons of a new genomic experiment with large repositories of publicly available profiles. Furthermore, data representations that allow comparisons of genomic signals from different platforms and across species enhance our ability to leverage these large repositories. Here, we present a signal processing approach that characterizes protein-chromatin interaction patterns at length scales of several kilobases. This allows us to efficiently compare numerous chromatin-immunoprecipitation sequencing (ChIP-seq) data sets consisting of many types of DNA-binding proteins collected from a variety of cells, conditions and organisms. Importantly, these interaction patterns broadly reflect the biological properties of the binding events. To generate these profiles, termed Arpeggio profiles, we applied harmonic deconvolution techniques to the autocorrelation profiles of the ChIP-seq signals. We used 806 publicly available ChIP-seq experiments and showed that Arpeggio profiles with similar spectral densities shared biological properties. Arpeggio profiles of ChIP-seq data sets revealed characteristics that are not easily detected by standard peak finders. They also allowed us to relate sequencing data sets from different genomes, experimental platforms and protocols. Arpeggio is freely available at http://sourceforge.net/p/arpeggio/wiki/Home/.
Analysis of ethanol-soluble extractives in southern pine wood by low-field proton NMR
Thomas L. Eberhardt; Thomas Elder; Nicole Labbe
2007-01-01
Low-field portion NMR was evaluated as a nondestructive and rapid technique for measuring ethanol-soluble extractives in southern pine wood. Matchstick-sized wood specimens were steeped in extractive-containing solutions to generate extractive-enriched samples for analysis. decay curves obtained by the Carr-Purcell-Meiboom-gill (CPMG) pulse sequence were analyzed with...
The Real Science Crisis: Bleak Prospects for Young Researchers
ERIC Educational Resources Information Center
Monastersky, Richard
2007-01-01
It is the best of times and worst of times to start a science career in the United States. Researchers today have access to powerful new tools and techniques--such as rapid gene sequencers and giant telescopes--that have accelerated the pace of discovery beyond the imagination of previous generations. But for many of today's graduate students, the…
Eastman, Alexander W.; Yuan, Ze-Chun
2015-01-01
Advances in sequencing technology have drastically increased the depth and feasibility of bacterial genome sequencing. However, little information is available that details the specific techniques and procedures employed during genome sequencing despite the large numbers of published genomes. Shotgun approaches employed by second-generation sequencing platforms has necessitated the development of robust bioinformatics tools for in silico assembly, and complete assembly is limited by the presence of repetitive DNA sequences and multi-copy operons. Typically, re-sequencing with multiple platforms and laborious, targeted Sanger sequencing are employed to finish a draft bacterial genome. Here we describe a novel strategy based on the identification and targeted sequencing of repetitive rDNA operons to expedite bacterial genome assembly and finishing. Our strategy was validated by finishing the genome of Paenibacillus polymyxa strain CR1, a bacterium with potential in sustainable agriculture and bio-based processes. An analysis of the 38 contigs contained in the P. polymyxa strain CR1 draft genome revealed 12 repetitive rDNA operons with varied intragenic and flanking regions of variable length, unanimously located at contig boundaries and within contig gaps. These highly similar but not identical rDNA operons were experimentally verified and sequenced simultaneously with multiple, specially designed primer sets. This approach also identified and corrected significant sequence rearrangement generated during the initial in silico assembly of sequencing reads. Our approach reduces the required effort associated with blind primer walking for contig assembly, increasing both the speed and feasibility of genome finishing. Our study further reinforces the notion that repetitive DNA elements are major limiting factors for genome finishing. Moreover, we provided a step-by-step workflow for genome finishing, which may guide future bacterial genome finishing projects. PMID:25653642
Single Nucleobase Identification Using Biophysical Signatures from Nanoelectronic Quantum Tunneling.
Korshoj, Lee E; Afsari, Sepideh; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant
2017-03-01
Nanoelectronic DNA sequencing can provide an important alternative to sequencing-by-synthesis by reducing sample preparation time, cost, and complexity as a high-throughput next-generation technique with accurate single-molecule identification. However, sample noise and signature overlap continue to prevent high-resolution and accurate sequencing results. Probing the molecular orbitals of chemically distinct DNA nucleobases offers a path for facile sequence identification, but molecular entropy (from nucleotide conformations) makes such identification difficult when relying only on the energies of lowest-unoccupied and highest-occupied molecular orbitals (LUMO and HOMO). Here, nine biophysical parameters are developed to better characterize molecular orbitals of individual nucleobases, intended for single-molecule DNA sequencing using quantum tunneling of charges. For this analysis, theoretical models for quantum tunneling are combined with transition voltage spectroscopy to obtain measurable parameters unique to the molecule within an electronic junction. Scanning tunneling spectroscopy is then used to measure these nine biophysical parameters for DNA nucleotides, and a modified machine learning algorithm identified nucleobases. The new parameters significantly improve base calling over merely using LUMO and HOMO frontier orbital energies. Furthermore, high accuracies for identifying DNA nucleobases were observed at different pH conditions. These results have significant implications for developing a robust and accurate high-throughput nanoelectronic DNA sequencing technique. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Porto, William F; Pires, Állan S; Franco, Octavio L
2017-08-07
The antimicrobial activity prediction tools aim to help the novel antimicrobial peptides (AMP) sequences discovery, utilizing machine learning methods. Such approaches have gained increasing importance in the generation of novel synthetic peptides by means of rational design techniques. This study focused on predictive ability of such approaches to determine the antimicrobial sequence activities, which were previously characterized at the protein level by in vitro studies. Using four web servers and one standalone software, we evaluated 78 sequences generated by the so-called linguistic model, being 40 designed and 38 shuffled sequences, with ∼60 and ∼25% of identity to AMPs, respectively. The ab initio molecular modelling of such sequences indicated that the structure does not affect the predictions, as both sets present similar structures. Overall, the systems failed on predicting shuffled versions of designed peptides, as they are identical in AMPs composition, which implies in accuracies below 30%. The prediction accuracy is negatively affected by the low specificity of all systems here evaluated, as they, on the other hand, reached 100% of sensitivity. Our results suggest that complementary approaches with high specificity, not necessarily high accuracy, should be developed to be used together with the current systems, overcoming their limitations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Selection of peptides binding to metallic borides by screening M13 phage display libraries.
Ploss, Martin; Facey, Sandra J; Bruhn, Carina; Zemel, Limor; Hofmann, Kathrin; Stark, Robert W; Albert, Barbara; Hauer, Bernhard
2014-02-10
Metal borides are a class of inorganic solids that is much less known and investigated than for example metal oxides or intermetallics. At the same time it is a highly versatile and interesting class of compounds in terms of physical and chemical properties, like semiconductivity, ferromagnetism, or catalytic activity. This makes these substances attractive for the generation of new materials. Very little is known about the interaction between organic materials and borides. To generate nanostructured and composite materials which consist of metal borides and organic modifiers it is necessary to develop new synthetic strategies. Phage peptide display libraries are commonly used to select peptides that bind specifically to metals, metal oxides, and semiconductors. Further, these binding peptides can serve as templates to control the nucleation and growth of inorganic nanoparticles. Additionally, the combination of two different binding motifs into a single bifunctional phage could be useful for the generation of new composite materials. In this study, we have identified a unique set of sequences that bind to amorphous and crystalline nickel boride (Ni3B) nanoparticles, from a random peptide library using the phage display technique. Using this technique, strong binders were identified that are selective for nickel boride. Sequence analysis of the peptides revealed that the sequences exhibit similar, yet subtle different patterns of amino acid usage. Although a predominant binding motif was not observed, certain charged amino acids emerged as essential in specific binding to both substrates. The 7-mer peptide sequence LGFREKE, isolated on amorphous Ni3B emerged as the best binder for both substrates. Fluorescence microscopy and atomic force microscopy confirmed the specific binding affinity of LGFREKE expressing phage to amorphous and crystalline Ni3B nanoparticles. This study is, to our knowledge, the first to identify peptides that bind specifically to amorphous and to crystalline Ni3B nanoparticles. We think that the identified strong binding sequences described here could potentially serve for the utilisation of M13 phage as a viable alternative to other methods to create tailor-made boride composite materials or new catalytic surfaces by a biologically driven nano-assembly synthesis and structuring.
Forman, Michael A; Young, Derek
2012-09-18
Examples of methods for generating data based on a communications channel are described. In one such example, a processing unit may generate a first vector representation based in part on at least two characteristics of a communications channel. A constellation having at least two dimensions may be addressed with the first vector representation to identify a first symbol associated with the first vector representation. The constellation represents a plurality of regions, each region associated with a respective symbol. The symbol may be used to generate data, which may stored in an electronic storage medium and used as a cryptographic key or a spreading code or hopping sequence in a modulation technique.
Marshall, Charla; Sturk-Andreaggi, Kimberly; Daniels-Higginbotham, Jennifer; Oliver, Robert Sean; Barritt-Ross, Suzanne; McMahon, Timothy P
2017-11-01
Next-generation ancient DNA technologies have the potential to assist in the analysis of degraded DNA extracted from forensic specimens. Mitochondrial genome (mitogenome) sequencing, specifically, may be of benefit to samples that fail to yield forensically relevant genetic information using conventional PCR-based techniques. This report summarizes the Armed Forces Medical Examiner System's Armed Forces DNA Identification Laboratory's (AFMES-AFDIL) performance evaluation of a Next-Generation Sequencing protocol for degraded and chemically treated past accounting samples. The procedure involves hybridization capture for targeted enrichment of mitochondrial DNA, massively parallel sequencing using Illumina chemistry, and an automated bioinformatic pipeline for forensic mtDNA profile generation. A total of 22 non-probative samples and associated controls were processed in the present study, spanning a range of DNA quantity and quality. Data were generated from over 100 DNA libraries by ten DNA analysts over the course of five months. The results show that the mitogenome sequencing procedure is reliable and robust, sensitive to low template (one ng control DNA) as well as degraded DNA, and specific to the analysis of the human mitogenome. Haplotypes were overall concordant between NGS replicates and with previously generated Sanger control region data. Due to the inherent risk for contamination when working with low-template, degraded DNA, a contamination assessment was performed. The consumables were shown to be void of human DNA contaminants and suitable for forensic use. Reagent blanks and negative controls were analyzed to determine the background signal of the procedure. This background signal was then used to set analytical and reporting thresholds, which were designated at 4.0X (limit of detection) and 10.0X (limit of quantiation) average coverage across the mitogenome, respectively. Nearly all human samples exceeded the reporting threshold, although coverage was reduced in chemically treated samples resulting in a ∼58% passing rate for these poor-quality samples. A concordance assessment demonstrated the reliability of the NGS data when compared to known Sanger profiles. One case sample was shown to be mixed with a co-processed sample and two reagent blanks indicated the presence of DNA above the analytical threshold. This contamination was attributed to sequencing crosstalk from simultaneously sequenced high-quality samples to include the positive control. Overall this study demonstrated that hybridization capture and Illumina sequencing provide a viable method for mitogenome sequencing of degraded and chemically treated skeletal DNA samples, yet may require alternative measures of quality control. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
A photogrammetric technique for generation of an accurate multispectral optical flow dataset
NASA Astrophysics Data System (ADS)
Kniaz, V. V.
2017-06-01
A presence of an accurate dataset is the key requirement for a successful development of an optical flow estimation algorithm. A large number of freely available optical flow datasets were developed in recent years and gave rise for many powerful algorithms. However most of the datasets include only images captured in the visible spectrum. This paper is focused on the creation of a multispectral optical flow dataset with an accurate ground truth. The generation of an accurate ground truth optical flow is a rather complex problem, as no device for error-free optical flow measurement was developed to date. Existing methods for ground truth optical flow estimation are based on hidden textures, 3D modelling or laser scanning. Such techniques are either work only with a synthetic optical flow or provide a sparse ground truth optical flow. In this paper a new photogrammetric method for generation of an accurate ground truth optical flow is proposed. The method combines the benefits of the accuracy and density of a synthetic optical flow datasets with the flexibility of laser scanning based techniques. A multispectral dataset including various image sequences was generated using the developed method. The dataset is freely available on the accompanying web site.
Hsu, Chung-Lun; Jiang, Haowei; Venkatesh, A G; Hall, Drew A
2015-10-01
Over the past two decades, nanopores have been a promising technology for next generation deoxyribonucleic acid (DNA) sequencing. Here, we present a hybrid semi-digital transimpedance amplifier (HSD-TIA) to sense the minute current signatures introduced by single-stranded DNA (ssDNA) translocating through a nanopore, while discharging the baseline current using a semi-digital feedback loop. The amplifier achieves fast settling by adaptively tuning a DC compensation current when a step input is detected. A noise cancellation technique reduces the total input-referred current noise caused by the parasitic input capacitance. Measurement results show the performance of the amplifier with 31.6 M Ω mid-band gain, 950 kHz bandwidth, and 8.5 fA/ √Hz input-referred current noise, a 2× noise reduction due to the noise cancellation technique. The settling response is demonstrated by observing the insertion of a protein nanopore in a lipid bilayer. Using the nanopore, the HSD-TIA was able to measure ssDNA translocation events.
Linear and nonlinear frequency- and time-domain spectroscopy with multiple frequency combs.
Bennett, Kochise; Rouxel, Jeremy R; Mukamel, Shaul
2017-09-07
Two techniques that employ equally spaced trains of optical pulses to map an optical high frequency into a low frequency modulation of the signal that can be detected in real time are compared. The development of phase-stable optical frequency combs has opened up new avenues to metrology and spectroscopy. The ability to generate a series of frequency spikes with precisely controlled separation permits a fast, highly accurate sampling of the material response. Recently, pairs of frequency combs with slightly different repetition rates have been utilized to down-convert material susceptibilities from the optical to microwave regime where they can be recorded in real time. We show how this one-dimensional dual comb technique can be extended to multiple dimensions by using several combs. We demonstrate how nonlinear susceptibilities can be quickly acquired using this technique. In a second class of techniques, sequences of ultrafast mode locked laser pulses are used to recover pathways of interactions contributing to nonlinear susceptibilities by using a photo-acoustic modulation varying along the sequences. We show that these techniques can be viewed as a time-domain analog of the multiple frequency comb scheme.
Langley, Alexander R.; Gräf, Stefan; Smith, James C.; Krude, Torsten
2016-01-01
Next-generation sequencing has enabled the genome-wide identification of human DNA replication origins. However, different approaches to mapping replication origins, namely (i) sequencing isolated small nascent DNA strands (SNS-seq); (ii) sequencing replication bubbles (bubble-seq) and (iii) sequencing Okazaki fragments (OK-seq), show only limited concordance. To address this controversy, we describe here an independent high-resolution origin mapping technique that we call initiation site sequencing (ini-seq). In this approach, newly replicated DNA is directly labelled with digoxigenin-dUTP near the sites of its initiation in a cell-free system. The labelled DNA is then immunoprecipitated and genomic locations are determined by DNA sequencing. Using this technique we identify >25,000 discrete origin sites at sub-kilobase resolution on the human genome, with high concordance between biological replicates. Most activated origins identified by ini-seq are found at transcriptional start sites and contain G-quadruplex (G4) motifs. They tend to cluster in early-replicating domains, providing a correlation between early replication timing and local density of activated origins. Origins identified by ini-seq show highest concordance with sites identified by SNS-seq, followed by OK-seq and bubble-seq. Furthermore, germline origins identified by positive nucleotide distribution skew jumps overlap with origins identified by ini-seq and OK-seq more frequently and more specifically than do sites identified by either SNS-seq or bubble-seq. PMID:27587586
Langley, Alexander R; Gräf, Stefan; Smith, James C; Krude, Torsten
2016-12-01
Next-generation sequencing has enabled the genome-wide identification of human DNA replication origins. However, different approaches to mapping replication origins, namely (i) sequencing isolated small nascent DNA strands (SNS-seq); (ii) sequencing replication bubbles (bubble-seq) and (iii) sequencing Okazaki fragments (OK-seq), show only limited concordance. To address this controversy, we describe here an independent high-resolution origin mapping technique that we call initiation site sequencing (ini-seq). In this approach, newly replicated DNA is directly labelled with digoxigenin-dUTP near the sites of its initiation in a cell-free system. The labelled DNA is then immunoprecipitated and genomic locations are determined by DNA sequencing. Using this technique we identify >25,000 discrete origin sites at sub-kilobase resolution on the human genome, with high concordance between biological replicates. Most activated origins identified by ini-seq are found at transcriptional start sites and contain G-quadruplex (G4) motifs. They tend to cluster in early-replicating domains, providing a correlation between early replication timing and local density of activated origins. Origins identified by ini-seq show highest concordance with sites identified by SNS-seq, followed by OK-seq and bubble-seq. Furthermore, germline origins identified by positive nucleotide distribution skew jumps overlap with origins identified by ini-seq and OK-seq more frequently and more specifically than do sites identified by either SNS-seq or bubble-seq. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Micropipette force probe to quantify single-cell force generation: application to T-cell activation
Sawicka, Anna; Babataheri, Avin; Dogniaux, Stéphanie; Barakat, Abdul I.; Gonzalez-Rodriguez, David; Hivroz, Claire; Husson, Julien
2017-01-01
In response to engagement of surface molecules, cells generate active forces that regulate many cellular processes. Developing tools that permit gathering mechanical and morphological information on these forces is of the utmost importance. Here we describe a new technique, the micropipette force probe, that uses a micropipette as a flexible cantilever that can aspirate at its tip a bead that is coated with molecules of interest and is brought in contact with the cell. This technique simultaneously allows tracking the resulting changes in cell morphology and mechanics as well as measuring the forces generated by the cell. To illustrate the power of this technique, we applied it to the study of human primary T lymphocytes (T-cells). It allowed the fine monitoring of pushing and pulling forces generated by T-cells in response to various activating antibodies and bending stiffness of the micropipette. We further dissected the sequence of mechanical and morphological events occurring during T-cell activation to model force generation and to reveal heterogeneity in the cell population studied. We also report the first measurement of the changes in Young’s modulus of T-cells during their activation, showing that T-cells stiffen within the first minutes of the activation process. PMID:28931600
THE DISCOVERY OF SOLAR-LIKE ACTIVITY CYCLES BEYOND THE END OF THE MAIN SEQUENCE?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Route, Matthew, E-mail: mroute@purdue.edu
2016-10-20
The long-term magnetic behavior of objects near the cooler end of the stellar main sequence is poorly understood. Most theoretical work on the generation of magnetism in these ultracool dwarfs (spectral type ≥M7 stars and brown dwarfs) suggests that their magnetic fields should not change in strength and direction. Using polarized radio emission measurements of their magnetic field orientations, I demonstrate that these cool, low-mass, fully convective objects appear to undergo magnetic polarity reversals analogous to those that occur on the Sun. This powerful new technique potentially indicates that the patterns of magnetic activity displayed by the Sun continue tomore » exist, despite the fully convective interiors of these objects, in contravention of several leading theories of the generation of magnetic fields by internal dynamos.« less
Limitations and possibilities of low cell number ChIP-seq
2012-01-01
Background Chromatin immunoprecipitation coupled with high-throughput DNA sequencing (ChIP-seq) offers high resolution, genome-wide analysis of DNA-protein interactions. However, current standard methods require abundant starting material in the range of 1–20 million cells per immunoprecipitation, and remain a bottleneck to the acquisition of biologically relevant epigenetic data. Using a ChIP-seq protocol optimised for low cell numbers (down to 100,000 cells / IP), we examined the performance of the ChIP-seq technique on a series of decreasing cell numbers. Results We present an enhanced native ChIP-seq method tailored to low cell numbers that represents a 200-fold reduction in input requirements over existing protocols. The protocol was tested over a range of starting cell numbers covering three orders of magnitude, enabling determination of the lower limit of the technique. At low input cell numbers, increased levels of unmapped and duplicate reads reduce the number of unique reads generated, and can drive up sequencing costs and affect sensitivity if ChIP is attempted from too few cells. Conclusions The optimised method presented here considerably reduces the input requirements for performing native ChIP-seq. It extends the applicability of the technique to isolated primary cells and rare cell populations (e.g. biobank samples, stem cells), and in many cases will alleviate the need for cell culture and any associated alteration of epigenetic marks. However, this study highlights a challenge inherent to ChIP-seq from low cell numbers: as cell input numbers fall, levels of unmapped sequence reads and PCR-generated duplicate reads rise. We discuss a number of solutions to overcome the effects of reducing cell number that may aid further improvements to ChIP performance. PMID:23171294
Sutton, Lesley-Ann; Ljungström, Viktor; Mansouri, Larry; Young, Emma; Cortese, Diego; Navrkalova, Veronika; Malcikova, Jitka; Muggen, Alice F; Trbusek, Martin; Panagiotidis, Panagiotis; Davi, Frederic; Belessi, Chrysoula; Langerak, Anton W; Ghia, Paolo; Pospisilova, Sarka; Stamatopoulos, Kostas; Rosenquist, Richard
2015-03-01
Next-generation sequencing has revealed novel recurrent mutations in chronic lymphocytic leukemia, particularly in patients with aggressive disease. Here, we explored targeted re-sequencing as a novel strategy to assess the mutation status of genes with prognostic potential. To this end, we utilized HaloPlex targeted enrichment technology and designed a panel including nine genes: ATM, BIRC3, MYD88, NOTCH1, SF3B1 and TP53, which have been linked to the prognosis of chronic lymphocytic leukemia, and KLHL6, POT1 and XPO1, which are less characterized but were found to be recurrently mutated in various sequencing studies. A total of 188 chronic lymphocytic leukemia patients with poor prognostic features (unmutated IGHV, n=137; IGHV3-21 subset #2, n=51) were sequenced on the HiSeq 2000 and data were analyzed using well-established bioinformatics tools. Using a conservative cutoff of 10% for the mutant allele, we found that 114/180 (63%) patients carried at least one mutation, with mutations in ATM, BIRC3, NOTCH1, SF3B1 and TP53 accounting for 149/177 (84%) of all mutations. We selected 155 mutations for Sanger validation (variant allele frequency, 10-99%) and 93% (144/155) of mutations were confirmed; notably, all 11 discordant variants had a variant allele frequency between 11-27%, hence at the detection limit of conventional Sanger sequencing. Technical precision was assessed by repeating the entire HaloPlex procedure for 63 patients; concordance was found for 77/82 (94%) mutations. In summary, this study demonstrates that targeted next-generation sequencing is an accurate and reproducible technique potentially suitable for routine screening, eventually as a stand-alone test without the need for confirmation by Sanger sequencing. Copyright© Ferrata Storti Foundation.
Cingolani, Pablo; Patel, Viral M.; Coon, Melissa; Nguyen, Tung; Land, Susan J.; Ruden, Douglas M.; Lu, Xiangyi
2012-01-01
This paper describes a new program SnpSift for filtering differential DNA sequence variants between two or more experimental genomes after genotoxic chemical exposure. Here, we illustrate how SnpSift can be used to identify candidate phenotype-relevant variants including single nucleotide polymorphisms, multiple nucleotide polymorphisms, insertions, and deletions (InDels) in mutant strains isolated from genome-wide chemical mutagenesis of Drosophila melanogaster. First, the genomes of two independently isolated mutant fly strains that are allelic for a novel recessive male-sterile locus generated by genotoxic chemical exposure were sequenced using the Illumina next-generation DNA sequencer to obtain 20- to 29-fold coverage of the euchromatic sequences. The sequencing reads were processed and variants were called using standard bioinformatic tools. Next, SnpEff was used to annotate all sequence variants and their potential mutational effects on associated genes. Then, SnpSift was used to filter and select differential variants that potentially disrupt a common gene in the two allelic mutant strains. The potential causative DNA lesions were partially validated by capillary sequencing of polymerase chain reaction-amplified DNA in the genetic interval as defined by meiotic mapping and deletions that remove defined regions of the chromosome. Of the five candidate genes located in the genetic interval, the Pka-like gene CG12069 was found to carry a separate pre-mature stop codon mutation in each of the two allelic mutants whereas the other four candidate genes within the interval have wild-type sequences. The Pka-like gene is therefore a strong candidate gene for the male-sterile locus. These results demonstrate that combining SnpEff and SnpSift can expedite the identification of candidate phenotype-causative mutations in chemically mutagenized Drosophila strains. This technique can also be used to characterize the variety of mutations generated by genotoxic chemicals. PMID:22435069
Identifiability, genomics and U.K. data protection law.
Curren, Liam; Boddington, Paula; Gowans, Heather; Hawkins, Naomi; Kanellopoulou, Nadja; Kaye, Jane; Melham, Karen
2010-09-01
Analyses of individuals' genomes--their entire DNA sequence--have increased knowledge about the links between genetics and disease. Anticipated advances in 'next generation' DNA-sequencing techniques will see the routine research use of whole genomes, rather than distinct parts, within the next few years. The scientific benefits of genomic research are, however, accompanied by legal and ethical concerns. Despite the assumption that genetic research data can and will be rendered anonymous, participants' identities can sometimes be elucidated, which could cause data protection legislation to apply. We undertake a timely reappraisal of these laws--particularly new penalties--and identifiability in genomic research.
Cortijo, Sandra; Charoensawan, Varodom; Roudier, François; Wigge, Philip A
2018-01-01
Chromatin immunoprecipitation combined with next-generation sequencing (ChIP-seq) is a powerful technique to investigate in vivo transcription factor (TF) binding to DNA, as well as chromatin marks. Here we provide a detailed protocol for all the key steps to perform ChIP-seq in Arabidopsis thaliana roots, also working on other A. thaliana tissues and in most non-ligneous plants. We detail all steps from material collection, fixation, chromatin preparation, immunoprecipitation, library preparation, and finally computational analysis based on a combination of publicly available tools.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mukamel, Shaul, E-mail: smukamel@uci.edu; Bakker, Huib J.
Multidimensional signals are generated by subjecting molecules to sequences of short optical pulses and recording correlation plots related to the various controlled delay periods. These techniques which span all the way from the THz to the x-ray regimes provide qualitatively new structural and dynamical molecular information not available from conventional one-dimensional techniques. This issue surveys the recent experimental and theoretical progresses in this rapidly developing 20 year old field which illustrates the novel insights provided by multidimensional techniques into electronic and nuclear motions. It should serve as a valuable source for experts in the field and help introduce newcomers tomore » this exciting and challenging branch of nonlinear spectroscopy.« less
Application of resequencing to rice genomics, functional genomics and evolutionary analysis
2014-01-01
Rice is a model system used for crop genomics studies. The completion of the rice genome draft sequences in 2002 not only accelerated functional genome studies, but also initiated a new era of resequencing rice genomes. Based on the reference genome in rice, next-generation sequencing (NGS) using the high-throughput sequencing system can efficiently accomplish whole genome resequencing of various genetic populations and diverse germplasm resources. Resequencing technology has been effectively utilized in evolutionary analysis, rice genomics and functional genomics studies. This technique is beneficial for both bridging the knowledge gap between genotype and phenotype and facilitating molecular breeding via gene design in rice. Here, we also discuss the limitation, application and future prospects of rice resequencing. PMID:25006357
Smart, M D; Cornman, R S; Iwanowicz, D D; McDermott-Kubeczko, M; Pettis, J S; Spivak, M S; Otto, C R V
2017-02-01
Taxonomic identification of pollen has historically been accomplished via light microscopy but requires specialized knowledge and reference collections, particularly when identification to lower taxonomic levels is necessary. Recently, next-generation sequencing technology has been used as a cost-effective alternative for identifying bee-collected pollen; however, this novel approach has not been tested on a spatially or temporally robust number of pollen samples. Here, we compare pollen identification results derived from light microscopy and DNA sequencing techniques with samples collected from honey bee colonies embedded within a gradient of intensive agricultural landscapes in the Northern Great Plains throughout the 2010-2011 growing seasons. We demonstrate that at all taxonomic levels, DNA sequencing was able to discern a greater number of taxa, and was particularly useful for the identification of infrequently detected species. Importantly, substantial phenological overlap did occur for commonly detected taxa using either technique, suggesting that DNA sequencing is an appropriate, and enhancing, substitutive technique for accurately capturing the breadth of bee-collected species of pollen present across agricultural landscapes. We also show that honey bees located in high and low intensity agricultural settings forage on dissimilar plants, though with overlap of the most abundantly collected pollen taxa. We highlight practical applications of utilizing sequencing technology, including addressing ecological issues surrounding land use, climate change, importance of taxa relative to abundance, and evaluating the impact of conservation program habitat enhancement efforts. Published by Oxford University Press on behalf of Entomological Society of America 2016. This work is written by US Government employees and is in the public domain in the US.
A Next-Generation Sequencing Primer—How Does It Work and What Can It Do?
Alekseyev, Yuriy O.; Fazeli, Roghayeh; Yang, Shi; Basran, Raveen; Miller, Nancy S.
2018-01-01
Next-generation sequencing refers to a high-throughput technology that determines the nucleic acid sequences and identifies variants in a sample. The technology has been introduced into clinical laboratory testing and produces test results for precision medicine. Since next-generation sequencing is relatively new, graduate students, medical students, pathology residents, and other physicians may benefit from a primer to provide a foundation about basic next-generation sequencing methods and applications, as well as specific examples where it has had diagnostic and prognostic utility. Next-generation sequencing technology grew out of advances in multiple fields to produce a sophisticated laboratory test with tremendous potential. Next-generation sequencing may be used in the clinical setting to look for specific genetic alterations in patients with cancer, diagnose inherited conditions such as cystic fibrosis, and detect and profile microbial organisms. This primer will review DNA sequencing technology, the commercialization of next-generation sequencing, and clinical uses of next-generation sequencing. Specific applications where next-generation sequencing has demonstrated utility in oncology are provided. PMID:29761157
BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment
Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy
2016-01-01
Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome. PMID:27461955
A system of nonlinear set valued variational inclusions.
Tang, Yong-Kun; Chang, Shih-Sen; Salahuddin, Salahuddin
2014-01-01
In this paper, we studied the existence theorems and techniques for finding the solutions of a system of nonlinear set valued variational inclusions in Hilbert spaces. To overcome the difficulties, due to the presence of a proper convex lower semicontinuous function ϕ and a mapping g which appeared in the considered problems, we have used the resolvent operator technique to suggest an iterative algorithm to compute approximate solutions of the system of nonlinear set valued variational inclusions. The convergence of the iterative sequences generated by algorithm is also proved. 49J40; 47H06.
Visualization of impact damage of composite plates by means of the Moire technique
NASA Technical Reports Server (NTRS)
Knauss, W. G.; Babcock, C. D.; Chai, H.
1980-01-01
The phenomenological aspects of propagation damage due to low velocity impact on heavily loaded graphite-epoxy composite laminates were investigated using high speed photography coupled with the moire fringe technique. High speed moire motion records of the impacted specimens are presented. The results provide information on the time scale and sequence of the failure process. While the generation of the initial damage cannot always be separated temporally from the spreading of the damage, the latter takes place on the average with a speed on the order of 200 m/sec.
Solopchuk, Oleg; Alamia, Andrea; Dricot, Laurence; Duque, Julie; Zénon, Alexandre
2017-12-01
Neuroimaging studies have repeatedly emphasized the role of the supplementary motor area (SMA) in motor sequence learning, but interferential approaches have led to inconsistent findings. Here, we aimed to test the role of the SMA in motor skill learning by combining interferential and neuroimaging techniques. Sixteen subjects were trained on simple finger movement sequences for 4 days. Afterwards, they underwent two neuroimaging sessions, in which they executed both trained and novel sequences. Prior to entering the scanner, the subjects received inhibitory transcranial magnetic stimulation (TMS) over the SMA or a control site. Using multivariate fMRI analysis, we confirmed that motor training enhances the neural representation of motor sequences in the SMA, in accordance with previous findings. However, although SMA inhibition altered sequence representation (i.e. between-sequence decoding accuracy) in this area, behavioural performance remained unimpaired. Our findings question the causal link between the neuroimaging correlate of elementary motor sequence representation in the SMA and sequence generation, calling for a more thorough investigation of the role of this region in performance of learned motor sequences. Copyright © 2017 Elsevier Inc. All rights reserved.
Hafler, Brian P
2017-03-01
Inherited retinal dystrophies are a significant cause of vision loss and are characterized by the loss of photoreceptors and the retinal pigment epithelium (RPE). Mutations in approximately 250 genes cause inherited retinal degenerations with a high degree of genetic heterogeneity. New techniques in next-generation sequencing are allowing the comprehensive analysis of all retinal disease genes thus changing the approach to the molecular diagnosis of inherited retinal dystrophies. This review serves to analyze clinical progress in genetic diagnostic testing and implications for retinal gene therapy. A literature search of PubMed and OMIM was conducted to relevant articles in inherited retinal dystrophies. Next-generation genetic sequencing allows the simultaneous analysis of all the approximately 250 genes that cause inherited retinal dystrophies. Reported diagnostic rates range are high and range from 51% to 57%. These new sequencing tools are highly accurate with sensitivities of 97.9% and specificities of 100%. Retinal gene therapy clinical trials are underway for multiple genes including RPE65, ABCA4, CHM, RS1, MYO7A, CNGA3, CNGB3, ND4, and MERTK for which a molecular diagnosis may be beneficial for patients. Comprehensive next-generation genetic sequencing of all retinal dystrophy genes is changing the paradigm for how retinal specialists perform genetic testing for inherited retinal degenerations. Not only are high diagnostic yields obtained, but mutations in genes with novel clinical phenotypes are also identified. In the era of retinal gene therapy clinical trials, identifying specific genetic defects will increasingly be of use to identify patients who may enroll in clinical studies and benefit from novel therapies.
Leung, Ross Ka-Kit; Dong, Zhi Qiang; Sa, Fei; Chong, Cheong Meng; Lei, Si Wan; Tsui, Stephen Kwok-Wing; Lee, Simon Ming-Yuen
2014-02-01
Minor variants have significant implications in quasispecies evolution, early cancer detection and non-invasive fetal genotyping but their accurate detection by next-generation sequencing (NGS) is hampered by sequencing errors. We generated sequencing data from mixtures at predetermined ratios in order to provide insight into sequencing errors and variations that can arise for which simulation cannot be performed. The information also enables better parameterization in depth of coverage, read quality and heterogeneity, library preparation techniques, technical repeatability for mathematical modeling, theory development and simulation experimental design. We devised minor variant authentication rules that achieved 100% accuracy in both testing and validation experiments. The rules are free from tedious inspection of alignment accuracy, sequencing read quality or errors introduced by homopolymers. The authentication processes only require minor variants to: (1) have minimum depth of coverage larger than 30; (2) be reported by (a) four or more variant callers, or (b) DiBayes or LoFreq, plus SNVer (or BWA when no results are returned by SNVer), and with the interassay coefficient of variation (CV) no larger than 0.1. Quantification accuracy undermined by sequencing errors could neither be overcome by ultra-deep sequencing, nor recruiting more variant callers to reach a consensus, such that consistent underestimation and overestimation (i.e. low CV) were observed. To accommodate stochastic error and adjust the observed ratio within a specified accuracy, we presented a proof of concept for the use of a double calibration curve for quantification, which provides an important reference towards potential industrial-scale fabrication of calibrants for NGS.
Identifying and reducing error in cluster-expansion approximations of protein energies.
Hahn, Seungsoo; Ashenberg, Orr; Grigoryan, Gevorg; Keating, Amy E
2010-12-01
Protein design involves searching a vast space for sequences that are compatible with a defined structure. This can pose significant computational challenges. Cluster expansion is a technique that can accelerate the evaluation of protein energies by generating a simple functional relationship between sequence and energy. The method consists of several steps. First, for a given protein structure, a training set of sequences with known energies is generated. Next, this training set is used to expand energy as a function of clusters consisting of single residues, residue pairs, and higher order terms, if required. The accuracy of the sequence-based expansion is monitored and improved using cross-validation testing and iterative inclusion of additional clusters. As a trade-off for evaluation speed, the cluster-expansion approximation causes prediction errors, which can be reduced by including more training sequences, including higher order terms in the expansion, and/or reducing the sequence space described by the cluster expansion. This article analyzes the sources of error and introduces a method whereby accuracy can be improved by judiciously reducing the described sequence space. The method is applied to describe the sequence-stability relationship for several protein structures: coiled-coil dimers and trimers, a PDZ domain, and T4 lysozyme as examples with computationally derived energies, and SH3 domains in amphiphysin-1 and endophilin-1 as examples where the expanded pseudo-energies are obtained from experiments. Our open-source software package Cluster Expansion Version 1.0 allows users to expand their own energy function of interest and thereby apply cluster expansion to custom problems in protein design. © 2010 Wiley Periodicals, Inc.
Simulative research on generating UWB signals by all-optical BPF
NASA Astrophysics Data System (ADS)
Yang, Chunyong; Hou, Rui; Chen, Shaoping
2007-11-01
The simulating technique is used to investigate generating and distributing Ultra-Wide-Band signals depend on fiber transmission. Numerical result for the system about the frequency response shows that the characteristics of band-pass filter is presented, and the shorter the wavelength is, the bandwidth of lower frequency is wider. Transmission performance simulation for 12.5Gb/s psudo-random sequence also shows that Gaussian pulse signal after transported in fiber is similar to UWB wave pattern mask of FCC in time domain and frequency spectrum specification of FCC in frequency domain .
Perturbation Techniques in Condition-Controlled Freeze-Thaw Heat Transfer
1993-06-01
is substituted into the governing equations for the problem. By equating the coefficients of each power ofe to zero , one can generate a sequence of...that a digital computer can be used to generate as many terms as desirable. The scheme circumvents the mounting algebraic labor entailed in manual...Ouj + o;* -(c;2#0 -3cjcj)OO = uolo (ul,- u04O 1c~, - uOI) UOW[FI UOOIO,# p + O(ut,-u0.~OI),.=,] (219) u2(NJ, O=I)=0 u2(WdtO= W)=0 (220) Zero -order
Natural and Genetically Engineered Proteins for Tissue Engineering
Gomes, Sílvia; Leonor, Isabel B.; Mano, João F.; Reis, Rui L.
2011-01-01
To overcome the limitations of traditionally used autografts, allografts and, to a lesser extent, synthetic materials, there is the need to develop a new generation of scaffolds with adequate mechanical and structural support, control of cell attachment, migration, proliferation and differentiation and with bio-resorbable features. This suite of properties would allow the body to heal itself at the same rate as implant degradation. Genetic engineering offers a route to this level of control of biomaterial systems. The possibility of expressing biological components in nature and to modify or bioengineer them further, offers a path towards multifunctional biomaterial systems. This includes opportunities to generate new protein sequences, new self-assembling peptides or fusions of different bioactive domains or protein motifs. New protein sequences with tunable properties can be generated that can be used as new biomaterials. In this review we address some of the most frequently used proteins for tissue engineering and biomedical applications and describe the techniques most commonly used to functionalize protein-based biomaterials by combining them with bioactive molecules to enhance biological performance. We also highlight the use of genetic engineering, for protein heterologous expression and the synthesis of new protein-based biopolymers, focusing the advantages of these functionalized biopolymers when compared with their counterparts extracted directly from nature and modified by techniques such as physical adsorption or chemical modification. PMID:22058578
Hodzic, Jasin; Gurbeta, Lejla; Omanovic-Miklicanin, Enisa; Badnjevic, Almir
2017-01-01
Introduction: Major advancements in DNA sequencing methods introduced in the first decade of the new millennium initiated a rapid expansion of sequencing studies, which yielded a tremendous amount of DNA sequence data, including whole sequenced genomes of various species, including plants. A set of novel sequencing platforms, often collectively named as “next-generation sequencing” (NGS) completely transformed the life sciences, by allowing extensive throughput, while greatly reducing the necessary time, labor and cost of any sequencing endeavor. Purpose: of this paper is to present an overview NGS platforms used to produce the current compendium of published draft genomes of various plants, namely the Roche/454, ABI/SOLiD, and Solexa/Illumina, and to determine the most frequently used platform for the whole genome sequencing of plants in light of genotypization of immortelle plant. Materials and methods: 45 papers were selected (with 47 presented plant genome draft sequences), and utilized sequencing techniques and NGS platforms (Roche/454, ABI/SOLiD and Illumina/Solexa) in selected papers were determined. Subsequently, frequency of usage of each platform or combination of platforms was calculated. Results: Illumina/Solexa platforms are by used either as sole sequencing tool in 40.42% of published genomes, or in combination with other platforms - additional 48.94% of published genomes, followed by Roche/454 platforms, used in combination with traditional Sanger sequencing method (10.64%), and never as a sole tool. ABI/SOLiD was only used in combination with Illumina/Solexa and Roche/454 in 4.25% of publications. Conclusions: Illumina/Solexa platforms are by far most preferred by researchers, most probably due to most affordable sequencing costs. Taking into consideration the current economic situation in the Balkans region, Illumina Solexa is the best (if not the only) platform choice if the sequencing of immortelle plant (Helichrysium arenarium) is to be performed by the researchers in this region. PMID:28974852
Imabayashi, Yumi; Moriyama, Masafumi; Takeshita, Toru; Ieda, Shinsuke; Hayashida, Jun-Nosuke; Tanaka, Akihiko; Maehara, Takashi; Furukawa, Sachiko; Ohta, Miho; Kubota, Keigo; Yamauchi, Masaki; Ishiguro, Noriko; Yamashita, Yoshihisa; Nakamura, Seiji
2016-06-16
Oral candidiasis is closely associated with changes in oral fungal biodiversity and is caused primarily by Candida albicans. However, the widespread use of empiric and prophylactic antifungal drugs has caused a shift in fungal biodiversity towards other Candida or yeast species. Recently, next-generation sequencing (NGS) has provided an improvement over conventional culture techniques, allowing rapid comprehensive analysis of oral fungal biodiversity. In this study, we used NGS to examine the oral fungal biodiversity of 27 patients with pseudomembranous oral candidiasis (POC) and 66 healthy controls. The total number of fungal species in patients with POC and healthy controls was 67 and 86, respectively. The copy number of total PCR products and the proportion of non-C. albicans, especially C. dubliniensis, in patients with POC, were higher than those in healthy controls. The detection patterns in patients with POC were similar to those in controls after antifungal treatment. Interestingly, the number of fungal species and the copy number of total PCR products in healthy controls increased with aging. These results suggest that high fungal biodiversity and aging might be involved in the pathogenesis of oral candidiasis. We therefore conclude that NGS is a useful technique for investigating oral candida infections.
A deep learning method for lincRNA detection using auto-encoder algorithm.
Yu, Ning; Yu, Zeng; Pan, Yi
2017-12-06
RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly annotated lincRNA data, deep learning methods based on auto-encoder algorithm can exert their capability in knowledge learning in order to capture the useful features and the information correlation along DNA genome sequences for lincRNA detection. As our knowledge, this is the first application to adopt the deep learning techniques for identifying lincRNA transcription sequences.
Optical digital chaos cryptography
NASA Astrophysics Data System (ADS)
Arenas-Pingarrón, Álvaro; González-Marcos, Ana P.; Rivas-Moscoso, José M.; Martín-Pereda, José A.
2007-10-01
In this work we present a new way to mask the data in a one-user communication system when direct sequence - code division multiple access (DS-CDMA) techniques are used. The code is generated by a digital chaotic generator, originally proposed by us and previously reported for a chaos cryptographic system. It is demonstrated that if the user's data signal is encoded with a bipolar phase-shift keying (BPSK) technique, usual in DS-CDMA, it can be easily recovered from a time-frequency domain representation. To avoid this situation, a new system is presented in which a previous dispersive stage is applied to the data signal. A time-frequency domain analysis is performed, and the devices required at the transmitter and receiver end, both user-independent, are presented for the optical domain.
Credo, Grace M; Su, Xing; Wu, Kai; Elibol, Oguz H; Liu, David J; Reddy, Bobby; Tsai, Ta-Wei; Dorvel, Brian R; Daniels, Jonathan S; Bashir, Rashid; Varma, Madoo
2012-03-21
We introduce a label-free approach for sensing polymerase reactions on deoxyribonucleic acid (DNA) using a chelator-modified silicon-on-insulator field-effect transistor (SOI-FET) that exhibits selective and reversible electrical response to pyrophosphate anions. The chemical modification of the sensor surface was designed to include rolling-circle amplification (RCA) DNA colonies for locally enhanced pyrophosphate (PPi) signal generation and sensors with immobilized chelators for capture and surface-sensitive detection of diffusible reaction by-products. While detecting arrays of enzymatic base incorporation reactions is typically accomplished using optical fluorescence or chemiluminescence techniques, our results suggest that it is possible to develop scalable and portable PPi-specific sensors and platforms for broad biomedical applications such as DNA sequencing and microbe detection using surface-sensitive electrical readout techniques.
NRGC: a novel referential genome compression algorithm.
Saha, Subrata; Rajasekaran, Sanguthevar
2016-11-15
Next-generation sequencing techniques produce millions to billions of short reads. The procedure is not only very cost effective but also can be done in laboratory environment. The state-of-the-art sequence assemblers then construct the whole genomic sequence from these reads. Current cutting edge computing technology makes it possible to build genomic sequences from the billions of reads within a minimal cost and time. As a consequence, we see an explosion of biological sequences in recent times. In turn, the cost of storing the sequences in physical memory or transmitting them over the internet is becoming a major bottleneck for research and future medical applications. Data compression techniques are one of the most important remedies in this context. We are in need of suitable data compression algorithms that can exploit the inherent structure of biological sequences. Although standard data compression algorithms are prevalent, they are not suitable to compress biological sequencing data effectively. In this article, we propose a novel referential genome compression algorithm (NRGC) to effectively and efficiently compress the genomic sequences. We have done rigorous experiments to evaluate NRGC by taking a set of real human genomes. The simulation results show that our algorithm is indeed an effective genome compression algorithm that performs better than the best-known algorithms in most of the cases. Compression and decompression times are also very impressive. The implementations are freely available for non-commercial purposes. They can be downloaded from: http://www.engr.uconn.edu/~rajasek/NRGC.zip CONTACT: rajasek@engr.uconn.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Smit, Kyra N; van Poppelen, Natasha M; Vaarwater, Jolanda; Verdijk, Robert; van Marion, Ronald; Kalirai, Helen; Coupland, Sarah E; Thornton, Sophie; Farquhar, Neil; Dubbink, Hendrikus-Jan; Paridaens, Dion; de Klein, Annelies; Kiliç, Emine
2018-05-01
Uveal melanoma is a highly aggressive cancer of the eye, in which nearly 50% of the patients die from metastasis. It is the most common type of primary eye cancer in adults. Chromosome and mutation status have been shown to correlate with the disease-free survival. Loss of chromosome 3 and inactivating mutations in BAP1, which is located on chromosome 3, are strongly associated with 'high-risk' tumors that metastasize early. Other genes often involved in uveal melanoma are SF3B1 and EIF1AX, which are found to be mutated in intermediate- and low-risk tumors, respectively. To obtain genetic information of all genes in one test, we developed a targeted sequencing method that can detect mutations in uveal melanoma genes and chromosomal anomalies in chromosome 1, 3, and 8. With as little as 10 ng DNA, we obtained enough coverage on all genes to detect mutations, such as substitutions, deletions, and insertions. These results were validated with Sanger sequencing in 28 samples. In >90% of the cases, the BAP1 mutation status corresponded to the BAP1 immunohistochemistry. The results obtained in the Ion Torrent single-nucleotide polymorphism assay were confirmed with several other techniques, such as fluorescence in situ hybridization, multiplex ligation-dependent probe amplification, and Illumina SNP array. By validating our assay in 27 formalin-fixed paraffin-embedded and 43 fresh uveal melanomas, we show that mutations and chromosome status can reliably be obtained using targeted next-generation sequencing. Implementing this technique as a diagnostic pathology application for uveal melanoma will allow prediction of the patients' metastatic risk and potentially assess eligibility for new therapies.
Wan, Shixiang; Zou, Quan
2017-01-01
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Lip-reading enhancement for law enforcement
NASA Astrophysics Data System (ADS)
Theobald, Barry J.; Harvey, Richard; Cox, Stephen J.; Lewis, Colin; Owen, Gari P.
2006-09-01
Accurate lip-reading techniques would be of enormous benefit for agencies involved in counter-terrorism and other law-enforcement areas. Unfortunately, there are very few skilled lip-readers, and it is apparently a difficult skill to transmit, so the area is under-resourced. In this paper we investigate the possibility of making the lip-reading task more amenable to a wider range of operators by enhancing lip movements in video sequences using active appearance models. These are generative, parametric models commonly used to track faces in images and video sequences. The parametric nature of the model allows a face in an image to be encoded in terms of a few tens of parameters, while the generative nature allows faces to be re-synthesised using the parameters. The aim of this study is to determine if exaggerating lip-motions in video sequences by amplifying the parameters of the model improves lip-reading ability. We also present results of lip-reading tests undertaken by experienced (but non-expert) adult subjects who claim to use lip-reading in their speech recognition process. The results, which are comparisons of word error-rates on unprocessed and processed video, are mixed. We find that there appears to be the potential to improve the word error rate but, for the method to improve the intelligibility there is need for more sophisticated tracking and visual modelling. Our technique can also act as an expression or visual gesture amplifier and so has applications to animation and the presentation of information via avatars or synthetic humans.
Targeting vector construction through recombineering.
Malureanu, Liviu A
2011-01-01
Gene targeting in mouse embryonic stem cells is an essential, yet still very expensive and highly time-consuming, tool and method to study gene function at the organismal level or to create mouse models of human diseases. Conventional cloning-based methods have been largely used for generating targeting vectors, but are hampered by a number of limiting factors, including the variety and location of restriction enzymes in the gene locus of interest, the specific PCR amplification of repetitive DNA sequences, and cloning of large DNA fragments. Recombineering is a technique that exploits the highly efficient homologous recombination function encoded by λ phage in Escherichia coli. Bacteriophage-based recombination can recombine homologous sequences as short as 30-50 bases, allowing manipulations such as insertion, deletion, or mutation of virtually any genomic region. The large availability of mouse genomic bacterial artificial chromosome (BAC) libraries covering most of the genome facilitates the retrieval of genomic DNA sequences from the bacterial chromosomes through recombineering. This chapter describes a successfully applied protocol and aims to be a detailed guide through the steps of generation of targeting vectors through recombineering.
Rasmussen, Ulla; Svenning, Mette M.
1998-01-01
The presence of repeated DNA (short tandemly repeated repetitive [STRR] and long tandemly repeated repetitive [LTRR]) sequences in the genome of cyanobacteria was used to generate a fingerprint method for symbiotic and free-living isolates. Primers corresponding to the STRR and LTRR sequences were used in the PCR, resulting in a method which generate specific fingerprints for individual isolates. The method was useful both with purified DNA and with intact cyanobacterial filaments or cells as templates for the PCR. Twenty-three Nostoc isolates from a total of 35 were symbiotic isolates from the angiosperm Gunnera species, including isolates from the same Gunnera species as well as from different species. The results show a genetic similarity among isolates from different Gunnera species as well as a genetic heterogeneity among isolates from the same Gunnera species. Isolates which have been postulated to be closely related or identical revealed similar results by the PCR method, indicating that the technique is useful for clustering of even closely related strains. The method was applied to nonheterocystus cyanobacteria from which a fingerprint pattern was obtained. PMID:16349487
Structural Analysis of Biodiversity
Sirovich, Lawrence; Stoeckle, Mark Y.; Zhang, Yu
2010-01-01
Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. PMID:20195371
Algorithms exploiting ultrasonic sensors for subject classification
NASA Astrophysics Data System (ADS)
Desai, Sachi; Quoraishee, Shafik
2009-09-01
Proposed here is a series of techniques exploiting micro-Doppler ultrasonic sensors capable of characterizing various detected mammalian targets based on their physiological movements captured a series of robust features. Employed is a combination of unique and conventional digital signal processing techniques arranged in such a manner they become capable of classifying a series of walkers. These processes for feature extraction develops a robust feature space capable of providing discrimination of various movements generated from bipeds and quadrupeds and further subdivided into large or small. These movements can be exploited to provide specific information of a given signature dividing it in a series of subset signatures exploiting wavelets to generate start/stop times. After viewing a series spectrograms of the signature we are able to see distinct differences and utilizing kurtosis, we generate an envelope detector capable of isolating each of the corresponding step cycles generated during a walk. The walk cycle is defined as one complete sequence of walking/running from the foot pushing off the ground and concluding when returning to the ground. This time information segments the events that are readily seen in the spectrogram but obstructed in the temporal domain into individual walk sequences. This walking sequence is then subsequently translated into a three dimensional waterfall plot defining the expected energy value associated with the motion at particular instance of time and frequency. The value is capable of being repeatable for each particular class and employable to discriminate the events. Highly reliable classification is realized exploiting a classifier trained on a candidate sample space derived from the associated gyrations created by motion from actors of interest. The classifier developed herein provides a capability to classify events as an adult humans, children humans, horses, and dogs at potentially high rates based on the tested sample space. The algorithm developed and described will provide utility to an underused sensor modality for human intrusion detection because of the current high-rate of generated false alarms. The active ultrasonic sensor coupled in a multi-modal sensor suite with binary, less descriptive sensors like seismic devices realizing a greater accuracy rate for detection of persons of interest for homeland purposes.
Artificial neural network study on organ-targeting peptides
NASA Astrophysics Data System (ADS)
Jung, Eunkyoung; Kim, Junhyoung; Choi, Seung-Hoon; Kim, Minkyoung; Rhee, Hokyoung; Shin, Jae-Min; Choi, Kihang; Kang, Sang-Kee; Lee, Nam Kyung; Choi, Yun-Jaie; Jung, Dong Hyun
2010-01-01
We report a new approach to studying organ targeting of peptides on the basis of peptide sequence information. The positive control data sets consist of organ-targeting peptide sequences identified by the peroral phage-display technique for four organs, and the negative control data are prepared from random sequences. The capacity of our models to make appropriate predictions is validated by statistical indicators including sensitivity, specificity, enrichment curve, and the area under the receiver operating characteristic (ROC) curve (the ROC score). VHSE descriptor produces statistically significant training models and the models with simple neural network architectures show slightly greater predictive power than those with complex ones. The training and test set statistics indicate that our models could discriminate between organ-targeting and random sequences. We anticipate that our models will be applicable to the selection of organ-targeting peptides for generating peptide drugs or peptidomimetics.
Reduction of display artifacts by random sampling
NASA Technical Reports Server (NTRS)
Ahumada, A. J., Jr.; Nagel, D. C.; Watson, A. B.; Yellott, J. I., Jr.
1983-01-01
The application of random-sampling techniques to remove visible artifacts (such as flicker, moire patterns, and paradoxical motion) introduced in TV-type displays by discrete sequential scanning is discussed and demonstrated. Sequential-scanning artifacts are described; the window of visibility defined in spatiotemporal frequency space by Watson and Ahumada (1982 and 1983) and Watson et al. (1983) is explained; the basic principles of random sampling are reviewed and illustrated by the case of the human retina; and it is proposed that the sampling artifacts can be replaced by random noise, which can then be shifted to frequency-space regions outside the window of visibility. Vertical sequential, single-random-sequence, and continuously renewed random-sequence plotting displays generating 128 points at update rates up to 130 Hz are applied to images of stationary and moving lines, and best results are obtained with the single random sequence for the stationary lines and with the renewed random sequence for the moving lines.
Mapping RNA Structure In Vitro with SHAPE Chemistry and Next-Generation Sequencing (SHAPE-Seq).
Watters, Kyle E; Lucks, Julius B
2016-01-01
Mapping RNA structure with selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry has proven to be a versatile method for characterizing RNA structure in a variety of contexts. SHAPE reagents covalently modify RNAs in a structure-dependent manner to create adducts at the 2'-OH group of the ribose backbone at nucleotides that are structurally flexible. The positions of these adducts are detected using reverse transcriptase (RT) primer extension, which stops one nucleotide before the modification, to create a pool of cDNAs whose lengths reflect the location of SHAPE modification. Quantification of the cDNA pools is used to estimate the "reactivity" of each nucleotide in an RNA molecule to the SHAPE reagent. High reactivities indicate nucleotides that are structurally flexible, while low reactivities indicate nucleotides that are inflexible. These SHAPE reactivities can then be used to infer RNA structures by restraining RNA structure prediction algorithms. Here, we provide a state-of-the-art protocol describing how to perform in vitro RNA structure probing with SHAPE chemistry using next-generation sequencing to quantify cDNA pools and estimate reactivities (SHAPE-Seq). The use of next-generation sequencing allows for higher throughput, more consistent data analysis, and multiplexing capabilities. The technique described herein, SHAPE-Seq v2.0, uses a universal reverse transcription priming site that is ligated to the RNA after SHAPE modification. The introduced priming site allows for the structural analysis of an RNA independent of its sequence.
None
2014-12-01
The recent development of methods applying next-generation sequencing to microbial community characterization has led to the proliferation of these studies in a wide variety of sample types. Yet, variation in the physical properties of environmental samples demands that optimal DNA extraction techniques be explored for each new environment. The microbiota associated with many species of insects offer an extraction challenge as they are frequently surrounded by an armored exoskeleton, inhibiting disruption of the tissues within. In this study, we examine the efficacy of several commonly used protocols for extracting bacterial DNA from ants. While bacterial community composition recovered using Illuminamore » 16S rRNA amplicon sequencing was not detectably biased by any method, the quantity of bacterial DNA varied drastically, reducing the number of samples that could be amplified and sequenced. These results indicate that the concentration necessary for dependable sequencing is around 10,000 copies of target DNA per microliter. Exoskeletal pulverization and tissue digestion increased the reliability of extractions, suggesting that these steps should be included in any study of insect-associated microorganisms that relies on obtaining microbial DNA from intact body segments. Although laboratory and analysis techniques should be standardized across diverse sample types as much as possible, minimal modifications such as these will increase the number of environments in which bacterial communities can be successfully studied.« less
Kerkhof, Jennifer; Schenkel, Laila C; Reilly, Jack; McRobbie, Sheri; Aref-Eshghi, Erfan; Stuart, Alan; Rupar, C Anthony; Adams, Paul; Hegele, Robert A; Lin, Hanxin; Rodenhiser, David; Knoll, Joan; Ainsworth, Peter J; Sadikovic, Bekim
2017-11-01
Next-generation sequencing (NGS) technology has rapidly replaced Sanger sequencing in the assessment of sequence variations in clinical genetics laboratories. One major limitation of current NGS approaches is the ability to detect copy number variations (CNVs) approximately >50 bp. Because these represent a major mutational burden in many genetic disorders, parallel CNV assessment using alternate supplemental methods, along with the NGS analysis, is normally required, resulting in increased labor, costs, and turnaround times. The objective of this study was to clinically validate a novel CNV detection algorithm using targeted clinical NGS gene panel data. We have applied this approach in a retrospective cohort of 391 samples and a prospective cohort of 2375 samples and found a 100% sensitivity (95% CI, 89%-100%) for 37 unique events and a high degree of specificity to detect CNVs across nine distinct targeted NGS gene panels. This NGS CNV pipeline enables stand-alone first-tier assessment for CNV and sequence variants in a clinical laboratory setting, dispensing with the need for parallel CNV analysis using classic techniques, such as microarray, long-range PCR, or multiplex ligation-dependent probe amplification. This NGS CNV pipeline can also be applied to the assessment of complex genomic regions, including pseudogenic DNA sequences, such as the PMS2CL gene, and to mitochondrial genome heteroplasmy detection. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Lee, Yujung; Kim, Changshin; Park, YoungJoon; Pyun, Jung-A; Kwack, KyuBum
2016-12-01
Premature ovarian failure (POF) is characterized by heterogeneous genetic causes such as chromosomal abnormalities and variants in causal genes. Recently, development of techniques made next generation sequencing (NGS) possible to detect genome wide variants including chromosomal abnormalities. Among 37 Korean POF patients, XY karyotype with distal part deletions of Y chromosome, Yp11.32-31 and Yp12 end part, was observed in two patients through NGS. Six deleterious variants in POF genes were also detected which might explain the pathogenesis of POF with abnormalities in the sex chromosomes. Additionally, the two POF patients had no mutation in SRY but three non-synonymous variants were detected in genes regarding sex reversal. These findings suggest candidate causes of POF and sex reversal and show the propriety of NGS to approach the heterogeneous pathogenesis of POF. Copyright © 2016 Elsevier Inc. All rights reserved.
In vivo generation of DNA sequence diversity for cellular barcoding
Peikon, Ian D.; Gizatullina, Diana I.; Zador, Anthony M.
2014-01-01
Heterogeneity is a ubiquitous feature of biological systems. A complete understanding of such systems requires a method for uniquely identifying and tracking individual components and their interactions with each other. We have developed a novel method of uniquely tagging individual cells in vivo with a genetic ‘barcode’ that can be recovered by DNA sequencing. Our method is a two-component system comprised of a genetic barcode cassette whose fragments are shuffled by Rci, a site-specific DNA invertase. The system is highly scalable, with the potential to generate theoretical diversities in the billions. We demonstrate the feasibility of this technique in Escherichia coli. Currently, this method could be employed to track the dynamics of populations of microbes through various bottlenecks. Advances of this method should prove useful in tracking interactions of cells within a network, and/or heterogeneity within complex biological samples. PMID:25013177
NASA Astrophysics Data System (ADS)
Shao, Xupeng
2017-04-01
Glutenite bodies are widely developed in northern Minfeng zone of Dongying Sag. Their litho-electric relationship is not clear. In addition, as the conventional sequence stratigraphic research method drawbacks of involving too many subjective human factors, it has limited deepening of the regional sequence stratigraphic research. The wavelet transform technique based on logging data and the time-frequency analysis technique based on seismic data have advantages of dividing sequence stratigraphy quantitatively comparing with the conventional methods. Under the basis of the conventional sequence research method, this paper used the above techniques to divide the fourth-order sequence of the upper Es4 in northern Minfeng zone of Dongying Sag. The research shows that the wavelet transform technique based on logging data and the time-frequency analysis technique based on seismic data are essentially consistent, both of which divide sequence stratigraphy quantitatively in the frequency domain; wavelet transform technique has high resolutions. It is suitable for areas with wells. The seismic time-frequency analysis technique has wide applicability, but a low resolution. Both of the techniques should be combined; the upper Es4 in northern Minfeng zone of Dongying Sag is a complete set of third-order sequence, which can be further subdivided into 5 fourth-order sequences that has the depositional characteristics of fine-upward sequence in granularity. Key words: Dongying sag, northern Minfeng zone, wavelet transform technique, time-frequency analysis technique ,the upper Es4, sequence stratigraphy
Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.
Raghav, Sunil Kumar; Deplancke, Bart
2012-01-01
Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.
Reference-free compression of high throughput sequencing data with a probabilistic de Bruijn graph.
Benoit, Gaëtan; Lemaitre, Claire; Lavenier, Dominique; Drezen, Erwan; Dayris, Thibault; Uricaru, Raluca; Rizk, Guillaume
2015-09-14
Data volumes generated by next-generation sequencing (NGS) technologies is now a major concern for both data storage and transmission. This triggered the need for more efficient methods than general purpose compression tools, such as the widely used gzip method. We present a novel reference-free method meant to compress data issued from high throughput sequencing technologies. Our approach, implemented in the software LEON, employs techniques derived from existing assembly principles. The method is based on a reference probabilistic de Bruijn Graph, built de novo from the set of reads and stored in a Bloom filter. Each read is encoded as a path in this graph, by memorizing an anchoring kmer and a list of bifurcations. The same probabilistic de Bruijn Graph is used to perform a lossy transformation of the quality scores, which allows to obtain higher compression rates without losing pertinent information for downstream analyses. LEON was run on various real sequencing datasets (whole genome, exome, RNA-seq or metagenomics). In all cases, LEON showed higher overall compression ratios than state-of-the-art compression software. On a C. elegans whole genome sequencing dataset, LEON divided the original file size by more than 20. LEON is an open source software, distributed under GNU affero GPL License, available for download at http://gatb.inria.fr/software/leon/.
RNA-Seq analysis to capture the transcriptome landscape of a single cell
Tang, Fuchou; Barbacioru, Catalin; Nordman, Ellen; Xu, Nanlan; Bashkirov, Vladimir I; Lao, Kaiqin; Surani, M. Azim
2013-01-01
We describe here a protocol for digital transcriptome analysis in a single mouse blastomere using a deep sequencing approach. An individual blastomere was first isolated and put into lysate buffer by mouth pipette. Reverse transcription was then performed directly on the whole cell lysate. After this, the free primers were removed by Exonuclease I and a poly(A) tail was added to the 3′ end of the first-strand cDNA by Terminal Deoxynucleotidyl Transferase. Then the single cell cDNAs were amplified by 20 plus 9 cycles of PCR. Then 100-200 ng of these amplified cDNAs were used to construct a sequencing library. The sequencing library can be used for deep sequencing using the SOLiD system. Compared with the cDNA microarray technique, our assay can capture up to 75% more genes expressed in early embryos. The protocol can generate deep sequencing libraries within 6 days for 16 single cell samples. PMID:20203668
Virus characterization and discovery in formalin-fixed paraffin-embedded tissues.
Bodewes, Rogier; van Run, Peter R W A; Schürch, Anita C; Koopmans, Marion P G; Osterhaus, Albert D M E; Baumgärtner, Wolfgang; Kuiken, Thijs; Smits, Saskia L
2015-03-01
Detection and characterization of novel viruses is hampered frequently by the lack of properly stored materials. Especially for the retrospective identification of viruses responsible for past disease outbreaks, often only formalin-fixed paraffin-embedded (FFPE) tissue samples are available. Although FFPE tissues can be used to detect known viral sequences, the application of FFPE tissues for detection of novel viruses is currently unclear. In the present study it was shown that sequence-independent amplification in combination with next-generation sequencing can be used to detect sequences of known and unknown viruses, although with relatively low sensitivity. These findings indicate that this technique could be useful for detecting novel viral sequences in FFPE tissues collected from humans and animals with disease of unknown origin, when other samples are not available. In addition, application of this method to FFPE tissues allows to correlate with the presence of histopathological changes in the corresponding tissue sections. Copyright © 2015 Elsevier B.V. All rights reserved.
DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases
Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.
2015-01-01
Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827
DS-SS with de Bruijn sequences for secure Inter Satellite Links
NASA Astrophysics Data System (ADS)
Spinsante, S.; Warty, C.; Gambi, E.
Today, both the military and commercial sectors are placing an increased emphasis on global communications. This has prompted the development of several Low Earth Orbit satellite systems that promise a worldwide connectivity and real-time voice, data and video communications. Constellations that avoid repeated uplink and downlink work by exploiting Inter Satellite Links have proved to be very economical in space routing. However, traditionally Inter Satellite Links were considered to be out of reach for any malicious activity and thus little, or no security was employed. This paper proposes a secured Inter Satellite Links based network, built upon the adoption of the Direct Sequence Spread Spectrum technique, with binary de Bruijn sequences used as spreading codes. Selected sequences from the de Bruijn family may be used over directional spot beams. The main intent of the paper is to propose a secure and robust communication link for the next generation of satellite communications, relying on a classical spread spectrum approach employing innovative sequences.
Relaxation time estimation in surface NMR
Grunewald, Elliot D.; Walsh, David O.
2017-03-21
NMR relaxation time estimation methods and corresponding apparatus generate two or more alternating current transmit pulses with arbitrary amplitudes, time delays, and relative phases; apply a surface NMR acquisition scheme in which initial preparatory pulses, the properties of which may be fixed across a set of multiple acquisition sequence, are transmitted at the start of each acquisition sequence and are followed by one or more depth sensitive pulses, the pulse moments of which are varied across the set of multiple acquisition sequences; and apply processing techniques in which recorded NMR response data are used to estimate NMR properties and the relaxation times T.sub.1 and T.sub.2* as a function of position as well as one-dimensional and two-dimension distributions of T.sub.1 versus T.sub.2* as a function of subsurface position.
Structural optimization with approximate sensitivities
NASA Technical Reports Server (NTRS)
Patnaik, S. N.; Hopkins, D. A.; Coroneos, R.
1994-01-01
Computational efficiency in structural optimization can be enhanced if the intensive computations associated with the calculation of the sensitivities, that is, gradients of the behavior constraints, are reduced. Approximation to gradients of the behavior constraints that can be generated with small amount of numerical calculations is proposed. Structural optimization with these approximate sensitivities produced correct optimum solution. Approximate gradients performed well for different nonlinear programming methods, such as the sequence of unconstrained minimization technique, method of feasible directions, sequence of quadratic programming, and sequence of linear programming. Structural optimization with approximate gradients can reduce by one third the CPU time that would otherwise be required to solve the problem with explicit closed-form gradients. The proposed gradient approximation shows potential to reduce intensive computation that has been associated with traditional structural optimization.
Mori, Kazuki; Shirasawa, Kenta; Nogata, Hitoshi; Hirata, Chiharu; Tashiro, Kosuke; Habu, Tsuyoshi; Kim, Sangwan; Himeno, Shuichi; Kuhara, Satoru; Ikegami, Hidetoshi
2017-01-25
With the aim of identifying sex determinants of fig, we generated the first draft genome sequence of fig and conducted the subsequent analyses. Linkage analysis with a high-density genetic map established by a restriction-site associated sequencing technique, and genome-wide association study followed by whole-genome resequencing analysis identified two missense mutations in RESPONSIVE-TO-ANTAGONIST1 (RAN1) orthologue encoding copper-transporting ATPase completely associated with sex phenotypes of investigated figs. This result suggests that RAN1 is a possible sex determinant candidate in the fig genome. The genomic resources and genetic findings obtained in this study can contribute to general understanding of Ficus species and provide an insight into fig's and plant's sex determination system.
NASA Astrophysics Data System (ADS)
Poston, Chloe N.; Higgs, Richard E.; You, Jinsam; Gelfanova, Valentina; Hale, John E.; Knierman, Michael D.; Siegel, Robert; Gutierrez, Jesus A.
2014-07-01
De novo sequencing by mass spectrometry (MS) allows for the determination of the complete amino acid (AA) sequence of a given protein based on the mass difference of detected ions from MS/MS fragmentation spectra. The technique relies on obtaining specific masses that can be attributed to characteristic theoretical masses of AAs. A major limitation of de novo sequencing by MS is the inability to distinguish between the isobaric residues leucine (Leu) and isoleucine (Ile). Incorrect identification of Ile as Leu or vice versa often results in loss of activity in recombinant antibodies. This functional ambiguity is commonly resolved with costly and time-consuming AA mutation and peptide sequencing experiments. Here, we describe a set of orthogonal biochemical protocols, which experimentally determine the identity of Ile or Leu residues in monoclonal antibodies (mAb) based on the selectivity that leucine aminopeptidase shows for n-terminal Leu residues and the cleavage preference for Leu by chymotrypsin. The resulting observations are combined with germline frequencies and incorporated into a logistic regression model, called Predictor for Xle Sites (PXleS) to provide a statistical likelihood for the identity of Leu at an ambiguous site. We demonstrate that PXleS can generate a probability for an Xle site in mAbs with 96% accuracy. The implementation of PXleS precludes the expression of several possible sequences and, therefore, reduces the overall time and resources required to go from spectra generation to a biologically active sequence for a mAb when an Ile or Leu residue is in question.
Poston, Chloe N; Higgs, Richard E; You, Jinsam; Gelfanova, Valentina; Hale, John E; Knierman, Michael D; Siegel, Robert; Gutierrez, Jesus A
2014-07-01
De novo sequencing by mass spectrometry (MS) allows for the determination of the complete amino acid (AA) sequence of a given protein based on the mass difference of detected ions from MS/MS fragmentation spectra. The technique relies on obtaining specific masses that can be attributed to characteristic theoretical masses of AAs. A major limitation of de novo sequencing by MS is the inability to distinguish between the isobaric residues leucine (Leu) and isoleucine (Ile). Incorrect identification of Ile as Leu or vice versa often results in loss of activity in recombinant antibodies. This functional ambiguity is commonly resolved with costly and time-consuming AA mutation and peptide sequencing experiments. Here, we describe a set of orthogonal biochemical protocols, which experimentally determine the identity of Ile or Leu residues in monoclonal antibodies (mAb) based on the selectivity that leucine aminopeptidase shows for n-terminal Leu residues and the cleavage preference for Leu by chymotrypsin. The resulting observations are combined with germline frequencies and incorporated into a logistic regression model, called Predictor for Xle Sites (PXleS) to provide a statistical likelihood for the identity of Leu at an ambiguous site. We demonstrate that PXleS can generate a probability for an Xle site in mAbs with 96% accuracy. The implementation of PXleS precludes the expression of several possible sequences and, therefore, reduces the overall time and resources required to go from spectra generation to a biologically active sequence for a mAb when an Ile or Leu residue is in question.
Human HOXA5 homeodomain enhances protein transduction and its application to vascular inflammation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Ji Young; Park, Kyoung sook; Cho, Eun Jung
2011-07-01
Highlights: {yields} We have developed an E. coli protein expression vector including human specific gene sequences for protein cellular delivery. {yields} The plasmid was generated by ligation the nucleotides 770-817 of the homeobox A5 mRNA sequence. {yields} HOXA5-APE1/Ref-1 inhibited TNF-alpha-induced monocyte adhesion to endothelial cells. {yields} Human HOXA5-PTD vector provides a powerful research tools for uncovering cellular functions of proteins or for the generation of human PTD-containing proteins. -- Abstract: Cellular protein delivery is an emerging technique by which exogenous recombinant proteins are delivered into mammalian cells across the membrane. We have developed an Escherichia coli expression vector including humanmore » specific gene sequences for protein cellular delivery. The plasmid was generated by ligation the nucleotides 770-817 of the homeobox A5 mRNA sequence which was matched with protein transduction domain (PTD) of homeodomain protein A5 (HOXA5) into pET expression vector. The cellular uptake of HOXA5-PTD-EGFP was detected in 1 min and its transduction reached a maximum at 1 h within cell lysates. The cellular uptake of HOXA5-EGFP at 37 {sup o}C was greater than in 4 {sup o}C. For study for the functional role of human HOXA5-PTD, we purified HOXA5-APE1/Ref-1 and applied it on monocyte adhesion. Pretreatment with HOXA5-APE1/Ref-1 (100 nM) inhibited TNF-{alpha}-induced monocyte adhesion to endothelial cells, compared with HOXA5-EGFP. Taken together, our data suggested that human HOXA5-PTD vector provides a powerful research tools for uncovering cellular functions of proteins or for the generation of human PTD-containing proteins.« less
Coarsening strategies for unstructured multigrid techniques with application to anisotropic problems
NASA Technical Reports Server (NTRS)
Morano, E.; Mavriplis, D. J.; Venkatakrishnan, V.
1995-01-01
Over the years, multigrid has been demonstrated as an efficient technique for solving inviscid flow problems. However, for viscous flows, convergence rates often degrade. This is generally due to the required use of stretched meshes (i.e., the aspect-ratio AR = delta y/delta x is much less than 1) in order to capture the boundary layer near the body. Usual techniques for generating a sequence of grids that produce proper convergence rates on isotopic meshes are not adequate for stretched meshes. This work focuses on the solution of Laplace's equation, discretized through a Galerkin finite-element formulation on unstructured stretched triangular meshes. A coarsening strategy is proposed and results are discussed.
Coarsening Strategies for Unstructured Multigrid Techniques with Application to Anisotropic Problems
NASA Technical Reports Server (NTRS)
Morano, E.; Mavriplis, D. J.; Venkatakrishnan, V.
1996-01-01
Over the years, multigrid has been demonstrated as an efficient technique for solving inviscid flow problems. However, for viscous flows, convergence rates often degrade. This is generally due to the required use of stretched meshes (i.e. the aspect-ratio AR = (delta)y/(delta)x much less than 1) in order to capture the boundary layer near the body. Usual techniques for generating a sequence of grids that produce proper convergence rates on isotropic meshes are not adequate for stretched meshes. This work focuses on the solution of Laplace's equation, discretized through a Galerkin finite-element formulation on unstructured stretched triangular meshes. A coarsening strategy is proposed and results are discussed.
Automated Sequence Generation Process and Software
NASA Technical Reports Server (NTRS)
Gladden, Roy
2007-01-01
"Automated sequence generation" (autogen) signifies both a process and software used to automatically generate sequences of commands to operate various spacecraft. The autogen software comprises the autogen script plus the Activity Plan Generator (APGEN) program. APGEN can be used for planning missions and command sequences.
An extension of command shaping methods for controlling residual vibration using frequency sampling
NASA Technical Reports Server (NTRS)
Singer, Neil C.; Seering, Warren P.
1992-01-01
The authors present an extension to the impulse shaping technique for commanding machines to move with reduced residual vibration. The extension, called frequency sampling, is a method for generating constraints that are used to obtain shaping sequences which minimize residual vibration in systems such as robots whose resonant frequencies change during motion. The authors present a review of impulse shaping methods, a development of the proposed extension, and a comparison of results of tests conducted on a simple model of the space shuttle robot arm. Frequency shaping provides a method for minimizing the impulse sequence duration required to give the desired insensitivity.
Wheat EST resources for functional genomics of abiotic stress
Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey
2006-01-01
Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID:16772040
HAFLER, BRIAN P.
2017-01-01
Purpose Inherited retinal dystrophies are a significant cause of vision loss and are characterized by the loss of photoreceptors and the retinal pigment epithelium (RPE). Mutations in approximately 250 genes cause inherited retinal degenerations with a high degree of genetic heterogeneity. New techniques in next-generation sequencing are allowing the comprehensive analysis of all retinal disease genes thus changing the approach to the molecular diagnosis of inherited retinal dystrophies. This review serves to analyze clinical progress in genetic diagnostic testing and implications for retinal gene therapy. Methods A literature search of PubMed and OMIM was conducted to relevant articles in inherited retinal dystrophies. Results Next-generation genetic sequencing allows the simultaneous analysis of all the approximately 250 genes that cause inherited retinal dystrophies. Reported diagnostic rates range are high and range from 51% to 57%. These new sequencing tools are highly accurate with sensitivities of 97.9% and specificities of 100%. Retinal gene therapy clinical trials are underway for multiple genes including RPE65, ABCA4, CHM, RS1, MYO7A, CNGA3, CNGB3, ND4, and MERTK for which a molecular diagnosis may be beneficial for patients. Conclusion Comprehensive next-generation genetic sequencing of all retinal dystrophy genes is changing the paradigm for how retinal specialists perform genetic testing for inherited retinal degenerations. Not only are high diagnostic yields obtained, but mutations in genes with novel clinical phenotypes are also identified. In the era of retinal gene therapy clinical trials, identifying specific genetic defects will increasingly be of use to identify patients who may enroll in clinical studies and benefit from novel therapies. PMID:27753762
Kim, Taehyung; Tyndel, Marc S; Huang, Haiming; Sidhu, Sachdev S; Bader, Gary D; Gfeller, David; Kim, Philip M
2012-03-01
Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage display, can retrieve thousands of distinct ligands, providing detailed insight into binding specificity. In particular, the advent of next-generation sequencing has recently increased the throughput of such methods by several orders of magnitude. These advances have helped reveal the presence of distinct binding specificity classes that co-exist within a set of ligands interacting with the same target. Here, we introduce a software system called MUSI that can rapidly analyze very large data sets of binding sequences to determine the relevant binding specificity patterns. Our pipeline provides two major advances. First, it can detect previously unrecognized multiple specificity patterns in any data set. Second, it offers integrated processing of very large data sets from next-generation sequencing machines. The results are visualized as multiple sequence logos describing the different binding preferences of the protein under investigation. We demonstrate the performance of MUSI by analyzing recent phage display data for human SH3 domains as well as microarray data for mouse transcription factors.
Thaitrong, Numrin; Kim, Hanyoup; Renzi, Ronald F; Bartsch, Michael S; Meagher, Robert J; Patel, Kamlesh D
2012-12-01
We have developed an automated quality control (QC) platform for next-generation sequencing (NGS) library characterization by integrating a droplet-based digital microfluidic (DMF) system with a capillary-based reagent delivery unit and a quantitative CE module. Using an in-plane capillary-DMF interface, a prepared sample droplet was actuated into position between the ground electrode and the inlet of the separation capillary to complete the circuit for an electrokinetic injection. Using a DNA ladder as an internal standard, the CE module with a compact LIF detector was capable of detecting dsDNA in the range of 5-100 pg/μL, suitable for the amount of DNA required by the Illumina Genome Analyzer sequencing platform. This DMF-CE platform consumes tenfold less sample volume than the current Agilent BioAnalyzer QC technique, preserving precious sample while providing necessary sensitivity and accuracy for optimal sequencing performance. The ability of this microfluidic system to validate NGS library preparation was demonstrated by examining the effects of limited-cycle PCR amplification on the size distribution and the yield of Illumina-compatible libraries, demonstrating that as few as ten cycles of PCR bias the size distribution of the library toward undesirable larger fragments. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Experimental measurement of dolphin thrust generated during a tail stand using DPIV
NASA Astrophysics Data System (ADS)
Wei, Timothy; Fish, Frank; Williams, Terrie; Wu, Vicki; Sherman, Erica; Misfeldt, Mitchel; Ringenberg, Hunter; Rogers, Dylan
2016-11-01
The thrust generated by dolphins doing tail stands was measured using DPIV. The technique entailed measuring vortex strength associated with the tail motion and correlating it to above water video sequences showing the amount of the dolphin's body that was being lifted out of the water. The underlying drivers for this research included: i) understanding the physiology, hydrodynamics and efficiency of dolphin locomotion, ii) developing non-invasive measurement techniques for studying marine swimming and iii) quantifying the actual propulsive capabilities of these animals. Two different bottlenose dolphins at the Long Marine Lab at UC-Santa Cruz were used as test subjects. Application of the Kutta-Joukowski Theorem on measured vortex circulations yielded thrust values that were well correlated with estimates of dolphin body weight being supported above water. This demonstrates that the tail motion can be interpreted as a flapping hydrofoil that can generate a sustained thrust roughly equal to the dolphin's weight. Videos of DPIV measurements overlaid with the dolphins will be presented along with thrust/weight data.
Research progress of plant population genomics based on high-throughput sequencing.
Wang, Yun-sheng
2016-08-01
Population genomics, a new paradigm for population genetics, combine the concepts and techniques of genomics with the theoretical system of population genetics and improve our understanding of microevolution through identification of site-specific effect and genome-wide effects using genome-wide polymorphic sites genotypeing. With the appearance and improvement of the next generation high-throughput sequencing technology, the numbers of plant species with complete genome sequences increased rapidly and large scale resequencing has also been carried out in recent years. Parallel sequencing has also been done in some plant species without complete genome sequences. These studies have greatly promoted the development of population genomics and deepened our understanding of the genetic diversity, level of linking disequilibium, selection effect, demographical history and molecular mechanism of complex traits of relevant plant population at a genomic level. In this review, I briely introduced the concept and research methods of population genomics and summarized the research progress of plant population genomics based on high-throughput sequencing. I also discussed the prospect as well as existing problems of plant population genomics in order to provide references for related studies.
Gulati, Ashima; Somlo, Stefan
2018-05-01
The genesis of whole exome sequencing as a powerful tool for detailing the protein coding sequence of the human genome was conceptualized based on the availability of next-generation sequencing technology and knowledge of the human reference genome. The field of pediatric nephrology enriched with molecularly unsolved phenotypes is allowing the clinical and research application of whole exome sequencing to enable novel gene discovery and provide amendment of phenotypic misclassification. Recent studies in the field have informed us that newer high-throughput sequencing techniques are likely to be of high yield when applied in conjunction with conventional genomic approaches such as linkage analysis and other strategies used to focus subsequent analysis. They have also emphasized the need for the validation of novel genetic findings in large collaborative cohorts and the production of robust corroborative biological data. The well-structured application of comprehensive genomic testing in clinical and research arenas will hopefully continue to advance patient care and precision medicine, but does call for attention to be paid to its integrated challenges.
Zou, Xiaohui; Tang, Guangpeng; Zhao, Xiang; Huang, Yan; Chen, Tao; Lei, Mingyu; Chen, Wenbing; Yang, Lei; Zhu, Wenfei; Zhuang, Li; Yang, Jing; Feng, Zhaomin; Wang, Dayan; Wang, Dingming; Shu, Yuelong
2017-03-01
Many viruses can cause respiratory diseases in humans. Although great advances have been achieved in methods of diagnosis, it remains challenging to identify pathogens in unexplained pneumonia (UP) cases. In this study, we applied next-generation sequencing (NGS) technology and a metagenomic approach to detect and characterize respiratory viruses in UP cases from Guizhou Province, China. A total of 33 oropharyngeal swabs were obtained from hospitalized UP patients and subjected to NGS. An unbiased metagenomic analysis pipeline identified 13 virus species in 16 samples. Human rhinovirus C was the virus most frequently detected and was identified in seven samples. Human measles virus, adenovirus B 55 and coxsackievirus A10 were also identified. Metagenomic sequencing also provided virus genomic sequences, which enabled genotype characterization and phylogenetic analysis. For cases of multiple infection, metagenomic sequencing afforded information regarding the quantity of each virus in the sample, which could be used to evaluate each viruses' role in the disease. Our study highlights the potential of metagenomic sequencing for pathogen identification in UP cases.
[Advances of Molecular Diagnostic Techniques Application in Clinical Diagnosis.
Ying, Bin-Wu
2016-11-01
Over the past 20 years,clinical molecular diagnostic technology has made rapid development,and became the most promising field in clinical laboratory medicine.In particular,with the development of genomics,clinical molecular diagnostic methods will reveal the nature of clinical diseases in a deeper level,thus guiding the clinical diagnosis and treatments.Many molecular diagnostic projects have been routinely applied in clinical works.This paper reviews the advances on application of clinical diagnostic techniques in infectious disease,tumor and genetic disorders,including nucleic acid amplification,biochip,next-generation sequencing,and automation molecular system,and so on.
Molecular diagnostics of periodontitis.
Korona-Głowniak, Izabela; Siwiec, Radosław; Berger, Marcin; Malm, Anna; Szymańska, Jolanta
2017-01-28
The microorganisms that form dental plaque are the main cause of periodontitis. Their identification and the understanding of the complex relationships and interactions that involve these microorganisms, environmental factors and the host's health status enable improvement in diagnostics and targeted therapy in patients with periodontitis. To this end, molecular diagnostics techniques (both techniques based on the polymerase chain reaction and those involving nucleic acid analysis via hybridization) come increasingly into use. On the basis of a literature review, the following methods are presented: polymerase chain reaction (PCR), real-time polymerase chain reaction (real-time PCR), 16S rRNA-encoding gene sequencing, checkerboard and reverse-capture checkerboard hybridization, microarrays, denaturing gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE), as well as terminal restriction fragment length polymorphism (TRFLP) and next generation sequencing (NGS). The advantages and drawbacks of each method in the examination of periopathogens are indicated. The techniques listed above allow fast detection of even small quantities of pathogen present in diagnostic material and prove particularly useful to detect microorganisms that are difficult or impossible to grow in a laboratory.
☆DNA assembly technique simplifies the construction of infectious clone of fowl adenovirus.
Zou, Xiao-Hui; Bi, Zhi-Xiang; Guo, Xiao-Juan; Zhang, Zun; Zhao, Yang; Wang, Min; Zhu, Ya-Lu; Jie, Hong-Ying; Yu, Yang; Hung, Tao; Lu, Zhuo-Zhuang
2018-07-01
Plasmid bearing adenovirus genome is generally constructed with the method of homologous recombination in E. coli BJ5183 strain. Here, we utilized Gibson gene assembly technique to generate infectious clone of fowl adenovirus 4 (FAdV-4). Primers flanked with partial inverted terminal repeat (ITR) sequence of FAdV-4 were synthesized to amplify a plasmid backbone containing kanamycin-resistant gene and pBR322 origin (KAN-ORI). DNA assembly was carried out by combining the KAN-ORI fragment, virus genomic DNA and DNA assembly master mix. E. coli competent cells were transformed with the assembled product, and plasmids (pKFAV4) were extracted and confirmed to contain viral genome by restriction analysis and sequencing. Virus was successfully rescued from linear pKFAV4-transfected chicken LMH cells. This approach was further verified in cloning of human adenovirus 5 genome. Our results indicated that DNA assembly technique simplified the construction of infectious clone of adenovirus, suggesting its possible application in virus traditional or reverse genetics. Copyright © 2018 Elsevier B.V. All rights reserved.
Single-Cell Sequencing for Drug Discovery and Drug Development.
Wu, Hongjin; Wang, Charles; Wu, Shixiu
2017-01-01
Next-generation sequencing (NGS), particularly single-cell sequencing, has revolutionized the scale and scope of genomic and biomedical research. Recent technological advances in NGS and singlecell studies have made the deep whole-genome (DNA-seq), whole epigenome and whole-transcriptome sequencing (RNA-seq) at single-cell level feasible. NGS at the single-cell level expands our view of genome, epigenome and transcriptome and allows the genome, epigenome and transcriptome of any organism to be explored without a priori assumptions and with unprecedented throughput. And it does so with single-nucleotide resolution. NGS is also a very powerful tool for drug discovery and drug development. In this review, we describe the current state of single-cell sequencing techniques, which can provide a new, more powerful and precise approach for analyzing effects of drugs on treated cells and tissues. Our review discusses single-cell whole genome/exome sequencing (scWGS/scWES), single-cell transcriptome sequencing (scRNA-seq), single-cell bisulfite sequencing (scBS), and multiple omics of single-cell sequencing. We also highlight the advantages and challenges of each of these approaches. Finally, we describe, elaborate and speculate the potential applications of single-cell sequencing for drug discovery and drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Rapid and accurate pyrosequencing of angiosperm plastid genomes
Moore, Michael J; Dhingra, Amit; Soltis, Pamela S; Shaw, Regina; Farmerie, William G; Folta, Kevin M; Soltis, Douglas E
2006-01-01
Background Plastid genome sequence information is vital to several disciplines in plant biology, including phylogenetics and molecular biology. The past five years have witnessed a dramatic increase in the number of completely sequenced plastid genomes, fuelled largely by advances in conventional Sanger sequencing technology. Here we report a further significant reduction in time and cost for plastid genome sequencing through the successful use of a newly available pyrosequencing platform, the Genome Sequencer 20 (GS 20) System (454 Life Sciences Corporation), to rapidly and accurately sequence the whole plastid genomes of the basal eudicot angiosperms Nandina domestica (Berberidaceae) and Platanus occidentalis (Platanaceae). Results More than 99.75% of each plastid genome was simultaneously obtained during two GS 20 sequence runs, to an average depth of coverage of 24.6× in Nandina and 17.3× in Platanus. The Nandina and Platanus plastid genomes shared essentially identical gene complements and possessed the typical angiosperm plastid structure and gene arrangement. To assess the accuracy of the GS 20 sequence, over 45 kilobases of sequence were generated for each genome using conventional sequencing. Overall error rates of 0.043% and 0.031% were observed in GS 20 sequence for Nandina and Platanus, respectively. More than 97% of all observed errors were associated with homopolymer runs, with ~60% of all errors associated with homopolymer runs of 5 or more nucleotides and ~50% of all errors associated with regions of extensive homopolymer runs. No substitution errors were present in either genome. Error rates were generally higher in the single-copy and noncoding regions of both plastid genomes relative to the inverted repeat and coding regions. Conclusion Highly accurate and essentially complete sequence information was obtained for the Nandina and Platanus plastid genomes using the GS 20 System. More importantly, the high accuracy observed in the GS 20 plastid genome sequence was generated for a significant reduction in time and cost over traditional shotgun-based genome sequencing techniques, although with approximately half the coverage of previously reported GS 20 de novo genome sequence. The GS 20 should be broadly applicable to angiosperm plastid genome sequencing, and therefore promises to expand the scale of plant genetic and phylogenetic research dramatically. PMID:16934154
Wu, Nicholas C.; Young, Arthur P.; Al-Mawsawi, Laith Q.; Olson, C. Anders; Feng, Jun; Qi, Hangfei; Luan, Harding H.; Li, Xinmin; Wu, Ting-Ting
2014-01-01
ABSTRACT Viral proteins often display several functions which require multiple assays to dissect their genetic basis. Here, we describe a systematic approach to screen for loss-of-function mutations that confer a fitness disadvantage under a specified growth condition. Our methodology was achieved by genetically monitoring a mutant library under two growth conditions, with and without interferon, by deep sequencing. We employed a molecular tagging technique to distinguish true mutations from sequencing error. This approach enabled us to identify mutations that were negatively selected against, in addition to those that were positively selected for. Using this technique, we identified loss-of-function mutations in the influenza A virus NS segment that were sensitive to type I interferon in a high-throughput fashion. Mechanistic characterization further showed that a single substitution, D92Y, resulted in the inability of NS to inhibit RIG-I ubiquitination. The approach described in this study can be applied under any specified condition for any virus that can be genetically manipulated. IMPORTANCE Traditional genetics focuses on a single genotype-phenotype relationship, whereas high-throughput genetics permits phenotypic characterization of numerous mutants in parallel. High-throughput genetics often involves monitoring of a mutant library with deep sequencing. However, deep sequencing suffers from a high error rate (∼0.1 to 1%), which is usually higher than the occurrence frequency for individual point mutations within a mutant library. Therefore, only mutations that confer a fitness advantage can be identified with confidence due to an enrichment in the occurrence frequency. In contrast, it is impossible to identify deleterious mutations using most next-generation sequencing techniques. In this study, we have applied a molecular tagging technique to distinguish true mutations from sequencing errors. It enabled us to identify mutations that underwent negative selection, in addition to mutations that experienced positive selection. This study provides a proof of concept by screening for loss-of-function mutations on the influenza A virus NS segment that are involved in its anti-interferon activity. PMID:24965464
Malware analysis using visualized image matrices.
Han, KyoungSoo; Kang, BooJoong; Im, Eul Gyu
2014-01-01
This paper proposes a novel malware visual analysis method that contains not only a visualization method to convert binary files into images, but also a similarity calculation method between these images. The proposed method generates RGB-colored pixels on image matrices using the opcode sequences extracted from malware samples and calculates the similarities for the image matrices. Particularly, our proposed methods are available for packed malware samples by applying them to the execution traces extracted through dynamic analysis. When the images are generated, we can reduce the overheads by extracting the opcode sequences only from the blocks that include the instructions related to staple behaviors such as functions and application programming interface (API) calls. In addition, we propose a technique that generates a representative image for each malware family in order to reduce the number of comparisons for the classification of unknown samples and the colored pixel information in the image matrices is used to calculate the similarities between the images. Our experimental results show that the image matrices of malware can effectively be used to classify malware families both statically and dynamically with accuracy of 0.9896 and 0.9732, respectively.
Xavier, Miguel J; Nixon, Brett; Roman, Shaun D; Aitken, Robert John
2018-01-01
Current approaches for DNA extraction and fragmentation from mammalian spermatozoa provide several challenges for the investigation of the oxidative stress burden carried in the genome of male gametes. Indeed, the potential introduction of oxidative DNA damage induced by reactive oxygen species, reducing agents (dithiothreitol or beta-mercaptoethanol), and DNA shearing techniques used in the preparation of samples for chromatin immunoprecipitation and next-generation sequencing serve to cofound the reliability and accuracy of the results obtained. Here we report optimised methodology that minimises, or completely eliminates, exposure to DNA damaging compounds during extraction and fragmentation procedures. Specifically, we show that Micrococcal nuclease (MNase) digestion prior to cellular lysis generates a greater DNA yield with minimal collateral oxidation while randomly fragmenting the entire paternal genome. This modified methodology represents a significant improvement over traditional fragmentation achieved via sonication in the preparation of genomic DNA from human spermatozoa for downstream applications, such as next-generation sequencing. We also present a redesigned bioinformatic pipeline framework adjusted to correctly analyse this form of data and detect statistically relevant targets of oxidation.
Synthetic generation of influenza vaccine viruses for rapid response to pandemics.
Dormitzer, Philip R; Suphaphiphat, Pirada; Gibson, Daniel G; Wentworth, David E; Stockwell, Timothy B; Algire, Mikkel A; Alperovich, Nina; Barro, Mario; Brown, David M; Craig, Stewart; Dattilo, Brian M; Denisova, Evgeniya A; De Souza, Ivna; Eickmann, Markus; Dugan, Vivien G; Ferrari, Annette; Gomila, Raul C; Han, Liqun; Judge, Casey; Mane, Sarthak; Matrosovich, Mikhail; Merryman, Chuck; Palladino, Giuseppe; Palmer, Gene A; Spencer, Terika; Strecker, Thomas; Trusheim, Heidi; Uhlendorff, Jennifer; Wen, Yingxia; Yee, Anthony C; Zaveri, Jayshree; Zhou, Bin; Becker, Stephan; Donabedian, Armen; Mason, Peter W; Glass, John I; Rappuoli, Rino; Venter, J Craig
2013-05-15
During the 2009 H1N1 influenza pandemic, vaccines for the virus became available in large quantities only after human infections peaked. To accelerate vaccine availability for future pandemics, we developed a synthetic approach that very rapidly generated vaccine viruses from sequence data. Beginning with hemagglutinin (HA) and neuraminidase (NA) gene sequences, we combined an enzymatic, cell-free gene assembly technique with enzymatic error correction to allow rapid, accurate gene synthesis. We then used these synthetic HA and NA genes to transfect Madin-Darby canine kidney (MDCK) cells that were qualified for vaccine manufacture with viral RNA expression constructs encoding HA and NA and plasmid DNAs encoding viral backbone genes. Viruses for use in vaccines were rescued from these MDCK cells. We performed this rescue with improved vaccine virus backbones, increasing the yield of the essential vaccine antigen, HA. Generation of synthetic vaccine seeds, together with more efficient vaccine release assays, would accelerate responses to influenza pandemics through a system of instantaneous electronic data exchange followed by real-time, geographically dispersed vaccine production.
Centralized Planning for Multiple Exploratory Robots
NASA Technical Reports Server (NTRS)
Estlin, Tara; Rabideau, Gregg; Chien, Steve; Barrett, Anthony
2005-01-01
A computer program automatically generates plans for a group of robotic vehicles (rovers) engaged in geological exploration of terrain. The program rapidly generates multiple command sequences that can be executed simultaneously by the rovers. Starting from a set of high-level goals, the program creates a sequence of commands for each rover while respecting hardware constraints and limitations on resources of each rover and of hardware (e.g., a radio communication terminal) shared by all the rovers. First, a separate model of each rover is loaded into a centralized planning subprogram. The centralized planning software uses the models of the rovers plus an iterative repair algorithm to resolve conflicts posed by demands for resources and by constraints associated with the all the rovers and the shared hardware. During repair, heuristics are used to make planning decisions that will result in solutions that will be better and will be found faster than would otherwise be possible. In particular, techniques from prior solutions of the multiple-traveling- salesmen problem are used as heuristics to generate plans in which the paths taken by the rovers to assigned scientific targets are shorter than they would otherwise be.
Swept Impact Seismic Technique (SIST)
Park, C.B.; Miller, R.D.; Steeples, D.W.; Black, R.A.
1996-01-01
A coded seismic technique is developed that can result in a higher signal-to-noise ratio than a conventional single-pulse method does. The technique is cost-effective and time-efficient and therefore well suited for shallow-reflection surveys where high resolution and cost-effectiveness are critical. A low-power impact source transmits a few to several hundred high-frequency broad-band seismic pulses during several seconds of recording time according to a deterministic coding scheme. The coding scheme consists of a time-encoded impact sequence in which the rate of impact (cycles/s) changes linearly with time providing a broad range of impact rates. Impact times used during the decoding process are recorded on one channel of the seismograph. The coding concept combines the vibroseis swept-frequency and the Mini-Sosie random impact concepts. The swept-frequency concept greatly improves the suppression of correlation noise with much fewer impacts than normally used in the Mini-Sosie technique. The impact concept makes the technique simple and efficient in generating high-resolution seismic data especially in the presence of noise. The transfer function of the impact sequence simulates a low-cut filter with the cutoff frequency the same as the lowest impact rate. This property can be used to attenuate low-frequency ground-roll noise without using an analog low-cut filter or a spatial source (or receiver) array as is necessary with a conventional single-pulse method. Because of the discontinuous coding scheme, the decoding process is accomplished by a "shift-and-stacking" method that is much simpler and quicker than cross-correlation. The simplicity of the coding allows the mechanical design of the source to remain simple. Several different types of mechanical systems could be adapted to generate a linear impact sweep. In addition, the simplicity of the coding also allows the technique to be used with conventional acquisition systems, with only minor modifications.
Huang, Jianguo; Chen, Mark; Whitley, Melodi Javid; Kuo, Hsuan-Cheng; Xu, Eric S.; Walens, Andrea; Mowery, Yvonne M.; Van Mater, David; Eward, William C.; Cardona, Diana M.; Luo, Lixia; Ma, Yan; Lopez, Omar M.; Nelson, Christopher E.; Robinson-Hamm, Jacqueline N.; Reddy, Anupama; Dave, Sandeep S.; Gersbach, Charles A.; Dodd, Rebecca D.; Kirsch, David G.
2017-01-01
Genetically engineered mouse models that employ site-specific recombinase technology are important tools for cancer research but can be costly and time-consuming. The CRISPR-Cas9 system has been adapted to generate autochthonous tumours in mice, but how these tumours compare to tumours generated by conventional recombinase technology remains to be fully explored. Here we use CRISPR-Cas9 to generate multiple subtypes of primary sarcomas efficiently in wild type and genetically engineered mice. These data demonstrate that CRISPR-Cas9 can be used to generate multiple subtypes of soft tissue sarcomas in mice. Primary sarcomas generated with CRISPR-Cas9 and Cre recombinase technology had similar histology, growth kinetics, copy number variation and mutational load as assessed by whole exome sequencing. These results show that sarcomas generated with CRISPR-Cas9 technology are similar to sarcomas generated with conventional modelling techniques and suggest that CRISPR-Cas9 can be used to more rapidly generate genotypically and phenotypically similar cancers. PMID:28691711
Lossy compression of quality scores in genomic data.
Cánovas, Rodrigo; Moffat, Alistair; Turpin, Andrew
2014-08-01
Next-generation sequencing technologies are revolutionizing medicine. Data from sequencing technologies are typically represented as a string of bases, an associated sequence of per-base quality scores and other metadata, and in aggregate can require a large amount of space. The quality scores show how accurate the bases are with respect to the sequencing process, that is, how confident the sequencer is of having called them correctly, and are the largest component in datasets in which they are retained. Previous research has examined how to store sequences of bases effectively; here we add to that knowledge by examining methods for compressing quality scores. The quality values originate in a continuous domain, and so if a fidelity criterion is introduced, it is possible to introduce flexibility in the way these values are represented, allowing lossy compression over the quality score data. We present existing compression options for quality score data, and then introduce two new lossy techniques. Experiments measuring the trade-off between compression ratio and information loss are reported, including quantifying the effect of lossy representations on a downstream application that carries out single nucleotide polymorphism and insert/deletion detection. The new methods are demonstrably superior to other techniques when assessed against the spectrum of possible trade-offs between storage required and fidelity of representation. An implementation of the methods described here is available at https://github.com/rcanovas/libCSAM. rcanovas@student.unimelb.edu.au Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Shen, Hong-Bin; Yi, Dong-Liang; Yao, Li-Xiu; Yang, Jie; Chou, Kuo-Chen
2008-10-01
In the postgenomic age, with the avalanche of protein sequences generated and relatively slow progress in determining their structures by experiments, it is important to develop automated methods to predict the structure of a protein from its sequence. The membrane proteins are a special group in the protein family that accounts for approximately 30% of all proteins; however, solved membrane protein structures only represent less than 1% of known protein structures to date. Although a great success has been achieved for developing computational intelligence techniques to predict secondary structures in both globular and membrane proteins, there is still much challenging work in this regard. In this review article, we firstly summarize the recent progress of automation methodology development in predicting protein secondary structures, especially in membrane proteins; we will then give some future directions in this research field.
Use of the alpha shape to quantify finite helical axis dispersion during simulated spine movements.
McLachlin, Stewart D; Bailey, Christopher S; Dunning, Cynthia E
2016-01-04
In biomechanical studies examining joint kinematics the most common measurement is range of motion (ROM), yet other techniques, such as the finite helical axis (FHA), may elucidate the changes in the 3D motion pathology more effectively. One of the deficiencies with the FHA technique is in quantifying the axes generated throughout a motion sequence. This study attempted to solve this issue via a computational geometric technique known as the alpha shape, which bounds a set of point data within a closed boundary similar to a convex hull. The purpose of this study was to use the alpha shape as an additional tool to visualize and quantify FHA dispersion between intact and injured cadaveric spine movements and compare these changes to the gold-standard ROM measurements. Flexion-extension, axial rotation, and lateral bending were simulated with five C5-C6 motion segments using a spinal loading simulator and Optotrak motion tracking system. Specimens were first tested intact followed by a simulated injury model. ROM and the FHAs were calculated post-hoc, with alpha shapes and convex hulls generated from the anatomic planar intercept points of the FHAs. While both ROM and the boundary shape areas increased with injury (p<0.05), no consistent geometric trends in the alpha shape growth were identified. The alpha shape area was sensitive to the alpha value chosen and values examined below 2.5 created more than one closed boundary. Ultimately, the alpha shape presents as a useful technique to quantify sequences of joint kinematics described by scatter plots such as FHA intercept data. Copyright © 2015. Published by Elsevier Ltd.
SU-F-T-527: A Novel Dynamic Multileaf Collimator Leaf-Sequencing Algorithm in Radiation Therapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jing, J; Lin, H; Chow, J
Purpose: A novel leaf-sequencing algorithm is developed for generating arbitrary beam intensity profiles in discrete levels using dynamic multileaf collimator (MLC). The efficiency of this dynamic MLC leaf-sequencing method was evaluated using external beam treatment plans delivered by intensity modulated radiation therapy technique. Methods: To qualify and validate this algorithm, integral test for the beam segment of MLC generated by the CORVUS treatment planning system was performed with clinical intensity map experiments. The treatment plans were optimized and the fluence maps for all photon beams were determined. This algorithm started with the algebraic expression for the area under the beammore » profile. The coefficients in the expression can be transformed into the specifications for the leaf-setting sequence. The leaf optimization procedure was then applied and analyzed for clinical relevant intensity profiles in cancer treatment. Results: The macrophysical effect of this method can be described by volumetric plan evaluation tools such as dose-volume histograms (DVHs). The DVH results are in good agreement compared to those from the CORVUS treatment planning system. Conclusion: We developed a dynamic MLC method to examine the stability of leaf speed including effects of acceleration and deceleration of leaf motion in order to make sure the stability of leaf speed did not affect the intensity profile generated. It was found that the mechanical requirements were better satisfied using this method. The Project is sponsored by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry.« less
Didelot, Audrey; Kotsopoulos, Steve K; Lupo, Audrey; Pekin, Deniz; Li, Xinyu; Atochin, Ivan; Srinivasan, Preethi; Zhong, Qun; Olson, Jeff; Link, Darren R; Laurent-Puig, Pierre; Blons, Hélène; Hutchison, J Brian; Taly, Valerie
2013-05-01
Assessment of DNA integrity and quantity remains a bottleneck for high-throughput molecular genotyping technologies, including next-generation sequencing. In particular, DNA extracted from paraffin-embedded tissues, a major potential source of tumor DNA, varies widely in quality, leading to unpredictable sequencing data. We describe a picoliter droplet-based digital PCR method that enables simultaneous detection of DNA integrity and the quantity of amplifiable DNA. Using a multiplex assay, we detected 4 different target lengths (78, 159, 197, and 550 bp). Assays were validated with human genomic DNA fragmented to sizes of 170 bp to 3000 bp. The technique was validated with DNA quantities as low as 1 ng. We evaluated 12 DNA samples extracted from paraffin-embedded lung adenocarcinoma tissues. One sample contained no amplifiable DNA. The fractions of amplifiable DNA for the 11 other samples were between 0.05% and 10.1% for 78-bp fragments and ≤1% for longer fragments. Four samples were chosen for enrichment and next-generation sequencing. The quality of the sequencing data was in agreement with the results of the DNA-integrity test. Specifically, DNA with low integrity yielded sequencing results with lower levels of coverage and uniformity and had higher levels of false-positive variants. The development of DNA-quality assays will enable researchers to downselect samples or process more DNA to achieve reliable genome sequencing with the highest possible efficiency of cost and effort, as well as minimize the waste of precious samples. © 2013 American Association for Clinical Chemistry.
Meason-Smith, Courtney; Diesel, Alison; Patterson, Adam P; Older, Caitlin E; Johnson, Timothy J; Mansell, Joanne M; Suchodolski, Jan S; Rodrigues Hoffmann, Aline
2017-02-01
Next generation sequencing (NGS) studies have demonstrated a diverse skin-associated microbiota and microbial dysbiosis associated with atopic dermatitis in people and in dogs. The skin of cats has yet to be investigated using NGS techniques. We hypothesized that the fungal microbiota of healthy feline skin would be similar to that of dogs, with a predominance of environmental fungi, and that fungal dysbiosis would be present on the skin of allergic cats. Eleven healthy cats and nine cats diagnosed with one or more cutaneous hypersensitivity disorders, including flea bite, food-induced and nonflea nonfood-induced hypersensitivity. Healthy cats were sampled at twelve body sites and allergic cats at six sites. DNA was isolated and Illumina sequencing was performed targeting the internal transcribed spacer region of fungi. Sequences were processed using the bioinformatics software QIIME. The most abundant fungal sequences from the skin of all cats were classified as Cladosporium and Alternaria. The mucosal sites, including nostril, conjunctiva and reproductive tracts, had the fewest number of fungi, whereas the pre-aural space had the most. Allergic feline skin had significantly greater amounts of Agaricomycetes and Sordariomycetes, and significantly less Epicoccum compared to healthy feline skin. The skin of healthy cats appears to have a more diverse fungal microbiota compared to previous studies, and a fungal dysbiosis is noted in the skin of allergic cats. Future studies assessing the temporal stability of the skin microbiota in cats will be useful in determining whether the microbiota sequenced using NGS are colonizers or transient microbes. © 2016 ESVD and ACVD.
P41IDENTIFICATION OF GLIOMA SPECIFIC APTAMER TARGETS
Arora, Mohit; Alder, Jane; Lawrence, Clare; Davis, Charles; Dawson, Tim; Hall, Greg; Shaw, Lisa
2014-01-01
INTRODUCTION: Aptamers are in vitro generated DNA and RNA sequences which are randomly created as a library, with multiple permutations and combinations. These are then exposed to the target structure against which we want an aptamer ‘selected’ using Sequential Enumeration of Ligands by Exponential enrichment (SELEX). METHOD: Commercially available glioma and glial cell lines and in-house generated primary glioma cultures were used. Modified aptamers based on published sequences against glioma cell lines and newly generated sequences were used in the project to identify their binding targets. Cy3 or biotin- conjugated aptamers were incubated with live glioma cell cultures and imaged using confocal or light microscopy.To determine the target ligand, aptamers were then reacted with glial cell lysate and subjected to precipitation using streptavidin agarose beads and SDS polyacrylamide electrophoresis. Proteins were analysed by mass spectroscopy. RESULTS: Known and unknown aptamer protein ligands were co-precipitated. Ku70, Ku80 were precipitated along with nucleolin and related proteins. CONCLUSION: The aptamer has shown preferential binding to glioma cells and could act as a delivery system for therapeutic payloads. The aptamer targets Ku70 and Ku80, which are known to be over expressed in other forms of cancer but their role in gliomagenesis has not been fully elucidated. Other novel proteins have also been identified. Thus the aptamer co-precipitation technique has identified potential glioma biomarkers that may be of clinical significance.
Variable speed wind turbine generator with zero-sequence filter
Muljadi, Eduard
1998-01-01
A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.
Variable Speed Wind Turbine Generator with Zero-sequence Filter
Muljadi, Eduard
1998-08-25
A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.
Variable speed wind turbine generator with zero-sequence filter
Muljadi, E.
1998-08-25
A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility. 14 figs.
Now and Next-Generation Sequencing Techniques: Future of Sequence Analysis Using Cloud Computing
Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav
2012-01-01
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed “cloud computing”) has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows. PMID:23248640
Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.
Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav
2012-01-01
Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.
NASA Technical Reports Server (NTRS)
Wallace, G. R.; Weathers, G. D.; Graf, E. R.
1973-01-01
The statistics of filtered pseudorandom digital sequences called hybrid-sum sequences, formed from the modulo-two sum of several maximum-length sequences, are analyzed. The results indicate that a relation exists between the statistics of the filtered sequence and the characteristic polynomials of the component maximum length sequences. An analysis procedure is developed for identifying a large group of sequences with good statistical properties for applications requiring the generation of analog pseudorandom noise. By use of the analysis approach, the filtering process is approximated by the convolution of the sequence with a sum of unit step functions. A parameter reflecting the overall statistical properties of filtered pseudorandom sequences is derived. This parameter is called the statistical quality factor. A computer algorithm to calculate the statistical quality factor for the filtered sequences is presented, and the results for two examples of sequence combinations are included. The analysis reveals that the statistics of the signals generated with the hybrid-sum generator are potentially superior to the statistics of signals generated with maximum-length generators. Furthermore, fewer calculations are required to evaluate the statistics of a large group of hybrid-sum generators than are required to evaluate the statistics of the same size group of approximately equivalent maximum-length sequences.
Sequence capture of ultraconserved elements from bird museum specimens.
McCormack, John E; Tsai, Whitney L E; Faircloth, Brant C
2016-09-01
New DNA sequencing technologies are allowing researchers to explore the genomes of the millions of natural history specimens collected prior to the molecular era. Yet, we know little about how well specific next-generation sequencing (NGS) techniques work with the degraded DNA typically extracted from museum specimens. Here, we use one type of NGS approach, sequence capture of ultraconserved elements (UCEs), to collect data from bird museum specimens as old as 120 years. We targeted 5060 UCE loci in 27 western scrub-jays (Aphelocoma californica) representing three evolutionary lineages that could be species, and we collected an average of 3749 UCE loci containing 4460 single nucleotide polymorphisms (SNPs). Despite older specimens producing fewer and shorter loci in general, we collected thousands of markers from even the oldest specimens. More sequencing reads per individual helped to boost the number of UCE loci we recovered from older specimens, but more sequencing was not as successful at increasing the length of loci. We detected contamination in some samples and determined that contamination was more prevalent in older samples that were subject to less sequencing. For the phylogeny generated from concatenated UCE loci, contamination led to incorrect placement of some individuals. In contrast, a species tree constructed from SNPs called within UCE loci correctly placed individuals into three monophyletic groups, perhaps because of the stricter analytical procedures used for SNP calling. This study and other recent studies on the genomics of museum specimens have profound implications for natural history collections, where millions of older specimens should now be considered genomic resources. © 2015 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
NASA Technical Reports Server (NTRS)
Gladden, Roy
2007-01-01
Version 2.0 of the autogen software has been released. "Autogen" (automated sequence generation) signifies both a process and software used to implement the process of automated generation of sequences of commands in a standard format for uplink to spacecraft. Autogen requires fewer workers than are needed for older manual sequence-generation processes and reduces sequence-generation times from weeks to minutes.
Periodic, On-Demand, and User-Specified Information Reconciliation
NASA Technical Reports Server (NTRS)
Kolano, Paul
2007-01-01
Automated sequence generation (autogen) signifies both a process and software used to automatically generate sequences of commands to operate various spacecraft. Autogen requires fewer workers than are needed for older manual sequence-generation processes and reduces sequence-generation times from weeks to minutes. The autogen software comprises the autogen script plus the Activity Plan Generator (APGEN) program. APGEN can be used for planning missions and command sequences. APGEN includes a graphical user interface that facilitates scheduling of activities on a time line and affords a capability to automatically expand, decompose, and schedule activities.
Single-shot thermal ghost imaging using wavelength-division multiplexing
NASA Astrophysics Data System (ADS)
Deng, Chao; Suo, Jinli; Wang, Yuwang; Zhang, Zhili; Dai, Qionghai
2018-01-01
Ghost imaging (GI) is an emerging technique that reconstructs the target scene from its correlated measurements with a sequence of patterns. Restricted by the multi-shot principle, GI usually requires long acquisition time and is limited in observation of dynamic scenes. To handle this problem, this paper proposes a single-shot thermal ghost imaging scheme via a wavelength-division multiplexing technique. Specifically, we generate thousands of correlated patterns simultaneously by modulating a broadband light source with a wavelength dependent diffuser. These patterns carry the scene's spatial information and then the correlated photons are coupled into a spectrometer for the final reconstruction. This technique increases the speed of ghost imaging and promotes the applications in dynamic ghost imaging with high scalability and compatibility.
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.
Murray, Vincent; Chen, Jon K; Tanaka, Mark M
2016-07-01
The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction
Laehnemann, David; Borkhardt, Arndt
2016-01-01
Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159
Ehrhardt, J; Säring, D; Handels, H
2007-01-01
Modern tomographic imaging devices enable the acquisition of spatial and temporal image sequences. But, the spatial and temporal resolution of such devices is limited and therefore image interpolation techniques are needed to represent images at a desired level of discretization. This paper presents a method for structure-preserving interpolation between neighboring slices in temporal or spatial image sequences. In a first step, the spatiotemporal velocity field between image slices is determined using an optical flow-based registration method in order to establish spatial correspondence between adjacent slices. An iterative algorithm is applied using the spatial and temporal image derivatives and a spatiotemporal smoothing step. Afterwards, the calculated velocity field is used to generate an interpolated image at the desired time by averaging intensities between corresponding points. Three quantitative measures are defined to evaluate the performance of the interpolation method. The behavior and capability of the algorithm is demonstrated by synthetic images. A population of 17 temporal and spatial image sequences are utilized to compare the optical flow-based interpolation method to linear and shape-based interpolation. The quantitative results show that the optical flow-based method outperforms the linear and shape-based interpolation statistically significantly. The interpolation method presented is able to generate image sequences with appropriate spatial or temporal resolution needed for image comparison, analysis or visualization tasks. Quantitative and qualitative measures extracted from synthetic phantoms and medical image data show that the new method definitely has advantages over linear and shape-based interpolation.
Integrative workflows for metagenomic analysis
Ladoukakis, Efthymios; Kolisis, Fragiskos N.; Chatziioannou, Aristotelis A.
2014-01-01
The rapid evolution of all sequencing technologies, described by the term Next Generation Sequencing (NGS), have revolutionized metagenomic analysis. They constitute a combination of high-throughput analytical protocols, coupled to delicate measuring techniques, in order to potentially discover, properly assemble and map allelic sequences to the correct genomes, achieving particularly high yields for only a fraction of the cost of traditional processes (i.e., Sanger). From a bioinformatic perspective, this boils down to many GB of data being generated from each single sequencing experiment, rendering the management or even the storage, critical bottlenecks with respect to the overall analytical endeavor. The enormous complexity is even more aggravated by the versatility of the processing steps available, represented by the numerous bioinformatic tools that are essential, for each analytical task, in order to fully unveil the genetic content of a metagenomic dataset. These disparate tasks range from simple, nonetheless non-trivial, quality control of raw data to exceptionally complex protein annotation procedures, requesting a high level of expertise for their proper application or the neat implementation of the whole workflow. Furthermore, a bioinformatic analysis of such scale, requires grand computational resources, imposing as the sole realistic solution, the utilization of cloud computing infrastructures. In this review article we discuss different, integrative, bioinformatic solutions available, which address the aforementioned issues, by performing a critical assessment of the available automated pipelines for data management, quality control, and annotation of metagenomic data, embracing various, major sequencing technologies and applications. PMID:25478562
Evaluation of next generation sequencing for the analysis of Eimeria communities in wildlife.
Vermeulen, Elke T; Lott, Matthew J; Eldridge, Mark D B; Power, Michelle L
2016-05-01
Next-generation sequencing (NGS) techniques are well-established for studying bacterial communities but not yet for microbial eukaryotes. Parasite communities remain poorly studied, due in part to the lack of reliable and accessible molecular methods to analyse eukaryotic communities. We aimed to develop and evaluate a methodology to analyse communities of the protozoan parasite Eimeria from populations of the Australian marsupial Petrogale penicillata (brush-tailed rock-wallaby) using NGS. An oocyst purification method for small sample sizes and polymerase chain reaction (PCR) protocol for the 18S rRNA locus targeting Eimeria was developed and optimised prior to sequencing on the Illumina MiSeq platform. A data analysis approach was developed by modifying methods from bacterial metagenomics and utilising existing Eimeria sequences in GenBank. Operational taxonomic unit (OTU) assignment at a high similarity threshold (97%) was more accurate at assigning Eimeria contigs into Eimeria OTUs but at a lower threshold (95%) there was greater resolution between OTU consensus sequences. The assessment of two amplification PCR methods prior to Illumina MiSeq, single and nested PCR, determined that single PCR was more sensitive to Eimeria as more Eimeria OTUs were detected in single amplicons. We have developed a simple and cost-effective approach to a data analysis pipeline for community analysis of eukaryotic organisms using Eimeria communities as a model. The pipeline provides a basis for evaluation using other eukaryotic organisms and potential for diverse community analysis studies. Copyright © 2016 Elsevier B.V. All rights reserved.
Breinholt, Jesse W; Earl, Chandra; Lemmon, Alan R; Lemmon, Emily Moriarty; Xiao, Lei; Kawahara, Akito Y
2018-01-01
The advent of next-generation sequencing technology has allowed for thecollection of large portions of the genome for phylogenetic analysis. Hybrid enrichment and transcriptomics are two techniques that leverage next-generation sequencing and have shown much promise. However, methods for processing hybrid enrichment data are still limited. We developed a pipeline for anchored hybrid enrichment (AHE) read assembly, orthology determination, contamination screening, and data processing for sequences flanking the target "probe" region. We apply this approach to study the phylogeny of butterflies and moths (Lepidoptera), a megadiverse group of more than 157,000 described species with poorly understood deep-level phylogenetic relationships. We introduce a new, 855 locus AHE kit for Lepidoptera phylogenetics and compare resulting trees to those from transcriptomes. The enrichment kit was designed from existing genomes, transcriptomes, and expressed sequence tags and was used to capture sequence data from 54 species from 23 lepidopteran families. Phylogenies estimated from AHE data were largely congruent with trees generated from transcriptomes, with strong support for relationships at all but the deepest taxonomic levels. We combine AHE and transcriptomic data to generate a new Lepidoptera phylogeny, representing 76 exemplar species in 42 families. The tree provides robust support for many relationships, including those among the seven butterfly families. The addition of AHE data to an existing transcriptomic dataset lowers node support along the Lepidoptera backbone, but firmly places taxa with AHE data on the phylogeny. Combining taxa sequenced for AHE with existing transcriptomes and genomes resulted in a tree with strong support for (Calliduloidea $+$ Gelechioidea $+$ Thyridoidea) $+$ (Papilionoidea $+$ Pyraloidea $+$ Macroheterocera). To examine the efficacy of AHE at a shallow taxonomic level, phylogenetic analyses were also conducted on a sister group representing a more recent divergence, the Saturniidae and Sphingidae. These analyses utilized sequences from the probe region and data flanking it, nearly doubled the size of the dataset; resulting trees supported new phylogenetics relationships, especially within the Saturniidae and Sphingidae (e.g., Hemarina derived in the latter). We hope that our data processing pipeline, hybrid enrichment gene set, and approach of combining AHE data with transcriptomes will be useful for the broader systematics community. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Universal sequence map (USM) of arbitrary discrete sequences
2002-01-01
Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM), is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR). The latter enables the representation of 4 unit type sequences (like DNA) as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules. PMID:11895567
Random whole metagenomic sequencing for forensic discrimination of soils.
Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian
2014-01-01
Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.
Lima, Jakelyne; Cerdeira, Louise Teixeira; Bol, Erick; Schneider, Maria Paula Cruz; Silva, Artur; Azevedo, Vasco; Abelém, Antônio Jorge Gomes
2012-01-01
Improvements in genome sequencing techniques have resulted in generation of huge volumes of data. As a consequence of this progress, the genome assembly stage demands even more computational power, since the incoming sequence files contain large amounts of data. To speed up the process, it is often necessary to distribute the workload among a group of machines. However, this requires hardware and software solutions specially configured for this purpose. Grid computing try to simplify this process of aggregate resources, but do not always offer the best performance possible due to heterogeneity and decentralized management of its resources. Thus, it is necessary to develop software that takes into account these peculiarities. In order to achieve this purpose, we developed an algorithm aimed to optimize the functionality of de novo assembly software ABySS in order to optimize its operation in grids. We run ABySS with and without the algorithm we developed in the grid simulator SimGrid. Tests showed that our algorithm is viable, flexible, and scalable even on a heterogeneous environment, which improved the genome assembly time in computational grids without changing its quality. PMID:22461785
Protocol matters: which methylome are you actually studying?
Robinson, Mark D; Statham, Aaron L; Speed, Terence P; Clark, Susan J
2011-01-01
The field of epigenetics is now capitalizing on the vast number of emerging technologies, largely based on second-generation sequencing, which interrogate DNA methylation status and histone modifications genome-wide. However, getting an exhaustive and unbiased view of a methylome at a reasonable cost is proving to be a significant challenge. In this article, we take a closer look at the impact of the DNA sequence and bias effects introduced to datasets by genome-wide DNA methylation technologies and where possible, explore the bioinformatics tools that deconvolve them. There remains much to be learned about the performance of genome-wide technologies, the data we mine from these assays and how it reflects the actual biology. While there are several methods to interrogate the DNA methylation status genome-wide, our opinion is that no single technique suitably covers the minimum criteria of high coverage and, high resolution at a reasonable cost. In fact, the fraction of the methylome that is studied currently depends entirely on the inherent biases of the protocol employed. There is promise for this to change, as the third generation of sequencing technologies is expected to again ‘revolutionize’ the way that we study genomes and epigenomes. PMID:21566704
McCann, Joshua C.; Wickersham, Tryon A.; Loor, Juan J.
2014-01-01
Diversity in the forestomach microbiome is one of the key features of ruminant animals. The diverse microbial community adapts to a wide array of dietary feedstuffs and management strategies. Understanding rumen microbiome composition, adaptation, and function has global implications ranging from climatology to applied animal production. Classical knowledge of rumen microbiology was based on anaerobic, culture-dependent methods. Next-generation sequencing and other molecular techniques have uncovered novel features of the rumen microbiome. For instance, pyrosequencing of the 16S ribosomal RNA gene has revealed the taxonomic identity of bacteria and archaea to the genus level, and when complemented with barcoding adds multiple samples to a single run. Whole genome shotgun sequencing generates true metagenomic sequences to predict the functional capability of a microbiome, and can also be used to construct genomes of isolated organisms. Integration of high-throughput data describing the rumen microbiome with classic fermentation and animal performance parameters has produced meaningful advances and opened additional areas for study. In this review, we highlight recent studies of the rumen microbiome in the context of cattle production focusing on nutrition, rumen development, animal efficiency, and microbial function. PMID:24940050
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dumeige, Yannick
We theoretically analyze the second-harmonic generation process in a sequence of unidirectionnaly coupled doubly resonant whispering gallery mode semiconductor resonators. By using a convenient design, it is possible to coherently sum the second-harmonic fields generated inside each resonator. We show that resonator coupling allows the bandwidth of the phase-matching curve to be increased with respect to single-resonator configurations simultaneously taking advantage of the resonant feature of the resonators. This quasi-phase-matching technique could be applied to obtain small-footprint nonlinear devices with large bandwidth and limited nonlinear losses. The results are discussed in the framework of the slow-light-effect enhancement of second-order opticalmore » nonlinearities.« less
WONOEP appraisal: new genetic approaches to study epilepsy
Rossignol, Elsa; Kobow, Katja; Simonato, Michele; Loeb, Jeffrey A.; Grisar, Thierry; Gilby, Krista L.; Vinet, Jonathan; Kadam, Shilpa D.; Becker, Albert J.
2014-01-01
Objective New genetic investigation techniques, including next-generation sequencing, epigenetic profiling, cell lineage mapping, targeted genetic manipulation of specific neuronal cell types, stem cell reprogramming and optogenetic manipulations within epileptic networks are progressively unravelling the mysteries of epileptogenesis and ictogenesis. These techniques have opened new avenues to discover the molecular basis of epileptogenesis and to study the physiological impacts of mutations in epilepsy-associated genes on a multilayer level, from cells to circuits. Methods This manuscript reviews recently published applications of these new genetic technologies in the study of epilepsy, as well as work presented by the authors at the genetic session of the XII Workshop on the Neurobiology of Epilepsy in Quebec, Canada. Results Next-generation sequencing is providing investigators with an unbiased means to assess the molecular causes of sporadic forms of epilepsy and have revealed the complexity and genetic heterogeneity of sporadic epilepsy disorders. To assess the functional impact of mutations in these newly identified genes on specific neuronal cell-types during brain development, new modeling strategies in animals, including conditional genetics in mice and in utero knockdown approaches, are enabling functional validation with exquisite cell-type and temporal specificity. In addition, optogenetics, using cell-type specific Cre recombinase driver lines, is enabling investigators to dissect networks involved in epilepsy. Genetically-encoded cell-type labeling is also providing new means to assess the role of the non-neuronal components of epileptic networks such as glial cells. Furthermore, beyond its role in revealing coding variants involved in epileptogenesis, next-generation sequencing can be used to assess the epigenetic modifications that lead to sustained network hyperexcitability in epilepsy, including methylation changes in gene promoters and non-coding RNAs involved in modifying gene expression following seizures. In addition, genetically-based bioluminescent reporters are providing new opportunities to assess neuronal activity and neurotransmitter levels both in vitro and in vivo in the context of epilepsy. Finally, genetically rederived neurons generated from patient iPS cells and genetically-modified zebrafish have become high-throughput means to investigate disease mechanisms and potential new therapies. Significance Genetics has considerably changed the field of epilepsy research and is paving the way for better diagnosis and therapies for patients with epilepsy. PMID:24965021
Chen, Bo-Ruei; Hale, Devin C; Ciolek, Peter J; Runge, Kurt W
2012-05-03
Barcodes are unique DNA sequence tags that can be used to specifically label individual mutants. The barcode-tagged open reading frame (ORF) haploid deletion mutant collections in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe allow for high-throughput mutant phenotyping because the relative growth of mutants in a population can be determined by monitoring the proportions of their associated barcodes. While these mutant collections have greatly facilitated genome-wide studies, mutations in essential genes are not present, and the roles of these genes are not as easily studied. To further support genome-scale research in S. pombe, we generated a barcode-tagged fission yeast insertion mutant library that has the potential of generating viable mutations in both essential and non-essential genes and can be easily analyzed using standard molecular biological techniques. An insertion vector containing a selectable ura4+ marker and a random barcode was used to generate a collection of 10,000 fission yeast insertion mutants stored individually in 384-well plates and as six pools of mixed mutants. Individual barcodes are flanked by Sfi I recognition sites and can be oligomerized in a unique orientation to facilitate barcode sequencing. Independent genetic screens on a subset of mutants suggest that this library contains a diverse collection of single insertion mutations. We present several approaches to determine insertion sites. This collection of S. pombe barcode-tagged insertion mutants is well-suited for genome-wide studies. Because insertion mutations may eliminate, reduce or alter the function of essential and non-essential genes, this library will contain strains with a wide range of phenotypes that can be assayed by their associated barcodes. The design of the barcodes in this library allows for barcode sequencing using next generation or standard benchtop cloning approaches.
Recently Patented Viral Nucleotide Sequences and Generation of Virus-Derived Vaccines.
Venkataraman, Srividhya; Ahmad, Tauqeer; Haidar, Mounir A; Hefferon, Kathleen L
2017-01-01
With an increase in comprehension of the molecular biology of viruses, there has been a recent surge in the application of virus sequences and viral gene expression strategies towards the diagnosis and treatment of diseases. The scope of the patenting landscape has widened as a result and the current review discusses patents pertaining to live / attenuated viral vaccines. The vaccines addressed here have been developed by both conventional means as well as by the state-of-the-art genetic engineering techniques. This review also addresses the applications of these patents for clinical and biotechnological purposes. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
ChIP-seq and RNA-seq methods to study circadian control of transcription in mammals
Takahashi, Joseph S.; Kumar, Vivek; Nakashe, Prachi; Koike, Nobuya; Huang, Hung-Chung; Green, Carla B.; Kim, Tae-Kyung
2015-01-01
Genome-wide analyses have revolutionized our ability to study the transcriptional regulation of circadian rhythms. The advent of next-generation sequencing methods has facilitated the use of two such technologies, ChIP-seq and RNA-seq. In this chapter, we describe detailed methods and protocols for these two techniques, with emphasis on their usage in circadian rhythm experiments in the mouse liver, a major target organ of the circadian clock system. Critical factors for these methods are highlighted and issues arising with time series samples for ChIP-seq and RNA-seq are discussed. Finally detailed protocols for library preparation suitable for Illumina sequencing platforms are presented. PMID:25662462
Scalable Parallel Methods for Analyzing Metagenomics Data at Extreme Scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.
2015-05-01
The field of bioinformatics and computational biology is currently experiencing a data revolution. The exciting prospect of making fundamental biological discoveries is fueling the rapid development and deployment of numerous cost-effective, high-throughput next-generation sequencing technologies. The result is that the DNA and protein sequence repositories are being bombarded with new sequence information. Databases are continuing to report a Moore’s law-like growth trajectory in their database sizes, roughly doubling every 18 months. In what seems to be a paradigm-shift, individual projects are now capable of generating billions of raw sequence data that need to be analyzed in the presence of alreadymore » annotated sequence information. While it is clear that data-driven methods, such as sequencing homology detection, are becoming the mainstay in the field of computational life sciences, the algorithmic advancements essential for implementing complex data analytics at scale have mostly lagged behind. Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. In this dissertation, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for a collection of 2.56M sequences show parallel efficiencies of ~75-100% on up to 8K cores, representing a time-to-solution of 33 seconds. We extend this work with a detailed analysis of single-node sequence alignment performance using the latest CPU vector instruction set extensions. Preliminary results reveal that current sequence alignment algorithms are unable to fully utilize widening vector registers.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guillermin, M.; Colombier, J. P.; Audouard, E.
2010-07-15
With an interest in pulsed laser deposition and remote spectroscopy techniques, we explore here the potential of laser pulses temporally tailored on ultrafast time scales to control the expansion and the excitation degree of various ablation products including atomic species and nanoparticulates. Taking advantage of automated pulse-shaping techniques, an adaptive procedure based on spectroscopic feedback is applied to regulate the irradiance and enhance the optical emission of monocharged aluminum ions with respect to the neutral signal. This leads to optimized pulses usually consisting in a series of femtosecond peaks distributed on a longer picosecond sequence. The ablation features induced bymore » the optimized pulse are compared with those determined by picosecond pulses generated by imposed second-order dispersion or by double pulse sequences with adjustable picosecond separation. This allows to analyze the influence of fast- and slow-varying envelope features on the material heating and the resulting plasma excitation degree. Using various optimal pulse forms including designed asymmetric shapes, we analyze the establishment of surface pre-excitation that enables conditions of enhanced radiation coupling. Thin films elaborated by unshaped femtosecond laser pulses and by optimized, stretched, or double pulse sequences are compared, indicating that the nanoparticles generation efficiency is strongly influenced by the temporal shaping of the laser irradiation. A thermodynamic scenario involving supercritical heating is proposed to explain enhanced ionization rates and lower particulates density for optimal pulses. Numerical one-dimensional hydrodynamic simulations for the excited matter support the interpretation of the experimental results in terms of relative efficiency of various relaxation paths for excited matter above or below the thermodynamic stability limits. The calculation results underline the role of the temperature and density gradients along the ablated plasma plume which lead to the spatial distinct locations of excited species. Moreover, the nanoparticles sizes are computed based on liquid layer ejection followed by a Rayleigh and Taylor instability decomposition, in good agreement with the experimental findings.« less
Continuous all-optical deceleration of molecular beams
NASA Astrophysics Data System (ADS)
Jayich, Andrew; Chen, Gary; Long, Xueping; Wang, Anna; Campbell, Wesley
2014-05-01
A significant impediment to generating ultracold molecules is slowing a molecular beam to velocities where the molecules can be cooled and trapped. We report on progress toward addressing this issue with a general optical deceleration technique for molecular and atomic beams. We propose addressing the molecular beam with a pump and dump pulse sequence from a mode-locked laser. The pump pulse counter-propagates with respect to the beam and drives the molecules to the excited state. The dump pulse co-propagates and stimulates emission, driving the molecules back to the ground state. This cycle transfers 2 ℏk of momentum and can generate very large optical forces, not limited by the spontaneous emission lifetime of the molecule or atom. Importantly, avoiding spontaneous emission limits the branching to dark states. This technique can later be augmented with cooling and trapping. We are working towards demonstrating this optical force by accelerating a cold atomic sample.
Dimension reduction techniques for the integrative analysis of multi-omics data
Zeleznik, Oana A.; Thallinger, Gerhard G.; Kuster, Bernhard; Gholami, Amin M.
2016-01-01
State-of-the-art next-generation sequencing, transcriptomics, proteomics and other high-throughput ‘omics' technologies enable the efficient generation of large experimental data sets. These data may yield unprecedented knowledge about molecular pathways in cells and their role in disease. Dimension reduction approaches have been widely used in exploratory analysis of single omics data sets. This review will focus on dimension reduction approaches for simultaneous exploratory analyses of multiple data sets. These methods extract the linear relationships that best explain the correlated structure across data sets, the variability both within and between variables (or observations) and may highlight data issues such as batch effects or outliers. We explore dimension reduction techniques as one of the emerging approaches for data integration, and how these can be applied to increase our understanding of biological systems in normal physiological function and disease. PMID:26969681
How B-Cell Receptor Repertoire Sequencing Can Be Enriched with Structural Antibody Data
Kovaltsuk, Aleksandr; Krawczyk, Konrad; Galson, Jacob D.; Kelly, Dominic F.; Deane, Charlotte M.; Trück, Johannes
2017-01-01
Next-generation sequencing of immunoglobulin gene repertoires (Ig-seq) allows the investigation of large-scale antibody dynamics at a sequence level. However, structural information, a crucial descriptor of antibody binding capability, is not collected in Ig-seq protocols. Developing systematic relationships between the antibody sequence information gathered from Ig-seq and low-throughput techniques such as X-ray crystallography could radically improve our understanding of antibodies. The mapping of Ig-seq datasets to known antibody structures can indicate structurally, and perhaps functionally, uncharted areas. Furthermore, contrasting naïve and antigenically challenged datasets using structural antibody descriptors should provide insights into antibody maturation. As the number of antibody structures steadily increases and more and more Ig-seq datasets become available, the opportunities that arise from combining the two types of information increase as well. Here, we review how these data types enrich one another and show potential for advancing our knowledge of the immune system and improving antibody engineering. PMID:29276518
Sequencing ebola and marburg viruses genomes using microarrays.
Hardick, Justin; Woelfel, Roman; Gardner, Warren; Ibrahim, Sofi
2016-08-01
Periodic outbreaks of Ebola and Marburg hemorrhagic fevers have occurred in Africa over the past four decades with case fatality rates reaching as high as 90%. The latest Ebola outbreak in West Africa in 2014 raised concerns that these infections can spread across continents and pose serious health risks. Early and accurate identification of the causative agents is necessary to contain outbreaks. In this report, we describe sequencing-by-hybridization (SBH) technique using high density microarrays to identify Ebola and Marburg viruses. The microarrays were designed to interrogate the sequences of entire viral genomes, and were evaluated with three species of Ebolavirus (Reston, Sudan, and Zaire), and three strains of Marburgvirus (Angola, Musoke, and Ravn). The results showed that the consensus sequences generated with four or more hybridizations had 92.1-98.9% accuracy over 95-99% of the genomes. Additionally, with SBH microarrays it was possible to distinguish between different strains of the Lake Victoria Marburgvirus. J. Med. Virol. 88:1303-1308, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
DNA-based techniques for authentication of processed food and food supplements.
Lo, Yat-Tung; Shaw, Pang-Chui
2018-02-01
Authentication of food or food supplements with medicinal values is important to avoid adverse toxic effects, provide consumer rights, as well as for certification purpose. Compared to morphological and spectrometric techniques, molecular authentication is found to be accurate, sensitive and reliable. However, DNA degradation and inclusion of inhibitors may lead to failure in PCR amplification. This paper reviews on the existing DNA extraction and PCR protocols, and the use of small size DNA markers with sufficient discriminative power for molecular authentication. Various emerging new molecular techniques such as isothermal amplification for on-site diagnosis, next-generation sequencing for high-throughput species identification, high resolution melting analysis for quick species differentiation, DNA array techniques for rapid detection and quantitative determination in food products are also discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Image encryption using random sequence generated from generalized information domain
NASA Astrophysics Data System (ADS)
Xia-Yan, Zhang; Guo-Ji, Zhang; Xuan, Li; Ya-Zhou, Ren; Jie-Hua, Wu
2016-05-01
A novel image encryption method based on the random sequence generated from the generalized information domain and permutation-diffusion architecture is proposed. The random sequence is generated by reconstruction from the generalized information file and discrete trajectory extraction from the data stream. The trajectory address sequence is used to generate a P-box to shuffle the plain image while random sequences are treated as keystreams. A new factor called drift factor is employed to accelerate and enhance the performance of the random sequence generator. An initial value is introduced to make the encryption method an approximately one-time pad. Experimental results show that the random sequences pass the NIST statistical test with a high ratio and extensive analysis demonstrates that the new encryption scheme has superior security.
Converting from DDOR SASF to APF
NASA Technical Reports Server (NTRS)
Gladden, Roy E.; Khanampompan, Teerapat; Fisher, Forest W.
2008-01-01
A computer program called ddor_sasf2apf converts delta-door (delta differential one-way range) request from an SASF (spacecraft activity sequence file) format to an APF (apgen plan file) format for use in the Mars Reconnaissance Orbiter (MRO) missionplanning- and-sequencing process. The APF is used as an input to APGEN/AUTOGEN in the MRO activity- planning and command-sequencegenerating process to sequence the delta-door (DDOR) activity. The DDOR activity is a spacecraft tracking technique for determining spacecraft location. The input to ddor_sasf2apf is an input request SASF provided by an observation team that utilizes DDOR. ddor_sasf2apf parses this DDOR SASF input, rearranging parameters and reformatting the request to produce an APF file for use in AUTOGEN and/or APGEN. The benefit afforded by ddor_sasf2apf is to enable the use of the DDOR SASF file earlier in the planning stage of the command-sequence-generating process and to produce sequences, optimized for DDOR operations, that are more accurate and more robust than would otherwise be possible.
Nanowire-nanopore transistor sensor for DNA detection during translocation
NASA Astrophysics Data System (ADS)
Xie, Ping; Xiong, Qihua; Fang, Ying; Qing, Quan; Lieber, Charles
2011-03-01
Nanopore sequencing, as a promising low cost, high throughput sequencing technique, has been proposed more than a decade ago. Due to the incompatibility between small ionic current signal and fast translocation speed and the technical difficulties on large scale integration of nanopore for direct ionic current sequencing, alternative methods rely on integrated DNA sensors have been proposed, such as using capacitive coupling or tunnelling current etc. But none of them have been experimentally demonstrated yet. Here we show that for the first time an amplified sensor signal has been experimentally recorded from a nanowire-nanopore field effect transistor sensor during DNA translocation. Independent multi-channel recording was also demonstrated for the first time. Our results suggest that the signal is from highly localized potential change caused by DNA translocation in none-balanced buffer condition. Given this method may produce larger signal for smaller nanopores, we hope our experiment can be a starting point for a new generation of nanopore sequencing devices with larger signal, higher bandwidth and large-scale multiplexing capability and finally realize the ultimate goal of low cost high throughput sequencing.
Yang, Litao; Liang, Wanqi; Jiang, Lingxi; Li, Wenquan; Cao, Wei; Wilson, Zoe A; Zhang, Dabing
2008-06-04
Real-time PCR techniques are being widely used for nucleic acids analysis, but one limitation of current frequently employed real-time PCR is the high cost of the labeled probe for each target molecule. We describe a real-time PCR technique employing attached universal duplex probes (AUDP), which has the advantage of generating fluorescence by probe hydrolysis and strand displacement over current real-time PCR methods. AUDP involves one set of universal duplex probes in which the 5' end of the fluorescent probe (FP) and a complementary quenching probe (QP) lie in close proximity so that fluorescence can be quenched. The PCR primer pair with attached universal template (UT) and the FP are identical to the UT sequence. We have shown that the AUDP technique can be used for detecting multiple target DNA sequences in both simplex and duplex real-time PCR assays for gene expression analysis, genotype identification, and genetically modified organism (GMO) quantification with comparable sensitivity, reproducibility, and repeatability with other real-time PCR methods. The results from GMO quantification, gene expression analysis, genotype identification, and GMO quantification using AUDP real-time PCR assays indicate that the AUDP real-time PCR technique has been successfully applied in nucleic acids analysis, and the developed AUDP real-time PCR technique will offer an alternative way for nucleic acid analysis with high efficiency, reliability, and flexibility at low cost.
Design of Cyclic Peptide Based Glucose Receptors and Their Application in Glucose Sensing.
Li, Chao; Chen, Xin; Zhang, Fuyuan; He, Xingxing; Fang, Guozhen; Liu, Jifeng; Wang, Shuo
2017-10-03
Glucose assay is of great scientific significance in clinical diagnostics and bioprocess monitoring, and to design a new glucose receptor is necessary for the development of more sensitive, selective, and robust glucose detection techniques. Herein, a series of cyclic peptide (CP) glucose receptors were designed to mimic the binding sites of glucose binding protein (GBP), and CPs' sequence contained amino acid sites Asp, Asn, His, Asp, and Arg, which constituted the first layer interactions of GBP. The properties of these CPs used as a glucose receptor or substitute for the GBP were studied by using a quartz crystal microbalance (QCM) technique. It was found that CPs can form a self-assembled monolayer at the Au quartz electrode surface, and the monolayer's properties were characterized by using cyclic voltammetry, electrochemical impedance spectroscopy, and atomic force microscopy. The CPs' binding affinity to saccharide (i.e., galactose, fructose, lactose, sucrose, and maltose) was investigated, and the CPs' sensitivity and selectivity toward glucose were found to be dependent upon the configuration,i.e., the amino acids sequence of the CPs. The cyclic unit with a cyclo[-CNDNHCRDNDC-] sequence gave the highest selectivity and sensitivity for glucose sensing. This work suggests that a synthetic peptide bearing a particular functional sequence could be applied for developing a new generation of glucose receptors and would find huge application in biological, life science, and clinical diagnostics fields.
Radioresistance of GGG Sequences to Prompt Strand Break Formation from Direct-Type Radiation Damage
Black, Paul J.; Miller, Adam S.; Hayes, Jeffrey J.
2016-01-01
Purpose As humans, we are constantly exposed to ionizing radiation from natural, man-made and cosmic sources which can damage DNA, leading to deleterious effects including cancer incidence. In this work we introduce a method to monitor strand breaks resulting from damage due to the direct effect of ionizing radiation and provide evidence for sequence-dependent effects leading to strand breaks. Materials and methods To analyze only DNA strand breaks caused by radiation damage due to the direct effect of ionizing radiation, we combined an established technique to generate dehydrated DNA samples with a technique to analyze single strand breaks on short oligonucleotide sequences via denaturing gel electrophoresis. Results We find that direct damage primarily results in a reduced number of strand breaks in guanine triplet regions (GGG) when compared to isolated guanine (G) bases with identical flanking base context. In addition, we observe strand break behavior possibly indicative of protection of guanine bases when flanked by pyrimidines, and sensitization of guanine to strand break when flanked by adenine (A) bases in both isolated G and GGG cases. Conclusions These observations provide insight into the strand break behavior in GGG regions damaged via the direct effect of ionizing radiation. In addition, this could be indicative of DNA sequences that are naturally more susceptible to strand break due to the direct effect of ionizing radiation. PMID:27349757
Aires-de-Sousa, João; Aires-de-Sousa, Luisa
2003-01-01
We propose representing individual positions in DNA sequences by virtual potentials generated by other bases of the same sequence. This is a compact representation of the neighbourhood of a base. The distribution of the virtual potentials over the whole sequence can be used as a representation of the entire sequence (SEQREP code). It is a flexible code, with a length independent of the sequence size, does not require previous alignment, and is convenient for processing by neural networks or statistical techniques. To evaluate its biological significance, the SEQREP code was used for training Kohonen self-organizing maps (SOMs) in two applications: (a) detection of Alu sequences, and (b) classification of sequences encoding for HIV-1 envelope glycoprotein (env) into subtypes A-G. It was demonstrated that SOMs clustered sequences belonging to different classes into distinct regions. For independent test sets, very high rates of correct predictions were obtained (97% in the first application, 91% in the second). Possible areas of application of SEQREP codes include functional genomics, phylogenetic analysis, detection of repetitions, database retrieval, and automatic alignment. Software for representing sequences by SEQREP code, and for training Kohonen SOMs is made freely available from http://www.dq.fct.unl.pt/qoa/jas/seqrep. Supplementary material is available at http://www.dq.fct.unl.pt/qoa/jas/seqrep/bioinf2002
B-MIC: An Ultrafast Three-Level Parallel Sequence Aligner Using MIC.
Cui, Yingbo; Liao, Xiangke; Zhu, Xiaoqian; Wang, Bingqiang; Peng, Shaoliang
2016-03-01
Sequence alignment is the central process for sequence analysis, where mapping raw sequencing data to reference genome. The large amount of data generated by NGS is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. Intel recently announced the MIC coprocessor, which can provide massive computing power. The Tianhe-2 is the world's fastest supercomputer now equipped with three MIC coprocessors each compute node. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a MIC-oriented three-level parallelization strategy to speed up BWA, a widely used sequence alignment tool, and developed our ultrafast parallel sequence aligner: B-MIC. B-MIC contains three levels of parallelization: firstly, parallelization of data IO and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by MIC coprocessor technology; thirdly, inter-node parallelization implemented by MPI. In this paper, we demonstrate that B-MIC outperforms BWA by a combination of those techniques using Inspur NF5280M server and the Tianhe-2 supercomputer. To the best of our knowledge, B-MIC is the first sequence alignment tool to run on Intel MIC and it can achieve more than fivefold speedup over the original BWA while maintaining the alignment precision.
Display of travelling 3D scenes from single integral-imaging capture
NASA Astrophysics Data System (ADS)
Martinez-Corral, Manuel; Dorado, Adrian; Hong, Seok-Min; Sola-Pikabea, Jorge; Saavedra, Genaro
2016-06-01
Integral imaging (InI) is a 3D auto-stereoscopic technique that captures and displays 3D images. We present a method for easily projecting the information recorded with this technique by transforming the integral image into a plenoptic image, as well as choosing, at will, the field of view (FOV) and the focused plane of the displayed plenoptic image. Furthermore, with this method we can generate a sequence of images that simulates a camera travelling through the scene from a single integral image. The application of this method permits to improve the quality of 3D display images and videos.
Distal airways in humans: dynamic hyperpolarized 3He MR imaging--feasibility
NASA Technical Reports Server (NTRS)
Tooker, Angela C.; Hong, Kwan Soo; McKinstry, Erin L.; Costello, Philip; Jolesz, Ferenc A.; Albert, Mitchell S.
2003-01-01
Dynamic hyperpolarized helium 3 (3He) magnetic resonance (MR) imaging of the human airways is achieved by using a fast gradient-echo pulse sequence during inhalation. The resulting dynamic images show differential contrast enhancement of both distal airways and the lung periphery, unlike static hyperpolarized 3He MR images on which only the lung periphery is seen. With this technique, up to seventh-generation airway branching can be visualized. Copyright RSNA, 2003.
A Method that Will Captivate U.
Martin, Sophie; Coller, Jeff
2015-09-03
In an age of next-generation sequencing, the ability to purify RNA transcripts has become a critical issue. In this issue, Duffy et al. (2015) improve on a pre-existing technique of RNA labeling and purification by 4-thiouridine tagging. By increasing the efficiency of RNA capture, this method will enhance the ability to study RNA dynamics, especially for transcripts normally inefficiently captured by previous methods. Copyright © 2015 Elsevier Inc. All rights reserved.
STITCHER: A web resource for high-throughput design of primers for overlapping PCR applications.
O'Halloran, Damien M
2015-06-01
Overlapping PCR is routinely used in a wide number of molecular applications. These include stitching PCR fragments together, generating fluorescent transcriptional and translational fusions, inserting mutations, making deletions, and PCR cloning. Overlapping PCR is also used for genotyping by traditional PCR techniques and in detection experiments using techniques such as loop-mediated isothermal amplification (LAMP). STITCHER is a web tool providing a central resource for researchers conducting all types of overlapping PCR experiments with an intuitive interface for automated primer design that's fast, easy to use, and freely available online (http://ohalloranlab.net/STITCHER.html). STITCHER can handle both single sequence and multi-sequence input, and specific features facilitate numerous other PCR applications, including assembly PCR, adapter PCR, and primer walking. Field PCR, and in particular, LAMP, offers promise as an on site tool for pathogen detection in underdeveloped areas, and STITCHER includes off-target detection features for pathogens commonly targeted using LAMP technology.
NASA Technical Reports Server (NTRS)
Olds, John R.; Marcus, Leland
2002-01-01
This paper is written in support of the on-going research into conceptual space vehicle design conducted at the Space Systems Design Laboratory (SSDL) at the Georgia Institute of Technology. Research at the SSDL follows a sequence of a number of the traditional aerospace disciplines. The sequence of disciplines and interrelationship among them is shown in the Design Structure Matrix (DSM). The discipline of Weights and Sizing occupies a central location in the design of a new space vehicle. Weights and Sizing interact, either in a feed forward or feed back manner, with every other discipline in the DSM. Because of this principle location, accuracy in Weights and Sizing is integral to producing an accurate model of a space vehicle concept. Instead of using conceptual level techniques, a simplified Finite Element Analysis (FEA) technique is described as applied to the problem of the Liquid Oxygen (LOX) tank bending loads applied to the forward Liquid Hydrogen (LH2) tank of the Georgia Tech Air Breathing Launch Vehicle (ABLV).
NASA Astrophysics Data System (ADS)
Schiffer, A.; Gardner, M. N.; Lynn, R. H.; Tagarielli, V. L.
2017-03-01
Experiments were conducted on an aqueous growth medium containing cultures of Escherichia coli (E. coli) XL1-Blue, to investigate, in a single experiment, the effect of two types of dynamic mechanical loading on cellular integrity. A bespoke shock tube was used to subject separate portions of a planktonic bacterial culture to two different loading sequences: (i) shock compression followed by cavitation, and (ii) shock compression followed by spray. The apparatus allows the generation of an adjustable loading shock wave of magnitude up to 300 MPa in a sterile laboratory environment. Cultures of E. coli were tested with this apparatus and the spread-plate technique was used to measure the survivability after mechanical loading. The loading sequence (ii) gave higher mortality than (i), suggesting that the bacteria are more vulnerable to shear deformation and cavitation than to hydrostatic compression. We present the results of preliminary experiments and suggestions for further experimental work; we discuss the potential applications of this technique to sterilize large volumes of fluid samples.
NASA Astrophysics Data System (ADS)
Both, P.; Green, A. P.; Gray, C. J.; Šardzík, R.; Voglmeir, J.; Fontana, C.; Austeri, M.; Rejzek, M.; Richardson, D.; Field, R. A.; Widmalm, G.; Flitsch, S. L.; Eyers, C. E.
2014-01-01
Mass spectrometry is the primary analytical technique used to characterize the complex oligosaccharides that decorate cell surfaces. Monosaccharide building blocks are often simple epimers, which when combined produce diastereomeric glycoconjugates indistinguishable by mass spectrometry. Structure elucidation frequently relies on assumptions that biosynthetic pathways are highly conserved. Here, we show that biosynthetic enzymes can display unexpected promiscuity, with human glycosyltransferase pp-α-GanT2 able to utilize both uridine diphosphate N-acetylglucosamine and uridine diphosphate N-acetylgalactosamine, leading to the synthesis of epimeric glycopeptides in vitro. Ion-mobility mass spectrometry (IM-MS) was used to separate these structures and, significantly, enabled characterization of the attached glycan based on the drift times of the monosaccharide product ions generated following collision-induced dissociation. Finally, ion-mobility mass spectrometry following fragmentation was used to determine the nature of both the reducing and non-reducing glycans of a series of epimeric disaccharides and the branched pentasaccharide Man3 glycan, demonstrating that this technique may prove useful for the sequencing of complex oligosaccharides.
Tahriri, Farzad; Dawal, Siti Zawiah Md; Taha, Zahari
2014-01-01
A new multiobjective dynamic fuzzy genetic algorithm is applied to solve a fuzzy mixed-model assembly line sequencing problem in which the primary goals are to minimize the total make-span and minimize the setup number simultaneously. Trapezoidal fuzzy numbers are implemented for variables such as operation and travelling time in order to generate results with higher accuracy and representative of real-case data. An improved genetic algorithm called fuzzy adaptive genetic algorithm (FAGA) is proposed in order to solve this optimization model. In establishing the FAGA, five dynamic fuzzy parameter controllers are devised in which fuzzy expert experience controller (FEEC) is integrated with automatic learning dynamic fuzzy controller (ALDFC) technique. The enhanced algorithm dynamically adjusts the population size, number of generations, tournament candidate, crossover rate, and mutation rate compared with using fixed control parameters. The main idea is to improve the performance and effectiveness of existing GAs by dynamic adjustment and control of the five parameters. Verification and validation of the dynamic fuzzy GA are carried out by developing test-beds and testing using a multiobjective fuzzy mixed production assembly line sequencing optimization problem. The simulation results highlight that the performance and efficacy of the proposed novel optimization algorithm are more efficient than the performance of the standard genetic algorithm in mixed assembly line sequencing model. PMID:24982962
Allen, Jonathan E.; Brown, Trevor S.; Gardner, Shea N.; McLoughlin, Kevin S.; Forsberg, Jonathan A.; Kirkup, Benjamin C.; Chromy, Brett A.; Luciw, Paul A.; Elster, Eric A.
2014-01-01
Combat wound healing and resolution are highly affected by the resident microbial flora. We therefore sought to achieve comprehensive detection of microbial populations in wounds using novel genomic technologies and bioinformatics analyses. We employed a microarray capable of detecting all sequenced pathogens for interrogation of 124 wound samples from extremity injuries in combat-injured U.S. service members. A subset of samples was also processed via next-generation sequencing and metagenomic analysis. Array analysis detected microbial targets in 51% of all wound samples, with Acinetobacter baumannii being the most frequently detected species. Multiple Pseudomonas species were also detected in tissue biopsy specimens. Detection of the Acinetobacter plasmid pRAY correlated significantly with wound failure, while detection of enteric-associated bacteria was associated significantly with successful healing. Whole-genome sequencing revealed broad microbial biodiversity between samples. The total wound bioburden did not associate significantly with wound outcome, although temporal shifts were observed over the course of treatment. Given that standard microbiological methods do not detect the full range of microbes in each wound, these data emphasize the importance of supplementation with molecular techniques for thorough characterization of wound-associated microbes. Future application of genomic protocols for assessing microbial content could allow application of specialized care through early and rapid identification and management of critical patterns in wound bioburden. PMID:24829242
Wang, Jinjin; Yu, Xiaomu; Zhao, Kai; Zhang, Yaoguang; Tong, Jingou; Peng, Zuogang
2012-01-01
Megalobrama pellegrini is an endemic fish species found in the upper Yangtze River basin in China. This species has become endangered due to the construction of the Three Gorges Dam and overfishing. However, the available genetic data for this species is limited. Here, we developed 26 polymorphic microsatellite markers from the M. pellegrini genome using next-generation sequencing techniques. A total of 257,497 raw reads were obtained from a quarter-plate run on 454 GS-FLX titanium platforms and 49,811 unique sequences were generated with an average length of 404 bp; 24,522 (49.2%) sequences contained microsatellite repeats. Of the 53 loci screened, 33 were amplified successfully and 26 were polymorphic. The genetic diversity in M. pellegrini was moderate, with an average of 3.08 alleles per locus, and the mean observed and expected heterozygosity were 0.47 and 0.51, respectively. In addition, we tested cross-species amplification for all 33 loci in four additional breams: M. amblycephala, M. skolkovii, M. terminalis, and Sinibrama wui. The cross-species amplification showed a significant high level of transferability (79%–97%), which might be due to their dramatically close genetic relationships. The polymorphic microsatellites developed in the current study will not only contribute to further conservation genetic studies and parentage analyses of this endangered species, but also facilitate future work on the other closely related species. PMID:22489139
Rapid Generation of Optimal Asteroid Powered Descent Trajectories Via Convex Optimization
NASA Technical Reports Server (NTRS)
Pinson, Robin; Lu, Ping
2015-01-01
This paper investigates a convex optimization based method that can rapidly generate the fuel optimal asteroid powered descent trajectory. The ultimate goal is to autonomously design the optimal powered descent trajectory on-board the spacecraft immediately prior to the descent burn. Compared to a planetary powered landing problem, the major difficulty is the complex gravity field near the surface of an asteroid that cannot be approximated by a constant gravity field. This paper uses relaxation techniques and a successive solution process that seeks the solution to the original nonlinear, nonconvex problem through the solutions to a sequence of convex optimal control problems.
A generative, probabilistic model of local protein structure.
Boomsma, Wouter; Mardia, Kanti V; Taylor, Charles C; Ferkinghoff-Borg, Jesper; Krogh, Anders; Hamelryck, Thomas
2008-07-01
Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state. Our method represents a significant theoretical and practical improvement over the widely used fragment assembly technique by avoiding the drawbacks associated with a discrete and nonprobabilistic approach.
RNActive® Technology: Generation and Testing of Stable and Immunogenic mRNA Vaccines.
Rauch, Susanne; Lutz, Johannes; Kowalczyk, Aleksandra; Schlake, Thomas; Heidenreich, Regina
2017-01-01
Developing effective mRNA vaccines poses certain challenges concerning mRNA stability and ability to induce sufficient immune stimulation and requires a specific panel of techniques for production and testing. Here, we describe the production of stabilized mRNA with enhanced immunogenicity, generated using conventional nucleotides only, by introducing changes to the mRNA sequence and by complexation with the nucleotide-binding peptide protamine (RNActive® technology). Methods described here include the synthesis, purification, and protamine complexation of mRNA vaccines as well as a comprehensive panel of in vitro and in vivo methods for evaluation of vaccine quality and immunogenicity.
2010-01-01
Background In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. Results The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Conclusions Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements. PMID:20205909
Nuel, Gregory; Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2010-01-26
In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements.
Design and implementation of an optical Gaussian noise generator
NASA Astrophysics Data System (ADS)
Za~O, Leonardo; Loss, Gustavo; Coelho, Rosângela
2009-08-01
A design of a fast and accurate optical Gaussian noise generator is proposed and demonstrated. The noise sample generation is based on the Box-Muller algorithm. The functions implementation was performed on a high-speed Altera Stratix EP1S25 field-programmable gate array (FPGA) development kit. It enabled the generation of 150 million 16-bit noise samples per second. The Gaussian noise generator required only 7.4% of the FPGA logic elements, 1.2% of the RAM memory, 0.04% of the ROM memory, and a laser source. The optical pulses were generated by a laser source externally modulated by the data bit samples using the frequency-shift keying technique. The accuracy of the noise samples was evaluated for different sequences size and confidence intervals. The noise sample pattern was validated by the Bhattacharyya distance (Bd) and the autocorrelation function. The results showed that the proposed design of the optical Gaussian noise generator is very promising to evaluate the performance of optical communications channels with very low bit-error-rate values.
Ancient DNA sequence revealed by error-correcting codes.
Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo
2015-07-10
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes
Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo
2015-01-01
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Electricity generation and microbial community analysis of alcohol powered microbial fuel cells.
Kim, Jung Rae; Jung, Sok Hee; Regan, John M; Logan, Bruce E
2007-09-01
Two different microbial fuel cell (MFC) configurations were investigated for electricity production from ethanol and methanol: a two-chambered, aqueous-cathode MFC; and a single-chamber direct-air cathode MFC. Electricity was generated in the two-chamber system at a maximum power density typical of this system (40+/-2 mW/m2) and a Coulombic efficiency (CE) ranging from 42% to 61% using ethanol. When bacteria were transferred into a single-chamber MFC known to produce higher power densities with different substrates, the maximum power density increased to 488+/-12 mW/m2 (CE = 10%) with ethanol. The voltage generated exhibited saturation kinetics as a function of ethanol concentration in the two-chambered MFC, with a half-saturation constant (Ks) of 4.86 mM. Methanol was also examined as a possible substrate, but it did not result in appreciable electricity generation. Analysis of the anode biofilm and suspension from a two-chamber MFC with ethanol using 16S rDNA-based techniques indicated that bacteria with sequences similar to Proteobacterium Core-1 (33.3% of clone library sequences), Azoarcus sp. (17.4%), and Desulfuromonas sp. M76 (15.9%) were significant members of the anode chamber community. These results indicate that ethanol can be used for sustained electricity generation at room temperature using bacteria on the anode in a MFC.
Rapid response sensor for analyzing Special Nuclear Material
Mitra, S. S.; Doron, O.; Chen, A. X.; ...
2015-06-18
Rapid in-situ analytical techniques are attractive for characterizing Special Nuclear Material (SNM). Present techniques are time consuming, and require sample dissolution. Proof-of-principal studies are performed to demonstrate the utility of employing low energy neutrons from a portable pulsed neutron generator for non-destructive isotopic analysis of nuclear material. In particular, time-sequenced data acquisition, operating synchronously with the pulsing of a neutron generator, partitions the characteristic elemental prompt gamma-rays according to the type of the reaction; inelastic neutron scattering reactions during the ON state and thermal neutron capture reactions during the OFF state of the generator. Thus, the key challenge is isolatingmore » these signature gamma- rays from the prompt fission and β-delayed gamma-rays that are also produced during the neutron interrogation. A commercial digital multi-channel analyzer has been specially customized to enable time-resolved gamma-ray spectral data to be acquired in multiple user-defined time bins within each of the ON/OFF gate periods of the neutron generator. Preliminary results on new signatures from depleted uranium as well as modeling and benchmarking of the concept are presented, however this approach should should be applicable for virtually all forms of SNM.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Torella, JP; Lienert, F; Boehm, CR
2014-08-07
Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less
Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.
2016-01-01
Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822
Utturkar, Sagar M.; Klingeman, Dawn Marie; Land, Miriam L.; ...
2014-06-14
Our motivation with this work was to assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. Our results show Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as anmore » additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. As to availability and implementation–all assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aw, Tiong Gim; Howe, Adina; Rose, Joan B.
2014-12-01
Genomic-based molecular techniques are emerging as powerful tools that allow a comprehensive characterization of water and wastewater microbiomes. Most recently, next generation sequencing (NGS) technologies which produce large amounts of sequence data are beginning to impact the field of environmental virology. In this study, NGS and bioinformatics have been employed for the direct detection and characterization of viruses in wastewater and of viruses isolated after cell culture. Viral particles were concentrated and purified from sewage samples by polyethylene glycol precipitation. Viral nucleic acid was extracted and randomly amplified prior to sequencing using Illumina technology, yielding a total of 18 millionmore » sequence reads. Most of the viral sequences detected could not be characterized, indicating the great viral diversity that is yet to be discovered. This sewage virome was dominated by bacteriophages and contained sequences related to known human pathogenic viruses such as adenoviruses (species B, C and F), polyomaviruses JC and BK and enteroviruses (type B). An array of other animal viruses was also found, suggesting unknown zoonotic viruses. This study demonstrated the feasibility of metagenomic approaches to characterize viruses in complex environmental water samples.« less
JavaScript DNA translator: DNA-aligned protein translations.
Perry, William L
2002-12-01
There are many instances in molecular biology when it is necessary to identify ORFs in a DNA sequence. While programs exist for displaying protein translations in multiple ORFs in alignment with a DNA sequence, they are often expensive, exist as add-ons to software that must be purchased, or are only compatible with a particular operating system. JavaScript DNA Translator is a shareware application written in JavaScript, a scripting language interpreted by the Netscape Communicator and Internet Explorer Web browsers, which makes it compatible with several different operating systems. While the program uses a familiar Web page interface, it requires no connection to the Internet since calculations are performed on the user's own computer. The program analyzes one or multiple DNA sequences and generates translations in up to six reading frames aligned to a DNA sequence, in addition to displaying translations as separate sequences in FASTA format. ORFs within a reading frame can also be displayed as separate sequences. Flexible formatting options are provided, including the ability to hide ORFs below a minimum size specified by the user. The program is available free of charge at the BioTechniques Software Library (www.Biotechniques.com).
RNA inverse folding using Monte Carlo tree search.
Yang, Xiufeng; Yoshizoe, Kazuki; Taneda, Akito; Tsuda, Koji
2017-11-06
Artificially synthesized RNA molecules provide important ways for creating a variety of novel functional molecules. State-of-the-art RNA inverse folding algorithms can design simple and short RNA sequences of specific GC content, that fold into the target RNA structure. However, their performance is not satisfactory in complicated cases. We present a new inverse folding algorithm called MCTS-RNA, which uses Monte Carlo tree search (MCTS), a technique that has shown exceptional performance in Computer Go recently, to represent and discover the essential part of the sequence space. To obtain high accuracy, initial sequences generated by MCTS are further improved by a series of local updates. Our algorithm has an ability to control the GC content precisely and can deal with pseudoknot structures. Using common benchmark datasets for evaluation, MCTS-RNA showed a lot of promise as a standard method of RNA inverse folding. MCTS-RNA is available at https://github.com/tsudalab/MCTS-RNA .
The dynamics of genome replication using deep sequencing
Müller, Carolin A.; Hawkins, Michelle; Retkute, Renata; Malla, Sunir; Wilson, Ray; Blythe, Martin J.; Nakato, Ryuichiro; Komata, Makiko; Shirahige, Katsuhiko; de Moura, Alessandro P.S.; Nieduszynski, Conrad A.
2014-01-01
Eukaryotic genomes are replicated from multiple DNA replication origins. We present complementary deep sequencing approaches to measure origin location and activity in Saccharomyces cerevisiae. Measuring the increase in DNA copy number during a synchronous S-phase allowed the precise determination of genome replication. To map origin locations, replication forks were stalled close to their initiation sites; therefore, copy number enrichment was limited to origins. Replication timing profiles were generated from asynchronous cultures using fluorescence-activated cell sorting. Applying this technique we show that the replication profiles of haploid and diploid cells are indistinguishable, indicating that both cell types use the same cohort of origins with the same activities. Finally, increasing sequencing depth allowed the direct measure of replication dynamics from an exponentially growing culture. This is the first time this approach, called marker frequency analysis, has been successfully applied to a eukaryote. These data provide a high-resolution resource and methodological framework for studying genome biology. PMID:24089142
López-Bueno, Alberto; Parras-Moltó, Marcos; López-Barrantes, Olivia; Belda, Sylvia; Alejo, Alí
2017-05-01
Molluscum contagiosum virus (MCV) is the sole member of the Molluscipoxvirus genus and causes a highly prevalent human disease of the skin characterized by the formation of a variable number of lesions that can persist for prolonged periods of time. Two major genotypes, subtype 1 and subtype 2, are recognized, although currently only a single complete genomic sequence corresponding to MCV subtype 1 is available. Using next-generation sequencing techniques, we report the complete genomic sequence of four new MCV isolates, including the first one derived from a subtype 2. Comparisons suggest a relatively distant evolutionary split between both MCV subtypes. Further, our data illustrate concurrent circulation of distinct viruses within a population and reveal the existence of recombination events among them. These results help identify a set of MCV genes with potentially relevant roles in molluscum contagiosum epidemiology and pathogenesis.
Hyperpolarized 13C pyruvate mouse brain metabolism with absorptive-mode EPSI at 1 T
NASA Astrophysics Data System (ADS)
Miloushev, Vesselin Z.; Di Gialleonardo, Valentina; Salamanca-Cardona, Lucia; Correa, Fabian; Granlund, Kristin L.; Keshari, Kayvan R.
2017-02-01
The expected signal in echo-planar spectroscopic imaging experiments was explicitly modeled jointly in spatial and spectral dimensions. Using this as a basis, absorptive-mode type detection can be achieved by appropriate choice of spectral delays and post-processing techniques. We discuss the effects of gradient imperfections and demonstrate the implementation of this sequence at low field (1.05 T), with application to hyperpolarized [1-13C] pyruvate imaging of the mouse brain. The sequence achieves sufficient signal-to-noise to monitor the conversion of hyperpolarized [1-13C] pyruvate to lactate in the mouse brain. Hyperpolarized pyruvate imaging of mouse brain metabolism using an absorptive-mode EPSI sequence can be applied to more sophisticated murine disease and treatment models. The simple modifications presented in this work, which permit absorptive-mode detection, are directly translatable to human clinical imaging and generate improved absorptive-mode spectra without the need for refocusing pulses.
Bozan, Mahir; Akyol, Çağrı; Ince, Orhan; Aydin, Sevcan; Ince, Bahar
2017-09-01
The anaerobic digestion of lignocellulosic wastes is considered an efficient method for managing the world's energy shortages and resolving contemporary environmental problems. However, the recalcitrance of lignocellulosic biomass represents a barrier to maximizing biogas production. The purpose of this review is to examine the extent to which sequencing methods can be employed to monitor such biofuel conversion processes. From a microbial perspective, we present a detailed insight into anaerobic digesters that utilize lignocellulosic biomass and discuss some benefits and disadvantages associated with the microbial sequencing techniques that are typically applied. We further evaluate the extent to which a hybrid approach incorporating a variation of existing methods can be utilized to develop a more in-depth understanding of microbial communities. It is hoped that this deeper knowledge will enhance the reliability and extent of research findings with the end objective of improving the stability of anaerobic digesters that manage lignocellulosic biomass.
Prediction of novel pre-microRNAs with high accuracy through boosting and SVM.
Zhang, Yuanwei; Yang, Yifan; Zhang, Huan; Jiang, Xiaohua; Xu, Bo; Xue, Yu; Cao, Yunxia; Zhai, Qian; Zhai, Yong; Xu, Mingqing; Cooke, Howard J; Shi, Qinghua
2011-05-15
High-throughput deep-sequencing technology has generated an unprecedented number of expressed short sequence reads, presenting not only an opportunity but also a challenge for prediction of novel microRNAs. To verify the existence of candidate microRNAs, we have to show that these short sequences can be processed from candidate pre-microRNAs. However, it is laborious and time consuming to verify these using existing experimental techniques. Therefore, here, we describe a new method, miRD, which is constructed using two feature selection strategies based on support vector machines (SVMs) and boosting method. It is a high-efficiency tool for novel pre-microRNA prediction with accuracy up to 94.0% among different species. miRD is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/rpg/mird/mird.php.
Characterization of Canna yellow mottle virus in a New Host, Alpinia purpurata, in Hawaii.
Zhang, Jingxin; Dey, Kishore K; Lin, Birun; Borth, Wayne B; Melzer, Michael J; Sether, Diane; Wang, Yanan; Wang, I-Chin; Shen, Huifang; Pu, Xiaoming; Sun, Dayuan; Hu, John S
2017-06-01
Canna yellow mottle virus (CaYMV) is an important badnavirus infecting Canna spp. worldwide. This is the first report of CaYMV in flowering ginger (Alpinia purpurata) in Hawaii, where it is associated with yellow mottling and necrosis of leaves, vein streaking, and stunted plants. We have sequenced CaYMV in A. purpurata (CaYMV-Ap) using a combination of next-generation sequencing and traditional Sanger sequencing techniques. The complete genome of CaYMV-Ap was 7,120 bp with an organization typical of other Badnavirus species. Our results indicated that CaYMV-Ap was present in the episomal form in infected flowering ginger. We determined that this virus disease is prevalent in Hawaii and could potentially have significant economic impact on the marketing of A. purpurata as cut flowers. There is a potential concern that the host range of CaYMV-Ap may expand to include other important tropical plants.
Xing, Mei-Ning; Zhang, Xue-Zhu; Huang, He
2012-01-01
Feedstock for biofuel synthesis is transitioning to lignocelluosic biomass to address criticism over competition between first generation biofuels and food production. As microbial catalysis is increasingly applied for the conversion of biomass to biofuels, increased import has been placed on the development of novel enzymes. With revolutionary advances in sequencer technology and metagenomic sequencing, mining enzymes from microbial communities for biofuel synthesis is becoming more and more practical. The present article highlights the latest research progress on the special characteristics of metagenomic sequencing, which has been a powerful tool for new enzyme discovery and gene functional analysis in the biomass energy field. Critical enzymes recently developed for the pretreatment and conversion of lignocellulosic materials are evaluated with respect to their activity and stability, with additional explorations into xylanase, laccase, amylase, chitinase, and lipolytic biocatalysts for other biomass feedstocks. Copyright © 2012 Elsevier Inc. All rights reserved.
Determination of feature generation methods for PTZ camera object tracking
NASA Astrophysics Data System (ADS)
Doyle, Daniel D.; Black, Jonathan T.
2012-06-01
Object detection and tracking using computer vision (CV) techniques have been widely applied to sensor fusion applications. Many papers continue to be written that speed up performance and increase learning of artificially intelligent systems through improved algorithms, workload distribution, and information fusion. Military application of real-time tracking systems is becoming more and more complex with an ever increasing need of fusion and CV techniques to actively track and control dynamic systems. Examples include the use of metrology systems for tracking and measuring micro air vehicles (MAVs) and autonomous navigation systems for controlling MAVs. This paper seeks to contribute to the determination of select tracking algorithms that best track a moving object using a pan/tilt/zoom (PTZ) camera applicable to both of the examples presented. The select feature generation algorithms compared in this paper are the trained Scale-Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF), the Mixture of Gaussians (MoG) background subtraction method, the Lucas- Kanade optical flow method (2000) and the Farneback optical flow method (2003). The matching algorithm used in this paper for the trained feature generation algorithms is the Fast Library for Approximate Nearest Neighbors (FLANN). The BSD licensed OpenCV library is used extensively to demonstrate the viability of each algorithm and its performance. Initial testing is performed on a sequence of images using a stationary camera. Further testing is performed on a sequence of images such that the PTZ camera is moving in order to capture the moving object. Comparisons are made based upon accuracy, speed and memory.
International Barcode of Life: Focus on big biodiversity in South Africa.
Adamowicz, Sarah J; Hollingsworth, Peter M; Ratnasingham, Sujeevan; van der Bank, Michelle
2017-11-01
Participants in the 7th International Barcode of Life Conference (Kruger National Park, South Africa, 20-24 November 2017) share the latest findings in DNA barcoding research and its increasingly diversified applications. Here, we review prevailing trends synthesized from among 429 invited and contributed abstracts, which are collated in this open-access special issue of Genome. Hosted for the first time on the African continent, the 7th Conference places special emphasis on the evolutionary origins, biogeography, and conservation of African flora and fauna. Within Africa and elsewhere, DNA barcoding and related techniques are being increasingly used for wildlife forensics and for the validation of commercial products, such as medicinal plants and seafood species. A striking trend of the conference is the dramatic rise of studies on environmental DNA (eDNA) and on diverse uses of high-throughput sequencing techniques. Emerging techniques in these areas are opening new avenues for environmental biomonitoring, managing species-at-risk and invasive species, and revealing species interaction networks in unprecedented detail. Contributors call for the development of validated community standards for high-throughput sequence data generation and analysis, to enable the full potential of these methods to be realized for understanding and managing biodiversity on a global scale.
Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations.
Feusier, Julie; Witherspoon, David J; Scott Watkins, W; Goubert, Clément; Sasani, Thomas A; Jorde, Lynn B
2017-01-01
Polymorphic human Alu elements are excellent tools for assessing population structure, and new retrotransposition events can contribute to disease. Next-generation sequencing has greatly increased the potential to discover Alu elements in human populations, and various sequencing and bioinformatics methods have been designed to tackle the problem of detecting these highly repetitive elements. However, current techniques for Alu discovery may miss rare, polymorphic Alu elements. Combining multiple discovery approaches may provide a better profile of the polymorphic Alu mobilome. Alu Yb8/9 elements have been a focus of our recent studies as they are young subfamilies (~2.3 million years old) that contribute ~30% of recent polymorphic Alu retrotransposition events. Here, we update our ME-Scan methods for detecting Alu elements and apply these methods to discover new insertions in a large set of individuals with diverse ancestral backgrounds. We identified 5,288 putative Alu insertion events, including several hundred novel Alu Yb8/9 elements from 213 individuals from 18 diverse human populations. Hundreds of these loci were specific to continental populations, and 23 non-reference population-specific loci were validated by PCR. We provide high-quality sequence information for 68 rare Alu Yb8/9 elements, of which 11 have hallmarks of an active source element. Our subfamily distribution of rare Alu Yb8/9 elements is consistent with previous datasets, and may be representative of rare loci. We also find that while ME-Scan and low-coverage, whole-genome sequencing (WGS) detect different Alu elements in 41 1000 Genomes individuals, the two methods yield similar population structure results. Current in-silico methods for Alu discovery may miss rare, polymorphic Alu elements. Therefore, using multiple techniques can provide a more accurate profile of Alu elements in individuals and populations. We improved our false-negative rate as an indicator of sample quality for future ME-Scan experiments. In conclusion, we demonstrate that ME-Scan is a good supplement for next-generation sequencing methods and is well-suited for population-level analyses.
Brain MR imaging at ultra-low radiofrequency power.
Sarkar, Subhendra N; Alsop, David C; Madhuranthakam, Ananth J; Busse, Reed F; Robson, Philip M; Rofsky, Neil M; Hackney, David B
2011-05-01
To explore the lower limits for radiofrequency (RF) power-induced specific absorption rate (SAR) achievable at 1.5 T for brain magnetic resonance (MR) imaging without loss of tissue signal or contrast present in high-SAR clinical imaging in order to create a potentially viable MR method at ultra-low RF power to image tissues containing implanted devices. An institutional review board-approved HIPAA-compliant prospective MR study design was used, with written informed consent from all subjects prior to MR sessions. Seven healthy subjects were imaged prospectively at 1.5 T with ultra-low-SAR optimized three-dimensional (3D) fast spin-echo (FSE) and fluid-attenuated inversion-recovery (FLAIR) T2-weighted sequences and an ultra-low-SAR 3D spoiled gradient-recalled acquisition in the steady state T1-weighted sequence. Corresponding high-SAR two-dimensional (2D) clinical sequences were also performed. In addition to qualitative comparisons, absolute signal-to-noise ratios (SNRs) and contrast-to-noise ratios (CNRs) for multicoil, parallel imaging acquisitions were generated by using a Monte Carlo method for quantitative comparison between ultra-low-SAR and high-SAR results. There were minor to moderate differences in the absolute tissue SNR and CNR values and in qualitative appearance of brain images obtained by using ultra-low-SAR and high-SAR techniques. High-SAR 2D T2-weighted imaging produced slightly higher SNR, while ultra-low-SAR 3D technique not only produced higher SNR for T1-weighted and FLAIR images but also higher CNRs for all three sequences for most of the brain tissues. The 3D techniques adopted here led to a decrease in the absorbed RF power by two orders of magnitude at 1.5 T, and still the image quality was preserved within clinically acceptable imaging times. RSNA, 2011
Hohenlohe, Paul A.; Day, Mitch D.; Amish, Stephen J.; Miller, Michael R.; Kamps-Hughes, Nick; Boyer, Matthew C.; Muhlfeld, Clint C.; Allendorf, Fred W.; Johnson, Eric A.; Luikart, Gordon
2013-01-01
Rapid and inexpensive methods for genomewide single nucleotide polymorphism (SNP) discovery and genotyping are urgently needed for population management and conservation. In hybridized populations, genomic techniques that can identify and genotype thousands of species-diagnostic markers would allow precise estimates of population- and individual-level admixture as well as identification of 'super invasive' alleles, which show elevated rates of introgression above the genomewide background (likely due to natural selection). Techniques like restriction-site-associated DNA (RAD) sequencing can discover and genotype large numbers of SNPs, but they have been limited by the length of continuous sequence data they produce with Illumina short-read sequencing. We present a novel approach, overlapping paired-end RAD sequencing, to generate RAD contigs of >300–400 bp. These contigs provide sufficient flanking sequence for design of high-throughput SNP genotyping arrays and strict filtering to identify duplicate paralogous loci. We applied this approach in five populations of native westslope cutthroat trout that previously showed varying (low) levels of admixture from introduced rainbow trout (RBT). We produced 77 141 RAD contigs and used these data to filter and genotype 3180 previously identified species-diagnostic SNP loci. Our population-level and individual-level estimates of admixture were generally consistent with previous microsatellite-based estimates from the same individuals. However, we observed slightly lower admixture estimates from genomewide markers, which might result from natural selection against certain genome regions, different genomic locations for microsatellites vs. RAD-derived SNPs and/or sampling error from the small number of microsatellite loci (n = 7). We also identified candidate adaptive super invasive alleles from RBT that had excessively high admixture proportions in hybridized cutthroat trout populations.
JVM: Java Visual Mapping tool for next generation sequencing read.
Yang, Ye; Liu, Juan
2015-01-01
We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB.
Malware Analysis Using Visualized Image Matrices
Im, Eul Gyu
2014-01-01
This paper proposes a novel malware visual analysis method that contains not only a visualization method to convert binary files into images, but also a similarity calculation method between these images. The proposed method generates RGB-colored pixels on image matrices using the opcode sequences extracted from malware samples and calculates the similarities for the image matrices. Particularly, our proposed methods are available for packed malware samples by applying them to the execution traces extracted through dynamic analysis. When the images are generated, we can reduce the overheads by extracting the opcode sequences only from the blocks that include the instructions related to staple behaviors such as functions and application programming interface (API) calls. In addition, we propose a technique that generates a representative image for each malware family in order to reduce the number of comparisons for the classification of unknown samples and the colored pixel information in the image matrices is used to calculate the similarities between the images. Our experimental results show that the image matrices of malware can effectively be used to classify malware families both statically and dynamically with accuracy of 0.9896 and 0.9732, respectively. PMID:25133202
Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.
Kleftogiannis, Dimitrios; Kalnis, Panos; Bajic, Vladimir B
2013-01-01
A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.
Automatic summarization of soccer highlights using audio-visual descriptors.
Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc
2015-01-01
Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Y; Subashi, E; Yin, F
Purpose: Current retrospective 4D-MRI provides superior tumor-to-tissue contrast and accurate respiratory motion information for radiotherapy motion management. The developed 4D-MRI techniques based on 2D-MRI image sorting require a high frame-rate of the MR sequences. However, several MRI sequences provide excellent image quality but have low frame-rate. This study aims at developing a novel retrospective 3D k-space sorting 4D-MRI technique using radial k-space acquisition MRI sequences to improve 4D-MRI image quality and temporal-resolution for imaging irregular organ/tumor respiratory motion. Methods: The method is based on a RF-spoiled, steady-state, gradient-recalled sequence with minimal echo time. A 3D radial k-space data acquisition trajectorymore » was used for sampling the datasets. Each radial spoke readout data line starts from the 3D center of Field-of-View. Respiratory signal can be extracted from the k-space center data point of each spoke. The spoke data was sorted based on its self-synchronized respiratory signal using phase sorting. Subsequently, 3D reconstruction was conducted to generate the time-resolved 4D-MRI images. As a feasibility study, this technique was implemented on a digital human phantom XCAT. The respiratory motion was controlled by an irregular motion profile. To validate using k-space center data as a respiratory surrogate, we compared it with the XCAT input controlling breathing profile. Tumor motion trajectories measured on reconstructed 4D-MRI were compared to the average input trajectory. The mean absolute amplitude difference (D) was calculated. Results: The signal extracted from k-space center data matches well with the input controlling respiratory profile of XCAT. The relative amplitude error was 8.6% and the relative phase error was 3.5%. XCAT 4D-MRI demonstrated a clear motion pattern with little serrated artifacts. D of tumor trajectories was 0.21mm, 0.23mm and 0.23mm in SI, AP and ML directions, respectively. Conclusion: A novel retrospective 3D k-space sorting 4D-MRI technique has been developed and evaluated on human digital phantom. NIH (1R21CA165384-01A1)« less
Hiremath, Pavana J; Farmer, Andrew; Cannon, Steven B; Woodward, Jimmy; Kudapa, Himabindu; Tuteja, Reetu; Kumar, Ashish; Bhanuprakash, Amindala; Mulaosmanovic, Benjamin; Gujaria, Neha; Krishnamurthy, Laxmanan; Gaur, Pooran M; Kavikishor, Polavarapu B; Shah, Trushar; Srinivasan, Ramamurthy; Lohse, Marc; Xiao, Yongli; Town, Christopher D; Cook, Douglas R; May, Gregory D; Varshney, Rajeev K
2011-10-01
Chickpea (Cicer arietinum L.) is an important legume crop in the semi-arid regions of Asia and Africa. Gains in crop productivity have been low however, particularly because of biotic and abiotic stresses. To help enhance crop productivity using molecular breeding techniques, next generation sequencing technologies such as Roche/454 and Illumina/Solexa were used to determine the sequence of most gene transcripts and to identify drought-responsive genes and gene-based molecular markers. A total of 103,215 tentative unique sequences (TUSs) have been produced from 435,018 Roche/454 reads and 21,491 Sanger expressed sequence tags (ESTs). Putative functions were determined for 49,437 (47.8%) of the TUSs, and gene ontology assignments were determined for 20,634 (41.7%) of the TUSs. Comparison of the chickpea TUSs with the Medicago truncatula genome assembly (Mt 3.5.1 build) resulted in 42,141 aligned TUSs with putative gene structures (including 39,281 predicted intron/splice junctions). Alignment of ∼37 million Illumina/Solexa tags generated from drought-challenged root tissues of two chickpea genotypes against the TUSs identified 44,639 differentially expressed TUSs. The TUSs were also used to identify a diverse set of markers, including 728 simple sequence repeats (SSRs), 495 single nucleotide polymorphisms (SNPs), 387 conserved orthologous sequence (COS) markers, and 2088 intron-spanning region (ISR) markers. This resource will be useful for basic and applied research for genome analysis and crop improvement in chickpea. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd. No claim to original US government works.
Next-Generation Sequencing and Genome Editing in Plant Virology
Hadidi, Ahmed; Flores, Ricardo; Candresse, Thierry; Barba, Marina
2016-01-01
Next-generation sequencing (NGS) has been applied to plant virology since 2009. NGS provides highly efficient, rapid, low cost DNA, or RNA high-throughput sequencing of the genomes of plant viruses and viroids and of the specific small RNAs generated during the infection process. These small RNAs, which cover frequently the whole genome of the infectious agent, are 21–24 nt long and are known as vsRNAs for viruses and vd-sRNAs for viroids. NGS has been used in a number of studies in plant virology including, but not limited to, discovery of novel viruses and viroids as well as detection and identification of those pathogens already known, analysis of genome diversity and evolution, and study of pathogen epidemiology. The genome engineering editing method, clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been successfully used recently to engineer resistance to DNA geminiviruses (family, Geminiviridae) by targeting different viral genome sequences in infected Nicotiana benthamiana or Arabidopsis plants. The DNA viruses targeted include tomato yellow leaf curl virus and merremia mosaic virus (begomovirus); beet curly top virus and beet severe curly top virus (curtovirus); and bean yellow dwarf virus (mastrevirus). The technique has also been used against the RNA viruses zucchini yellow mosaic virus, papaya ringspot virus and turnip mosaic virus (potyvirus) and cucumber vein yellowing virus (ipomovirus, family, Potyviridae) by targeting the translation initiation genes eIF4E in cucumber or Arabidopsis plants. From these recent advances of major importance, it is expected that NGS and CRISPR-Cas technologies will play a significant role in the very near future in advancing the field of plant virology and connecting it with other related fields of biology. PMID:27617007
Assessing the impact of transcriptomics, proteomics and metabolomics on fungal phytopathology.
Tan, Kar-Chun; Ipcho, Simon V S; Trengove, Robert D; Oliver, Richard P; Solomon, Peter S
2009-09-01
SUMMARY Peer-reviewed literature is today littered with exciting new tools and techniques that are being used in all areas of biology and medicine. Transcriptomics, proteomics and, more recently, metabolomics are three of these techniques that have impacted on fungal plant pathology. Used individually, each of these techniques can generate a plethora of data that could occupy a laboratory for years. When used in combination, they have the potential to comprehensively dissect a system at the transcriptional and translational level. Transcriptomics, or quantitative gene expression profiling, is arguably the most familiar to researchers in the field of fungal plant pathology. Microarrays have been the primary technique for the last decade, but others are now emerging. Proteomics has also been exploited by the fungal phytopathogen community, but perhaps not to its potential. A lack of genome sequence information has frustrated proteomics researchers and has largely contributed to this technique not fulfilling its potential. The coming of the genome sequencing era has partially alleviated this problem. Metabolomics is the most recent of these techniques to emerge and is concerned with the non-targeted profiling of all metabolites in a given system. Metabolomics studies on fungal plant pathogens are only just beginning to appear, although its potential to dissect many facets of the pathogen and disease will see its popularity increase quickly. This review assesses the impact of transcriptomics, proteomics and metabolomics on fungal plant pathology over the last decade and discusses their futures. Each of the techniques is described briefly with further reading recommended. Key examples highlighting the application of these technologies to fungal plant pathogens are also reviewed.
Dudley, Dawn M.; Chin, Emily N.; Bimber, Benjamin N.; Sanabani, Sabri S.; Tarosso, Leandro F.; Costa, Priscilla R.; Sauer, Mariana M.; Kallas, Esper G.; O.’Connor, David H.
2012-01-01
Background Great efforts have been made to increase accessibility of HIV antiretroviral therapy (ART) in low and middle-income countries. The threat of wide-scale emergence of drug resistance could severely hamper ART scale-up efforts. Population-based surveillance of transmitted HIV drug resistance ensures the use of appropriate first-line regimens to maximize efficacy of ART programs where drug options are limited. However, traditional HIV genotyping is extremely expensive, providing a cost barrier to wide-scale and frequent HIV drug resistance surveillance. Methods/Results We have developed a low-cost laboratory-scale next-generation sequencing-based genotyping method to monitor drug resistance. We designed primers specifically to amplify protease and reverse transcriptase from Brazilian HIV subtypes and developed a multiplexing scheme using multiplex identifier tags to minimize cost while providing more robust data than traditional genotyping techniques. Using this approach, we characterized drug resistance from plasma in 81 HIV infected individuals collected in São Paulo, Brazil. We describe the complexities of analyzing next-generation sequencing data and present a simplified open-source workflow to analyze drug resistance data. From this data, we identified drug resistance mutations in 20% of treatment naïve individuals in our cohort, which is similar to frequencies identified using traditional genotyping in Brazilian patient samples. Conclusion The developed ultra-wide sequencing approach described here allows multiplexing of at least 48 patient samples per sequencing run, 4 times more than the current genotyping method. This method is also 4-fold more sensitive (5% minimal detection frequency vs. 20%) at a cost 3–5× less than the traditional Sanger-based genotyping method. Lastly, by using a benchtop next-generation sequencer (Roche/454 GS Junior), this approach can be more easily implemented in low-resource settings. This data provides proof-of-concept that next-generation HIV drug resistance genotyping is a feasible and low-cost alternative to current genotyping methods and may be particularly beneficial for in-country surveillance of transmitted drug resistance. PMID:22574170
System, method and apparatus for generating phrases from a database
NASA Technical Reports Server (NTRS)
McGreevy, Michael W. (Inventor)
2004-01-01
A phrase generation is a method of generating sequences of terms, such as phrases, that may occur within a database of subsets containing sequences of terms, such as text. A database is provided and a relational model of the database is created. A query is then input. The query includes a term or a sequence of terms or multiple individual terms or multiple sequences of terms or combinations thereof. Next, several sequences of terms that are contextually related to the query are assembled from contextual relations in the model of the database. The sequences of terms are then sorted and output. Phrase generation can also be an iterative process used to produce sequences of terms from a relational model of a database.
Methods for MHC genotyping in non-model vertebrates.
Babik, W
2010-03-01
Genes of the major histocompatibility complex (MHC) are considered a paradigm of adaptive evolution at the molecular level and as such are frequently investigated by evolutionary biologists and ecologists. Accurate genotyping is essential for understanding of the role that MHC variation plays in natural populations, but may be extremely challenging. Here, I discuss the DNA-based methods currently used for genotyping MHC in non-model vertebrates, as well as techniques likely to find widespread use in the future. I also highlight the aspects of MHC structure that are relevant for genotyping, and detail the challenges posed by the complex genomic organization and high sequence variation of MHC loci. Special emphasis is placed on designing appropriate PCR primers, accounting for artefacts and the problem of genotyping alleles from multiple, co-amplifying loci, a strategy which is frequently necessary due to the structure of the MHC. The suitability of typing techniques is compared in various research situations, strategies for efficient genotyping are discussed and areas of likely progress in future are identified. This review addresses the well established typing methods such as the Single Strand Conformation Polymorphism (SSCP), Denaturing Gradient Gel Electrophoresis (DGGE), Reference Strand Conformational Analysis (RSCA) and cloning of PCR products. In addition, it includes the intriguing possibility of direct amplicon sequencing followed by the computational inference of alleles and also next generation sequencing (NGS) technologies; the latter technique may, in the future, find widespread use in typing complex multilocus MHC systems. © 2009 Blackwell Publishing Ltd.
Sma3s: a three-step modular annotator for large sequence datasets.
Muñoz-Mérida, Antonio; Viguera, Enrique; Claros, M Gonzalo; Trelles, Oswaldo; Pérez-Pulido, Antonio J
2014-08-01
Automatic sequence annotation is an essential component of modern 'omics' studies, which aim to extract information from large collections of sequence data. Most existing tools use sequence homology to establish evolutionary relationships and assign putative functions to sequences. However, it can be difficult to define a similarity threshold that achieves sufficient coverage without sacrificing annotation quality. Defining the correct configuration is critical and can be challenging for non-specialist users. Thus, the development of robust automatic annotation techniques that generate high-quality annotations without needing expert knowledge would be very valuable for the research community. We present Sma3s, a tool for automatically annotating very large collections of biological sequences from any kind of gene library or genome. Sma3s is composed of three modules that progressively annotate query sequences using either: (i) very similar homologues, (ii) orthologous sequences or (iii) terms enriched in groups of homologous sequences. We trained the system using several random sets of known sequences, demonstrating average sensitivity and specificity values of ~85%. In conclusion, Sma3s is a versatile tool for high-throughput annotation of a wide variety of sequence datasets that outperforms the accuracy of other well-established annotation algorithms, and it can enrich existing database annotations and uncover previously hidden features. Importantly, Sma3s has already been used in the functional annotation of two published transcriptomes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning
Xu, Shengzhi; Cheng, Xiang; Li, Zhengyi; Xiong, Li
2016-01-01
In this paper, we study the problem of mining frequent sequences under the rigorous differential privacy model. We explore the possibility of designing a differentially private frequent sequence mining (FSM) algorithm which can achieve both high data utility and a high degree of privacy. We found, in differentially private FSM, the amount of required noise is proportionate to the number of candidate sequences. If we could effectively reduce the number of unpromising candidate sequences, the utility and privacy tradeoff can be significantly improved. To this end, by leveraging a sampling-based candidate pruning technique, we propose a novel differentially private FSM algorithm, which is referred to as PFS2. The core of our algorithm is to utilize sample databases to further prune the candidate sequences generated based on the downward closure property. In particular, we use the noisy local support of candidate sequences in the sample databases to estimate which sequences are potentially frequent. To improve the accuracy of such private estimations, a sequence shrinking method is proposed to enforce the length constraint on the sample databases. Moreover, to decrease the probability of misestimating frequent sequences as infrequent, a threshold relaxation method is proposed to relax the user-specified threshold for the sample databases. Through formal privacy analysis, we show that our PFS2 algorithm is ε-differentially private. Extensive experiments on real datasets illustrate that our PFS2 algorithm can privately find frequent sequences with high accuracy. PMID:26973430
Phage-mediated Delivery of Targeted sRNA Constructs to Knock Down Gene Expression in E. coli.
Bernheim, Aude G; Libis, Vincent K; Lindner, Ariel B; Wintermute, Edwin H
2016-03-20
RNA-mediated knockdowns are widely used to control gene expression. This versatile family of techniques makes use of short RNA (sRNA) that can be synthesized with any sequence and designed to complement any gene targeted for silencing. Because sRNA constructs can be introduced to many cell types directly or using a variety of vectors, gene expression can be repressed in living cells without laborious genetic modification. The most common RNA knockdown technology, RNA interference (RNAi), makes use of the endogenous RNA-induced silencing complex (RISC) to mediate sequence recognition and cleavage of the target mRNA. Applications of this technique are therefore limited to RISC-expressing organisms, primarily eukaryotes. Recently, a new generation of RNA biotechnologists have developed alternative mechanisms for controlling gene expression through RNA, and so made possible RNA-mediated gene knockdowns in bacteria. Here we describe a method for silencing gene expression in E. coli that functionally resembles RNAi. In this system a synthetic phagemid is designed to express sRNA, which may designed to target any sequence. The expression construct is delivered to a population of E. coli cells with non-lytic M13 phage, after which it is able to stably replicate as a plasmid. Antisense recognition and silencing of the target mRNA is mediated by the Hfq protein, endogenous to E. coli. This protocol includes methods for designing the antisense sRNA, constructing the phagemid vector, packaging the phagemid into M13 bacteriophage, preparing a live cell population for infection, and performing the infection itself. The fluorescent protein mKate2 and the antibiotic resistance gene chloramphenicol acetyltransferase (CAT) are targeted to generate representative data and to quantify knockdown effectiveness.
Anis, Eman; Hawkins, Ian K; Ilha, Marcia R S; Woldemeskel, Moges W; Saliki, Jeremiah T; Wilkes, Rebecca P
2018-07-01
The laboratory diagnosis of infectious diseases, especially those caused by mixed infections, is challenging. Routinely, it requires submission of multiple samples to separate laboratories. Advances in next-generation sequencing (NGS) have provided the opportunity for development of a comprehensive method to identify infectious agents. This study describes the use of target-specific primers for PCR-mediated amplification with the NGS technology in which pathogen genomic regions of interest are enriched and selectively sequenced from clinical samples. In the study, 198 primers were designed to target 43 common bovine and small-ruminant bacterial, fungal, viral, and parasitic pathogens, and a bioinformatics tool was specifically constructed for the detection of targeted pathogens. The primers were confirmed to detect the intended pathogens by testing reference strains and isolates. The method was then validated using 60 clinical samples (including tissues, feces, and milk) that were also tested with other routine diagnostic techniques. The detection limits of the targeted NGS method were evaluated using 10 representative pathogens that were also tested by quantitative PCR (qPCR), and the NGS method was able to detect the organisms from samples with qPCR threshold cycle ( C T ) values in the 30s. The method was successful for the detection of multiple pathogens in the clinical samples, including some additional pathogens missed by the routine techniques because the specific tests needed for the particular organisms were not performed. The results demonstrate the feasibility of the approach and indicate that it is possible to incorporate NGS as a diagnostic tool in a cost-effective manner into a veterinary diagnostic laboratory. Copyright © 2018 Anis et al.
W-curve alignments for HIV-1 genomic comparisons.
Cork, Douglas J; Lembark, Steven; Tovanabutra, Sodsai; Robb, Merlin L; Kim, Jerome H
2010-06-01
The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly. We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison. The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE. Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem.
Malc, Ewa P.; Jayakody, Chatura N.; Tsuruta, James K.; Mieczkowski, Piotr A.; Janzen, William P.; Dayton, Paul A.
2015-01-01
A perfluorocarbon nanodroplet formulation is shown to be an effective cavitation enhancement agent, enabling rapid and consistent fragmentation of genomic DNA in a standard ultrasonic water bath. This nanodroplet-enhanced method produces genomic DNA libraries and next-generation sequencing results indistinguishable from DNA samples fragmented in dedicated commercial acoustic sonication equipment, and with higher throughput. This technique thus enables widespread access to fast bench-top genomic DNA fragmentation. PMID:26186461
DNA Cryptography and Deep Learning using Genetic Algorithm with NW algorithm for Key Generation.
Kalsi, Shruti; Kaur, Harleen; Chang, Victor
2017-12-05
Cryptography is not only a science of applying complex mathematics and logic to design strong methods to hide data called as encryption, but also to retrieve the original data back, called decryption. The purpose of cryptography is to transmit a message between a sender and receiver such that an eavesdropper is unable to comprehend it. To accomplish this, not only we need a strong algorithm, but a strong key and a strong concept for encryption and decryption process. We have introduced a concept of DNA Deep Learning Cryptography which is defined as a technique of concealing data in terms of DNA sequence and deep learning. In the cryptographic technique, each alphabet of a letter is converted into a different combination of the four bases, namely; Adenine (A), Cytosine (C), Guanine (G) and Thymine (T), which make up the human deoxyribonucleic acid (DNA). Actual implementations with the DNA don't exceed laboratory level and are expensive. To bring DNA computing on a digital level, easy and effective algorithms are proposed in this paper. In proposed work we have introduced firstly, a method and its implementation for key generation based on the theory of natural selection using Genetic Algorithm with Needleman-Wunsch (NW) algorithm and Secondly, a method for implementation of encryption and decryption based on DNA computing using biological operations Transcription, Translation, DNA Sequencing and Deep Learning.
Touch HDR: photograph enhancement by user controlled wide dynamic range adaptation
NASA Astrophysics Data System (ADS)
Verrall, Steve; Siddiqui, Hasib; Atanassov, Kalin; Goma, Sergio; Ramachandra, Vikas
2013-03-01
High Dynamic Range (HDR) technology enables photographers to capture a greater range of tonal detail. HDR is typically used to bring out detail in a dark foreground object set against a bright background. HDR technologies include multi-frame HDR and single-frame HDR. Multi-frame HDR requires the combination of a sequence of images taken at different exposures. Single-frame HDR requires histogram equalization post-processing of a single image, a technique referred to as local tone mapping (LTM). Images generated using HDR technology can look less natural than their non- HDR counterparts. Sometimes it is only desired to enhance small regions of an original image. For example, it may be desired to enhance the tonal detail of one subject's face while preserving the original background. The Touch HDR technique described in this paper achieves these goals by enabling selective blending of HDR and non-HDR versions of the same image to create a hybrid image. The HDR version of the image can be generated by either multi-frame or single-frame HDR. Selective blending can be performed as a post-processing step, for example, as a feature of a photo editor application, at any time after the image has been captured. HDR and non-HDR blending is controlled by a weighting surface, which is configured by the user through a sequence of touches on a touchscreen.
NASA Astrophysics Data System (ADS)
Sen, Suman
DNA, RNA and Protein are three pivotal biomolecules in human and other organisms, playing decisive roles in functionality, appearance, diseases development and other physiological phenomena. Hence, sequencing of these biomolecules acquires the prime interest in the scientific community. Single molecular identification of their building blocks can be done by a technique called Recognition Tunneling (RT) based on Scanning Tunneling Microscope (STM). A single layer of specially designed recognition molecule is attached to the STM electrodes, which trap the targeted molecules (DNA nucleoside monophosphates, RNA nucleoside monophosphates or amino acids) inside the STM nanogap. Depending on their different binding interactions with the recognition molecules, the analyte molecules generate stochastic signal trains accommodating their "electronic fingerprints". Signal features are used to detect the molecules using a machine learning algorithm and different molecules can be identified with significantly high accuracy. This, in turn, paves the way for rapid, economical nanopore sequencing platform, overcoming the drawbacks of Next Generation Sequencing (NGS) techniques. To read DNA nucleotides with high accuracy in an STM tunnel junction a series of nitrogen-based heterocycles were designed and examined to check their capabilities to interact with naturally occurring DNA nucleotides by hydrogen bonding in the tunnel junction. These recognition molecules are Benzimidazole, Imidazole, Triazole and Pyrrole. Benzimidazole proved to be best among them showing DNA nucleotide classification accuracy close to 99%. Also, Imidazole reader can read an abasic monophosphate (AP), a product from depurination or depyrimidination that occurs 10,000 times per human cell per day. In another study, I have investigated a new universal reader, 1-(2-mercaptoethyl)pyrene (Pyrene reader) based on stacking interactions, which should be more specific to the canonical DNA nucleosides. In addition, Pyrene reader showed higher DNA base-calling accuracy compare to Imidazole reader, the workhorse in our previous projects. In my other projects, various amino acids and RNA nucleoside monophosphates were also classified with significantly high accuracy using RT. Twenty naturally occurring amino acids and various RNA nucleosides (four canonical and two modified) were successfully identified. Thus, we envision nanopore sequencing biomolecules using Recognition Tunneling (RT) that should provide comprehensive betterment over current technologies in terms of time, chemical and instrumental cost and capability of de novo sequencing.
Trujillano, Daniel; Bullich, Gemma; Ossowski, Stephan; Ballarín, José; Torra, Roser; Estivill, Xavier; Ars, Elisabet
2014-09-01
Molecular diagnostics of autosomal dominant polycystic kidney disease (ADPKD) relies on mutation screening of PKD1 and PKD2, which is complicated by extensive allelic heterogeneity and the presence of six highly homologous sequences of PKD1. To date, specific sequencing of PKD1 requires laborious long-range amplifications. The high cost and long turnaround time of PKD1 and PKD2 mutation analysis using conventional techniques limits its widespread application in clinical settings. We performed targeted next-generation sequencing (NGS) of PKD1 and PKD2. Pooled barcoded DNA patient libraries were enriched by in-solution hybridization with PKD1 and PKD2 capture probes. Bioinformatics analysis was performed using an in-house developed pipeline. We validated the assay in a cohort of 36 patients with previously known PKD1 and PKD2 mutations and five control individuals. Then, we used the same assay and bioinformatics analysis in a discovery cohort of 12 uncharacterized patients. We detected 35 out of 36 known definitely, highly likely, and likely pathogenic mutations in the validation cohort, including two large deletions. In the discovery cohort, we detected 11 different pathogenic mutations in 10 out of 12 patients. This study demonstrates that laborious long-range PCRs of the repeated PKD1 region can be avoided by in-solution enrichment of PKD1 and PKD2 and NGS. This strategy significantly reduces the cost and time for simultaneous PKD1 and PKD2 sequence analysis, facilitating routine genetic diagnostics of ADPKD.
Method for high resolution magnetic resonance analysis using magic angle technique
Wind, Robert A.; Hu, Jian Zhi
2003-11-25
A method of performing a magnetic resonance analysis of a biological object that includes placing the biological object in a main magnetic field and in a radio frequency field, the main magnetic field having a static field direction; rotating the biological object at a rotational frequency of less than about 100 Hz around an axis positioned at an angle of about 54.degree.44' relative to the main magnetic static field direction; pulsing the radio frequency to provide a sequence that includes a magic angle turning pulse segment; and collecting data generated by the pulsed radio frequency. According to another embodiment, the radio frequency is pulsed to provide a sequence capable of producing a spectrum that is substantially free of spinning sideband peaks.
Fuzzy logic based on-line fault detection and classification in transmission line.
Adhikari, Shuma; Sinha, Nidul; Dorendrajit, Thingam
2016-01-01
This study presents fuzzy logic based online fault detection and classification of transmission line using Programmable Automation and Control technology based National Instrument Compact Reconfigurable i/o (CRIO) devices. The LabVIEW software combined with CRIO can perform real time data acquisition of transmission line. When fault occurs in the system current waveforms are distorted due to transients and their pattern changes according to the type of fault in the system. The three phase alternating current, zero sequence and positive sequence current data generated by LabVIEW through CRIO-9067 are processed directly for relaying. The result shows that proposed technique is capable of right tripping action and classification of type of fault at high speed therefore can be employed in practical application.
Alteration of hairpin ribozyme specificity utilizing PCR.
DeGrandis, P; Hampel, A; Galasinski, S; Borneman, J; Siwkowski, A; Altschuler, M
1994-12-01
We have developed a method by which a researcher can quickly alter the specificity of a trans hairpin ribozyme. Utilizing this PCR method, two oligonucleotides, and any target vector, new ribozyme template sequences can be generated without the synthesis of longer oligonucleotides. We have produced templates with altered specificity for both standard and modified (larger) ribozymes. After transcription, these ribozymes show specific cleavage activity with the new substrate beta-glucuronidase (GUS), and no activity against the original substrate (HIV-1, 5' leader sequence). Utilizing this technique, it is also possible to produce an inactive ribozyme that can be used as an antisense control. Applications of this procedure would provide a rapid and economical system for the assessment of trans ribozyme activity.
Dohrn, Maike F; Glöckle, Nicola; Mulahasanovic, Lejla; Heller, Corina; Mohr, Julia; Bauer, Christine; Riesch, Erik; Becker, Andrea; Battke, Florian; Hörtnagel, Konstanze; Hornemann, Thorsten; Suriyanarayanan, Saranya; Blankenburg, Markus; Schulz, Jörg B; Claeys, Kristl G; Gess, Burkhard; Katona, Istvan; Ferbert, Andreas; Vittore, Debora; Grimm, Alexander; Wolking, Stefan; Schöls, Ludger; Lerche, Holger; Korenke, G Christoph; Fischer, Dirk; Schrank, Bertold; Kotzaeridou, Urania; Kurlemann, Gerhard; Dräger, Bianca; Schirmacher, Anja; Young, Peter; Schlotter-Weigel, Beate; Biskup, Saskia
2017-12-01
Hereditary neuropathies comprise a wide variety of chronic diseases associated to more than 80 genes identified to date. We herein examined 612 index patients with either a Charcot-Marie-Tooth phenotype, hereditary sensory neuropathy, familial amyloid neuropathy, or small fiber neuropathy using a customized multigene panel based on the next generation sequencing technique. In 121 cases (19.8%), we identified at least one putative pathogenic mutation. Of these, 54.4% showed an autosomal dominant, 33.9% an autosomal recessive, and 11.6% an X-linked inheritance. The most frequently affected genes were PMP22 (16.4%), GJB1 (10.7%), MPZ, and SH3TC2 (both 9.9%), and MFN2 (8.3%). We further detected likely or known pathogenic variants in HINT1, HSPB1, NEFL, PRX, IGHMBP2, NDRG1, TTR, EGR2, FIG4, GDAP1, LMNA, LRSAM1, POLG, TRPV4, AARS, BIC2, DHTKD1, FGD4, HK1, INF2, KIF5A, PDK3, REEP1, SBF1, SBF2, SCN9A, and SPTLC2 with a declining frequency. Thirty-four novel variants were considered likely pathogenic not having previously been described in association with any disorder in the literature. In one patient, two homozygous mutations in HK1 were detected in the multigene panel, but not by whole exome sequencing. A novel missense mutation in KIF5A was considered pathogenic because of the highly compatible phenotype. In one patient, the plasma sphingolipid profile could functionally prove the pathogenicity of a mutation in SPTLC2. One pathogenic mutation in MPZ was identified after being previously missed by Sanger sequencing. We conclude that panel based next generation sequencing is a useful, time- and cost-effective approach to assist clinicians in identifying the correct diagnosis and enable causative treatment considerations. © 2017 International Society for Neurochemistry.
Next-Generation Sequencing Reveals Significant Bacterial Diversity of Botrytized Wine
Bokulich, Nicholas A.; Joseph, C. M. Lucy; Allen, Greg; Benson, Andrew K.; Mills, David A.
2012-01-01
While wine fermentation has long been known to involve complex microbial communities, the composition and role of bacteria other than a select set of lactic acid bacteria (LAB) has often been assumed either negligible or detrimental. This study served as a pilot study for using barcoded amplicon next-generation sequencing to profile bacterial community structure in wines and grape musts, comparing the taxonomic depth achieved by sequencing two different domains of prokaryotic 16S rDNA (V4 and V5). This study was designed to serve two goals: 1) to empirically determine the most taxonomically informative 16S rDNA target region for barcoded amplicon sequencing of wine, comparing V4 and V5 domains of bacterial 16S rDNA to terminal restriction fragment length polymorphism (TRFLP) of LAB communities; and 2) to explore the bacterial communities of wine fermentation to better understand the biodiversity of wine at a depth previously unattainable using other techniques. Analysis of amplicons from the V4 and V5 provided similar views of the bacterial communities of botrytized wine fermentations, revealing a broad diversity of low-abundance taxa not traditionally associated with wine, as well as atypical LAB communities initially detected by TRFLP. The V4 domain was determined as the more suitable read for wine ecology studies, as it provided greater taxonomic depth for profiling LAB communities. In addition, targeted enrichment was used to isolate two species of Alphaproteobacteria from a finished fermentation. Significant differences in diversity between inoculated and uninoculated samples suggest that Saccharomyces inoculation exerts selective pressure on bacterial diversity in these fermentations, most notably suppressing abundance of acetic acid bacteria. These results determine the bacterial diversity of botrytized wines to be far higher than previously realized, providing further insight into the fermentation dynamics of these wines, and demonstrate the utility of next-generation sequencing for wine ecology studies. PMID:22563494
Generating Models of Surgical Procedures using UMLS Concepts and Multiple Sequence Alignment
Meng, Frank; D’Avolio, Leonard W.; Chen, Andrew A.; Taira, Ricky K.; Kangarloo, Hooshang
2005-01-01
Surgical procedures can be viewed as a process composed of a sequence of steps performed on, by, or with the patient’s anatomy. This sequence is typically the pattern followed by surgeons when generating surgical report narratives for documenting surgical procedures. This paper describes a methodology for semi-automatically deriving a model of conducted surgeries, utilizing a sequence of derived Unified Medical Language System (UMLS) concepts for representing surgical procedures. A multiple sequence alignment was computed from a collection of such sequences and was used for generating the model. These models have the potential of being useful in a variety of informatics applications such as information retrieval and automatic document generation. PMID:16779094
Hybridization and sequencing of nucleic acids using base pair mismatches
Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua
2001-01-01
Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.
Micropipette force probe to quantify single-cell force generation: application to T-cell activation.
Sawicka, Anna; Babataheri, Avin; Dogniaux, Stéphanie; Barakat, Abdul I; Gonzalez-Rodriguez, David; Hivroz, Claire; Husson, Julien
2017-11-07
In response to engagement of surface molecules, cells generate active forces that regulate many cellular processes. Developing tools that permit gathering mechanical and morphological information on these forces is of the utmost importance. Here we describe a new technique, the micropipette force probe, that uses a micropipette as a flexible cantilever that can aspirate at its tip a bead that is coated with molecules of interest and is brought in contact with the cell. This technique simultaneously allows tracking the resulting changes in cell morphology and mechanics as well as measuring the forces generated by the cell. To illustrate the power of this technique, we applied it to the study of human primary T lymphocytes (T-cells). It allowed the fine monitoring of pushing and pulling forces generated by T-cells in response to various activating antibodies and bending stiffness of the micropipette. We further dissected the sequence of mechanical and morphological events occurring during T-cell activation to model force generation and to reveal heterogeneity in the cell population studied. We also report the first measurement of the changes in Young's modulus of T-cells during their activation, showing that T-cells stiffen within the first minutes of the activation process. © 2017 Sawicka et al. This article is distributed by The American Society for Cell Biology under license from the author(s). Two months after publication it is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).
Assessing the Impact of Assemblers on Virus Detection in a De Novo Metagenomic Analysis Pipeline.
White, Daniel J; Wang, Jing; Hall, Richard J
2017-09-01
Applying high-throughput sequencing to pathogen discovery is a relatively new field, the objective of which is to find disease-causing agents when little or no background information on disease is available. Key steps in the process are the generation of millions of sequence reads from an infected tissue sample, followed by assembly of these reads into longer, contiguous stretches of nucleotide sequences, and then identification of the contigs by matching them to known databases, such as those stored at GenBank or Ensembl. This technique, that is, de novo metagenomics, is particularly useful when the pathogen is viral and strong discriminatory power can be achieved. However, recently, we found that striking differences in results can be achieved when different assemblers were used. In this study, we test formally the impact of five popular assemblers (MIRA, VELVET, METAVELVET, SPADES, and OMEGA) on the detection of a novel virus and assembly of its whole genome in a data set for which we have confirmed the presence of the virus by empirical laboratory techniques, and compare the overall performance between assemblers. Our results show that if results from only one assembler are considered, biologically important reads can easily be overlooked. The impacts of these results on the field of pathogen discovery are considered.
Long Read Alignment with Parallel MapReduce Cloud Platform
Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki
2015-01-01
Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms. PMID:26839887
Long Read Alignment with Parallel MapReduce Cloud Platform.
Al-Absi, Ahmed Abdulhakim; Kang, Dae-Ki
2015-01-01
Genomic sequence alignment is an important technique to decode genome sequences in bioinformatics. Next-Generation Sequencing technologies produce genomic data of longer reads. Cloud platforms are adopted to address the problems arising from storage and analysis of large genomic data. Existing genes sequencing tools for cloud platforms predominantly consider short read gene sequences and adopt the Hadoop MapReduce framework for computation. However, serial execution of map and reduce phases is a problem in such systems. Therefore, in this paper, we introduce Burrows-Wheeler Aligner's Smith-Waterman Alignment on Parallel MapReduce (BWASW-PMR) cloud platform for long sequence alignment. The proposed cloud platform adopts a widely accepted and accurate BWA-SW algorithm for long sequence alignment. A custom MapReduce platform is developed to overcome the drawbacks of the Hadoop framework. A parallel execution strategy of the MapReduce phases and optimization of Smith-Waterman algorithm are considered. Performance evaluation results exhibit an average speed-up of 6.7 considering BWASW-PMR compared with the state-of-the-art Bwasw-Cloud. An average reduction of 30% in the map phase makespan is reported across all experiments comparing BWASW-PMR with Bwasw-Cloud. Optimization of Smith-Waterman results in reducing the execution time by 91.8%. The experimental study proves the efficiency of BWASW-PMR for aligning long genomic sequences on cloud platforms.
Stelzl, Evelyn; Haas, Bernhard; Bauer, Bernd; Zhang, Sherry; Fiss, Ellen H; Hillman, Grantland; Hamilton, Aaron T; Mehta, Rochak; Heil, Marintha L; Marins, Ed G; Santner, Brigitte I; Kessler, Harald H
2017-01-01
Hepatitis C virus (HCV) intergenotypic recombinant forms have been reported for various HCV genotypes/subtypes in several countries worldwide. In a recent study, four patients living in Austria had been identified to be possibly infected with a recombinant HCV strain. To clarify results and determine the point of recombination, full-genome next-generation sequencing using the Illumina MiSeq v2 300 cycle kit (Illumina, San Diego, CA, USA) was performed in the present study. Samples of all of the patients contained the recombinant HCV strain 2k/1b. The point of recombination was found to be within the HCV NS2 gene between nucleotide positions 3189-3200 based on H77 numbering. While three of four patients were male and had migration background from Chechnya (n = 2) and Azerbaijan (n = 1), the forth patient was a female born in Austria. Three of the four patients including the female had intravenous drug abuse as a risk factor for HCV transmission. While sequencing techniques are limited to a few specialized laboratories, a genotyping assay that uses both ends of the HCV genome should be employed to identify patients infected with a recombinant HCV strain. The correct identification of recombinant strains also has an impact considering the tailored choice of anti-HCV treatment.
ToTem: a tool for variant calling pipeline optimization.
Tom, Nikola; Tom, Ondrej; Malcikova, Jitka; Pavlova, Sarka; Kubesova, Blanka; Rausch, Tobias; Kolarik, Miroslav; Benes, Vladimir; Bystry, Vojtech; Pospisilova, Sarka
2018-06-26
High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. ToTem is a tool for automated pipeline optimization which is freely available as a web application at https://totem.software .
Mapping 3D genome architecture through in situ DNase Hi-C.
Ramani, Vijay; Cusanovich, Darren A; Hause, Ronald J; Ma, Wenxiu; Qiu, Ruolan; Deng, Xinxian; Blau, C Anthony; Disteche, Christine M; Noble, William S; Shendure, Jay; Duan, Zhijun
2016-11-01
With the advent of massively parallel sequencing, considerable work has gone into adapting chromosome conformation capture (3C) techniques to study chromosomal architecture at a genome-wide scale. We recently demonstrated that the inactive murine X chromosome adopts a bipartite structure using a novel 3C protocol, termed in situ DNase Hi-C. Like traditional Hi-C protocols, in situ DNase Hi-C requires that chromatin be chemically cross-linked, digested, end-repaired, and proximity-ligated with a biotinylated bridge adaptor. The resulting ligation products are optionally sheared, affinity-purified via streptavidin bead immobilization, and subjected to traditional next-generation library preparation for Illumina paired-end sequencing. Importantly, in situ DNase Hi-C obviates the dependence on a restriction enzyme to digest chromatin, instead relying on the endonuclease DNase I. Libraries generated by in situ DNase Hi-C have a higher effective resolution than traditional Hi-C libraries, which makes them valuable in cases in which high sequencing depth is allowed for, or when hybrid capture technologies are expected to be used. The protocol described here, which involves ∼4 d of bench work, is optimized for the study of mammalian cells, but it can be broadly applicable to any cell or tissue of interest, given experimental parameter optimization.
Spatio-Temporal Dynamics of Impulse Responses to Figure Motion in Optic Flow Neurons
Lee, Yu-Jen; Jönsson, H. Olof; Nordström, Karin
2015-01-01
White noise techniques have been used widely to investigate sensory systems in both vertebrates and invertebrates. White noise stimuli are powerful in their ability to rapidly generate data that help the experimenter decipher the spatio-temporal dynamics of neural and behavioral responses. One type of white noise stimuli, maximal length shift register sequences (m-sequences), have recently become particularly popular for extracting response kernels in insect motion vision. We here use such m-sequences to extract the impulse responses to figure motion in hoverfly lobula plate tangential cells (LPTCs). Figure motion is behaviorally important and many visually guided animals orient towards salient features in the surround. We show that LPTCs respond robustly to figure motion in the receptive field. The impulse response is scaled down in amplitude when the figure size is reduced, but its time course remains unaltered. However, a low contrast stimulus generates a slower response with a significantly longer time-to-peak and half-width. Impulse responses in females have a slower time-to-peak than males, but are otherwise similar. Finally we show that the shapes of the impulse response to a figure and a widefield stimulus are very similar, suggesting that the figure response could be coded by the same input as the widefield response. PMID:25955416
From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.
Kwok, Hin; Chiang, Alan Kwok Shing
2016-02-24
Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.
Program Synthesizes UML Sequence Diagrams
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Osborne, Richard N.
2006-01-01
A computer program called "Rational Sequence" generates Universal Modeling Language (UML) sequence diagrams of a target Java program running on a Java virtual machine (JVM). Rational Sequence thereby performs a reverse engineering function that aids in the design documentation of the target Java program. Whereas previously, the construction of sequence diagrams was a tedious manual process, Rational Sequence generates UML sequence diagrams automatically from the running Java code.
Yang, Litao; Liang, Wanqi; Jiang, Lingxi; Li, Wenquan; Cao, Wei; Wilson, Zoe A; Zhang, Dabing
2008-01-01
Background Real-time PCR techniques are being widely used for nucleic acids analysis, but one limitation of current frequently employed real-time PCR is the high cost of the labeled probe for each target molecule. Results We describe a real-time PCR technique employing attached universal duplex probes (AUDP), which has the advantage of generating fluorescence by probe hydrolysis and strand displacement over current real-time PCR methods. AUDP involves one set of universal duplex probes in which the 5' end of the fluorescent probe (FP) and a complementary quenching probe (QP) lie in close proximity so that fluorescence can be quenched. The PCR primer pair with attached universal template (UT) and the FP are identical to the UT sequence. We have shown that the AUDP technique can be used for detecting multiple target DNA sequences in both simplex and duplex real-time PCR assays for gene expression analysis, genotype identification, and genetically modified organism (GMO) quantification with comparable sensitivity, reproducibility, and repeatability with other real-time PCR methods. Conclusion The results from GMO quantification, gene expression analysis, genotype identification, and GMO quantification using AUDP real-time PCR assays indicate that the AUDP real-time PCR technique has been successfully applied in nucleic acids analysis, and the developed AUDP real-time PCR technique will offer an alternative way for nucleic acid analysis with high efficiency, reliability, and flexibility at low cost. PMID:18522756
USDA-ARS?s Scientific Manuscript database
Next-generation sequencing technologies were used to rapidly and efficiently sequence the genome of the domestic turkey (Meleagris gallopavo). The current genome assembly (~1.1 Gb) includes 917 Mb of sequence assigned to chromosomes. Innate heterozygosity of the sequenced bird allowed discovery of...
McKernan, Kevin J.; Spangler, Jessica; Zhang, Lei; Tadigotla, Vasisht; McLaughlin, Stephen; Warner, Jason; Zare, Amir; Boles, Richard G.
2014-01-01
We have developed a PCR method, coined Déjà vu PCR, that utilizes six nucleotides in PCR with two methyl specific restriction enzymes that respectively digest these additional nucleotides. Use of this enzyme-and-nucleotide combination enables what we term a “DNA diode”, where DNA can advance in a laboratory in only one direction and cannot feedback into upstream assays. Here we describe aspects of this method that enable consecutive amplification with the introduction of a 5th and 6th base while simultaneously providing methylation dependent mitochondrial DNA enrichment. These additional nucleotides enable a novel DNA decontamination technique that generates ephemeral and easy to decontaminate DNA. PMID:24788618
Recent patents of nanopore DNA sequencing technology: progress and challenges.
Zhou, Jianfeng; Xu, Bingqian
2010-11-01
DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Applications of nanotechnology, next generation sequencing and microarrays in biomedical research.
Elingaramil, Sauli; Li, Xiaolong; He, Nongyue
2013-07-01
Next-generation sequencing technologies, microarrays and advances in bio nanotechnology have had an enormous impact on research within a short time frame. This impact appears certain to increase further as many biomedical institutions are now acquiring these prevailing new technologies. Beyond conventional sampling of genome content, wide-ranging applications are rapidly evolving for next-generation sequencing, microarrays and nanotechnology. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted re sequencing and discovery of transcription factor binding sites, noncoding RNA expression profiling and molecular diagnostics. This paper thus discusses current applications of nanotechnology, next-generation sequencing technologies and microarrays in biomedical research and highlights the transforming potential these technologies offer.
Overcoming Sequence Misalignments with Weighted Structural Superposition
Khazanov, Nickolay A.; Damm-Ganamet, Kelly L.; Quang, Daniel X.; Carlson, Heather A.
2012-01-01
An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD’s robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, SSM, CE, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542
Yang, Lisha; Ijaz, Iqra; Cheng, Jingliang; Wei, Chunli; Tan, Xiaojun; Khan, Md Asaduzzaman; Fu, Xiaodong; Fu, Junjiang
2018-01-01
Choroideremia is a rare X-linked recessive inherited disorder that causes chorioretinal dystrophy leading to visual impairment in its early stages which finally causes total blindness in the affected person. It is caused due to mutations in the CHM gene. In this study, we have recruited a pedigree with choroideremia and detected a nonsense variant (c.C799T:p.R267X) in CHM of the proband (I:1). Different primer sets for amplification refractory mutation system (ARMS) were designed and PCR conditions were optimized. Then, we evaluated the sequence variant in the patient, carrier, and a fetus by using ARMS technique to identify if they inherited the pathogenic gene from parental generation; we used amniotic fluid DNA for the diagnosis of the gene in the fetus. The primer pairs, WT2+C and MT+C, amplified high specific products in different DNAs which were verified by Sanger sequencing. Based on our results, ARMS technique is fast, accurate, and reliable prenatal gene diagnostic tool to assess CHM variants. Taken together, our study indicates that ARMS technique can be used as a potential molecular tool in the diagnosis of prenatal mutation for choroideremia as well as other genetic diseases in undeveloped and developing countries, where there might be shortage of medical resources and supplies.
Biswas, Sovan; Sen, Suman; Im, JongOne; Biswas, Sudipta; Krstic, Predrag; Ashcroft, Brian; Borges, Chad; Zhao, Yanan; Lindsay, Stuart; Zhang, Peiming
2016-12-27
A reader molecule, which recognizes all the naturally occurring nucleobases in an electron tunnel junction, is required for sequencing DNA by a recognition tunneling (RT) technique, referred to as a universal reader. In the present study, we have designed a series of heterocyclic carboxamides based on hydrogen bonding and a large-sized pyrene ring based on a π-π stacking interaction as universal reader candidates. Each of these compounds was synthesized to bear a thiolated linker for attachment to metal electrodes and examined for their interactions with naturally occurring DNA nucleosides and nucleotides by 1 H NMR, ESI-MS, computational calculations, and surface plasmon resonance. RT measurements were carried out in a scanning tunnel microscope. All of these molecules generated electrical signals with DNA nucleotides in tunneling junctions under physiological conditions (phosphate buffered aqueous solution, pH 7.4). Using a support vector machine as a tool for data analysis, we found that these candidates distinguished among naturally occurring DNA nucleotides with the accuracy of pyrene (by π-π stacking interactions) > azole carboxamides (by hydrogen-bonding interactions). In addition, the pyrene reader operated efficiently in a larger tunnel junction. However, the azole carboxamide could read abasic (AP) monophosphate, a product from spontaneous base hydrolysis or an intermediate of base excision repair. Thus, we envision that sequencing DNA using both π-π stacking and hydrogen-bonding-based universal readers in parallel should generate more comprehensive genome sequences than sequencing based on either reader molecule alone.
Genome-wide gene–gene interaction analysis for next-generation sequencing
Zhao, Jinying; Zhu, Yun; Xiong, Momiao
2016-01-01
The critical barrier in interaction analysis for next-generation sequencing (NGS) data is that the traditional pairwise interaction analysis that is suitable for common variants is difficult to apply to rare variants because of their prohibitive computational time, large number of tests and low power. The great challenges for successful detection of interactions with NGS data are (1) the demands in the paradigm of changes in interaction analysis; (2) severe multiple testing; and (3) heavy computations. To meet these challenges, we shift the paradigm of interaction analysis between two SNPs to interaction analysis between two genomic regions. In other words, we take a gene as a unit of analysis and use functional data analysis techniques as dimensional reduction tools to develop a novel statistic to collectively test interaction between all possible pairs of SNPs within two genome regions. By intensive simulations, we demonstrate that the functional logistic regression for interaction analysis has the correct type 1 error rates and higher power to detect interaction than the currently used methods. The proposed method was applied to a coronary artery disease dataset from the Wellcome Trust Case Control Consortium (WTCCC) study and the Framingham Heart Study (FHS) dataset, and the early-onset myocardial infarction (EOMI) exome sequence datasets with European origin from the NHLBI's Exome Sequencing Project. We discovered that 6 of 27 pairs of significantly interacted genes in the FHS were replicated in the independent WTCCC study and 24 pairs of significantly interacted genes after applying Bonferroni correction in the EOMI study. PMID:26173972
NGS-based likelihood ratio for identifying contributors in two- and three-person DNA mixtures.
Chan Mun Wei, Joshua; Zhao, Zicheng; Li, Shuai Cheng; Ng, Yen Kaow
2018-06-01
DNA fingerprinting, also known as DNA profiling, serves as a standard procedure in forensics to identify a person by the short tandem repeat (STR) loci in their DNA. By comparing the STR loci between DNA samples, practitioners can calculate a probability of match to identity the contributors of a DNA mixture. Most existing methods are based on 13 core STR loci which were identified by the Federal Bureau of Investigation (FBI). Analyses based on these loci of DNA mixture for forensic purposes are highly variable in procedures, and suffer from subjectivity as well as bias in complex mixture interpretation. With the emergence of next-generation sequencing (NGS) technologies, the sequencing of billions of DNA molecules can be parallelized, thus greatly increasing throughput and reducing the associated costs. This allows the creation of new techniques that incorporate more loci to enable complex mixture interpretation. In this paper, we propose a computation for likelihood ratio that uses NGS (next generation sequencing) data for DNA testing on mixed samples. We have applied the method to 4480 simulated DNA mixtures, which consist of various mixture proportions of 8 unrelated whole-genome sequencing data. The results confirm the feasibility of utilizing NGS data in DNA mixture interpretations. We observed an average likelihood ratio as high as 285,978 for two-person mixtures. Using our method, all 224 identity tests for two-person mixtures and three-person mixtures were correctly identified. Copyright © 2018 Elsevier Ltd. All rights reserved.
Information entropy of humpback whale songs.
Suzuki, Ryuji; Buck, John R; Tyack, Peter L
2006-03-01
The structure of humpback whale (Megaptera novaeangliae) songs was examined using information theory techniques. The song is an ordered sequence of individual sound elements separated by gaps of silence. Song samples were converted into sequences of discrete symbols by both human and automated classifiers. This paper analyzes the song structure in these symbol sequences using information entropy estimators and autocorrelation estimators. Both parametric and nonparametric entropy estimators are applied to the symbol sequences representing the songs. The results provide quantitative evidence consistent with the hierarchical structure proposed for these songs by Payne and McVay [Science 173, 587-597 (1971)]. Specifically, this analysis demonstrates that: (1) There is a strong structural constraint, or syntax, in the generation of the songs, and (2) the structural constraints exhibit periodicities with periods of 6-8 and 180-400 units. This implies that no empirical Markov model is capable of representing the songs' structure. The results are robust to the choice of either human or automated song-to-symbol classifiers. In addition, the entropy estimates indicate that the maximum amount of information that could be communicated by the sequence of sounds made is less than 1 bit per second.
Lou, Tzu-Fang; Weidmann, Chase A; Killingsworth, Jordan; Tanaka Hall, Traci M; Goldstrohm, Aaron C; Campbell, Zachary T
2017-04-15
RNA-binding proteins (RBPs) collaborate to control virtually every aspect of RNA function. Tremendous progress has been made in the area of global assessment of RBP specificity using next-generation sequencing approaches both in vivo and in vitro. Understanding how protein-protein interactions enable precise combinatorial regulation of RNA remains a significant problem. Addressing this challenge requires tools that can quantitatively determine the specificities of both individual proteins and multimeric complexes in an unbiased and comprehensive way. One approach utilizes in vitro selection, high-throughput sequencing, and sequence-specificity landscapes (SEQRS). We outline a SEQRS experiment focused on obtaining the specificity of a multi-protein complex between Drosophila RBPs Pumilio (Pum) and Nanos (Nos). We discuss the necessary controls in this type of experiment and examine how the resulting data can be complemented with structural and cell-based reporter assays. Additionally, SEQRS data can be integrated with functional genomics data to uncover biological function. Finally, we propose extensions of the technique that will enhance our understanding of multi-protein regulatory complexes assembled onto RNA. Copyright © 2016 Elsevier Inc. All rights reserved.
Demidov, German; Simakova, Tamara; Vnuchkova, Julia; Bragin, Anton
2016-10-22
Multiplex polymerase chain reaction (PCR) is a common enrichment technique for targeted massive parallel sequencing (MPS) protocols. MPS is widely used in biomedical research and clinical diagnostics as the fast and accurate tool for the detection of short genetic variations. However, identification of larger variations such as structure variants and copy number variations (CNV) is still being a challenge for targeted MPS. Some approaches and tools for structural variants detection were proposed, but they have limitations and often require datasets of certain type, size and expected number of amplicons affected by CNVs. In the paper, we describe novel algorithm for high-resolution germinal CNV detection in the PCR-enriched targeted sequencing data and present accompanying tool. We have developed a machine learning algorithm for the detection of large duplications and deletions in the targeted sequencing data generated with PCR-based enrichment step. We have performed verification studies and established the algorithm's sensitivity and specificity. We have compared developed tool with other available methods applicable for the described data and revealed its higher performance. We showed that our method has high specificity and sensitivity for high-resolution copy number detection in targeted sequencing data using large cohort of samples.
Itskov, Vladimir; Curto, Carina; Pastalkova, Eva; Buzsáki, György
2011-01-01
Hippocampal neurons can display reliable and long-lasting sequences of transient firing patterns, even in the absence of changing external stimuli. We suggest that time-keeping is an important function of these sequences, and propose a network mechanism for their generation. We show that sequences of neuronal assemblies recorded from rat hippocampal CA1 pyramidal cells can reliably predict elapsed time (15-20 sec) during wheel running with a precision of 0.5sec. In addition, we demonstrate the generation of multiple reliable, long-lasting sequences in a recurrent network model. These sequences are generated in the presence of noisy, unstructured inputs to the network, mimicking stationary sensory input. Identical initial conditions generate similar sequences, whereas different initial conditions give rise to distinct sequences. The key ingredients responsible for sequence generation in the model are threshold-adaptation and a Mexican-hat-like pattern of connectivity among pyramidal cells. This pattern may arise from recurrent systems such as the hippocampal CA3 region or the entorhinal cortex. We hypothesize that mechanisms that evolved for spatial navigation also support tracking of elapsed time in behaviorally relevant contexts. PMID:21414904
Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi
2018-03-20
Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics.
Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi
2018-01-01
Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics. PMID:29644010
Optimum projection pattern generation for grey-level coded structured light illumination systems
NASA Astrophysics Data System (ADS)
Porras-Aguilar, Rosario; Falaggis, Konstantinos; Ramos-Garcia, Ruben
2017-04-01
Structured light illumination (SLI) systems are well-established optical inspection techniques for noncontact 3D surface measurements. A common technique is multi-frequency sinusoidal SLI that obtains the phase map at various fringe periods in order to estimate the absolute phase, and hence, the 3D surface information. Nevertheless, multi-frequency SLI systems employ multiple measurement planes (e.g. four phase shifted frames) to obtain the phase at a given fringe period. It is therefore an age old challenge to obtain the absolute surface information using fewer measurement frames. Grey level (GL) coding techniques have been developed as an attempt to reduce the number of planes needed, because a spatio-temporal GL sequence employing p discrete grey-levels and m frames has the potential to unwrap up to pm fringes. Nevertheless, one major disadvantage of GL based SLI techniques is that there are often errors near the border of each stripe, because an ideal stepwise intensity change cannot be measured. If the step-change in intensity is a single discrete grey-level unit, this problem can usually be overcome by applying an appropriate threshold. However, severe errors occur if the intensity change at the border of the stripe exceeds several discrete grey-level units. In this work, an optimum GL based technique is presented that generates a series of projection patterns with a minimal gradient in the intensity. It is shown that when using this technique, the errors near the border of the stripes can be significantly reduced. This improvement is achieved with the choice generated patterns, and does not involve additional hardware or special post-processing techniques. The performance of that method is validated using both simulations and experiments. The reported technique is generic, works with an arbitrary number of frames, and can employ an arbitrary number of grey-levels.
Video-signal synchronizes registration of visual evoked responses.
Vít, F; Kuba, M; Kremlácek, J; Kubová, Z; Horevaj, M
1996-01-01
Autodesk Animator software offers the suitable technique for visual stimulation in the registration of visual evoked responses (VERs). However, it is not possible to generate pulses that are synchronous with the animated sequences on any output port of the computer. These pulses are necessary for the synchronization of the computer that makes the registration of the VERs. The principle of the circuit is presented that is able to provide the synchronization of the analyzer with the stimulation computer using Autodesk Animator software.
Edge enhancement of color images using a digital micromirror device.
Di Martino, J Matías; Flores, Jorge L; Ayubi, Gastón A; Alonso, Julia R; Fernández, Ariel; Ferrari, José A
2012-06-01
A method for orientation-selective enhancement of edges in color images is proposed. The method utilizes the capacity of digital micromirror devices to generate a positive and a negative color replica of the image used as input. When both images are slightly displaced and imagined together, one obtains an image with enhanced edges. The proposed technique does not require a coherent light source or precise alignment. The proposed method could be potentially useful for processing large image sequences in real time. Validation experiments are presented.
Time domain convergence properties of Lyapunov stable penalty methods
NASA Technical Reports Server (NTRS)
Kurdila, A. J.; Sunkel, John
1991-01-01
Linear hyperbolic partial differential equations are analyzed using standard techniques to show that a sequence of solutions generated by the Liapunov stable penalty equations approaches the solution of the differential-algebraic equations governing the dynamics of multibody problems arising in linear vibrations. The analysis does not require that the system be conservative and does not impose any specific integration scheme. Variational statements are derived which bound the error in approximation by the norm of the constraint violation obtained in the approximate solutions.
USDA-ARS?s Scientific Manuscript database
Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...
USDA-ARS?s Scientific Manuscript database
Next generation sequencing technologies have vastly changed the approach of sequencing of the 16S rRNA gene for studies in microbial ecology. Three distinct technologies are available for large-scale 16S sequencing. All three are subject to biases introduced by sequencing error rates, amplificatio...
deFUME: Dynamic exploration of functional metagenomic sequencing data.
van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander
2015-07-31
Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.
Mapping wide row crops with video sequences acquired from a tractor moving at treatment speed.
Sainz-Costa, Nadir; Ribeiro, Angela; Burgos-Artizzu, Xavier P; Guijarro, María; Pajares, Gonzalo
2011-01-01
This paper presents a mapping method for wide row crop fields. The resulting map shows the crop rows and weeds present in the inter-row spacing. Because field videos are acquired with a camera mounted on top of an agricultural vehicle, a method for image sequence stabilization was needed and consequently designed and developed. The proposed stabilization method uses the centers of some crop rows in the image sequence as features to be tracked, which compensates for the lateral movement (sway) of the camera and leaves the pitch unchanged. A region of interest is selected using the tracked features, and an inverse perspective technique transforms the selected region into a bird's-eye view that is centered on the image and that enables map generation. The algorithm developed has been tested on several video sequences of different fields recorded at different times and under different lighting conditions, with good initial results. Indeed, lateral displacements of up to 66% of the inter-row spacing were suppressed through the stabilization process, and crop rows in the resulting maps appear straight.
Guo, Yinshan; Shi, Guangli; Liu, Zhendong; Zhao, Yuhui; Yang, Xiaoxu; Zhu, Junchi; Li, Kun; Guo, Xiuwu
2015-01-01
In this study, 149 F1 plants from the interspecific cross between 'Red Globe' (Vitis vinifera L.) and 'Shuangyou' (Vitis amurensis Rupr.) and the parent were used to construct a molecular genetic linkage map by using the specific length amplified fragment sequencing technique. DNA sequencing generated 41.282 Gb data consisting of 206,411,693 paired-end reads. The average sequencing depths were 68.35 for 'Red Globe,' 63.65 for 'Shuangyou,' and 8.01 for each progeny. In all, 115,629 high-quality specific length amplified fragments were detected, of which 42,279 were polymorphic. The genetic map was constructed using 7,199 of these polymorphic markers. These polymorphic markers were assigned to 19 linkage groups; the total length of the map was 1929.13 cm, with an average distance of 0.28 cm between each maker. To our knowledge, the genetic maps constructed in this study contain the largest number of molecular markers. These high-density genetic maps might form the basis for the fine quantitative trait loci mapping and molecular-assisted breeding of grape.
High compression image and image sequence coding
NASA Technical Reports Server (NTRS)
Kunt, Murat
1989-01-01
The digital representation of an image requires a very large number of bits. This number is even larger for an image sequence. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture or image sequence. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio reached a plateau around 10:1 a couple of years ago. Recent progress in the study of the brain mechanism of vision and scene analysis has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 100:1 for images and around 300:1 for image sequences. Recent progress on some of the main avenues of object-based methods is presented. These second generation techniques make use of contour-texture modeling, new results in neurophysiology and psychophysics and scene analysis.
Design of DNA pooling to allow incorporation of covariates in rare variants analysis.
Guan, Weihua; Li, Chun
2014-01-01
Rapid advances in next-generation sequencing technologies facilitate genetic association studies of an increasingly wide array of rare variants. To capture the rare or less common variants, a large number of individuals will be needed. However, the cost of a large scale study using whole genome or exome sequencing is still high. DNA pooling can serve as a cost-effective approach, but with a potential limitation that the identity of individual genomes would be lost and therefore individual characteristics and environmental factors could not be adjusted in association analysis, which may result in power loss and a biased estimate of genetic effect. For case-control studies, we propose a design strategy for pool creation and an analysis strategy that allows covariate adjustment, using multiple imputation technique. Simulations show that our approach can obtain reasonable estimate for genotypic effect with only slight loss of power compared to the much more expensive approach of sequencing individual genomes. Our design and analysis strategies enable more powerful and cost-effective sequencing studies of complex diseases, while allowing incorporation of covariate adjustment.
Mesoscopic modeling of DNA denaturation rates: Sequence dependence and experimental comparison
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dahlen, Oda, E-mail: oda.dahlen@ntnu.no; Erp, Titus S. van, E-mail: titus.van.erp@ntnu.no
Using rare event simulation techniques, we calculated DNA denaturation rate constants for a range of sequences and temperatures for the Peyrard-Bishop-Dauxois (PBD) model with two different parameter sets. We studied a larger variety of sequences compared to previous studies that only consider DNA homopolymers and DNA sequences containing an equal amount of weak AT- and strong GC-base pairs. Our results show that, contrary to previous findings, an even distribution of the strong GC-base pairs does not always result in the fastest possible denaturation. In addition, we applied an adaptation of the PBD model to study hairpin denaturation for which experimentalmore » data are available. This is the first quantitative study in which dynamical results from the mesoscopic PBD model have been compared with experiments. Our results show that present parameterized models, although giving good results regarding thermodynamic properties, overestimate denaturation rates by orders of magnitude. We believe that our dynamical approach is, therefore, an important tool for verifying DNA models and for developing next generation models that have higher predictive power than present ones.« less
Hodel, Richard G. J.; Segovia-Salcedo, M. Claudia; Landis, Jacob B.; Crowl, Andrew A.; Sun, Miao; Liu, Xiaoxian; Gitzendanner, Matthew A.; Douglas, Norman A.; Germain-Aubrey, Charlotte C.; Chen, Shichao; Soltis, Douglas E.; Soltis, Pamela S.
2016-01-01
Microsatellites, or simple sequence repeats (SSRs), have long played a major role in genetic studies due to their typically high polymorphism. They have diverse applications, including genome mapping, forensics, ascertaining parentage, population and conservation genetics, identification of the parentage of polyploids, and phylogeography. We compare SSRs and newer methods, such as genotyping by sequencing (GBS) and restriction site associated DNA sequencing (RAD-Seq), and offer recommendations for researchers considering which genetic markers to use. We also review the variety of techniques currently used for identifying microsatellite loci and developing primers, with a particular focus on those that make use of next-generation sequencing (NGS). Additionally, we review software for microsatellite development and report on an experiment to assess the utility of currently available software for SSR development. Finally, we discuss the future of microsatellites and make recommendations for researchers preparing to use microsatellites. We argue that microsatellites still have an important place in the genomic age as they remain effective and cost-efficient markers. PMID:27347456
Nadin-Davis, S A; Huang, W; Wandeler, A I
1996-03-01
Since its recognition as a discrete epizootic in Florida in the early 1950s, the raccoon strain of rabies virus (RV) has spread over almost the entire eastern seaboard of the US and now threatens to enter the southernmost regions of Canada. To characterise this RV strain in more detail, nucleotide sequencing of the N and G genes, encoding the nucleoprotein and glycoprotein, respectively, of representative isolates has been undertaken. This sequence information generated a conserved restriction map of the N gene, thereby permitting unequivocal identification of this strain by molecular techniques. Comparisons of the predicted nucleoprotein and glycoprotein products with those of other RV strains identified a number of amino acid sequence variations conserved only in the raccoon strain. This information was used to design strain-specific primers targeted to the N gene sequences encoding these residues. The incorporation of these primers into a multiplex polymerase chain reaction (PCR) protocol permitted easy and rapid discrimination between the raccoon RV strain and indigenous Ontario RVs.
Fluorescence-based strategies to investigate the structure and dynamics of aptamer-ligand complexes
NASA Astrophysics Data System (ADS)
Perez-Gonzalez, Cibran; Lafontaine, Daniel; Penedo, J.
2016-08-01
In addition to the helical nature of double-stranded DNA and RNA, single-stranded oligonucleotides can arrange themselves into tridimensional structures containing loops, bulges, internal hairpins and many other motifs. This ability has been used for more than two decades to generate oligonucleotide sequences, so-called aptamers, that can recognize certain metabolites with high affinity and specificity. More recently, this library of artificially-generated nucleic acid aptamers has been expanded by the discovery that naturally occurring RNA sequences control bacterial gene expression in response to cellular concentration of a given metabolite. The application of fluorescence methods has been pivotal to characterize in detail the structure and dynamics of these aptamer-ligand complexes in solution. This is mostly due to the intrinsic high sensitivity of fluorescence methods and also to significant improvements in solid-phase synthesis, post-synthetic labelling strategies and optical instrumentation that took place during the last decade. In this work, we provide an overview of the most widely employed fluorescence methods to investigate aptamer structure and function by describing the use of aptamers labelled with a single dye in fluorescence quenching and anisotropy assays. The use of 2-aminopurine as a fluorescent analog of adenine to monitor local changes in structure and fluorescence resonance energy transfer (FRET) to follow long-range conformational changes is also covered in detail. The last part of the review is dedicated to the application of fluorescence techniques based on single-molecule microscopy, a technique that has revolutionized our understanding of nucleic acid structure and dynamics. We finally describe the advantages of monitoring ligand-binding and conformational changes, one molecule at a time, to decipher the complexity of regulatory aptamers and summarize the emerging folding and ligand-binding models arising from the application of these single-molecule FRET microscopy techniques.
Fluorescence-Based Strategies to Investigate the Structure and Dynamics of Aptamer-Ligand Complexes
Perez-Gonzalez, Cibran; Lafontaine, Daniel A.; Penedo, J. Carlos
2016-01-01
In addition to the helical nature of double-stranded DNA and RNA, single-stranded oligonucleotides can arrange themselves into tridimensional structures containing loops, bulges, internal hairpins and many other motifs. This ability has been used for more than two decades to generate oligonucleotide sequences, so-called aptamers, that can recognize certain metabolites with high affinity and specificity. More recently, this library of artificially-generated nucleic acid aptamers has been expanded by the discovery that naturally occurring RNA sequences control bacterial gene expression in response to cellular concentration of a given metabolite. The application of fluorescence methods has been pivotal to characterize in detail the structure and dynamics of these aptamer-ligand complexes in solution. This is mostly due to the intrinsic high sensitivity of fluorescence methods and also to significant improvements in solid-phase synthesis, post-synthetic labeling strategies and optical instrumentation that took place during the last decade. In this work, we provide an overview of the most widely employed fluorescence methods to investigate aptamer structure and function by describing the use of aptamers labeled with a single dye in fluorescence quenching and anisotropy assays. The use of 2-aminopurine as a fluorescent analog of adenine to monitor local changes in structure and fluorescence resonance energy transfer (FRET) to follow long-range conformational changes is also covered in detail. The last part of the review is dedicated to the application of fluorescence techniques based on single-molecule microscopy, a technique that has revolutionized our understanding of nucleic acid structure and dynamics. We finally describe the advantages of monitoring ligand-binding and conformational changes, one molecule at a time, to decipher the complexity of regulatory aptamers and summarize the emerging folding and ligand-binding models arising from the application of these single-molecule FRET microscopy techniques. PMID:27536656
Di Pietro, C; Di Pietro, V; Emmanuele, G; Ferro, A; Maugeri, T; Modica, E; Pigola, G; Pulvirenti, A; Purrello, M; Ragusa, M; Scalia, M; Shasha, D; Travali, S; Zimmitti, V
2003-01-01
In this paper we present a new Multiple Sequence Alignment (MSA) algorithm called AntiClusAl. The method makes use of the commonly use idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process ina bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called Antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high aminoacid conservation during evolution of Xenopus laevis SOD2 is also cited.
Songbirds and humans apply different strategies in a sound sequence discrimination task.
Seki, Yoshimasa; Suzuki, Kenta; Osawa, Ayumi M; Okanoya, Kazuo
2013-01-01
The abilities of animals and humans to extract rules from sound sequences have previously been compared using observation of spontaneous responses and conditioning techniques. However, the results were inconsistently interpreted across studies possibly due to methodological and/or species differences. Therefore, we examined the strategies for discrimination of sound sequences in Bengalese finches and humans using the same protocol. Birds were trained on a GO/NOGO task to discriminate between two categories of sound stimulus generated based on an "AAB" or "ABB" rule. The sound elements used were taken from a variety of male (M) and female (F) calls, such that the sequences could be represented as MMF and MFF. In test sessions, FFM and FMM sequences, which were never presented in the training sessions but conformed to the rule, were presented as probe stimuli. The results suggested two discriminative strategies were being applied: (1) memorizing sound patterns of either GO or NOGO stimuli and generating the appropriate responses for only those sounds; and (2) using the repeated element as a cue. There was no evidence that the birds successfully extracted the abstract rule (i.e., AAB and ABB); MMF-GO subjects did not produce a GO response for FFM and vice versa. Next we examined whether those strategies were also applicable for human participants on the same task. The results and questionnaires revealed that participants extracted the abstract rule, and most of them employed it to discriminate the sequences. This strategy was never observed in bird subjects, although some participants used strategies similar to the birds when responding to the probe stimuli. Our results showed that the human participants applied the abstract rule in the task even without instruction but Bengalese finches did not, thereby reconfirming that humans have to extract abstract rules from sound sequences that is distinct from non-human animals.
ProDaMa: an open source Python library to generate protein structure datasets.
Armano, Giuliano; Manconi, Andrea
2009-10-02
The huge difference between the number of known sequences and known tertiary structures has justified the use of automated methods for protein analysis. Although a general methodology to solve these problems has not been yet devised, researchers are engaged in developing more accurate techniques and algorithms whose training plays a relevant role in determining their performance. From this perspective, particular importance is given to the training data used in experiments, and researchers are often engaged in the generation of specialized datasets that meet their requirements. To facilitate the task of generating specialized datasets we devised and implemented ProDaMa, an open source Python library than provides classes for retrieving, organizing, updating, analyzing, and filtering protein data. ProDaMa has been used to generate specialized datasets useful for secondary structure prediction and to develop a collaborative web application aimed at generating and sharing protein structure datasets. The library, the related database, and the documentation are freely available at the URL http://iasc.diee.unica.it/prodama.
Development of an Automatic Grid Generator for Multi-Element High-Lift Wings
NASA Technical Reports Server (NTRS)
Eberhardt, Scott; Wibowo, Pratomo; Tu, Eugene
1996-01-01
The procedure to generate the grid around a complex wing configuration is presented in this report. The automatic grid generation utilizes the Modified Advancing Front Method as a predictor and an elliptic scheme as a corrector. The scheme will advance the surface grid one cell outward and the newly obtained grid is corrected using the Laplace equation. The predictor-corrector step ensures that the grid produced will be smooth for every configuration. The predictor-corrector scheme is extended for a complex wing configuration. A new technique is developed to deal with the grid generation in the wing-gaps and on the flaps. It will create the grids that fill the gap on the wing surface and the gap created by the flaps. The scheme recognizes these configurations automatically so that minimal user input is required. By utilizing an appropriate sequence in advancing the grid points on a wing surface, the automatic grid generation for complex wing configurations is achieved.
A Window Into Clinical Next-Generation Sequencing-Based Oncology Testing Practices.
Nagarajan, Rakesh; Bartley, Angela N; Bridge, Julia A; Jennings, Lawrence J; Kamel-Reid, Suzanne; Kim, Annette; Lazar, Alexander J; Lindeman, Neal I; Moncur, Joel; Rai, Alex J; Routbort, Mark J; Vasalos, Patricia; Merker, Jason D
2017-12-01
- Detection of acquired variants in cancer is a paradigm of precision medicine, yet little has been reported about clinical laboratory practices across a broad range of laboratories. - To use College of American Pathologists proficiency testing survey results to report on the results from surveys on next-generation sequencing-based oncology testing practices. - College of American Pathologists proficiency testing survey results from more than 250 laboratories currently performing molecular oncology testing were used to determine laboratory trends in next-generation sequencing-based oncology testing. - These presented data provide key information about the number of laboratories that currently offer or are planning to offer next-generation sequencing-based oncology testing. Furthermore, we present data from 60 laboratories performing next-generation sequencing-based oncology testing regarding specimen requirements and assay characteristics. The findings indicate that most laboratories are performing tumor-only targeted sequencing to detect single-nucleotide variants and small insertions and deletions, using desktop sequencers and predesigned commercial kits. Despite these trends, a diversity of approaches to testing exists. - This information should be useful to further inform a variety of topics, including national discussions involving clinical laboratory quality systems, regulation and oversight of next-generation sequencing-based oncology testing, and precision oncology efforts in a data-driven manner.
Transcriptome-based differentiation of closely-related Miscanthus lines.
Chouvarine, Philippe; Cooksey, Amanda M; McCarthy, Fiona M; Ray, David A; Baldwin, Brian S; Burgess, Shane C; Peterson, Daniel G
2012-01-01
Distinguishing between individuals is critical to those conducting animal/plant breeding, food safety/quality research, diagnostic and clinical testing, and evolutionary biology studies. Classical genetic identification studies are based on marker polymorphisms, but polymorphism-based techniques are time and labor intensive and often cannot distinguish between closely related individuals. Illumina sequencing technologies provide the detailed sequence data required for rapid and efficient differentiation of related species, lines/cultivars, and individuals in a cost-effective manner. Here we describe the use of Illumina high-throughput exome sequencing, coupled with SNP mapping, as a rapid means of distinguishing between related cultivars of the lignocellulosic bioenergy crop giant miscanthus (Miscanthus × giganteus). We provide the first exome sequence database for Miscanthus species complete with Gene Ontology (GO) functional annotations. A SNP comparative analysis of rhizome-derived cDNA sequences was successfully utilized to distinguish three Miscanthus × giganteus cultivars from each other and from other Miscanthus species. Moreover, the resulting phylogenetic tree generated from SNP frequency data parallels the known breeding history of the plants examined. Some of the giant miscanthus plants exhibit considerable sequence divergence. Here we describe an analysis of Miscanthus in which high-throughput exome sequencing was utilized to differentiate between closely related genotypes despite the current lack of a reference genome sequence. We functionally annotated the exome sequences and provide resources to support Miscanthus systems biology. In addition, we demonstrate the use of the commercial high-performance cloud computing to do computational GO annotation.
Vega, Ana I; Medrano, Celia; Navarrete, Rosa; Desviat, Lourdes R; Merinero, Begoña; Rodríguez-Pombo, Pilar; Vitoria, Isidro; Ugarte, Magdalena; Pérez-Cerdá, Celia; Pérez, Belen
2016-10-01
Glycogen storage disease (GSD) is an umbrella term for a group of genetic disorders that involve the abnormal metabolism of glycogen; to date, 23 types of GSD have been identified. The nonspecific clinical presentation of GSD and the lack of specific biomarkers mean that Sanger sequencing is now widely relied on for making a diagnosis. However, this gene-by-gene sequencing technique is both laborious and costly, which is a consequence of the number of genes to be sequenced and the large size of some genes. This work reports the use of massive parallel sequencing to diagnose patients at our laboratory in Spain using either a customized gene panel (targeted exome sequencing) or the Illumina Clinical-Exome TruSight One Gene Panel (clinical exome sequencing (CES)). Sequence variants were matched against biochemical and clinical hallmarks. Pathogenic mutations were detected in 23 patients. Twenty-two mutations were recognized (mostly loss-of-function mutations), including 11 that were novel in GSD-associated genes. In addition, CES detected five patients with mutations in ALDOB, LIPA, NKX2-5, CPT2, or ANO5. Although these genes are not involved in GSD, they are associated with overlapping phenotypic characteristics such as hepatic, muscular, and cardiac dysfunction. These results show that next-generation sequencing, in combination with the detection of biochemical and clinical hallmarks, provides an accurate, high-throughput means of making genetic diagnoses of GSD and related diseases.Genet Med 18 10, 1037-1043.
Whole-genome sequencing for comparative genomics and de novo genome assembly.
Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C
2015-01-01
Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).
NASA Astrophysics Data System (ADS)
Peng, Chong; Sui, Zhenghong; Zhou, Wei; Hu, Yiyi; Mi, Ping; Jiang, Minjie; Li, Xiaodong; Ruan, Xudong
2018-06-01
Gracilariopsis lemaneiformis is an economically important agarophyte, which contains high quality gel and shows a high growth rate. Wild population of G. lemaneiformis displayed resident divergence, though with a low genetic diversity as was revealed by amplified fragment length polymorphism (AFLP) and simple sequence repeat (SSR) analyses. In addition, different strains of G. lemaneiformis are diverse in morphology. The highly inconsistence between genetic background and physiological characteristics recommends strongly to the regulation at epigenetic level. In this study, the DNA methylation change in G. lemaneiformis among different generation branches and under different temperature stresses was assessed using methylation sensitive amplified polymorphism (MSAP) technique. It was shown that DNA methylation level among different generation branches was diverse. The full and total methylated DNA level was the lowest in the second generation branch and the highest in the third generation. The total methylation level was 61.11%, 60.88% and 64.12% at 15°C, 22°C and 26°C, respectively. Compared with the control group (22°C), the fully methylated and totally methylated ratios were increased in both experiment groups (15°C and 26°C). All of the cytosine methylation/demethylation transform (CMDT) was further analyzed. High temperature treatment could induce more CMDT than low temperature treatment did.
Microfluidic droplet platform for ultrahigh-throughput single-cell screening of biodiversity.
Terekhov, Stanislav S; Smirnov, Ivan V; Stepanova, Anastasiya V; Bobik, Tatyana V; Mokrushina, Yuliana A; Ponomarenko, Natalia A; Belogurov, Alexey A; Rubtsova, Maria P; Kartseva, Olga V; Gomzikova, Marina O; Moskovtsev, Alexey A; Bukatin, Anton S; Dubina, Michael V; Kostryukova, Elena S; Babenko, Vladislav V; Vakhitova, Maria T; Manolov, Alexander I; Malakhova, Maja V; Kornienko, Maria A; Tyakht, Alexander V; Vanyushkina, Anna A; Ilina, Elena N; Masson, Patrick; Gabibov, Alexander G; Altman, Sidney
2017-03-07
Ultrahigh-throughput screening (uHTS) techniques can identify unique functionality from millions of variants. To mimic the natural selection mechanisms that occur by compartmentalization in vivo, we developed a technique based on single-cell encapsulation in droplets of a monodisperse microfluidic double water-in-oil-in-water emulsion (MDE). Biocompatible MDE enables in-droplet cultivation of different living species. The combination of droplet-generating machinery with FACS followed by next-generation sequencing and liquid chromatography-mass spectrometry analysis of the secretomes of encapsulated organisms yielded detailed genotype/phenotype descriptions. This platform was probed with uHTS for biocatalysts anchored to yeast with enrichment close to the theoretically calculated limit and cell-to-cell interactions. MDE-FACS allowed the identification of human butyrylcholinesterase mutants that undergo self-reactivation after inhibition by the organophosphorus agent paraoxon. The versatility of the platform allowed the identification of bacteria, including slow-growing oral microbiota species that suppress the growth of a common pathogen, Staphylococcus aureus , and predicted which genera were associated with inhibitory activity.
Current Progress of Genetically Engineered Pig Models for Biomedical Research
Gün, Gökhan
2014-01-01
Abstract The first transgenic pigs were generated for agricultural purposes about three decades ago. Since then, the micromanipulation techniques of pig oocytes and embryos expanded from pronuclear injection of foreign DNA to somatic cell nuclear transfer, intracytoplasmic sperm injection-mediated gene transfer, lentiviral transduction, and cytoplasmic injection. Mechanistically, the passive transgenesis approach based on random integration of foreign DNA was developed to active genetic engineering techniques based on the transient activity of ectopic enzymes, such as transposases, recombinases, and programmable nucleases. Whole-genome sequencing and annotation of advanced genome maps of the pig complemented these developments. The full implementation of these tools promises to immensely increase the efficiency and, in parallel, to reduce the costs for the generation of genetically engineered pigs. Today, the major application of genetically engineered pigs is found in the field of biomedical disease modeling. It is anticipated that genetically engineered pigs will increasingly be used in biomedical research, since this model shows several similarities to humans with regard to physiology, metabolism, genome organization, pathology, and aging. PMID:25469311
Automatic Command Sequence Generation
NASA Technical Reports Server (NTRS)
Fisher, Forest; Gladded, Roy; Khanampompan, Teerapat
2007-01-01
Automatic Sequence Generator (Autogen) Version 3.0 software automatically generates command sequences for the Mars Reconnaissance Orbiter (MRO) and several other JPL spacecraft operated by the multi-mission support team. Autogen uses standard JPL sequencing tools like APGEN, ASP, SEQGEN, and the DOM database to automate the generation of uplink command products, Spacecraft Command Message Format (SCMF) files, and the corresponding ground command products, DSN Keywords Files (DKF). Autogen supports all the major multi-mission mission phases including the cruise, aerobraking, mapping/science, and relay mission phases. Autogen is a Perl script, which functions within the mission operations UNIX environment. It consists of two parts: a set of model files and the autogen Perl script. Autogen encodes the behaviors of the system into a model and encodes algorithms for context sensitive customizations of the modeled behaviors. The model includes knowledge of different mission phases and how the resultant command products must differ for these phases. The executable software portion of Autogen, automates the setup and use of APGEN for constructing a spacecraft activity sequence file (SASF). The setup includes file retrieval through the DOM (Distributed Object Manager), an object database used to store project files. This step retrieves all the needed input files for generating the command products. Depending on the mission phase, Autogen also uses the ASP (Automated Sequence Processor) and SEQGEN to generate the command product sent to the spacecraft. Autogen also provides the means for customizing sequences through the use of configuration files. By automating the majority of the sequencing generation process, Autogen eliminates many sequence generation errors commonly introduced by manually constructing spacecraft command sequences. Through the layering of commands into the sequence by a series of scheduling algorithms, users are able to rapidly and reliably construct the desired uplink command products. With the aid of Autogen, sequences may be produced in a matter of hours instead of weeks, with a significant reduction in the number of people on the sequence team. As a result, the uplink product generation process is significantly streamlined and mission risk is significantly reduced. Autogen is used for operations of MRO, Mars Global Surveyor (MGS), Mars Exploration Rover (MER), Mars Odyssey, and will be used for operations of Phoenix. Autogen Version 3.0 is the operational version of Autogen including the MRO adaptation for the cruise mission phase, and was also used for development of the aerobraking and mapping mission phases for MRO.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daum, Christopher; Zane, Matthew; Han, James
2011-01-31
The U.S. Department of Energy (DOE) Joint Genome Institute's (JGI) Production Sequencing group is committed to the generation of high-quality genomic DNA sequence to support the mission areas of renewable energy generation, global carbon management, and environmental characterization and clean-up. Within the JGI's Production Sequencing group, a robust Illumina Genome Analyzer and HiSeq pipeline has been established. Optimization of the sesequencer pipelines has been ongoing with the aim of continual process improvement of the laboratory workflow, reducing operational costs and project cycle times to increases ample throughput, and improving the overall quality of the sequence generated. A sequence QC analysismore » pipeline has been implemented to automatically generate read and assembly level quality metrics. The foremost of these optimization projects, along with sequencing and operational strategies, throughput numbers, and sequencing quality results will be presented.« less
Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico
2018-02-01
To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weinzierl, Marion; Yeates, Anthony R.; Mackay, Duncan H.
2016-05-20
In this paper, we develop a new technique for driving global non-potential simulations of the Sun’s coronal magnetic field solely from sequences of radial magnetic maps of the solar photosphere. A primary challenge to driving such global simulations is that the required horizontal electric field cannot be uniquely determined from such maps. We show that an “inductive” electric field solution similar to that used by previous authors successfully reproduces specific features of the coronal field evolution in both single and multiple bipole simulations. For these cases, the true solution is known because the electric field was generated from a surfacemore » flux-transport model. The match for these cases is further improved by including the non-inductive electric field contribution from surface differential rotation. Then, using this reconstruction method for the electric field, we show that a coronal non-potential simulation can be successfully driven from a sequence of ADAPT maps of the photospheric radial field, without including additional physical observations which are not routinely available.« less
Schiffer, A.; Gardner, M. N.; Lynn, R. H.
2017-01-01
Experiments were conducted on an aqueous growth medium containing cultures of Escherichia coli (E. coli) XL1-Blue, to investigate, in a single experiment, the effect of two types of dynamic mechanical loading on cellular integrity. A bespoke shock tube was used to subject separate portions of a planktonic bacterial culture to two different loading sequences: (i) shock compression followed by cavitation, and (ii) shock compression followed by spray. The apparatus allows the generation of an adjustable loading shock wave of magnitude up to 300 MPa in a sterile laboratory environment. Cultures of E. coli were tested with this apparatus and the spread-plate technique was used to measure the survivability after mechanical loading. The loading sequence (ii) gave higher mortality than (i), suggesting that the bacteria are more vulnerable to shear deformation and cavitation than to hydrostatic compression. We present the results of preliminary experiments and suggestions for further experimental work; we discuss the potential applications of this technique to sterilize large volumes of fluid samples. PMID:28405383
Ahmed, Towfiq; Haraldsen, Jason T; Rehr, John J; Di Ventra, Massimiliano; Schuller, Ivan; Balatsky, Alexander V
2014-03-28
Nanopore-based sequencing has demonstrated a significant potential for the development of fast, accurate, and cost-efficient fingerprinting techniques for next generation molecular detection and sequencing. We propose a specific multilayered graphene-based nanopore device architecture for the recognition of single biomolecules. Molecular detection and analysis can be accomplished through the detection of transverse currents as the molecule or DNA base translocates through the nanopore. To increase the overall signal-to-noise ratio and the accuracy, we implement a new 'multi-point cross-correlation' technique for identification of DNA bases or other molecules on the single molecular level. We demonstrate that the cross-correlations between each nanopore will greatly enhance the transverse current signal for each molecule. We implement first-principles transport calculations for DNA bases surveyed across a multilayered graphene nanopore system to illustrate the advantages of the proposed geometry. A time-series analysis of the cross-correlation functions illustrates the potential of this method for enhancing the signal-to-noise ratio. This work constitutes a significant step forward in facilitating fingerprinting of single biomolecules using solid state technology.
Schiffer, A; Gardner, M N; Lynn, R H; Tagarielli, V L
2017-03-01
Experiments were conducted on an aqueous growth medium containing cultures of Escherichia coli ( E. coli ) XL1-Blue, to investigate, in a single experiment, the effect of two types of dynamic mechanical loading on cellular integrity. A bespoke shock tube was used to subject separate portions of a planktonic bacterial culture to two different loading sequences: (i) shock compression followed by cavitation, and (ii) shock compression followed by spray. The apparatus allows the generation of an adjustable loading shock wave of magnitude up to 300 MPa in a sterile laboratory environment. Cultures of E. coli were tested with this apparatus and the spread-plate technique was used to measure the survivability after mechanical loading. The loading sequence (ii) gave higher mortality than (i), suggesting that the bacteria are more vulnerable to shear deformation and cavitation than to hydrostatic compression. We present the results of preliminary experiments and suggestions for further experimental work; we discuss the potential applications of this technique to sterilize large volumes of fluid samples.
Mohammadi, Mohammad; Rasaee, Mohammad Javad; Rajabibazl, Masoumeh; Paknejad, Malihe; Zare, Mehrak; Mohammadzadeh, Sara
2007-08-01
PR81 is an anti-MUC1 monoclonal antibody (MAb) which was generated against human MUC1 mucin that reacted with breast cancerous tissue, MUC1 positive cell line (MCF-7, BT-20, and T-4 7 D), and synthetic peptide, including the tandem repeat sequence of MUC1. Here we characterized the binding properties of PR81 against the tandem repeat of MUC1 by two different epitope mapping techniques, namely, PEPSCAN and phage display. Epitope mapping of PR81 MAb by PEPSCAN revealed a minimal consensus binding sequence, PDTRP, which is found on MUC1 peptide as the most important epitope. Using the phage display peptide library, we identified the motif PD(T/S/G)RP as an epitope and the motif AVGLSPDGSRGV as a mimotope recognized by PR81. Results of these two methods showed that the two residues, arginine and aspartic acid, have important roles in antibody binding and threonine can be substituted by either glycine or serine. These results may be of importance in tailor making antigens used in immunoassay.
Diaz, Maureen H; Winchell, Jonas M
2016-01-01
Over the past decade there have been significant advancements in the methods used for detecting and characterizing Mycoplasma pneumoniae, a common cause of respiratory illness and community-acquired pneumonia worldwide. The repertoire of available molecular diagnostics has greatly expanded from nucleic acid amplification techniques (NAATs) that encompass a variety of chemistries used for detection, to more sophisticated characterizing methods such as multi-locus variable-number tandem-repeat analysis (MLVA), Multi-locus sequence typing (MLST), matrix-assisted laser desorption ionization-time-of-flight mass spectrometry (MALDI-TOF MS), single nucleotide polymorphism typing, and numerous macrolide susceptibility profiling methods, among others. These many molecular-based approaches have been developed and employed to continually increase the level of discrimination and characterization in order to better understand the epidemiology and biology of M. pneumoniae. This review will summarize recent molecular techniques and procedures and lend perspective to how each has enhanced the current understanding of this organism and will emphasize how Next Generation Sequencing may serve as a resource for researchers to gain a more comprehensive understanding of the genomic complexities of this insidious pathogen.
Robust infrared target tracking using discriminative and generative approaches
NASA Astrophysics Data System (ADS)
Asha, C. S.; Narasimhadhan, A. V.
2017-09-01
The process of designing an efficient tracker for thermal infrared imagery is one of the most challenging tasks in computer vision. Although a lot of advancement has been achieved in RGB videos over the decades, textureless and colorless properties of objects in thermal imagery pose hard constraints in the design of an efficient tracker. Tracking of an object using a single feature or a technique often fails to achieve greater accuracy. Here, we propose an effective method to track an object in infrared imagery based on a combination of discriminative and generative approaches. The discriminative technique makes use of two complementary methods such as kernelized correlation filter with spatial feature and AdaBoost classifier with pixel intesity features to operate in parallel. After obtaining optimized locations through discriminative approaches, the generative technique is applied to determine the best target location using a linear search method. Unlike the baseline algorithms, the proposed method estimates the scale of the target by Lucas-Kanade homography estimation. To evaluate the proposed method, extensive experiments are conducted on 17 challenging infrared image sequences obtained from LTIR dataset and a significant improvement of mean distance precision and mean overlap precision is accomplished as compared with the existing trackers. Further, a quantitative and qualitative assessment of the proposed approach with the state-of-the-art trackers is illustrated to clearly demonstrate an overall increase in performance.
van Huet, Ramon A. C.; Pierrache, Laurence H.M.; Meester-Smoor, Magda A.; Klaver, Caroline C.W.; van den Born, L. Ingeborgh; Hoyng, Carel B.; de Wijs, Ilse J.; Collin, Rob W. J.; Hoefsloot, Lies H.
2015-01-01
Purpose To determine the efficacy of multiple versions of a commercially available arrayed primer extension (APEX) microarray chip for autosomal recessive retinitis pigmentosa (arRP). Methods We included 250 probands suspected of arRP who were genetically analyzed with the APEX microarray between January 2008 and November 2013. The mode of inheritance had to be autosomal recessive according to the pedigree (including isolated cases). If the microarray identified a heterozygous mutation, we performed Sanger sequencing of exons and exon–intron boundaries of that specific gene. The efficacy of this microarray chip with the additional Sanger sequencing approach was determined by the percentage of patients that received a molecular diagnosis. We also collected data from genetic tests other than the APEX analysis for arRP to provide a detailed description of the molecular diagnoses in our study cohort. Results The APEX microarray chip for arRP identified the molecular diagnosis in 21 (8.5%) of the patients in our cohort. Additional Sanger sequencing yielded a second mutation in 17 patients (6.8%), thereby establishing the molecular diagnosis. In total, 38 patients (15.2%) received a molecular diagnosis after analysis using the microarray and additional Sanger sequencing approach. Further genetic analyses after a negative result of the arRP microarray (n = 107) resulted in a molecular diagnosis of arRP (n = 23), autosomal dominant RP (n = 5), X-linked RP (n = 2), and choroideremia (n = 1). Conclusions The efficacy of the commercially available APEX microarray chips for arRP appears to be low, most likely caused by the limitations of this technique and the genetic and allelic heterogeneity of RP. Diagnostic yields up to 40% have been reported for next-generation sequencing (NGS) techniques that, as expected, thereby outperform targeted APEX analysis. PMID:25999674
Leichty, Aaron R; Brisson, Dustin
2014-10-01
Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host-microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >10(5)-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2-9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs. Copyright © 2014 by the Genetics Society of America.
Wall, Melanie; Cheslack-Postava, Keely; Hu, Mei-Chen; Feng, Tianshu; Griesler, Pamela; Kandel, Denise B
2018-01-01
This study sought to specify (1) the position of nonmedical prescription opioids (NMPO) in drug initiation sequences among Millennials (1979-96), Generation X (1964-79), and Baby Boomers (1949-64) and (2) gender and racial/ethnic differences in sequences among Millennials. Data are from the 2013-2014 National Surveys on Drug Use and Health (n = 73,026). We identified statistically significant drug initiation sequences involving alcohol/cigarettes, marijuana, NMPO, cocaine, and heroin using a novel method distinguishing significant sequences from patterns expected only due to correlations induced by common liability among drugs. Alcohol/cigarettes followed by marijuana was the most common sequence. NMPO or cocaine use after marijuana, and heroin use after NMPO or cocaine, differed by generation. Among successively younger generations, NMPO after marijuana and heroin after NMPO increased. Millennials were more likely to initiate NMPO than cocaine after marijuana; Generation X and Baby Boomers were less likely (odds ratios = 1.4;0.3;0.2). Millennials were more likely than Generation X and Baby Boomers to use heroin after NMPO (hazards ratios = 7.1;3.4;2.5). In each generation, heroin users were far more likely to start heroin after both NMPO and cocaine than either alone. Sequences were similar by gender. Fewer paths were significant among African-Americans. NMPOs play a more prominent role in drug initiation sequences among Millennials than prior generations. Among Millennials, NMPO use is more likely than cocaine to follow marijuana use. In all generations, transition to heroin from NMPO significantly occurs only when both NMPO and cocaine have been used. Delineation of drug sequences suggests optimal points in development for prevention and treatment efforts. Copyright © 2017 Elsevier B.V. All rights reserved.
Reinharz, Vladimir; Ponty, Yann; Waldispühl, Jérôme
2013-07-01
The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences. However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. In this article, we introduce IncaRNAtion, a novel algorithm to design RNA sequences folding into target secondary structures with a predefined nucleotide distribution. IncaRNAtion uses a global sampling approach and weighted sampling techniques. We show that our approach is fast (i.e. running time comparable or better than local search methods), seedless (we remove the bias of the seed in local search heuristics) and successfully generates high-quality sequences (i.e. thermodynamically stable) for any GC-content. To complete this study, we develop a hybrid method combining our global sampling approach with local search strategies. Remarkably, our glocal methodology overcomes both local and global approaches for sampling sequences with a specific GC-content and target structure. IncaRNAtion is available at csb.cs.mcgill.ca/incarnation/. Supplementary data are available at Bioinformatics online.
ERGC: an efficient referential genome compression algorithm
Saha, Subrata; Rajasekaran, Sanguthevar
2015-01-01
Motivation: Genome sequencing has become faster and more affordable. Consequently, the number of available complete genomic sequences is increasing rapidly. As a result, the cost to store, process, analyze and transmit the data is becoming a bottleneck for research and future medical applications. So, the need for devising efficient data compression and data reduction techniques for biological sequencing data is growing by the day. Although there exists a number of standard data compression algorithms, they are not efficient in compressing biological data. These generic algorithms do not exploit some inherent properties of the sequencing data while compressing. To exploit statistical and information-theoretic properties of genomic sequences, we need specialized compression algorithms. Five different next-generation sequencing data compression problems have been identified and studied in the literature. We propose a novel algorithm for one of these problems known as reference-based genome compression. Results: We have done extensive experiments using five real sequencing datasets. The results on real genomes show that our proposed algorithm is indeed competitive and performs better than the best known algorithms for this problem. It achieves compression ratios that are better than those of the currently best performing algorithms. The time to compress and decompress the whole genome is also very promising. Availability and implementation: The implementations are freely available for non-commercial purposes. They can be downloaded from http://engr.uconn.edu/∼rajasek/ERGC.zip. Contact: rajasek@engr.uconn.edu PMID:26139636
YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs
Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore
2017-01-01
Abstract Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. PMID:28108659
2012-01-01
Background Barcodes are unique DNA sequence tags that can be used to specifically label individual mutants. The barcode-tagged open reading frame (ORF) haploid deletion mutant collections in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe allow for high-throughput mutant phenotyping because the relative growth of mutants in a population can be determined by monitoring the proportions of their associated barcodes. While these mutant collections have greatly facilitated genome-wide studies, mutations in essential genes are not present, and the roles of these genes are not as easily studied. To further support genome-scale research in S. pombe, we generated a barcode-tagged fission yeast insertion mutant library that has the potential of generating viable mutations in both essential and non-essential genes and can be easily analyzed using standard molecular biological techniques. Results An insertion vector containing a selectable ura4+ marker and a random barcode was used to generate a collection of 10,000 fission yeast insertion mutants stored individually in 384-well plates and as six pools of mixed mutants. Individual barcodes are flanked by Sfi I recognition sites and can be oligomerized in a unique orientation to facilitate barcode sequencing. Independent genetic screens on a subset of mutants suggest that this library contains a diverse collection of single insertion mutations. We present several approaches to determine insertion sites. Conclusions This collection of S. pombe barcode-tagged insertion mutants is well-suited for genome-wide studies. Because insertion mutations may eliminate, reduce or alter the function of essential and non-essential genes, this library will contain strains with a wide range of phenotypes that can be assayed by their associated barcodes. The design of the barcodes in this library allows for barcode sequencing using next generation or standard benchtop cloning approaches. PMID:22554201
Stable MSAP markers for the distinction of Vitis vinifera cv Pinot noir clones.
Ocaña, Juan; Walter, Bernard; Schellenbaum, Paul
2013-11-01
Grapevine is one of the most economically important fruit crops. Molecular markers have been used to study grapevine diversity. For instance, simple sequence repeats are a powerful tool for identification of grapevine cultivars, while amplified fragment length polymorphisms have shown their usefulness in intra-varietal diversity studies. Other techniques such as sequence-specific amplified polymorphism are based on the presence of mobile elements in the genome, but their detection lies upon their activity. Relevant attention has been drawn toward epigenetic sources of variation. In this study, a set of Vitis vinifera cv Pinot noir clones were analyzed using the methylation-sensitive amplified polymorphism technique with isoschizomers MspI and HpaII. Nine out of fourteen selective primer combinations were informative and generated two types of polymorphic fragments which were categorized as "stable" and "unstable." In total, 23 stable fragments were detected and they discriminated 92.5 % of the studied clones. Detected stable polymorphisms were either common to several clones, restricted to a few clones or unique to a single clone. The identification of these stable epigenetic markers will be useful in clonal diversity studies. We highlight the relevance of stable epigenetic variation in V. vinifera clones and analyze at which level these markers could be applicable for the development of forthright techniques for clonal distinction.
Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong
2013-07-01
A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.
Phage display as a technology delivering on the promise of peptide drug discovery.
Hamzeh-Mivehroud, Maryam; Alizadeh, Ali Akbar; Morris, Michael B; Church, W Bret; Dastmalchi, Siavoush
2013-12-01
Phage display represents an important approach in the development pipeline for producing peptides and peptidomimetics therapeutics. Using randomly generated DNA sequences and molecular biology techniques, large diverse peptide libraries can be displayed on the phage surface. The phage library can be incubated with a target of interest and the phage which bind can be isolated and sequenced to reveal the displayed peptides' primary structure. In this review, we focus on the 'mechanics' of the phage display process, whilst highlighting many diverse and subtle ways it has been used to further the drug-development process, including the potential for the phage particle itself to be used as a drug carrier targeted to a particular pathogen or cell type in the body. Copyright © 2013 Elsevier Ltd. All rights reserved.
Gillingham, Dennis; Sauter, Basilius
2018-05-06
Drugs that covalently modify DNA are components of most chemotherapy regimens, often serving as first-line treatments. Classically the chemical reactivity of DNA alkylators has been determined in vitro with short oligonucleotides. Here we use next generation RNA sequencing to report on the chemoselectivity of alkylating agents. We develop the method with the well-known clinically used DNA modifiying drugs streptozotocin and temozolomide, and then apply the technique to profile RNA modification with uncharacterized alkylation reactions such as with powerful electrophiles like trimethylsilyldiazomethane. The multiplexed and massively parallel format of NGS offers analyses of chemical reactivity in nucleic acids to be accomplished in less time with greater statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Detection of generalized synchronization using echo state networks
NASA Astrophysics Data System (ADS)
Ibáñez-Soria, D.; Garcia-Ojalvo, J.; Soria-Frisch, A.; Ruffini, G.
2018-03-01
Generalized synchronization between coupled dynamical systems is a phenomenon of relevance in applications that range from secure communications to physiological modelling. Here, we test the capabilities of reservoir computing and, in particular, echo state networks for the detection of generalized synchronization. A nonlinear dynamical system consisting of two coupled Rössler chaotic attractors is used to generate temporal series consisting of time-locked generalized synchronized sequences interleaved with unsynchronized ones. Correctly tuned, echo state networks are able to efficiently discriminate between unsynchronized and synchronized sequences even in the presence of relatively high levels of noise. Compared to other state-of-the-art techniques of synchronization detection, the online capabilities of the proposed Echo State Network based methodology make it a promising choice for real-time applications aiming to monitor dynamical synchronization changes in continuous signals.
Non-coding RNAs in virology: an RNA genomics approach.
Isaac, Christopher; Patel, Trushar R; Zovoilis, Athanasios
2018-04-01
Advances in sequencing technologies and bioinformatic analysis techniques have greatly improved our understanding of various classes of RNAs and their functions. Despite not coding for proteins, non-coding RNAs (ncRNAs) are emerging as essential biomolecules fundamental for cellular functions and cell survival. Interestingly, ncRNAs produced by viruses not only control the expression of viral genes, but also influence host cell regulation and circumvent host innate immune response. Correspondingly, ncRNAs produced by the host genome can play a key role in host-virus interactions. In this article, we will first discuss a number of types of viral and mammalian ncRNAs associated with viral infections. Subsequently, we also describe the new possibilities and opportunities that RNA genomics and next-generation sequencing technologies provide for studying ncRNAs in virology.
Assignment of the SLA alleles and reproductive potential of selective breeding Duroc pig lines.
Soe, Ok Kar; Ohba, Yasunori; Imaeda, Noriaki; Nishii, Naohito; Takasu, Masaki; Yoshioka, Gou; Kawata, Hisako; Shigenari, Atsuko; Uenishi, Hirohide; Inoko, Hidetoshi; Ando, Asako; Kitagawa, Hitoshi
2008-01-01
Pigs with defined swine leukocyte antigen (SLA) haplotypes and their detailed information are useful for transplantation and immunological studies. We developed two herds of SLA homozygous Duroc pigs with novel SLA haplotypes and characterized their reproductive potential. For selective inbreeding, a pair of Duroc pigs was chosen as initial breeders, and substantial breeding within progenies was carried out for eight generations. In the selective breeding Duroc pigs, SLA haplotypes were assigned by nucleotide sequence determination of reverse transcription polymerase chain reaction (RT-PCR) products of three SLA classical class I genes and two class II genes. Based on this sequence information, we developed a rapid and simple SLA class II DNA typing method by polymerase chain reaction-sequence specific primer (PCR-SSP) technique. As a complementary method for the characterization of the SLA haplotypes, genetic polymorphisms of 36 microsatellite (MS) markers within the SLA region were also analyzed in the selective breeding pigs with SLA homozygous/heterozygous haplotypes. Among the selective breeding pigs from the third to fifth generations, only two SLA haplotypes were identified by the RT-PCR based SLA typing method; Hp-27.30 (SLA-1*08an03, SLA-1*06an04, SLA-2*0102, SLA-3*0101 DRB1*1101 and DQB1*0503) and Hp-60.13 (SLA-1*an02, SLA-2*1002, SLA-3*0502, DRB1*0403 and DQB1*0303). In these two SLA haplotypes, two class I haplotypes, Hp-27.0 and Hp-60.0, are novel. Furthermore, two class II haplotypes, Hp-0.30 and Hp-0.13, which were previously reported in Korean native pigs and pigs of Hanford breed, respectively, were also assigned by a simple assay using a PCR-SSP technique in the entire selective breeding stock. Moreover, two haplotype specific MS patterns were observed across the entire SLA region in the selective breeding (homozygous/heterozygous) pigs. No morphological abnormalities were observed in selective breeding pigs. The theoretical inbreeding coefficient at the eighth generation was 78.5%. In all generations of selective breeding pigs, litter sizes were comparable and weaning weights from the fifth to eighth generation produced progenies significantly lighter (P < 0.01) than those in the non-selective breeding pigs. We established and characterized SLA homozygous Duroc herds with two kinds of haplotypes that can be used as a new resource for transplantation and other biomedical studies.
Concatenated shift registers generating maximally spaced phase shifts of PN-sequences
NASA Technical Reports Server (NTRS)
Hurd, W. J.; Welch, L. R.
1977-01-01
A large class of linearly concatenated shift registers is shown to generate approximately maximally spaced phase shifts of pn-sequences, for use in pseudorandom number generation. A constructive method is presented for finding members of this class, for almost all degrees for which primitive trinomials exist. The sequences which result are not normally characterized by trinomial recursions, which is desirable since trinomial sequences can have some undesirable randomness properties.
An Overview on Prenatal Screening for Chromosomal Aberrations.
Hixson, Lucas; Goel, Srishti; Schuber, Paul; Faltas, Vanessa; Lee, Jessica; Narayakkadan, Anjali; Leung, Ho; Osborne, Jim
2015-10-01
This article is a review of current and emerging methods used for prenatal detection of chromosomal aneuploidies. Chromosomal anomalies in the developing fetus can occur in any pregnancy and lead to death prior to or shortly after birth or to costly lifelong disabilities. Early detection of fetal chromosomal aneuploidies, an atypical number of certain chromosomes, can help parents evaluate their pregnancy options. Current diagnostic methods include maternal serum sampling or nuchal translucency testing, which are minimally invasive diagnostics, but lack sensitivity and specificity. The gold standard, karyotyping, requires amniocentesis or chorionic villus sampling, which are highly invasive and can cause abortions. In addition, many of these methods have long turnaround times, which can cause anxiety in mothers. Next-generation sequencing of fetal DNA in maternal blood enables minimally invasive, sensitive, and reasonably rapid analysis of fetal chromosomal anomalies and can be of clinical utility to parents. This review covers traditional methods and next-generation sequencing techniques for diagnosing aneuploidies in terms of clinical utility, technological characteristics, and market potential. © 2015 Society for Laboratory Automation and Screening.
NASA Astrophysics Data System (ADS)
He, Nana; Zhang, Xiaolong; Zhao, Juanjuan; Zhao, Huilan; Qiang, Yan
2017-07-01
While the popular thin layer scanning technology of spiral CT has helped to improve diagnoses of lung diseases, the large volumes of scanning images produced by the technology also dramatically increase the load of physicians in lesion detection. Computer-aided diagnosis techniques like lesions segmentation in thin CT sequences have been developed to address this issue, but it remains a challenge to achieve high segmentation efficiency and accuracy without much involvement of human manual intervention. In this paper, we present our research on automated segmentation of lung parenchyma with an improved geodesic active contour model that is geodesic active contour model based on similarity (GACBS). Combining spectral clustering algorithm based on Nystrom (SCN) with GACBS, this algorithm first extracts key image slices, then uses these slices to generate an initial contour of pulmonary parenchyma of un-segmented slices with an interpolation algorithm, and finally segments lung parenchyma of un-segmented slices. Experimental results show that the segmentation results generated by our method are close to what manual segmentation can produce, with an average volume overlap ratio of 91.48%.
Mitochondrial DNA variant at HVI region as a candidate of genetic markers of type 2 diabetes
NASA Astrophysics Data System (ADS)
Gumilar, Gun Gun; Purnamasari, Yunita; Setiadi, Rahmat
2016-02-01
Mitochondrial DNA (mtDNA) is maternally inherited. mtDNA mutations which can contribute to the excess of maternal inheritance of type 2 diabetes. Due to the high mutation rate, one of the areas in the mtDNA that is often associated with the disease is the hypervariable region I (HVI). Therefore, this study was conducted to determine the genetic variants of human mtDNA HVI that related to the type 2 diabetes in four samples that were taken from four generations in one lineage. Steps being taken include the lyses of hair follicles, amplification of mtDNA HVI fragment using Polymerase Chain Reaction (PCR), detection of PCR products through agarose gel electrophoresis technique, the measurement of the concentration of mtDNA using UV-Vis spectrophotometer, determination of the nucleotide sequence via direct sequencing method and analysis of the sequencing results using SeqMan DNASTAR program. Based on the comparison between nucleotide sequence of samples and revised Cambridge Reference Sequence (rCRS) obtained six same mutations that these are C16147T, T16189C, C16193del, T16127C, A16235G, and A16293C. After comparing the data obtained to the secondary data from Mitomap and NCBI, it were found that two mutations, T16189C and T16217C, become candidates as genetic markers of type 2 diabetes even the mutations were found also in the generations of undiagnosed type 2 diabetes. The results of this study are expected to give contribution to the collection of human mtDNA database of genetic variants that associated to metabolic diseases, so that in the future it can be utilized in various fields, especially in medicine.
Song, Beng-Kah; Nadarajah, Kalaivani; Romanov, Michael N; Ratnam, Wickneswari
2005-01-01
The construction of BAC-contig physical maps is an important step towards a partial or ultimate genome sequence analysis. Here, we describe our initial efforts to apply an overgo approach to screen a BAC library of the Malaysian wild rice species, Oryza rufipogon. Overgo design is based on repetitive element masking and sequence uniqueness, and uses short probes (approximately 40 bp), making this method highly efficient and specific. Pairs of 24-bp oligos that contain an 8-bp overlap were developed from the publicly available genomic sequences of the cultivated rice, O. sativa, to generate 20 overgo probes for a 1-Mb region that encompasses a yield enhancement QTL yld1.1 in O. rufipogon. The advantages of a high similarity in melting temperature, hybridization kinetics and specific activities of overgos further enabled a pooling strategy for library screening by filter hybridization. Two pools of ten overgos each were hybridized to high-density filters representing the O. rufipogon genomic BAC library. These screening tests succeeded in providing 69 PCR-verified positive hits from a total of 23,040 BAC clones of the entire O. rufipogon library. A minimal tilling path of clones was generated to contribute to a fully covered BAC-contig map of the targeted 1-Mb region. The developed protocol for overgo design based on O. sativa sequences as a comparative genomic framework, and the pooled overgo hybridization screening technique are suitable means for high-resolution physical mapping and the identification of BAC candidates for sequencing.
LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.
El-Metwally, Sara; Zakaria, Magdi; Hamza, Taher
2016-11-01
The deluge of current sequenced data has exceeded Moore's Law, more than doubling every 2 years since the next-generation sequencing (NGS) technologies were invented. Accordingly, we will able to generate more and more data with high speed at fixed cost, but lack the computational resources to store, process and analyze it. With error prone high throughput NGS reads and genomic repeats, the assembly graph contains massive amount of redundant nodes and branching edges. Most assembly pipelines require this large graph to reside in memory to start their workflows, which is intractable for mammalian genomes. Resource-efficient genome assemblers combine both the power of advanced computing techniques and innovative data structures to encode the assembly graph efficiently in a computer memory. LightAssembler is a lightweight assembly algorithm designed to be executed on a desktop machine. It uses a pair of cache oblivious Bloom filters, one holding a uniform sample of [Formula: see text]-spaced sequenced [Formula: see text]-mers and the other holding [Formula: see text]-mers classified as likely correct, using a simple statistical test. LightAssembler contains a light implementation of the graph traversal and simplification modules that achieves comparable assembly accuracy and contiguity to other competing tools. Our method reduces the memory usage by [Formula: see text] compared to the resource-efficient assemblers using benchmark datasets from GAGE and Assemblathon projects. While LightAssembler can be considered as a gap-based sequence assembler, different gap sizes result in an almost constant assembly size and genome coverage. https://github.com/SaraEl-Metwally/LightAssembler CONTACT: sarah_almetwally4@mans.edu.egSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Neural Sequence Generation Using Spatiotemporal Patterns of Inhibition.
Cannon, Jonathan; Kopell, Nancy; Gardner, Timothy; Markowitz, Jeffrey
2015-11-01
Stereotyped sequences of neural activity are thought to underlie reproducible behaviors and cognitive processes ranging from memory recall to arm movement. One of the most prominent theoretical models of neural sequence generation is the synfire chain, in which pulses of synchronized spiking activity propagate robustly along a chain of cells connected by highly redundant feedforward excitation. But recent experimental observations in the avian song production pathway during song generation have shown excitatory activity interacting strongly with the firing patterns of inhibitory neurons, suggesting a process of sequence generation more complex than feedforward excitation. Here we propose a model of sequence generation inspired by these observations in which a pulse travels along a spatially recurrent excitatory chain, passing repeatedly through zones of local feedback inhibition. In this model, synchrony and robust timing are maintained not through redundant excitatory connections, but rather through the interaction between the pulse and the spatiotemporal pattern of inhibition that it creates as it circulates the network. These results suggest that spatially and temporally structured inhibition may play a key role in sequence generation.
Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC
NASA Astrophysics Data System (ADS)
Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.
2000-02-01
Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.
WebLogo: A Sequence Logo Generator
Crooks, Gavin E.; Hon, Gary; Chandonia, John-Marc; Brenner, Steven E.
2004-01-01
WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Each logo consists of stacks of letters, one stack for each position in the sequence. The overall height of each stack indicates the sequence conservation at that position (measured in bits), whereas the height of symbols within the stack reflects the relative frequency of the corresponding amino or nucleic acid at that position. WebLogo has been enhanced recently with additional features and options, to provide a convenient and highly configurable sequence logo generator. A command line interface and the complete, open WebLogo source code are available for local installation and customization. PMID:15173120