Science.gov

Sample records for automated dna sequencing

  1. Automated DNA Sequencing System

    SciTech Connect

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  2. A Demonstration of Automated DNA Sequencing.

    ERIC Educational Resources Information Center

    Latourelle, Sandra; Seidel-Rogol, Bonnie

    1998-01-01

    Details a simulation that employs a paper-and-pencil model to demonstrate the principles behind automated DNA sequencing. Discusses the advantages of automated sequencing as well as the chemistry of automated DNA sequencing. (DDR)

  3. Automated hybridization/imaging device for fluorescent multiplex DNA sequencing

    DOEpatents

    Weiss, Robert B.; Kimball, Alvin W.; Gesteland, Raymond F.; Ferguson, F. Mark; Dunn, Diane M.; Di Sera, Leonard J.; Cherry, Joshua L.

    1995-01-01

    A method is disclosed for automated multiplex sequencing of DNA with an integrated automated imaging hybridization chamber system. This system comprises an hybridization chamber device for mounting a membrane containing size-fractionated multiplex sequencing reaction products, apparatus for fluid delivery to the chamber device, imaging apparatus for light delivery to the membrane and image recording of fluorescence emanating from the membrane while in the chamber device, and programmable controller apparatus for controlling operation of the system. The multiplex reaction products are hybridized with a probe, then an enzyme (such as alkaline phosphatase) is bound to a binding moiety on the probe, and a fluorogenic substrate (such as a benzothiazole derivative) is introduced into the chamber device by the fluid delivery apparatus. The enzyme converts the fluorogenic substrate into a fluorescent product which, when illuminated in the chamber device with a beam of light from the imaging apparatus, excites fluorescence of the fluorescent product to produce a pattern of hybridization. The pattern of hybridization is imaged by a CCD camera component of the imaging apparatus to obtain a series of digital signals. These signals are converted by the controller apparatus into a string of nucleotides corresponding to the nucleotide sequence an automated sequence reader. The method and apparatus are also applicable to other membrane-based applications such as colony and plaque hybridization and Southern, Northern, and Western blots.

  4. Automated hybridization/imaging device for fluorescent multiplex DNA sequencing

    DOEpatents

    Weiss, R.B.; Kimball, A.W.; Gesteland, R.F.; Ferguson, F.M.; Dunn, D.M.; Di Sera, L.J.; Cherry, J.L.

    1995-11-28

    A method is disclosed for automated multiplex sequencing of DNA with an integrated automated imaging hybridization chamber system. This system comprises an hybridization chamber device for mounting a membrane containing size-fractionated multiplex sequencing reaction products, apparatus for fluid delivery to the chamber device, imaging apparatus for light delivery to the membrane and image recording of fluorescence emanating from the membrane while in the chamber device, and programmable controller apparatus for controlling operation of the system. The multiplex reaction products are hybridized with a probe, the enzyme (such as alkaline phosphatase) is bound to a binding moiety on the probe, and a fluorogenic substrate (such as a benzothiazole derivative) is introduced into the chamber device by the fluid delivery apparatus. The enzyme converts the fluorogenic substrate into a fluorescent product which, when illuminated in the chamber device with a beam of light from the imaging apparatus, excites fluorescence of the fluorescent product to produce a pattern of hybridization. The pattern of hybridization is imaged by a CCD camera component of the imaging apparatus to obtain a series of digital signals. These signals are converted by the controller apparatus into a string of nucleotides corresponding to the nucleotide sequence an automated sequence reader. The method and apparatus are also applicable to other membrane-based applications such as colony and plaque hybridization and Southern, Northern, and Western blots. 9 figs.

  5. D-Tailor: automated analysis and design of DNA sequences

    PubMed Central

    Guimaraes, Joao C.; Rocha, Miguel; Arkin, Adam P.; Cambray, Guillaume

    2014-01-01

    Motivation: Current advances in DNA synthesis, cloning and sequencing technologies afford high-throughput implementation of artificial sequences into living cells. However, flexible computational tools for multi-objective sequence design are lacking, limiting the potential of these technologies. Results: We developed DNA-Tailor (D-Tailor), a fully extendable software framework, for property-based design of synthetic DNA sequences. D-Tailor permits the seamless integration of multiple sequence analysis tools into a generic Monte Carlo simulation that evolves sequences toward any combination of rationally defined properties. As proof of principle, we show that D-Tailor is capable of designing sequence libraries comprising all possible combinations among three different sequence properties influencing translation efficiency in Escherichia coli. The capacity to design artificial sequences that systematically sample any given parameter space should support the implementation of more rigorous experimental designs. Availability: Source code is available for download at https://sourceforge.net/projects/dtailor/ Contact: aparkin@lbl.gov or cambray.guillaume@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online (D-Tailor Tutorial). PMID:24398007

  6. Primer effect in the detection of mitochondrial DNA point heteroplasmy by automated sequencing.

    PubMed

    Calatayud, Marta; Ramos, Amanda; Santos, Cristina; Aluja, Maria Pilar

    2013-06-01

    The correct detection of mitochondrial DNA (mtDNA) heteroplasmy by automated sequencing presents methodological constraints. The main goals of this study are to investigate the effect of sense and distance of primers in heteroplasmy detection and to test if there are differences in the accurate determination of heteroplasmy involving transitions or transversions. A gradient of the heteroplasmy levels was generated for mtDNA positions 9477 (transition G/A) and 15,452 (transversion C/A). Amplification and subsequent sequencing with forward and reverse primers, situated at 550 and 150 bp from the heteroplasmic positions, were performed. Our data provide evidence that there is a significant difference between the use of forward and reverse primers. The forward primer is the primer that seems to give a better approximation to the real proportion of the variants. No significant differences were found concerning the distance at which the sequencing primers were placed neither between the analysis of transitions and transversions. The data collected in this study are a starting point that allows to glimpse the importance of the sequencing primers in the accurate detection of point heteroplasmy, providing additional insight into the overall automated sequencing strategy.

  7. Aptaligner: automated software for aligning pseudorandom DNA X-aptamers from next-generation sequencing data.

    PubMed

    Lu, Emily; Elizondo-Riojas, Miguel-Angel; Chang, Jeffrey T; Volk, David E

    2014-06-10

    Next-generation sequencing results from bead-based aptamer libraries have demonstrated that traditional DNA/RNA alignment software is insufficient. This is particularly true for X-aptamers containing specialty bases (W, X, Y, Z, ...) that are identified by special encoding. Thus, we sought an automated program that uses the inherent design scheme of bead-based X-aptamers to create a hypothetical reference library and Markov modeling techniques to provide improved alignments. Aptaligner provides this feature as well as length error and noise level cutoff features, is parallelized to run on multiple central processing units (cores), and sorts sequences from a single chip into projects and subprojects.

  8. Automated one-step DNA sequencing based on nanoliter reaction volumes and capillary electrophoresis.

    PubMed

    Pang, H M; Yeung, E S

    2000-08-01

    An integrated system with a nano-reactor for cycle-sequencing reaction coupled to on-line purification and capillary gel electrophoresis has been demonstrated. Fifty nanoliters of reagent solution, which includes dye-labeled terminators, polymerase, BSA and template, was aspirated and mixed with the template inside the nano-reactor followed by cycle-sequencing reaction. The reaction products were then purified by a size-exclusion chromatographic column operated at 50 degrees C followed by room temperature on-line injection of the DNA fragments into a capillary for gel electrophoresis. Over 450 bases of DNA can be separated and identified. As little as 25 nl reagent solution can be used for the cycle-sequencing reaction with a slightly shorter read length. Significant savings on reagent cost is achieved because the remaining stock solution can be reused without contamination. The steps of cycle sequencing, on-line purification, injection, DNA separation, capillary regeneration, gel-filling and fluidic manipulation were performed with complete automation. This system can be readily multiplexed for high-throughput DNA sequencing or PCR analysis directly from templates or even biological materials.

  9. Automation and integration of multiplexed on-line sample preparation with capillary electrophoresis for DNA sequencing

    SciTech Connect

    Tan, H.

    1999-03-31

    The purpose of this research is to develop a multiplexed sample processing system in conjunction with multiplexed capillary electrophoresis for high-throughput DNA sequencing. The concept from DNA template to called bases was first demonstrated with a manually operated single capillary system. Later, an automated microfluidic system with 8 channels based on the same principle was successfully constructed. The instrument automatically processes 8 templates through reaction, purification, denaturation, pre-concentration, injection, separation and detection in a parallel fashion. A multiplexed freeze/thaw switching principle and a distribution network were implemented to manage flow direction and sample transportation. Dye-labeled terminator cycle-sequencing reactions are performed in an 8-capillary array in a hot air thermal cycler. Subsequently, the sequencing ladders are directly loaded into a corresponding size-exclusion chromatographic column operated at {approximately} 60 C for purification. On-line denaturation and stacking injection for capillary electrophoresis is simultaneously accomplished at a cross assembly set at {approximately} 70 C. Not only the separation capillary array but also the reaction capillary array and purification columns can be regenerated after every run. DNA sequencing data from this system allow base calling up to 460 bases with accuracy of 98%.

  10. Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

    PubMed Central

    2009-01-01

    sequences are reported in this paper. Conclusion This automated process allows laboratories to discover DNA variations in a short time and at low cost. PMID:19835634

  11. Automated Workflow for Preparation of cDNA for Cap Analysis of Gene Expression on a Single Molecule Sequencer

    PubMed Central

    Nagao-Sato, Sayaka; Saijo, Eri; Lassmann, Timo; Kanamori-Katayama, Mutsumi; Kaiho, Ai; Lizio, Marina; Kawaji, Hideya; Carninci, Piero; Forrest, Alistair R. R.; Hayashizaki, Yoshihide

    2012-01-01

    Background Cap analysis of gene expression (CAGE) is a 5′ sequence tag technology to globally determine transcriptional starting sites in the genome and their expression levels and has most recently been adapted to the HeliScope single molecule sequencer. Despite significant simplifications in the CAGE protocol, it has until now been a labour intensive protocol. Methodology In this study we set out to adapt the protocol to a robotic workflow, which would increase throughput and reduce handling. The automated CAGE cDNA preparation system we present here can prepare 96 ‘HeliScope ready’ CAGE cDNA libraries in 8 days, as opposed to 6 weeks by a manual operator.We compare the results obtained using the same RNA in manual libraries and across multiple automation batches to assess reproducibility. Conclusions We show that the sequencing was highly reproducible and comparable to manual libraries with an 8 fold increase in productivity. The automated CAGE cDNA preparation system can prepare 96 CAGE sequencing samples simultaneously. Finally we discuss how the system could be used for CAGE on Illumina/SOLiD platforms, RNA-seq and full-length cDNA generation. PMID:22303458

  12. Dna Sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  13. High-speed automated DNA sequencing utilizing from-the-side laser excitation

    NASA Astrophysics Data System (ADS)

    Westphall, Michael S.; Brumley, Robert L., Jr.; Buxton, Erin C.; Smith, Lloyd M.

    1995-04-01

    The Human Genome Initiative is an ambitious international effort to map and sequence the three billion bases of DNA encoded in the human genome. If successfully completed, the resultant sequence database will be a tool of unparalleled power for biomedical research. One of the major challenges of this project is in the area of DNA sequencing technology. At this time, virtually all DNA sequencing is based upon the separation of DNA fragments in high resolution polyacrylamide gels. This method, as generally practiced, is one to two orders of magnitude too slow and expensive for the successful completion of the Human Genome projection. One reasonable approach is improved sequencing of DNA fragments is to increase the performance of such gel-based sequencing methods. Decreased sequencing times may be obtained by increasing the magnitude of the electric field employed. This is not possible with conventional sequencing, due to the fact that the additional heat associated with the increased electric field cannot be adequately dissipated. Recent developments in the use of thin gels have addressed this problem. Performing electrophoresis in ultrathin (50 to 100 microns) gels greatly increases the heat transfer efficiency, thus allowing the benefits of larger electric fields to be obtained. An increase in separation speed of about an order of magnitude is readily achieved. Thin gels have successfully been used in capillary and slab formats. A detection system has been designed for use with a multiple fluorophore sequencing strategy in horizontal ultrathin slab gels. The system employs laser through-the-side excitation and a cooled CCD detector; this allows for the parallel detection of up to 24 sets of four fluorescently labeled DNA sequencing reactions during their electrophoretic separation in ultrathin (115 micrometers ) denaturing polyacrylamide gels. Four hundred bases of sequence information is obtained from 100 ng of M13 template DNA in an hour, corresponding to an

  14. DNA Sequencing apparatus

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1992-01-01

    An automated DNA sequencing apparatus having a reactor for providing at least two series of DNA products formed from a single primer and a DNA strand, each DNA product of a series differing in molecular weight and having a chain terminating agent at one end; separating means for separating the DNA products to form a series bands, the intensity of substantially all nearby bands in a different series being different, band reading means for determining the position an This invention was made with government support including a grant from the U.S. Public Health Service, contract number AI-06045. The U.S. government has certain rights in the invention.

  15. A fully automated 384 capillary array for DNA sequencer. Final report

    SciTech Connect

    Li, Qingbo; Kane, T

    2003-03-20

    Phase I SpectruMedix has successfully developed an automatic 96-capillary array DNA prototype based on the multiplexed capillary electrophoresis system originated from Ames Laboratory-USDOE, Iowa State University. With computer control of all steps involved in a 96-capillary array running cycle, the prototype instrument (the SCE9600) is now capable of sequencing 450 base pairs (bp) per capillary, or 48,000 bp per instrument run within 2 hrs. Phase II of this grant involved the advancement of the core 96 capillary technologies, as well as designing a high density 384 capillary prototype. True commercialization of the 96 capillary instrument involved finalization of the gel matrix, streamlining the instrument hardware, creating a more reliable capillary cartridge, and further advancement of the data processing software. Together these silos of technology create a truly commercializable product (the SCE9610) capable of meeting the operation needs of the sequencing centers.

  16. Identification of the DNA bases of a DNase I footprint by the use of dye primer sequencing on an automated capillary DNA analysis instrument.

    PubMed

    Zianni, Michael; Tessanne, Kimberly; Merighi, Massimo; Laguna, Rick; Tabita, F R

    2006-04-01

    We have adapted the techniques of DNA footprint analysis to an Applied Biosystems 3730 DNA Analyzer. The use of fluorescently labeled primers eliminates the need for radioactively labeled nucleotides, as well as slab gel electrophoresis, and takes advantage of commonly available automated fluorescent capillary electrophoresis instruments. With fluorescently labeled primers and dideoxynucleotide DNA sequencing, we have shown that the terminal base of each digested fragment may be accurately identified with a capillary-based instrument. Polymerase chain reaction (PCR) was performed with a 6FAM-labeled primer to amplify a typical target promoter region. This PCR product was then incubated with a transcriptional activator protein, or bovine serum albumin as a control, and then partially digested with DNase I. A clone of the promoter was sequenced with the Thermo Sequenase Dye Primer Manual Cycle Sequencing kit (USB) and the FAM-labeled primer. Through the use of Genemapper software, the Thermo sequenase and DNasei digestion products were accurately aligned, providing a ready means to assign correct nucleotides to each peak from the DNA footprint. This method was used to characterize the binding of two different transcriptional activator proteins to their respective promoter regions.

  17. Identification of the DNA Bases of a DNase I Footprint by the Use of Dye Primer Sequencing on an Automated Capillary DNA Analysis Instrument

    PubMed Central

    Zianni, Michael; Tessanne, Kimberly; Merighi, Massimo; Laguna, Rick; Tabita, F.R.

    2006-01-01

    We have adapted the techniques of DNA footprint analysis to an Applied Biosystems 3730 DNA Analyzer. The use of fluorescently labeled primers eliminates the need for radioactively labeled nucleotides, as well as slab gel electrophoresis, and takes advantage of commonly available automated fluorescent capillary electrophoresis instruments. With fluorescently labeled primers and dideoxynucleotide DNA sequencing, we have shown that the terminal base of each digested fragment may be accurately identified with a capillary-based instrument. Polymerase chain reaction (PCR) was performed with a 6FAM-labeled primer to amplify a typical target promoter region. This PCR product was then incubated with a transcriptional activator protein, or bovine serum albumin as a control, and then partially digested with DNase I. A clone of the promoter was sequenced with the Thermo Sequenase Dye Primer Manual Cycle Sequencing kit (USB) and the FAM-labeled primer. Through the use of Genemapper software, the Thermo sequenase and DNasei digestion products were accurately aligned, providing a ready means to assign correct nucleotides to each peak from the DNA footprint. This method was used to characterize the binding of two different transcriptional activator proteins to their respective promoter regions. PMID:16741237

  18. j5 DNA assembly design automation.

    PubMed

    Hillson, Nathan J

    2014-01-01

    Modern standardized methodologies, described in detail in the previous chapters of this book, have enabled the software-automated design of optimized DNA construction protocols. This chapter describes how to design (combinatorial) scar-less DNA assembly protocols using the web-based software j5. j5 assists biomedical and biotechnological researchers construct DNA by automating the design of optimized protocols for flanking homology sequence as well as type IIS endonuclease-mediated DNA assembly methodologies. Unlike any other software tool available today, j5 designs scar-less combinatorial DNA assembly protocols, performs a cost-benefit analysis to identify which portions of an assembly process would be less expensive to outsource to a DNA synthesis service provider, and designs hierarchical DNA assembly strategies to mitigate anticipated poor assembly junction sequence performance. Software integrated with j5 add significant value to the j5 design process through graphical user-interface enhancement and downstream liquid-handling robotic laboratory automation.

  19. Factors to be considered for robust high-throughput automated DNA sequencing using a multiple-capillary array instrument

    NASA Astrophysics Data System (ADS)

    Carrilho, Emanuel; Miller, Arthur W.; Ruiz-Martinez, Marie C.; Kotler, Lev; Kesilman, Jeffrey; Karger, Barry L.

    1997-05-01

    The overall goal of our program is to develop a robust, high throughput, fully automated DNA sequencing instrument based on replaceable polymer solutions using a multicapillary array. Significant effort has already been devoted to column and polymer chemistry in order to obtain long read lengths per run in fast analysis time. In this paper we report on progress in instrument considerations and data processing software. A simple instrument design, based on no moving parts for continuous illumination of the capillaries and detection of the fluorescent light was used for this study. Our polymer solution replacement system with the permanent connection between the buffer/chamber manifold and capillary columns on the detector side is designed to prevent the trapping of air bubbles during matrix solution replacement. A special construction of a column-electrode couple on the injection side precludes air trapping during sample injection from small sample volumes. Our in-house software now features the significant reduction of the crosstalk signal from neighbor columns, which may be a potential problem in densely packed large capillary array sequencers.

  20. Speciation of Bacillus spp. in honey produced in Northern Ireland by employment of 16S rDNA PCR and automated DNA sequencing techniques.

    PubMed

    Tolba, Ola; Earle, J A Philip; Millar, B Cherie; Rooney, Paul J; Moore, John E

    2007-12-01

    Phenotypic speciation of foodborne Bacillus spp. remains problematic in terms of obtaining a reliable identification. In this study, we wished to identify several bacterial isolates from honey produced in Northern Ireland, and which belonged to the genus Bacillus, through employment of a molecular identification scheme based on PCR amplification of universal regions of the 16S rRNA operon in combination with direct automated sequencing of the resulting amplicons. Seven samples of honey and related materials (propolis) were examined microbiologically and were demonstrated to have total viable counts (TVC) ranging from <100 to 1700 colony-forming units/g. No yeasts or filamentous fungi were isolated from the honey materials. Several bacterial isolates were identified using this method, yielding two different genera (Paenibacillus and Bacillus), as well as four Bacillus species, namely Bacillus pumilus, B. licheniformis, B. subtilis and B. fusiformis, with B. pumilus the most frequently identified species present. When the use of molecular identification methods is justified, employment of partial 16S rDNA PCR and sequencing provides a valuable and reliable method of identification of Bacillus spp. from foodstuffs and negates associated problems of conventional laboratory and phenotypic identification.

  1. Algorithms for automated DNA assembly

    PubMed Central

    Densmore, Douglas; Hsiau, Timothy H.-C.; Kittleson, Joshua T.; DeLoache, Will; Batten, Christopher; Anderson, J. Christopher

    2010-01-01

    Generating a defined set of genetic constructs within a large combinatorial space provides a powerful method for engineering novel biological functions. However, the process of assembling more than a few specific DNA sequences can be costly, time consuming and error prone. Even if a correct theoretical construction scheme is developed manually, it is likely to be suboptimal by any number of cost metrics. Modular, robust and formal approaches are needed for exploring these vast design spaces. By automating the design of DNA fabrication schemes using computational algorithms, we can eliminate human error while reducing redundant operations, thus minimizing the time and cost required for conducting biological engineering experiments. Here, we provide algorithms that optimize the simultaneous assembly of a collection of related DNA sequences. We compare our algorithms to an exhaustive search on a small synthetic dataset and our results show that our algorithms can quickly find an optimal solution. Comparison with random search approaches on two real-world datasets show that our algorithms can also quickly find lower-cost solutions for large datasets. PMID:20335162

  2. Image analysis for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Palaniappan, Kannappan; Huang, Thomas S.

    1991-07-01

    There is a great deal of interest in automating the process of DNA (deoxyribonucleic acid) sequencing to support the analysis of genomic DNA such as the Human and Mouse Genome projects. In one class of gel-based sequencing protocols autoradiograph images are generated in the final step and usually require manual interpretation to reconstruct the DNA sequence represented by the image. The need to handle a large volume of sequence information necessitates automation of the manual autoradiograph reading step through image analysis in order to reduce the length of time required to obtain sequence data and reduce transcription errors. Various adaptive image enhancement, segmentation and alignment methods were applied to autoradiograph images. The methods are adaptive to the local characteristics of the image such as noise, background signal, or presence of edges. Once the two-dimensional data is converted to a set of aligned one-dimensional profiles waveform analysis is used to determine the location of each band which represents one nucleotide in the sequence. Different classification strategies including a rule-based approach are investigated to map the profile signals, augmented with the original two-dimensional image data as necessary, to textual DNA sequence information.

  3. Initial analysis of non-typical Leber hereditary optic neuropathy (LHON) at onset and late developing demyelinating disease in Italian patients by SSCP and automated DNA sequence analysis

    SciTech Connect

    Sartore, M.; Semeraro, A.; Fortina, P.

    1994-09-01

    LHON is a mitochondrial genetic disease characterized by maternal inheritance and late onset of blindness caused by bilateral retinal degeneration. A number of molecular defects are known affecting expression of seven mitochondrial genes encoding subunits of respiratory chain complex I, III and IV. We screened genomic DNA from Italian patients for seven of the known point mutations in the ND-1, ND-4 and ND-6 subunits of complex I by PCR followed by SSCP and restriction enzyme digestion. Most of the patients had nonfamilial bilateral visual loss with partial or no recovery and normal neurological examination. Fundoscopic examination revealed that none of the patients had features typical of LHON. Nine of 21 patients (43%) showed multifocal CNS demyelination on MRI. Our results show aberrant SSCP patterns for a PCR product from the ND-4 subunit in one affected child and his mother. Sfa NI and Mae III digestions suggested the absence of a previously defined LHON mutation, and automated DNA sequence analysis revealed two A to G neutral sequence polymorphisms in the third position of codons 351 and 353. In addition, PCR products from the same two samples and an unrelated one showed abnormal SSCP patterns for the ND-1 subunit region of complex I due to the presence of a T to C change at nt 4,216 which was demonstrated after Nla III digestion of PCR products and further confirmed by DNA sequence analysis. Our results indicate that additional defects are present in the Italian population, and identification of abnormal SSCP patterns followed by targeted automated DNA sequence analysis is a reasonable strategy for delineation of new LHON mutations.

  4. Comparison of Boiling and Robotics Automation Method in DNA Extraction for Metagenomic Sequencing of Human Oral Microbes

    PubMed Central

    Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu

    2016-01-01

    The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no “gold standard” for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study. PMID:27104353

  5. Comparison of Boiling and Robotics Automation Method in DNA Extraction for Metagenomic Sequencing of Human Oral Microbes.

    PubMed

    Yamagishi, Junya; Sato, Yukuto; Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu

    2016-01-01

    The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no "gold standard" for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study.

  6. A Microfluidic Device for Preparing Next Generation DNA Sequencing Libraries and for Automating Other Laboratory Protocols That Require One or More Column Chromatography Steps

    PubMed Central

    Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C.; Quake, Stephen R.; Burkholder, William F.

    2013-01-01

    Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation. PMID:23894273

  7. A microfluidic device for preparing next generation DNA sequencing libraries and for automating other laboratory protocols that require one or more column chromatography steps.

    PubMed

    Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C; Quake, Stephen R; Burkholder, William F

    2013-01-01

    Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation.

  8. DNA sequencing conference, 2

    SciTech Connect

    Cook-Deegan, R.M.; Venter, J.C.; Gilbert, W.; Mulligan, J.; Mansfield, B.K.

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  9. Automated DNA profile analysis.

    PubMed

    Graham, Eleanor A M

    2005-12-01

    DNA profile analysis is not a simple process. Stringent demands are placed on the accuracy and consistency of forensic evidence so that complex, robust, and reproducible guidelines are necessary to assist the analyst and ensure mistakes are eliminated before a final profile is reported. The guidelines used for forensic DNA profile interpretation are formulated by investigation and statistical evaluation of all aspects of the analytical procedure. All the resulting rules, formulas, and thresholds are perfectly suited to programming of "expert systems"-software programs that imitate the human expert in decision-based processes to formulate a conclusion. Expert systems in forensic DNA analysis will contribute greatly to this field by increasing analytical throughput. The net result of this will be an increase in the human resources available for the research and development of improved methodologies, to ensure that forensic DNA profiling continues to advance at its current impressive rate.

  10. Whole-Genome Sequencing: Automated, Nonindexed Library Preparation.

    PubMed

    Mardis, Elaine; McCombie, W Richard

    2017-03-01

    This protocol describes an automated procedure for constructing a nonindexed Illumina DNA library and relies on the use of a CyBi-SELMA automated pipetting machine, the Covaris E210 shearing instrument, and the epMotion 5075. With this method, genomic DNA fragments are produced by sonication, using high-frequency acoustic energy to shear DNA. Here, double-stranded DNA is fragmented when exposed to the energy of adaptive focused acoustic shearing (AFA). The resulting DNA fragments are ligated to adaptors, amplified by polymerase chain reaction (PCR), and subjected to size selection using magnetic beads. The product is suitable for use as template in whole-genome sequencing.

  11. DNA Sequencing by Capillary Electrophoresis

    PubMed Central

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  12. Whole-Genome Sequencing: Automated, Indexed Library Preparation.

    PubMed

    Mardis, Elaine; McCombie, W Richard

    2017-03-01

    This protocol describes an automated procedure for constructing an indexed Illumina DNA library. With this method, genomic DNA fragments are produced by sonication, using high-frequency acoustic energy to shear DNA. Double-stranded DNA (dsDNA) will fragment when exposed to the energy of adaptive focused acoustic shearing (AFA). The resulting DNA fragments are ligated to adaptors, amplified by polymer chain reaction (PCR), and subjected to size selection using magnetic beads. The product is suitable for use as template in whole-genome sequencing.

  13. High throughput visualization and analysis of human short tandem repeat polymorphisms (STRP`s) using an infrared based automated DNA sequencer

    SciTech Connect

    Jang, G.Y.; Gartside, B.O.; Brumbaugh, J.A.

    1994-09-01

    Short tandem repeat polymorphisms (STRPs) including di-, tri-, and tetranucleotide repeats are very useful markers for gene mapping, genetic diagnosis, and forensics because they are highly polymorphic, abundant throughout the mammalian genome, and amplifiable by the polymerase chain reaction (PCR). In order to meet the demand for large scale gene mapping and genetic diagnosis, the overall speed of genotyping STRPs should be increased. Automated detection systems using laser irradiation along with infrared fluorescently labeled PCR primers or dATP`s have been used for improved detection of PCR amplified STRPs when compared to conventional radioactive detection methods. Here, we report protocols that provide high throughput visualization and analysis of PCR amplified STRPs using the LI-COR Model 4000S DNA Sequencer. Short (15 cm separation distance) gels were used which produced rapid migration of the DNA fragments to the scanning detector (less than 1 hour from sample loading to detection of up to 350 base long DNA fragments). Seven percent denaturing acrylamide gels with a constant 2000V were employed for fast and adequate resolution. A 64 well format was used for loading (60 samples with 4 lanes for standard markers). Multiple loading of samples (using the same gel up to 3 times) has been achieved. In addition, we have multiplexed more than one locus per lane. By applying these conditions (60 samples x 3 loci x loads/gel x 2 gels/day) it is possible to generate images for over 1000 person loci in less than a day. Images were analyzed using Scanalytics RFLPscan software. The output data gave each allele size in number of base pairs.

  14. Automated DNA extraction from pollen in honey.

    PubMed

    Guertler, Patrick; Eicheldinger, Adelina; Muschler, Paul; Goerlich, Ottmar; Busch, Ulrich

    2014-04-15

    In recent years, honey has become subject of DNA analysis due to potential risks evoked by microorganisms, allergens or genetically modified organisms. However, so far, only a few DNA extraction procedures are available, mostly time-consuming and laborious. Therefore, we developed an automated DNA extraction method from pollen in honey based on a CTAB buffer-based DNA extraction using the Maxwell 16 instrument and the Maxwell 16 FFS Nucleic Acid Extraction System, Custom-Kit. We altered several components and extraction parameters and compared the optimised method with a manual CTAB buffer-based DNA isolation method. The automated DNA extraction was faster and resulted in higher DNA yield and sufficient DNA purity. Real-time PCR results obtained after automated DNA extraction are comparable to results after manual DNA extraction. No PCR inhibition was observed. The applicability of this method was further successfully confirmed by analysis of different routine honey samples. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Transposon facilitated DNA sequencing

    SciTech Connect

    Berg, D.E.; Berg, C.M.; Huang, H.V.

    1990-01-01

    The purpose of this research is to investigate and develop methods that exploit the power of bacterial transposable elements for large scale DNA sequencing: Our premise is that the use of transposons to put primer binding sites randomly in target DNAs should provide access to all portions of large DNA fragments, without the inefficiencies of methods involving random subcloning and attendant repetitive sequencing, or of sequential synthesis of many oligonucleotide primers that are used to match systematically along a DNA molecule. Two unrelated bacterial transposons, Tn5 and {gamma}{delta}, are being used because they have both proven useful for molecular analyses, and because they differ sufficiently in mechanism and specificity of transposition to merit parallel development.

  16. Automated Sequence Generation Process and Software

    NASA Technical Reports Server (NTRS)

    Gladden, Roy

    2007-01-01

    "Automated sequence generation" (autogen) signifies both a process and software used to automatically generate sequences of commands to operate various spacecraft. The autogen software comprises the autogen script plus the Activity Plan Generator (APGEN) program. APGEN can be used for planning missions and command sequences.

  17. DNA sequences encoding osteoinductive products

    SciTech Connect

    Wang, E.A.; Wozney, J.M.; Rosen, V.

    1991-05-07

    This patent describes an isolated DNA sequence encoding an osteoinductive protein the DNA sequence comprising a coding sequence. It comprises: nucleotide No.1 through nucleotide No.387, nucleotide No.356 through nucleotide No.1543, nucleotide $402 through nucleotide No.1626, naturally occurring allelic sequences and equivalent degenerative codon sequences and sequences which hybridize to any of sequences under stringent hybridization conditions; and encode a protein characterized by the ability to induce the formation of bone and/or cartilage.

  18. Towards single molecule DNA sequencing

    NASA Astrophysics Data System (ADS)

    Liu, Hao

    Single molecule DNA Sequencing technology has been a hot research topic in the recent decades because it holds the promise to sequence a human genome in a fast and affordable way, which will eventually make personalized medicine possible. Single molecule differentiation and DNA translocation control are the two main challenges in all single molecule DNA sequencing methods. In this thesis, I will first introduce DNA sequencing technology development and its application, and then explain the performance and limitation of prior art in detail. Following that, I will show a single molecule DNA base differentiation result obtained in recognition tunneling experiments. Furthermore, I will explain the assembly of a nanofluidic platform for single strand DNA translocation, which holds the promised to be integrated into a single molecule DNA sequencing instrument for DNA translocation control. Taken together, my dissertation research demonstrated the potential of using recognition tunneling techniques to serve as a general readout system for single molecule DNA sequencing application.

  19. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  20. A Bioluminometric Method of DNA Sequencing

    NASA Technical Reports Server (NTRS)

    Ronaghi, Mostafa; Pourmand, Nader; Stolc, Viktor; Arnold, Jim (Technical Monitor)

    2001-01-01

    Pyrosequencing is a bioluminometric single-tube DNA sequencing method that takes advantage of co-operativity between four enzymes to monitor DNA synthesis. In this sequencing-by-synthesis method, a cascade of enzymatic reactions yields detectable light, which is proportional to incorporated nucleotides. Pyrosequencing has the advantages of accuracy, flexibility and parallel processing. It can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides and gel-electrophoresis. In this chapter, the use of this technique for different applications is discussed.

  1. The Dynamics of DNA Sequencing.

    ERIC Educational Resources Information Center

    Morvillo, Nancy

    1997-01-01

    Describes a paper-and-pencil activity that helps students understand DNA sequencing and expands student understanding of DNA structure, replication, and gel electrophoresis. Appropriate for advanced biology students who are familiar with the Sanger method. (DDR)

  2. Biosensors for DNA sequence detection

    NASA Technical Reports Server (NTRS)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  3. Biosensors for DNA sequence detection

    NASA Technical Reports Server (NTRS)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  4. Graphene nanodevices for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  5. Automated Sequence Processor: Something Old, Something New

    NASA Technical Reports Server (NTRS)

    Streiffert, Barbara; Schrock, Mitchell; Fisher, Forest; Himes, Terry

    2012-01-01

    High productivity required for operations teams to meet schedules Risk must be minimized. Scripting used to automate processes. Scripts perform essential operations functions. Automated Sequence Processor (ASP) was a grass-roots task built to automate the command uplink process System engineering task for ASP revitalization organized. ASP is a set of approximately 200 scripts written in Perl, C Shell, AWK and other scripting languages.. ASP processes/checks/packages non-interactive commands automatically.. Non-interactive commands are guaranteed to be safe and have been checked by hardware or software simulators.. ASP checks that commands are non-interactive.. ASP processes the commands through a command. simulator and then packages them if there are no errors.. ASP must be active 24 hours/day, 7 days/week..

  6. Automated Sequence Processor: Something Old, Something New

    NASA Technical Reports Server (NTRS)

    Streiffert, Barbara; Schrock, Mitchell; Fisher, Forest; Himes, Terry

    2012-01-01

    High productivity required for operations teams to meet schedules Risk must be minimized. Scripting used to automate processes. Scripts perform essential operations functions. Automated Sequence Processor (ASP) was a grass-roots task built to automate the command uplink process System engineering task for ASP revitalization organized. ASP is a set of approximately 200 scripts written in Perl, C Shell, AWK and other scripting languages.. ASP processes/checks/packages non-interactive commands automatically.. Non-interactive commands are guaranteed to be safe and have been checked by hardware or software simulators.. ASP checks that commands are non-interactive.. ASP processes the commands through a command. simulator and then packages them if there are no errors.. ASP must be active 24 hours/day, 7 days/week..

  7. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, Stefan K.

    1998-01-01

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.

  8. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, S.K.

    1998-03-24

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.

  9. Web Navigation Sequences Automation in Modern Websites

    NASA Astrophysics Data System (ADS)

    Montoto, Paula; Pan, Alberto; Raposo, Juan; Bellas, Fernando; López, Javier

    Most today’s web sources are designed to be used by humans, but they do not provide suitable interfaces for software programs. That is why a growing interest has arisen in so-called web automation applications that are widely used for different purposes such as B2B integration, automated testing of web applications or technology and business watch. Previous proposals assume models for generating and reproducing navigation sequences that are not able to correctly deal with new websites using technologies such as AJAX: on one hand existing systems only allow recording simple navigation actions and, on the other hand, they are unable to detect the end of the effects caused by an user action. In this paper, we propose a set of new techniques to record and execute web navigation sequences able to deal with all the complexity existing in AJAX-based web sites. We also present an exhaustive evaluation of the proposed techniques that shows very promising results.

  10. A simple and rapid preparation of M13 sequencing templates for manual and automated dideoxy sequencing.

    PubMed Central

    Kristensen, T; Voss, H; Ansorge, W

    1987-01-01

    A simple and rapid procedure for the preparation of M13 single stranded DNA sequencing templates which does not involve phenol extractions and alcohol precipitations is described. Bacteriophages are precipitated from media supernatants with acetic acid and recovered on glass fiber filters. Subsequent dissociation of the phages and removal of contaminants is performed while the DNA is bound to the glass. Finally, the purified DNA is eluted in a small volume of low-salt buffer. The yield is higher than that obtained by standard methods. The simplified procedure takes less than 30 minutes and does not demand special skills or equipment; the sequence resolution is as good as that obtained by standard procedures both with the Klenow fragment and T7 DNA polymerase, with radioactive labelling as well as in automated sequencing with a fluorescent label. Images PMID:3615197

  11. j5 DNA assembly design automation software.

    PubMed

    Hillson, Nathan J; Rosengarten, Rafael D; Keasling, Jay D

    2012-01-20

    Recent advances in Synthetic Biology have yielded standardized and automatable DNA assembly protocols that enable a broad range of biotechnological research and development. Unfortunately, the experimental design required for modern scar-less multipart DNA assembly methods is frequently laborious, time-consuming, and error-prone. Here, we report the development and deployment of a web-based software tool, j5, which automates the design of scar-less multipart DNA assembly protocols including SLIC, Gibson, CPEC, and Golden Gate. The key innovations of the j5 design process include cost optimization, leveraging DNA synthesis when cost-effective to do so, the enforcement of design specification rules, hierarchical assembly strategies to mitigate likely assembly errors, and the instruction of manual or automated construction of scar-less combinatorial DNA libraries. Using a GFP expression testbed, we demonstrate that j5 designs can be executed with the SLIC, Gibson, or CPEC assembly methods, used to build combinatorial libraries with the Golden Gate assembly method, and applied to the preparation of linear gene deletion cassettes for E. coli. The DNA assembly design algorithms reported here are generally applicable to broad classes of DNA construction methodologies and could be implemented to supplement other DNA assembly design tools. Taken together, these innovations save researchers time and effort, reduce the frequency of user design errors and off-target assembly products, decrease research costs, and enable scar-less multipart and combinatorial DNA construction at scales unfeasible without computer-aided design.

  12. Chromosome specific repetitive DNA sequences

    DOEpatents

    Moyzis, Robert K.; Meyne, Julianne

    1991-01-01

    A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).

  13. Stepping stones in DNA sequencing

    PubMed Central

    Stranneheim, Henrik; Lundeberg, Joakim

    2012-01-01

    In recent years there have been tremendous advances in our ability to rapidly and cost-effectively sequence DNA. This has revolutionized the fields of genetics and biology, leading to a deeper understanding of the molecular events in life processes. The rapid technological advances have enormously expanded sequencing opportunities and applications, but also imposed strains and challenges on steps prior to sequencing and in the downstream process of handling and analysis of these massive amounts of sequence data. Traditionally, sequencing has been limited to small DNA fragments of approximately one thousand bases (derived from the organism's genome) due to issues in maintaining a high sequence quality and accuracy for longer read lengths. Although many technological breakthroughs have been made, currently the commercially available massively parallel sequencing methods have not been able to resolve this issue. However, recent announcements in nanopore sequencing hold the promise of removing this read-length limitation, enabling sequencing of larger intact DNA fragments. The ability to sequence longer intact DNA with high accuracy is a major stepping stone towards greatly simplifying the downstream analysis and increasing the power of sequencing compared to today. This review covers some of the technical advances in sequencing that have opened up new frontiers in genomics. PMID:22887891

  14. Thermoelectric method for sequencing DNA.

    PubMed

    Nestorova, Gergana G; Guilbeau, Eric J

    2011-05-21

    This study describes a novel, thermoelectric method for DNA sequencing in a microfluidic device. The method measures the heat released when DNA polymerase inserts a deoxyribonucleoside triphosphate into a primed DNA template. The study describes the principle of operation of a laminar flow microfluidic chip with a reaction zone that contains DNA template/primer complex immobilized to the inner surface of the device's lower channel wall. A thin-film thermopile attached to the external surface of the lower channel wall measures the dynamic change in temperature that results when Klenow polymerase inserts a deoxyribonucleoside triphosphate into the DNA template. The intrinsic rejection of common-mode thermal signals by the thermopile in combination with hydrodynamic focused flow allows for the measurement of temperature changes on the order of 10(-4) K without control of ambient temperature. To demonstrate the method, we report the sequencing of a model oligonucleotide containing 12 bases. Results demonstrate that it is feasible to sequence DNA by measuring the heat released during nucleotide incorporation. This thermoelectric method for sequencing DNA may offer a novel new method of DNA sequencing for personalized medicine applications. © The Royal Society of Chemistry 2011

  15. Biotools: Patenting DNA sequences

    SciTech Connect

    Yablonsky, M.D.; Hone, W.J.

    1995-07-01

    The decision, known as In re Deuel{sup 2}, rejects the PTO`s interpretation of a previous decision of the Federal Circuit and makes it more possible that a {open_quotes}nucleic acid of a particular sequence{close_quotes} - commonly known as a gene sequence - may be patentable. 15 refs.

  16. Duplication in DNA Sequences

    NASA Astrophysics Data System (ADS)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  17. The sequence of sequencers: The history of sequencing DNA.

    PubMed

    Heather, James M; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way.

  18. Arduino-based automation of a DNA extraction system.

    PubMed

    Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won

    2015-01-01

    There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile.

  19. DNA Sequencing Sensors: An Overview

    PubMed Central

    Garrido-Cardenas, Jose Antonio; Garcia-Maroto, Federico; Alvarez-Bermejo, Jose Antonio; Manzano-Agugliaro, Francisco

    2017-01-01

    The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS) meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years. PMID:28335417

  20. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  1. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  2. Haplogrouping mitochondrial DNA sequences in Legal Medicine/Forensic Genetics.

    PubMed

    Bandelt, Hans-Jürgen; van Oven, Mannis; Salas, Antonio

    2012-11-01

    Haplogrouping refers to the classification of (partial) mitochondrial DNA (mtDNA) sequences into haplogroups using the current knowledge of the worldwide mtDNA phylogeny. Haplogroup assignment of mtDNA control-region sequences assists in the focused comparison with closely related complete mtDNA sequences and thus serves two main goals in forensic genetics: first is the a posteriori quality analysis of sequencing results and second is the prediction of relevant coding-region sites for confirmation or further refinement of haplogroup status. The latter may be important in forensic casework where discrimination power needs to be as high as possible. However, most articles published in forensic genetics perform haplogrouping only in a rudimentary or incorrect way. The present study features PhyloTree as the key tool for assigning control-region sequences to haplogroups and elaborates on additional Web-based searches for finding near-matches with complete mtDNA genomes in the databases. In contrast, none of the automated haplogrouping tools available can yet compete with manual haplogrouping using PhyloTree plus additional Web-based searches, especially when confronted with artificial recombinants still present in forensic mtDNA datasets. We review and classify the various attempts at haplogrouping by using a multiplex approach or relying on automated haplogrouping. Furthermore, we re-examine a few articles in forensic journals providing mtDNA population data where appropriate haplogrouping following PhyloTree immediately highlights several kinds of sequence errors.

  3. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  4. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1996-01-01

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection.

  5. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1996-05-07

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection. 18 figs.

  6. Sequencing PCR-amplified DNA in lipoprotein and cardiovascular disease research.

    PubMed

    Youngblood, Victoria; Taylor, James G

    2013-01-01

    The discovery of novel genetic variants and mutations in lipoprotein and cardiovascular disease research requires DNA sequencing. Large-scale genomics facilities will increasingly accomplish this with a combination of "next-generation" DNA sequencing methodologies. However, laboratories with limited access to these emerging technologies can still support focused genomic studies with the use of automated Sanger sequencing. Here, we describe two robust methods for medium-throughput DNA sequencing from PCR-amplified fragments of genomic DNA.

  7. Automation of a single-DNA molecule stretching device.

    PubMed

    Sørensen, Kristian Tølbøl; Lopacinska, Joanna M; Tommerup, Niels; Silahtaroglu, Asli; Kristensen, Anders; Marie, Rodolphe

    2015-06-01

    We automate the manipulation of genomic-length DNA in a nanofluidic device based on real-time analysis of fluorescence images. In our protocol, individual molecules are picked from a microchannel and stretched with pN forces using pressure driven flows. The millimeter-long DNA fragments free flowing in micro- and nanofluidics emit low fluorescence and change shape, thus challenging the image analysis for machine vision. We demonstrate a set of image processing steps that increase the intrinsically low signal-to-noise ratio associated with single-molecule fluorescence microscopy. Furthermore, we demonstrate how to estimate the length of molecules by continuous real-time image stitching and how to increase the effective resolution of a pressure controller by pulse width modulation. The sequence of image-processing steps addresses the challenges of genomic-length DNA visualization; however, they should also be general to other applications of fluorescence-based microfluidics.

  8. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  9. A scalable, fully automated process for construction of sequence-ready barcoded libraries for 454.

    PubMed

    Lennon, Niall J; Lintner, Robert E; Anderson, Scott; Alvarez, Pablo; Barry, Andrew; Brockman, William; Daza, Riza; Erlich, Rachel L; Giannoukos, Georgia; Green, Lisa; Hollinger, Andrew; Hoover, Cindi A; Jaffe, David B; Juhn, Frank; McCarthy, Danielle; Perrin, Danielle; Ponchner, Karen; Powers, Taryn L; Rizzolo, Kamran; Robbins, Dana; Ryan, Elizabeth; Russ, Carsten; Sparrow, Todd; Stalker, John; Steelman, Scott; Weiand, Michael; Zimmer, Andrew; Henn, Matthew R; Nusbaum, Chad; Nicol, Robert

    2010-01-01

    We present an automated, high throughput library construction process for 454 technology. Sample handling errors and cross-contamination are minimized via end-to-end barcoding of plasticware, along with molecular DNA barcoding of constructs. Automation-friendly magnetic bead-based size selection and cleanup steps have been devised, eliminating major bottlenecks and significant sources of error. Using this methodology, one technician can create 96 sequence-ready 454 libraries in 2 days, a dramatic improvement over the standard method.

  10. Channel plate for DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1998-01-01

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface.

  11. Channel plate for DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1998-01-13

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface. 15 figs.

  12. DNA Sequencing Using capillary Electrophoresis

    SciTech Connect

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  13. Automated design of programmable enzyme-driven DNA circuits.

    PubMed

    van Roekel, Hendrik W H; Meijer, Lenny H H; Masroor, Saeed; Félix Garza, Zandra C; Estévez-Torres, André; Rondelez, Yannick; Zagaris, Antonios; Peletier, Mark A; Hilbers, Peter A J; de Greef, Tom F A

    2015-06-19

    Molecular programming allows for the bottom-up engineering of biochemical reaction networks in a controlled in vitro setting. These engineered biochemical reaction networks yield important insight in the design principles of biological systems and can potentially enrich molecular diagnostic systems. The DNA polymerase-nickase-exonuclease (PEN) toolbox has recently been used to program oscillatory and bistable biochemical networks using a minimal number of components. Previous work has reported the automatic construction of in silico descriptions of biochemical networks derived from the PEN toolbox, paving the way for generating networks of arbitrary size and complexity in vitro. Here, we report an automated approach that further bridges the gap between an in silico description and in vitro realization. A biochemical network of arbitrary complexity can be globally screened for parameter values that display the desired function and combining this approach with robustness analysis further increases the chance of successful in vitro implementation. Moreover, we present an automated design procedure for generating optimal DNA sequences, exhibiting key characteristics deduced from the in silico analysis. Our in silico method has been tested on a previously reported network, the Oligator, and has also been applied to the design of a reaction network capable of displaying adaptation in one of its components. Finally, we experimentally characterize unproductive sequestration of the exonuclease to phosphorothioate protected ssDNA strands. The strong nonlinearities in the degradation of active components caused by this unintended cross-coupling are shown computationally to have a positive effect on adaptation quality.

  14. Scar-less multi-part DNA assembly design automation

    DOEpatents

    Hillson, Nathan J.

    2016-06-07

    The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.

  15. Nanopore DNA sequencing with MspA.

    PubMed

    Derrington, Ian M; Butler, Tom Z; Collins, Marcus D; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H

    2010-09-14

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing.

  16. Nanopore DNA sequencing with MspA

    PubMed Central

    Derrington, Ian M.; Butler, Tom Z.; Collins, Marcus D.; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H.

    2010-01-01

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing. PMID:20798343

  17. Particle sizer and DNA sequencer

    DOEpatents

    Olivares, Jose A.; Stark, Peter C.

    2005-09-13

    An electrophoretic device separates and detects particles such as DNA fragments, proteins, and the like. The device has a capillary which is coated with a coating with a low refractive index such as Teflon.RTM. AF. A sample of particles is fluorescently labeled and injected into the capillary. The capillary is filled with an electrolyte buffer solution. An electrical field is applied across the capillary causing the particles to migrate from a first end of the capillary to a second end of the capillary. A detector light beam is then scanned along the length of the capillary to detect the location of the separated particles. The device is amenable to a high throughput system by providing additional capillaries. The device can also be used to determine the actual size of the particles and for DNA sequencing.

  18. Genetic mapping and DNA sequencing

    SciTech Connect

    Speed, T.; Waterman, M.S.

    1996-12-31

    The Human Genome Initiative has as its primary objective the characterization of the human genome. High-resolution linkage maps of genetic markers will play an important role in completing the human genome project. This is one of two volumes based on the proceedings of the 1994 IMA Summer Program on Molecular Biology and comprises Weeks 1 and 2 of the four-week program. This volume focuses on genetic mapping and DNA sequencing. Selected papers are indexed separately for inclusion in the Energy Science and Technology Database.

  19. Automated Gene Ontology annotation for anonymous sequence data.

    PubMed

    Hennig, Steffen; Groth, Detlef; Lehrach, Hans

    2003-07-01

    Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequencing projects is not easy, especially for species not commonly represented in public databases. We present a software package (GOblet), which performs annotation based on GO terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. The paper also addresses the reliability of automated GO annotations by using a reference set of more than 6000 human proteins. The GOblet server is accessible at http://goblet.molgen.mpg.de.

  20. EasyExonPrimer: automated primer design for exon sequences.

    PubMed

    Wu, Xiaolin; Munroe, David J

    2006-01-01

    EasyExonPrimer is a web-based software that automates the design of PCR primers to amplify exon sequences from genomic DNA. EasyExonPrimer is written in Perl and uses Primer3 to design PCR primers based on the genome builds and annotation databases available at the University of California, Santa Cruz (UCSC) Genome Browser database (http://genome.ucsc.edu/). It masks repeats and known single nucleotide polymorphism (SNP) sites in the genome and designs standardised primers using optimised conditions. Users can input genes by RefSeq mRNA ID, gene name or keyword. The primer design is optimised for large-scale resequencing of exons. For exons larger than 1 kb, the user has the option of breaking the exon sequence down into overlapping smaller fragments. All primer pairs are then verified using the In-Silico PCR software to test for uniqueness in the genome. We have designed >1000 pairs of primers for 90 genes; 95% of the primer pairs successfully amplified exon sequences under standard PCR conditions without requiring further optimisation. EasyExonPrimer is available from http://129.43.22.27/~primer/. The source code is also available upon request. Xiaolin Wu (forestwu@mail.nih.gov).

  1. Laser desorption mass spectrometry for DNA analysis and sequencing

    SciTech Connect

    Chen, C.H.; Taranenko, N.I.; Tang, K.; Allman, S.L.

    1995-03-01

    Laser desorption mass spectrometry has been considered as a potential new method for fast DNA sequencing. Our approach is to use matrix-assisted laser desorption to produce parent ions of DNA segments and a time-of-flight mass spectrometer to identify the sizes of DNA segments. Thus, the approach is similar to gel electrophoresis sequencing using Sanger`s enzymatic method. However, gel, radioactive tagging, and dye labeling are not required. In addition, the sequencing process can possibly be finished within a few hundred microseconds instead of hours and days. In order to use mass spectrometry for fast DNA sequencing, the following three criteria need to be satisfied. They are (1) detection of large DNA segments, (2) sensitivity reaching the femtomole region, and (3) mass resolution good enough to separate DNA segments of a single nucleotide difference. It has been very difficult to detect large DNA segments by mass spectrometry before due to the fragile chemical properties of DNA and low detection sensitivity of DNA ions. We discovered several new matrices to increase the production of DNA ions. By innovative design of a mass spectrometer, we can increase the ion energy up to 45 KeV to enhance the detection sensitivity. Recently, we succeeded in detecting a DNA segment with 500 nucleotides. The sensitivity was 100 femtomole. Thus, we have fulfilled two key criteria for using mass spectrometry for fast DNA sequencing. The major effort in the near future is to improve the resolution. Different approaches are being pursued. When high resolution of mass spectrometry can be achieved and automation of sample preparation is developed, the sequencing speed to reach 500 megabases per year can be feasible.

  2. Automated DNA extraction for large numbers of plant samples.

    PubMed

    Mehle, Nataša; Nikolić, Petra; Rupar, Matevž; Boben, Jana; Ravnikar, Maja; Dermastia, Marina

    2013-01-01

    The method described here is a rapid, total DNA extraction procedure applicable to a large number of plant samples requiring pathogen detection. The procedure combines a simple and quick homogenization step of crude extracts with DNA extraction based upon the binding of DNA to magnetic beads. DNA is purified in an automated process in which the magnetic beads are transferred through a series of washing buffers. The eluted DNA is suitable for efficient amplification in PCR reactions.

  3. Plant DNA sequencing for phylogenetic analyses: from plants to sequences.

    PubMed

    Neves, Susana S; Forrest, Laura L

    2011-01-01

    DNA sequences are important sources of data for phylogenetic analysis. Nowadays, DNA sequencing is a routine technique in molecular biology laboratories. However, there are specific questions associated with project design and sequencing of plant samples for phylogenetic analysis, which may not be familiar to researchers starting in the field. This chapter gives an overview of methods and protocols involved in the sequencing of plant samples, including general recommendations on the selection of species/taxa and DNA regions to be sequenced, and field collection of plant samples. Protocols of plant sample preparation, DNA extraction, PCR and cloning, which are critical to the success of molecular phylogenetic projects, are described in detail. Common problems of sequencing (using the Sanger method) are also addressed. Possible applications of second-generation sequencing techniques in plant phylogenetics are briefly discussed. Finally, orientation on the preparation of sequence data for phylogenetic analyses and submission to public databases is also given.

  4. Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples.

    PubMed

    Uyaguari-Diaz, Miguel I; Slobodan, Jared R; Nesbitt, Matthew J; Croxen, Matthew A; Isaac-Renton, Judith; Prystajecky, Natalie A; Tang, Patrick

    2015-04-17

    Next-generation sequencing of environmental samples can be challenging because of the variable DNA quantity and quality in these samples. High quality DNA libraries are needed for optimal results from next-generation sequencing. Environmental samples such as water may have low quality and quantities of DNA as well as contaminants that co-precipitate with DNA. The mechanical and enzymatic processes involved in extraction and library preparation may further damage the DNA. Gel size selection enables purification and recovery of DNA fragments of a defined size for sequencing applications. Nevertheless, this task is one of the most time-consuming steps in the DNA library preparation workflow. The protocol described here enables complete automation of agarose gel loading, electrophoretic analysis, and recovery of targeted DNA fragments. In this study, we describe a high-throughput approach to prepare high quality DNA libraries from freshwater samples that can be applied also to other environmental samples. We used an indirect approach to concentrate bacterial cells from environmental freshwater samples; DNA was extracted using a commercially available DNA extraction kit, and DNA libraries were prepared using a commercial transposon-based protocol. DNA fragments of 500 to 800 bp were gel size selected using Ranger Technology, an automated electrophoresis workstation. Sequencing of the size-selected DNA libraries demonstrated significant improvements to read length and quality of the sequencing reads.

  5. The Value of DNA Sequencing - TCGA

    Cancer.gov

    DNA sequencing: what it tells us about DNA changes in cancer, how looking across many tumors will help to identify meaningful changes and potential drug targets, and how genomics is changing the way we think about cancer.

  6. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, Andrew M.; Dawson, John

    1993-01-01

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.

  7. DNA sequence from Cretaceous period bone fragments.

    PubMed

    Woodward, S R; Weyand, N J; Bunnell, M

    1994-11-18

    DNA was extracted from 80-million-year-old bone fragments found in strata of the Upper Cretaceous Blackhawk Formation in the roof of an underground coal mine in eastern Utah. This DNA was used as the template in a polymerase chain reaction that amplified and sequenced a portion of the gene encoding mitochondrial cytochrome b. These sequences differ from all other cytochrome b sequences investigated, including those in the GenBank and European Molecular Biology Laboratory databases. DNA isolated from these bone fragments and the resulting gene sequences demonstrate that small fragments of DNA may survive in bone for millions of years.

  8. Small scale sequence automation pays big dividends

    NASA Technical Reports Server (NTRS)

    Nelson, Bill

    1994-01-01

    Galileo sequence design and integration are supported by a suite of formal software tools. Sequence review, however, is largely a manual process with reviewers scanning hundreds of pages of cryptic computer printouts to verify sequence correctness. Beginning in 1990, a series of small, PC based sequence review tools evolved. Each tool performs a specific task but all have a common 'look and feel'. The narrow focus of each tool means simpler operation, and easier creation, testing, and maintenance. Benefits from these tools are (1) decreased review time by factors of 5 to 20 or more with a concomitant reduction in staffing, (2) increased review accuracy, and (3) excellent returns on time invested.

  9. Fibonacci Sequence and Supramolecular Structure of DNA.

    PubMed

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  10. Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

    NASA Astrophysics Data System (ADS)

    Chen, K.

    2017-01-01

    With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).

  11. Sequence and Structure Dependent DNA-DNA Interactions

    NASA Astrophysics Data System (ADS)

    Kopchick, Benjamin; Qiu, Xiangyun

    Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.

  12. Automated Selection Of Pictures In Sequences

    NASA Technical Reports Server (NTRS)

    Rorvig, Mark E.; Shelton, Robert O.

    1995-01-01

    Method of automated selection of film or video motion-picture frames for storage or examination developed. Beneficial in situations in which quantity of visual information available exceeds amount stored or examined by humans in reasonable amount of time, and/or necessary to reduce large number of motion-picture frames to few conveying significantly different information in manner intermediate between movie and comic book or storyboard. For example, computerized vision system monitoring industrial process programmed to sound alarm when changes in scene exceed normal limits.

  13. Automated Selection Of Pictures In Sequences

    NASA Technical Reports Server (NTRS)

    Rorvig, Mark E.; Shelton, Robert O.

    1995-01-01

    Method of automated selection of film or video motion-picture frames for storage or examination developed. Beneficial in situations in which quantity of visual information available exceeds amount stored or examined by humans in reasonable amount of time, and/or necessary to reduce large number of motion-picture frames to few conveying significantly different information in manner intermediate between movie and comic book or storyboard. For example, computerized vision system monitoring industrial process programmed to sound alarm when changes in scene exceed normal limits.

  14. [DNA extraction from bones and teeth using AutoMate Express forensic DNA extraction system].

    PubMed

    Gao, Lin-Lin; Xu, Nian-Lai; Xie, Wei; Ding, Shao-Cheng; Wang, Dong-Jing; Ma, Li-Qin; Li, You-Ying

    2013-04-01

    To explore a new method in order to extract DNA from bones and teeth automatically. Samples of 33 bones and 15 teeth were acquired by freeze-mill method and manual method, respectively. DNA materials were extracted and quantified from the triturated samples by AutoMate Express forensic DNA extraction system. DNA extraction from bones and teeth were completed in 3 hours using the AutoMate Express forensic DNA extraction system. There was no statistical difference between the two methods in the DNA concentration of bones. Both bones and teeth got the good STR typing by freeze-mill method, and the DNA concentration of teeth was higher than those by manual method. AutoMate Express forensic DNA extraction system is a new method to extract DNA from bones and teeth, which can be applied in forensic practice.

  15. Using DNA looping to measure sequence dependent DNA elasticity

    NASA Astrophysics Data System (ADS)

    Kandinov, Alan; Raghunathan, Krishnan; Meiners, Jens-Christian

    2012-10-01

    We are using tethered particle motion (TPM) microscopy to observe protein-mediated DNA looping in the lactose repressor system in DNA constructs with varying AT / CG content. We use these data to determine the persistence length of the DNA as a function of its sequence content and compare the data to direct micromechanical measurements with constant-force axial optical tweezers. The data from the TPM experiments show a much smaller sequence effect on the persistence length than the optical tweezers experiments.

  16. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  17. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  18. Automated Sequence Preprocessing in a Large-Scale Sequencing Environment

    PubMed Central

    Wendl, Michael C.; Dear, Simon; Hodgson, Dave; Hillier, LaDeana

    1998-01-01

    A software system for transforming fragments from four-color fluorescence-based gel electrophoresis experiments into assembled sequence is described. It has been developed for large-scale processing of all trace data, including shotgun and finishing reads, regardless of clone origin. Design considerations are discussed in detail, as are programming implementation and graphic tools. The importance of input validation, record tracking, and use of base quality values is emphasized. Several quality analysis metrics are proposed and applied to sample results from recently sequenced clones. Such quantities prove to be a valuable aid in evaluating modifications of sequencing protocol. The system is in full production use at both the Genome Sequencing Center and the Sanger Centre, for which combined weekly production is ∼100,000 sequencing reads per week. PMID:9750196

  19. Fractal analysis of DNA sequence data

    SciTech Connect

    Berthelsen, C.L.

    1993-01-01

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the [open quote]sandbox method[close quote]. Analysis of 164 human DNA sequences compared to three types of control sequences (random, base-content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than to invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  20. Alignment method for spectrograms of DNA sequences.

    PubMed

    Bucur, Anca; van Leeuwen, Jasper; Dimitrova, Nevenka; Mittal, Chetan

    2010-01-01

    DNA spectrograms express the periodicities of each of the four nucleotides A, T, C, and G in one or several genomic sequences to be analyzed. DNA spectral analysis can be applied to systematically investigate DNA patterns, which may correspond to relevant biological features. As opposed to looking at nucleotide sequences, spectrogram analysis may detect structural characteristics in very long sequences that are not identifiable by sequence alignment. Alignment of DNA spectrograms can be used to facilitate analysis of very long sequences or entire genomes at different resolutions. Standard clustering algorithms have been used in spectral analysis to find strong patterns in spectra. However, as they use a global distance metric, these algorithms can only detect strong patterns coexisting in several frequencies. In this paper, we propose a new method and several algorithms for aligning spectra suitable for efficient spectral analysis and allowing for the easy detection of strong patterns in both single frequencies and multiple frequencies.

  1. DNA sequencing: bench to bedside and beyond†

    PubMed Central

    Hutchison, Clyde A.

    2007-01-01

    Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage ϕX174), which demonstrated that sequence could give profound insights into genetic organization. Incremental improvements allowed sequencing of molecules >200 kb (human cytomegalovirus) leading to an avalanche of data that demanded computational analysis and spawned the field of bioinformatics. The US Human Genome Project spurred sequencing activity. By 1992 the first ‘sequencing factory’ was established, and others soon followed. The first complete cellular genome sequences, from bacteria, appeared in 1995 and other eubacterial, archaebacterial and eukaryotic genomes were soon sequenced. Competition between the public Human Genome Project and Celera Genomics produced working drafts of the human genome sequence, published in 2001, but refinement and analysis of the human genome sequence will continue for the foreseeable future. New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine. These advances will also allow new approaches to a variety of problems in biology, evolution and the environment. PMID:17855400

  2. Nucleotide capacitance calculation for DNA sequencing

    SciTech Connect

    Lu, Jun-Qiang; Zhang, Xiaoguang

    2008-01-01

    Using a first-principles linear response theory, the capacitance of the DNA nucleotides, adenine, cytosine, guanine and thymine, are calculated. The difference in the capacitance between the nucleotides is studied with respect to conformational distortion. The result suggests that although an alternate current capacitance measurement of a single-stranded DNA chain threaded through a nano-gap electrodes may not sufficient to be used as a stand alone method for rapid DNA sequencing, the capacitance of the nucleotides should be taken into consideration in any GHz-frequency electric measurements and may also serve as an additional criterion for identifying the DNA sequence.

  3. Visible periodicity of strong nucleosome DNA sequences.

    PubMed

    Salih, Bilal; Tripathi, Vijay; Trifonov, Edward N

    2015-01-01

    Fifteen years ago, Lowary and Widom assembled nucleosomes on synthetic random sequence DNA molecules, selected the strongest nucleosomes and discovered that the TA dinucleotides in these strong nucleosome sequences often appear at 10-11 bases from one another or at distances which are multiples of this period. We repeated this experiment computationally, on large ensembles of natural genomic sequences, by selecting the strongest nucleosomes--i.e. those with such distances between like-named dinucleotides, multiples of 10.4 bases, the structural and sequence period of nucleosome DNA. The analysis confirmed the periodicity of TA dinucleotides in the strong nucleosomes, and revealed as well other periodic sequence elements, notably classical AA and TT dinucleotides. The matrices of DNA bendability and their simple linear forms--nucleosome positioning motifs--are calculated from the strong nucleosome DNA sequences. The motifs are in full accord with nucleosome positioning sequences derived earlier, thus confirming that the new technique, indeed, detects strong nucleosomes. Species- and isochore-specific variations of the matrices and of the positioning motifs are demonstrated. The strong nucleosome DNA sequences manifest the highest hitherto nucleosome positioning sequence signals, showing the dinucleotide periodicities in directly observable rather than in hidden form.

  4. Coupled amplification and sequencing of genomic DNA.

    PubMed Central

    Ruano, G; Kidd, K K

    1991-01-01

    Addition of dideoxyribonucleotides during the exponential phase of the PCR should result in the synthesis of two complementary sequence ladders. We have explored this hypothesis to develop coupled amplification and sequencing of genomic DNA. Coupled amplification and sequencing is a biphasic method for sequencing both strands of template as they are amplified. Stage I selects and amplifies a single target from the genomic DNA sample. Stage II accomplishes the sequencing as well as additional amplification of the target using aliquots from the stage I reaction mixed with end-labeled primer and dideoxynucleotides. We have successfully applied coupled amplification and sequencing to a 300-base-pair fragment 4 kilobases upstream from HOX2B directly from human whole genomic DNA. Images PMID:1672768

  5. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    PubMed Central

    Vlijm, Rifka; v.d. Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix. PMID:26513573

  6. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A.; Arlinghaus, H.F.

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  7. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. ); Arlinghaus, H.F. )

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  8. Data structures for DNA sequence manipulation.

    PubMed Central

    Lawrence, C B

    1986-01-01

    Two data structures designated Fragment and Construct are described. The Fragment data structure defines a continuous nucleic acid sequence from a unique genetic origin. The Construct defines a continuous sequence composed of sequences from multiple genetic origins. These data structures are manipulated by a set of software tools to simulate the construction of mosaic recombinant DNA molecules. They are also used as an interface between sequence data banks and analytical programs. PMID:3753765

  9. EGNAS: an exhaustive DNA sequence design algorithm

    PubMed Central

    2012-01-01

    Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA) is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences) offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS. PMID:22716030

  10. A test matrix sequencer for research test facility automation

    NASA Technical Reports Server (NTRS)

    Mccartney, Timothy P.; Emery, Edward F.

    1990-01-01

    The hardware and software configuration of a Test Matrix Sequencer, a general purpose test matrix profiler that was developed for research test facility automation at the NASA Lewis Research Center, is described. The system provides set points to controllers and contact closures to data systems during the course of a test. The Test Matrix Sequencer consists of a microprocessor controlled system which is operated from a personal computer. The software program, which is the main element of the overall system is interactive and menu driven with pop-up windows and help screens. Analog and digital input/output channels can be controlled from a personal computer using the software program. The Test Matrix Sequencer provides more efficient use of aeronautics test facilities by automating repetitive tasks that were once done manually.

  11. DNA sequencing using electrical conductance measurements of a DNA polymerase

    NASA Astrophysics Data System (ADS)

    Chen, Yu-Shiun; Lee, Chia-Hui; Hung, Meng-Yen; Pan, Hsu-An; Chiou, Jin-Chern; Huang, G. Steven

    2013-06-01

    The development of personalized medicine--in which medical treatment is customized to an individual on the basis of genetic information--requires techniques that can sequence DNA quickly and cheaply. Single-molecule sequencing technologies, such as nanopores, can potentially be used to sequence long strands of DNA without labels or amplification, but a viable technique has yet to be established. Here, we show that single DNA molecules can be sequenced by monitoring the electrical conductance of a phi29 DNA polymerase as it incorporates unlabelled nucleotides into a template strand of DNA. The conductance of the polymerase is measured by attaching it to a protein transistor that consists of an antibody molecule (immunoglobulin G) bound to two gold nanoparticles, which are in turn connected to source and drain electrodes. The electrical conductance of the DNA polymerase exhibits well-separated plateaux that are ~3 pA in height. Each plateau corresponds to an individual base and is formed at a rate of ~22 nucleotides per second. Additional spikes appear on top of the plateaux and can be used to discriminate between the four different nucleotides. We also show that the sequencing platform works with a variety of DNA polymerases and can sequence difficult templates such as homopolymers.

  12. Streamlining DNA Barcoding Protocols: Automated DNA Extraction and a New cox1 Primer in Arachnid Systematics

    PubMed Central

    Vidergar, Nina; Toplak, Nataša; Kuntner, Matjaž

    2014-01-01

    Background DNA barcoding is a popular tool in taxonomic and phylogenetic studies, but for most animal lineages protocols for obtaining the barcoding sequences—mitochondrial cytochrome C oxidase subunit I (cox1 AKA CO1)—are not standardized. Our aim was to explore an optimal strategy for arachnids, focusing on the species-richest lineage, spiders by (1) improving an automated DNA extraction protocol, (2) testing the performance of commonly used primer combinations, and (3) developing a new cox1 primer suitable for more efficient alignment and phylogenetic analyses. Methodology We used exemplars of 15 species from all major spider clades, processed a range of spider tissues of varying size and quality, optimized genomic DNA extraction using the MagMAX Express magnetic particle processor—an automated high throughput DNA extraction system—and tested cox1 amplification protocols emphasizing the standard barcoding region using ten routinely employed primer pairs. Results The best results were obtained with the commonly used Folmer primers (LCO1490/HCO2198) that capture the standard barcode region, and with the C1-J-2183/C1-N-2776 primer pair that amplifies its extension. However, C1-J-2183 is designed too close to HCO2198 for well-interpreted, continuous sequence data, and in practice the resulting sequences from the two primer pairs rarely overlap. We therefore designed a new forward primer C1-J-2123 60 base pairs upstream of the C1-J-2183 binding site. The success rate of this new primer (93%) matched that of C1-J-2183. Conclusions The use of C1-J-2123 allows full, indel-free overlap of sequences obtained with the standard Folmer primers and with C1-J-2123 primer pair. Our preliminary tests suggest that in addition to spiders, C1-J-2123 will also perform in other arachnids and several other invertebrates. We provide optimal PCR protocols for these primer sets, and recommend using them for systematic efforts beyond DNA barcoding. PMID:25415202

  13. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, A.M.; Dawson, J.

    1993-12-14

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.

  14. Automated serial extraction of DNA and RNA from biobanked tissue specimens.

    PubMed

    Mathot, Lucy; Wallin, Monica; Sjöblom, Tobias

    2013-08-19

    With increasing biobanking of biological samples, methods for large scale extraction of nucleic acids are in demand. The lack of such techniques designed for extraction from tissues results in a bottleneck in downstream genetic analyses, particularly in the field of cancer research. We have developed an automated procedure for tissue homogenization and extraction of DNA and RNA into separate fractions from the same frozen tissue specimen. A purpose developed magnetic bead based technology to serially extract both DNA and RNA from tissues was automated on a Tecan Freedom Evo robotic workstation. 864 fresh-frozen human normal and tumor tissue samples from breast and colon were serially extracted in batches of 96 samples. Yields and quality of DNA and RNA were determined. The DNA was evaluated in several downstream analyses, and the stability of RNA was determined after 9 months of storage. The extracted DNA performed consistently well in processes including PCR-based STR analysis, HaloPlex selection and deep sequencing on an Illumina platform, and gene copy number analysis using microarrays. The RNA has performed well in RT-PCR analyses and maintains integrity upon storage. The technology described here enables the processing of many tissue samples simultaneously with a high quality product and a time and cost reduction for the user. This reduces the sample preparation bottleneck in cancer research. The open automation format also enables integration with upstream and downstream devices for automated sample quantitation or storage.

  15. Automated serial extraction of DNA and RNA from biobanked tissue specimens

    PubMed Central

    2013-01-01

    Background With increasing biobanking of biological samples, methods for large scale extraction of nucleic acids are in demand. The lack of such techniques designed for extraction from tissues results in a bottleneck in downstream genetic analyses, particularly in the field of cancer research. We have developed an automated procedure for tissue homogenization and extraction of DNA and RNA into separate fractions from the same frozen tissue specimen. A purpose developed magnetic bead based technology to serially extract both DNA and RNA from tissues was automated on a Tecan Freedom Evo robotic workstation. Results 864 fresh-frozen human normal and tumor tissue samples from breast and colon were serially extracted in batches of 96 samples. Yields and quality of DNA and RNA were determined. The DNA was evaluated in several downstream analyses, and the stability of RNA was determined after 9 months of storage. The extracted DNA performed consistently well in processes including PCR-based STR analysis, HaloPlex selection and deep sequencing on an Illumina platform, and gene copy number analysis using microarrays. The RNA has performed well in RT-PCR analyses and maintains integrity upon storage. Conclusions The technology described here enables the processing of many tissue samples simultaneously with a high quality product and a time and cost reduction for the user. This reduces the sample preparation bottleneck in cancer research. The open automation format also enables integration with upstream and downstream devices for automated sample quantitation or storage. PMID:23957867

  16. Nanopore DNA sequencing using kinetic proofreading

    NASA Astrophysics Data System (ADS)

    Ling, Xinsheng

    We propose a method of DNA sequencing by combining the physical method of nanopore electrical measurements and Southern's sequencing-by-hybridization. The new key ingredient, essential to both lowering the costs and increasing the precision, is an asymmetric nanopore sandwich device capable of measuring the DNA hybridization probe twice separated by a designed waiting time. Those incorrect probes appearing only once in nanopore ionic current traces are discriminated from the correct ones that appear twice. This method of discrimination is similar to the principle of kinetic proofreading proposed by Hopfield and Ninio in gene transcription and translation processes. An error analysis is of this nanopore kinetic proofreading (nKP) technique for DNA sequencing is carried out in comparison with the most precise 3' dideoxy termination method developed by Sanger. Nanopore DNA sequencing using kinetic proofreading.

  17. Extracting biological knowledge from DNA sequences

    SciTech Connect

    De La Vega, F.M.; Thieffry, D. |; Collado-Vides, J.

    1996-12-31

    This session describes the elucidation of information from dna sequences and what challenges computational biologists face in their task of summarizing and deciphering the human genome. Techniques discussed include methods from statistics, information theory, artificial intelligence and linguistics. 1 ref.

  18. gargammel: a sequence simulator for ancient DNA.

    PubMed

    Renaud, Gabriel; Hanghøj, Kristian; Willerslev, Eske; Orlando, Ludovic

    2016-10-29

    Ancient DNA has emerged as a remarkable tool to infer the history of extinct species and past populations. However, many of its characteristics, such as extensive fragmentation, damage and contamination, can influence downstream analyses. To help investigators measure how these could impact their analyses in silico, we have developed gargammel, a package that simulates ancient DNA fragments given a set of known reference genomes. Our package simulates the entire molecular process from post-mortem DNA fragmentation and DNA damage to experimental sequencing errors, and reproduces most common bias observed in ancient DNA datasets.

  19. Compression of Multiple DNA Sequences Using Intra-Sequence and Inter-Sequence Similarities.

    PubMed

    Cheng, Kin-On; Wu, Paula; Law, Ngai-Fong; Siu, Wan-Chi

    2015-01-01

    Traditionally, intra-sequence similarity is exploited for compressing a single DNA sequence. Recently, remarkable compression performance of individual DNA sequence from the same population is achieved by encoding its difference with a nearly identical reference sequence. Nevertheless, there is lack of general algorithms that also allow less similar reference sequences. In this work, we extend the intra-sequence to the inter-sequence similarity in that approximate matches of subsequences are found between the DNA sequence and a set of reference sequences. Hence, a set of nearly identical DNA sequences from the same population or a set of partially similar DNA sequences like chromosome sequences and DNA sequences of related species can be compressed together. For practical compressors, the compressed size is usually influenced by the compression order of sequences. Fast search algorithms for the optimal compression order are thus developed for multiple sequences compression. Experimental results on artificial and real datasets demonstrate that our proposed multiple sequences compression methods with fast compression order search are able to achieve good compression performance under different levels of similarity in the multiple DNA sequences.

  20. Automated carboxy-terminal sequence analysis of peptides.

    PubMed Central

    Bailey, J. M.; Shenoy, N. R.; Ronk, M.; Shively, J. E.

    1992-01-01

    Proteins and peptides can be sequenced from the carboxy-terminus with isothiocyanate reagents to produce amino acid thiohydantoin derivatives. Previous studies in our laboratory have focused on solution phase conditions for formation of the peptidylthiohydantoins with trimethylsilylisothiocyanate (TMS-ITC) and for hydrolysis of these peptidylthiohydantoins into an amino acid thiohydantoin derivative and a new shortened peptide capable of continued degradation (Bailey, J. M. & Shively, J. E., 1990, Biochemistry 29, 3145-3156). The current study is a continuation of this work and describes the construction of an instrument for automated C-terminal sequencing, the application of the thiocyanate chemistry to peptides covalently coupled to a novel polyethylene solid support (Shenoy, N. R., Bailey, J. M., & Shively, J. E., 1992, Protein Sci. I, 58-67), the use of sodium trimethylsilanolate as a novel reagent for the specific cleavage of the derivatized C-terminal amino acid, and the development of methodology to sequence through the difficult amino acid, aspartate. Automated programs are described for the C-terminal sequencing of peptides covalently attached to carboxylic acid-modified polyethylene. The chemistry involves activation with acetic anhydride, derivatization with TMS-ITC, and cleavage of the derivatized C-terminal amino acid with sodium trimethylsilanolate. The thiohydantoin amino acid is identified by on-line high performance liquid chromatography using a Phenomenex Ultracarb 5 ODS(30) column and a triethylamine/phosphoric acid buffer system containing pentanesulfonic acid. The generality of our automated C-terminal sequencing methodology was examined by sequencing model peptides containing all 20 of the common amino acids. All of the amino acids were found to sequence in high yield (90% or greater) except for asparagine and aspartate, which could be only partially removed, and proline, which was found not be capable of derivatization. In spite of these

  1. Development and Evaluation of an Automated Annotation Pipeline and cDNA Annotation System

    PubMed Central

    Kasukawa, Takeya; Furuno, Masaaki; Nikaido, Itoshi; Bono, Hidemasa; Hume, David A.; Bult, Carol; Hill, David P.; Baldarelli, Richard; Gough, Julian; Kanapin, Alexander; Matsuda, Hideo; Schriml, Lynn M.; Hayashizaki, Yoshihide; Okazaki, Yasushi; Quackenbush, John

    2003-01-01

    Manual curation has long been held to be the “gold standard” for functional annotation of DNA sequence. Our experience with the annotation of more than 20,000 full-length cDNA sequences revealed problems with this approach, including inaccurate and inconsistent assignment of gene names, as well as many good assignments that were difficult to reproduce using only computational methods. For the FANTOM2 annotation of more than 60,000 cDNA clones, we developed a number of methods and tools to circumvent some of these problems, including an automated annotation pipeline that provides high-quality preliminary annotation for each sequence by introducing an “uninformative filter” that eliminates uninformative annotations, controlled vocabularies to accurately reflect both the functional assignments and the evidence supporting them, and a highly refined, Web-based manual annotation tool that allows users to view a wide array of sequence analyses and to assign gene names and putative functions using a consistent nomenclature. The ultimate utility of our approach is reflected in the low rate of reassignment of automated assignments by manual curation. Based on these results, we propose a new standard for large-scale annotation, in which the initial automated annotations are manually investigated and then computational methods are iteratively modified and improved based on the results of manual curation. PMID:12819153

  2. Nanogrid rolling circle DNA sequencing

    DOEpatents

    Church, George M.; Porreca, Gregory J.; Shendure, Jay; Rosenbaum, Abraham Meir

    2017-04-18

    The present invention relates to methods for sequencing a polynucleotide immobilized on an array having a plurality of specific regions each having a defined diameter size, including synthesizing a concatemer of a polynucleotide by rolling circle amplification, wherein the concatemer has a cross-sectional diameter greater than the diameter of a specific region, immobilizing the concatemer to the specific region to make an immobilized concatemer, and sequencing the immobilized concatemer.

  3. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, Karin D.; Chu, Tun-Jen; Pitt, William G.

    1992-01-01

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through said smino groups contained on the surface thereof. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to said target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membrances may be reprobed numerous times.

  4. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, K.D.; Chu, T.J.; Pitt, W.G.

    1992-05-12

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings

  5. Nucleotide sequence of mouse satellite DNA.

    PubMed Central

    Hörz, W; Altenburger, W

    1981-01-01

    The nucleotide sequence of uncloned mouse satellite DNA has been determined by analyzing Sau96I restriction fragments that correspond to the repeat unit of the satellite DNA. An unambiguous sequence of 234 bp has been obtained. The sequence of the first 250 bases from dimeric satellite fragments present in Sau96I limit digests corresponds almost exactly to two tandemly arranged monomer sequences including a complete Sau96I site in the center. This is in agreement with the hypothesis that a low level of divergence which cannot be detected in sequence analyses of uncloned DNA is responsible for the appearance of dimeric fragments. Most of the sequence of the 5% fraction of Sau96 monomers that are susceptible to TaqI has also been determined and has been found to agree completely with the prototype sequence. The monomer sequence is internally repetitious being composed of eight diverged subrepeats. The divergence pattern has interesting implications for theories on the evolution of mouse satellite DNA. PMID:6261227

  6. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley; Brown, Steven D; Podar, Mircea; Palumbo, Anthony Vito; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  7. Intranuclear Anchoring of Repetitive DNA Sequences

    PubMed Central

    Weipoltshammer, Klara; Schöfer, Christian; Almeder, Marlene; Philimonenko, Vlada V.; Frei, Klemens; Wachtler, Franz; Hozák, Pavel

    1999-01-01

    Centromeres, telomeres, and ribosomal gene clusters consist of repetitive DNA sequences. To assess their contributions to the spatial organization of the interphase genome, their interactions with the nucleoskeleton were examined in quiescent and activated human lymphocytes. The nucleoskeletons were prepared using “physiological” conditions. The resulting structures were probed for specific DNA sequences of centromeres, telomeres, and ribosomal genes by in situ hybridization; the electroeluted DNA fractions were examined by blot hybridization. In both nonstimulated and stimulated lymphocytes, centromeric alpha-satellite repeats were almost exclusively found in the eluted fraction, while telomeric sequences remained attached to the nucleoskeleton. Ribosomal genes showed a transcription-dependent attachment pattern: in unstimulated lymphocytes, transcriptionally inactive ribosomal genes located outside the nucleolus were eluted completely. When comparing transcription unit and intergenic spacer, significantly more of the intergenic spacer was removed. In activated lymphocytes, considerable but similar amounts of both rDNA fragments were eluted. The results demonstrate that: (a) the various repetitive DNA sequences differ significantly in their intranuclear anchoring, (b) telomeric rather than centromeric DNA sequences form stable attachments to the nucleoskeleton, and (c) different attachment mechanisms might be responsible for the interaction of ribosomal genes with the nucleoskeleton. PMID:10613900

  8. Nanopore-CMOS Interfaces for DNA Sequencing.

    PubMed

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-08-06

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.

  9. Osmylated DNA, a novel concept for sequencing DNA using nanopores.

    PubMed

    Kanavarioti, Anastassia

    2015-03-27

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. 'Base calling' becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  10. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    NASA Astrophysics Data System (ADS)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  11. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    SciTech Connect

    Not Available

    1992-01-01

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3' to 5' exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  12. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    SciTech Connect

    Not Available

    1992-12-31

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3` to 5` exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  13. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    PubMed

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  14. Dynamics and control of DNA sequence amplification

    NASA Astrophysics Data System (ADS)

    Marimuthu, Karthikeyan; Chakrabarti, Raj

    2014-10-01

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  15. Dynamics and control of DNA sequence amplification

    SciTech Connect

    Marimuthu, Karthikeyan; Chakrabarti, Raj E-mail: rajc@andrew.cmu.edu

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  16. Dynamics and control of DNA sequence amplification.

    PubMed

    Marimuthu, Karthikeyan; Chakrabarti, Raj

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  17. Quadruplex DNA: sequence, topology and structure

    PubMed Central

    Burge, Sarah; Parkinson, Gary N.; Hazel, Pascale; Todd, Alan K.; Neidle, Stephen

    2006-01-01

    G-quadruplexes are higher-order DNA and RNA structures formed from G-rich sequences that are built around tetrads of hydrogen-bonded guanine bases. Potential quadruplex sequences have been identified in G-rich eukaryotic telomeres, and more recently in non-telomeric genomic DNA, e.g. in nuclease-hypersensitive promoter regions. The natural role and biological validation of these structures is starting to be explored, and there is particular interest in them as targets for therapeutic intervention. This survey focuses on the folding and structural features on quadruplexes formed from telomeric and non-telomeric DNA sequences, and examines fundamental aspects of topology and the emerging relationships with sequence. Emphasis is placed on information from the high-resolution methods of X-ray crystallography and NMR, and their scope and current limitations are discussed. Such information, together with biological insights, will be important for the discovery of drugs targeting quadruplexes from particular genes. PMID:17012276

  18. Female-specific DNA sequences in geese.

    PubMed

    Huang, M C; Lin, W C; Horng, Y M; Rouvier, R; Huang, C W

    2003-07-01

    1. The OPAE random primers (Operon Technologies, Inc., CA) were used for random amplified polymorphic DNA (RAPD) fingerprinting in Chinese, White Roman and Landaise geese. One of these primers, OPAE-06, produced a 938-bp sex-specific fragment in all females and in no males of Chinese geese only. 2. A novel female-specific DNA sequence in Chinese goose was cloned and sequenced. Two primers, CGSex-F and CGSex-R, were designed in order to amplify a 912-bp sex-specific polymerase chain reaction (PCR) fragment on genomic DNA from female geese. 3. It was shown that a simple and effective PCR-based sexing technique could be used in the three goose breeds studied. 4. Nucleotide sequencing of the sex-specific fragments in White Roman and Landaise geese was performed and sequence differences were observed among these three breeds.

  19. Automated screening for small organic ligands using DNA-encoded chemical libraries.

    PubMed

    Decurtins, Willy; Wichert, Moreno; Franzini, Raphael M; Buller, Fabian; Stravs, Michael A; Zhang, Yixin; Neri, Dario; Scheuermann, Jörg

    2016-04-01

    DNA-encoded chemical libraries (DECLs) are collections of organic compounds that are individually linked to different oligonucleotides, serving as amplifiable identification barcodes. As all compounds in the library can be identified by their DNA tags, they can be mixed and used in affinity-capture experiments on target proteins of interest. In this protocol, we describe the screening process that allows the identification of the few binding molecules within the multiplicity of library members. First, the automated affinity selection process physically isolates binding library members. Second, the DNA codes of the isolated binders are PCR-amplified and subjected to high-throughput DNA sequencing. Third, the obtained sequencing data are evaluated using a C++ program and the results are displayed using MATLAB software. The resulting selection fingerprints facilitate the discrimination of binding from nonbinding library members. The described procedures allow the identification of small organic ligands to biological targets from a DECL within 10 d.

  20. Compressing DNA sequence databases with coil

    PubMed Central

    White, W Timothy J; Hendy, Michael D

    2008-01-01

    Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794

  1. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  2. Inferring ethnicity from mitochondrial DNA sequence

    PubMed Central

    2011-01-01

    Background The assignment of DNA samples to coarse population groups can be a useful but difficult task. One such example is the inference of coarse ethnic groupings for forensic applications. Ethnicity plays an important role in forensic investigation and can be inferred with the help of genetic markers. Being maternally inherited, of high copy number, and robust persistence in degraded samples, mitochondrial DNA may be useful for inferring coarse ethnicity. In this study, we compare the performance of methods for inferring ethnicity from the sequence of the hypervariable region of the mitochondrial genome. Results We present the results of comprehensive experiments conducted on datasets extracted from the mtDNA population database, showing that ethnicity inference based on support vector machines (SVM) achieves an overall accuracy of 80-90%, consistently outperforming nearest neighbor and discriminant analysis methods previously proposed in the literature. We also evaluate methods of handling missing data and characterize the most informative segments of the hypervariable region of the mitochondrial genome. Conclusions Support vector machines can be used to infer coarse ethnicity from a small region of mitochondrial DNA sequence with surprisingly high accuracy. In the presence of missing data, utilizing only the regions common to the training sequences and a test sequence proves to be the best strategy. Given these results, SVM algorithms are likely to also be useful in other DNA sequence classification applications. PMID:21554759

  3. Sequencing of long stretches of repetitive DNA

    PubMed Central

    De Bustos, Alfredo; Cuadrado, Angeles; Jouve, Nicolás

    2016-01-01

    Repetitive DNA is widespread in eukaryotic genomes, in some cases making up more than 80% of the total. SSRs are a type of repetitive DNA formed by short motifs repeated in tandem arrays. In some species, SSRs may be organized into long stretches, usually associated with the constitutive heterochromatin. Variation in repeats can alter the expression of genes, and changes in the number of repeats have been linked to certain human diseases. Unfortunately, the molecular characterization of these repeats has been hampered by technical limitations related to cloning and sequencing. Indeed, most sequenced genomes contain gaps owing to repetitive DNA-related assembly difficulties. This paper reports an alternative method for sequencing of long stretches of repetitive DNA based on the combined use of 1) a linear vector to stabilize the cloning process, and 2) the use of exonuclease III for obtaining progressive deletions of SSR-rich fragments. This strategy allowed the sequencing of a fragment containing a stretch of 6.2 kb of continuous SSRs. To demonstrate that this procedure can sequence other kinds of repetitive DNA, it was used to examine a 4.5 kb fragment containing a cluster of 15 repeats of the 5S rRNA gene of barley. PMID:27819354

  4. Assessing graphene nanopores for sequencing DNA.

    PubMed

    Wells, David B; Belkin, Maxim; Comer, Jeffrey; Aksimentiev, Aleksei

    2012-08-08

    Using all-atom molecular dynamics and atomic-resolution Brownian dynamics, we simulate the translocation of single-stranded DNA through graphene nanopores and characterize the ionic current blockades produced by DNA nucleotides. We find that transport of single DNA strands through graphene nanopores may occur in single nucleotide steps. For certain pore geometries, hydrophobic interactions with the graphene membrane lead to a dramatic reduction in the conformational fluctuations of the nucleotides in the nanopores. Furthermore, we show that ionic current blockades produced by different DNA nucleotides are, in general, indicative of the nucleotide type, but very sensitive to the orientation of the nucleotides in the nanopore. Taken together, our simulations suggest that strand sequencing of DNA by measuring the ionic current blockades in graphene nanopores may be possible, given that the conformation of DNA nucleotides in the nanopore can be controlled through precise engineering of the nanopore surface.

  5. Sequencing mitochondrial DNA polymorphisms by hybridization

    SciTech Connect

    Chee, M.S.; Lockhart, D.J.; Hubbell, E.

    1994-09-01

    We have investigated the use of DNA chips for genetic analysis, using human mitochondrial DNA (mtDNA) as a model. The DNA chips are made up of ordered arrays of DNA oligonucleotide probes, synthesized on a glass substrate using photolithographic techniques. The synthesis site for each different probe is specifically addressed by illumination of the substrate through a photolithographic mask, achieving selective deprotection Nucleoside phosphoramidites bearing photolabile protecting groups are coupled only to exposed sites. Repeated cycles of deprotection and coupling generate all the probes in parallel. The set of 4{sup N} N-mer probes can be synthesized in only 4N steps. Any subset can be synthesized in 4N steps. Any subset can be synthesized in 4N or fewer steps. Sequences amplified from the D-loop region of human mitochondrial DNA (mtDNA) were fluorescently labelled and hybridized to DNA chips containing probes specific for mtDNA. Each nucleotide of a 1.3 kb region spanning the D loop is represented by four probes on the chip. Each probe has a different base at the position of interest: together they comprise a set of A, C, G and T probes which are otherwise identical. In principle, only one probe-target hybrid will be a perfect match. The other three will be single base mismatches. Fluorescence imaging of the hybridized chip allows quantification of hybridization signals. Heterozygous mixtures of sequences can also be characterized. We have developed software to quantitate and interpret the hybridization signals, and to call the sequence automatically. Results of sequence analysis of human mtDNAs will be presented.

  6. DNA sequencing by synthesis based on elongation delay detection

    NASA Astrophysics Data System (ADS)

    Manturov, Alexey O.; Grigoryev, Anton V.

    2015-03-01

    The one of most important problem in modern genetics, biology and medicine is determination of the primary nucleotide sequence of the DNA of living organisms (DNA sequencing). This paper describes the label-free DNA sequencing approach, based on the observation of a discrete dynamics of DNA sequence elongation phase. The proposed DNA sequencing principle are studied by numerical simulation. The numerical model for proposed label-free DNA sequencing approach is based on a cellular automaton, which can simulate the elongation stage (growth of DNA strands) and dynamics of nucleotides incorporation to rising DNA strand. The estimates for number of copied DNA sequences for required probability of nucleotide incorporation event detection and correct DNA sequence determination was obtained. The proposed approach can be applied at all known DNA sequencing devices with "sequencing by synthesis" principle of operation.

  7. Automated genomic DNA extraction from saliva using the QIAxtractor.

    PubMed

    Keijzer, Henry; Endenburg, Silvia C; Smits, Marcel G; Koopmann, Miriam

    2010-05-01

    Venipuncture is an invasive procedure to obtain whole blood in order to obtain high quality and sufficient amounts of genomic DNA. Obtaining DNA from non-invasive sources is preferred by patients, medical doctors and researchers. Saliva collected with cotton swabs (Salivette) is increasingly being used to study chemical compounds, and it can also be a source of DNA. However, extracting DNA from Salivettes is very laborious and time consuming. Therefore, we developed a protocol for automated genomic DNA extraction from saliva collected in Salivette using the QIAxtractor. Saliva (0.1-2.0 mL) was collected by chewing on a Salivette for 1-2 min. A total of 70 samples, collected from healthy volunteers, were extracted with the QIAxtractor robot and a Qiagen DX reagent pack. Quantity and quality was assessed using UV spectrometry and real-time polymerase chain reaction (PCR) (substitution at position -729 in the CYP1A2 gene). The average DNA concentration from the saliva samples was 6.0 microg/mL (95% CI 5.4-6.6 microg/mL). In 100% of the saliva samples, PCR products were detected with an average cycle threshold of 23.1 (95% CI 22.6-23.6). DNA can be extracted in sufficient amounts from Salivette with a fully automated system with a short turnaround time. Real-time PCR can be performed with these samples.

  8. Unzipping of DNA with correlated base sequence.

    PubMed

    Allahverdyan, A E; Gevorkian, Zh S; Hu, Chin-Kun; Wu, Ming-Chya

    2004-06-01

    We consider force-induced unzipping transition for a heterogeneous DNA model with a correlated base sequence. Both finite-range and long-range correlated situations are considered. It is shown that finite-range correlations increase stability of DNA with respect to the external unzipping force. Due to long-range correlations the number of unzipped base pairs displays two widely different scenarios depending on the details of the base sequence: either there is no unzipping phase transition at all, or the transition is realized via a sequence of jumps with magnitude comparable to the size of the system. Both scenarios are different from the behavior of the average number of unzipped base pairs (non-self-averaging). The results can be relevant for explaining the biological purpose of correlated structures in DNA.

  9. Automated Antibody De Novo Sequencing and Its Utility in Biopharmaceutical Discovery

    NASA Astrophysics Data System (ADS)

    Sen, K. Ilker; Tang, Wilfred H.; Nayak, Shruti; Kil, Yong J.; Bern, Marshall; Ozoglu, Berk; Ueberheide, Beatrix; Davis, Darryl; Becker, Christopher

    2017-05-01

    Applications of antibody de novo sequencing in the biopharmaceutical industry range from the discovery of new antibody drug candidates to identifying reagents for research and determining the primary structure of innovator products for biosimilar development. When murine, phage display, or patient-derived monoclonal antibodies against a target of interest are available, but the cDNA or the original cell line is not, de novo protein sequencing is required to humanize and recombinantly express these antibodies, followed by in vitro and in vivo testing for functional validation. Availability of fully automated software tools for monoclonal antibody de novo sequencing enables efficient and routine analysis. Here, we present a novel method to automatically de novo sequence antibodies using mass spectrometry and the Supernovo software. The robustness of the algorithm is demonstrated through a series of stress tests.

  10. Automated Antibody De Novo Sequencing and Its Utility in Biopharmaceutical Discovery

    NASA Astrophysics Data System (ADS)

    Sen, K. Ilker; Tang, Wilfred H.; Nayak, Shruti; Kil, Yong J.; Bern, Marshall; Ozoglu, Berk; Ueberheide, Beatrix; Davis, Darryl; Becker, Christopher

    2017-01-01

    Applications of antibody de novo sequencing in the biopharmaceutical industry range from the discovery of new antibody drug candidates to identifying reagents for research and determining the primary structure of innovator products for biosimilar development. When murine, phage display, or patient-derived monoclonal antibodies against a target of interest are available, but the cDNA or the original cell line is not, de novo protein sequencing is required to humanize and recombinantly express these antibodies, followed by in vitro and in vivo testing for functional validation. Availability of fully automated software tools for monoclonal antibody de novo sequencing enables efficient and routine analysis. Here, we present a novel method to automatically de novo sequence antibodies using mass spectrometry and the Supernovo software. The robustness of the algorithm is demonstrated through a series of stress tests.

  11. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  12. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  13. DNA sequencing technology, walking with modular primers. Final report

    SciTech Connect

    Ulanovsky, L.

    1996-12-31

    The success of the Human Genome Project depends on the development of adequate technology for rapid and inexpensive DNA sequencing, which will also benefit biomedical research in general. The authors are working on DNA technologies that eliminate primer synthesis, the main bottleneck in sequencing by primer walking. They have developed modular primers that are assembled from three 5-mer, 6-mer or 7-mer modules selected from a presynthesized library of as few as 1,000 oligonucleotides ({double_bond}4, {double_bond}5, {double_bond}7). The three modules anneal contiguously at the selected template site and prime there uniquely, even though each is not unique for the most part when used alone. This technique is expected to speed up primer walking 30 to 50 fold, and reduce the sequencing cost by a factor of 5 to 15. Time and expensive will be saved on primer synthesis itself and even more so due to closed-loop automation of primer walking, made possible by the instant availability of primers. Apart from saving time and cost, closed-loop automation would also minimize the errors and complications associated with human intervention between the walks. The author has also developed two additional approaches to primer-library based sequencing. One involves a branched structure of modular primers which has a distinctly different mechanism of achieving priming specificity. The other introduces the concept of ``Differential Extension with Nucleotide Subsets`` as an approach increasing priming specificity, priming strength and allowing cycle sequencing. These approaches are expected to be more robust than the original version of the modular primer technique.

  14. DNA methylation detection: bisulfite genomic sequencing analysis.

    PubMed

    Li, Yuanyuan; Tollefsbol, Trygve O

    2011-01-01

    DNA methylation, which most commonly occurs at the C5 position of cytosines within CpG dinucleotides, plays a pivotal role in many biological procedures such as gene expression, embryonic development, cellular proliferation, differentiation, and chromosome stability. Aberrant DNA methylation is often associated with loss of DNA homeostasis and genomic instability leading to the development of human diseases such as cancer. The importance of DNA methylation creates an urgent demand for effective methods with high sensitivity and reliability to explore innovative diagnostic and therapeutic strategies. Bisulfite genomic sequencing developed by Frommer and colleagues was recognized as a revolution in DNA methylation analysis based on conversion of genomic DNA by using sodium bisulfite. Besides various merits of the bisulfite genomic sequencing method such as being highly qualitative and quantitative, it serves as a fundamental principle to many derived methods to better interpret the mystery of DNA methylation. Here, we present a protocol currently frequently used in our laboratory that has proven to yield optimal outcomes. We also discuss the potential technical problems and troubleshooting notes for a variety of applications in this field.

  15. A microchannel electrophoresis DNA sequencing system

    SciTech Connect

    Madabhushi, R S; Warth, T; Balch, J W; Bass, M; Brewer, L R; Copeland, A C; Davidson, J C; Fitch, J P; Kegelmeyer, L M; Kimbrough, J R; McCready, P; Nelson, D; Pastrone, R L; Richardson, P M; Swierkowski, S P; Tarte, L A; Vainer, M

    1999-01-01

    In order to increase the DNA sequencing throughput of the Joint Genome Institute, we have developed a microchannel electrophoresis system. The critical new and unique elements of this system include 1) a process for the production of arrays of 96 and 384 microchannels on bonded glass substrates up to 14 x 58 cm and 2) new sieving media for high resolution and high speed separations. With custom fabrication apparatus, microchannels are etched in a borosilicate substrate, and then fusion bonded to a top substrate 1.1 mm thick that has access holes formed in it. SEM examination shows a typical microchannel to be 40 micrometers deep x 180 micrometers wide by 46 cm long. This technology offers significant advantages over discrete capillaries or conventional slab-gel approaches. High throughput DNA sequencing with over 550 base pairs resolution has been achieved in roughly half the time of conventional sequencers. In February 1999, we begin a pre-production evaluation protocol for the microchannel and for three glass capillary electrophoresis systems (two from industry and one developed by Lawrence Berkeley National Laboratory for the Joint Genome Institute). In order to utilize these instruments for DNA production sequencing, we have been evaluating and implementing software to convert raw electropherograms into called DNA bases with an associated probability of error. Our original intent was to utilize the DNA base calling software known as Plan and Phred developed by the University of Washington. This software has been outstanding for our slab gel electrophoresis systems currently in the production facility. In our tests and evaluations of this software applied to microchannel data, we observed that the electropherograms are of a different statistical and underlying signal structure compared to slab gels. Even with substantial modifications to the software, base calling performance was not satisfactory for the microchannel data. In this paper, we will present o The

  16. Automated N-Glycosylation Sequencing Of Biopharmaceuticals By Capillary Electrophoresis.

    PubMed

    Szigeti, Marton; Guttman, Andras

    2017-09-15

    Comprehensive analysis of the N-linked carbohydrates of glycoproteins is gaining high recent interest in both the biopharmaceutical and biomedical fields. In addition to high resolution glycosylation profiling, sugar residue and linkage specific enzymes are also routinely used for exoglycosidase digestion based carbohydrate sequencing. This latter one, albeit introduced decades ago, still mostly practiced by following tedious and time consuming manual processes. In this paper we introduce an automated carbohydrate sequencing approach using the appropriate exoglycosidase enzymes in conjunction with the utilization of some of the features of a capillary electrophoresis (CE) instrument to speed up the process. The enzymatic reactions were accomplished within the temperature controlled sample storage compartment of a capillary electrophoresis unit and the separation capillary was also utilized for accurate delivery of the exoglycosidase enzymes. CE analysis was conducted after each digestion step obtaining in this way the sequence information of N-glycans in 60 and 128 minutes using the semi- and the fully-automated methods, respectively.

  17. The DNA sequence specificity of bleomycin cleavage in a systematically altered DNA sequence.

    PubMed

    Gautam, Shweta D; Chen, Jon K; Murray, Vincent

    2017-08-01

    Bleomycin is an anti-tumour agent that is clinically used to treat several types of cancers. Bleomycin cleaves DNA at specific DNA sequences and recent genome-wide DNA sequencing specificity data indicated that the sequence 5'-RTGT*AY (where T* is the site of bleomycin cleavage, R is G/A and Y is T/C) is preferentially cleaved by bleomycin in human cells. Based on this DNA sequence, we constructed a plasmid clone to explore this bleomycin cleavage preference. By systematic variation of single nucleotides in the 5'-RTGT*AY sequence, we were able to investigate the effect of nucleotide changes on bleomycin cleavage efficiency. We observed that the preferred consensus DNA sequence for bleomycin cleavage in the plasmid clone was 5'-YYGT*AW (where W is A/T). The most highly cleaved sequence was 5'-TCGT*AT and, in fact, the seven most highly cleaved sequences conformed to the consensus sequence 5'-YYGT*AW. A comparison with genome-wide results was also performed and while the core sequence was similar in both environments, the surrounding nucleotides were different.

  18. New Stopping Criteria for Segmenting DNA Sequences

    NASA Astrophysics Data System (ADS)

    Li, Wentian

    2001-06-01

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian information criterion in the model selection framework. When this criterion is applied to telomere of S. cerevisiae and the complete sequence of E. coli, borders of biologically meaningful units were identified, and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  19. New Stopping Criteria for Segmenting DNA Sequences

    SciTech Connect

    Li, Wentian

    2001-06-18

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian information criterion in the model selection framework. When this criterion is applied to telomere of S.cerevisiae and the complete sequence of E.coli, borders of biologically meaningful units were identified, and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  20. DNA Sequence Alignment during Homologous Recombination*

    PubMed Central

    Greene, Eric C.

    2016-01-01

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  1. DNA Sequence Alignment during Homologous Recombination.

    PubMed

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.

  2. Automated amplicon design suitable for analysis of DNA variants by melting techniques.

    PubMed

    Ekstrøm, Per Olaf; Nakken, Sigve; Johansen, Morten; Hovig, Eivind

    2015-11-11

    The technological development of DNA analysis has had tremendous development in recent years, and the present deep sequencing techniques present unprecedented opportunities for detailed and high-throughput DNA variant detection. Although DNA sequencing has had an exponential decrease in cost per base pair analyzed, focused and target-specific methods are however still much in use for analysis of DNA variants. With increasing capacity in the analytical procedures, an equal demand in automated amplicon and primer design has emerged. We have constructed a web-based tool that is able to batch design DNA variant assay suitable for analysis by denaturing gel/capillary electrophoresis and high resolution melting. The tool is developed as a computational workflow that implements one of the most widely used primer design tools, followed by validation of primer specificity, as well as calculation and visualization of the melting properties of the resulting amplicon, with or without an artificial high melting domain attached. The tool will be useful for scientists applying DNA melting techniques in analysis of DNA variations. The tool is freely available at http://meltprimer.ous-research.no/ . Herein, we demonstrate a novel tool with respect to covering the whole amplicon design workflow necessary for groups that use melting equilibrium techniques to separate DNA variants.

  3. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  4. The first determination of DNA sequence of a specific gene.

    PubMed

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  5. Imaging of DNA sequences with chemiluminescence.

    PubMed Central

    Tizard, R; Cate, R L; Ramachandran, K L; Wysk, M; Voyta, J C; Murphy, O J; Bronstein, I

    1990-01-01

    We have coupled a chemiluminescent detection method that uses an alkaline phosphatase label to the genomic DNA sequencing protocol of Church and Gilbert [Church, G. M. & Gilbert, W. (1984) Proc. Natl. Acad. Sci. USA 81, 1991-1995]. Images of sequence ladders are obtained on x-ray film with exposure times of less than 30 min, as compared to 40 h required for a similar exposure with a 32P-labeled oligomer. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to DNA oligonucleotides labeled with alkaline phosphatase or with biotin, leading directly or indirectly to deposition of enzyme. If a biotinylated probe is used, an incubation with avidin-alkaline phosphatase conjugate follows. The membrane is soaked in the chemiluminescent substrate (AMPPD) and is exposed to film. Dephosphorylation of AMPPD leads in a two-step pathway to a highly localized emission of visible light. The demonstrated shorter exposure times may improve the efficiency of a serial reprobing strategy such as the multiplex sequencing approach of Church and Kieffer-Higgins [Church, G. M. & Kieffer-Higgins, S. (1988) Science 240, 185-188]. Images PMID:2191292

  6. DNA sequencing by nanopores: advances and challenges

    NASA Astrophysics Data System (ADS)

    Agah, Shaghayegh; Zheng, Ming; Pasquali, Matteo; Kolomeisky, Anatoly B.

    2016-10-01

    Developing inexpensive and simple DNA sequencing methods capable of detecting entire genomes in short periods of time could revolutionize the world of medicine and technology. It will also lead to major advances in our understanding of fundamental biological processes. It has been shown that nanopores have the ability of single-molecule sensing of various biological molecules rapidly and at a low cost. This has stimulated significant experimental efforts in developing DNA sequencing techniques by utilizing biological and artificial nanopores. In this review, we discuss recent progress in the nanopore sequencing field with a focus on the nature of nanopores and on sensing mechanisms during the translocation. Current challenges and alternative methods are also discussed.

  7. Repetitive DNA sequences in Mycoplasma pneumoniae.

    PubMed Central

    Wenzel, R; Herrmann, R

    1988-01-01

    Two types of different repetitive DNA sequences called RepMP1 and RepMP2 were identified in the genome of Mycoplasma pneumoniae. The number of these repeated elements, their nucleotide sequence and their localization on a physical map of the M. pneumoniae genome were determined. The results show that RepMP1 appears at least 10 times and RepMP2 at least 8 times in the genome. The repeated elements are dispersed on the chromosome and, in three cases, linked to each other by a homologous DNA sequence of 400 bp. The elements themselves are 300 bp (for RepMP1) and 150 bp (for RepMP2) long showing a high degree of homology. One copy of RepMP2 is a translated part of the gene for the major cytadhesin protein P1 which is responsible for the adsorption of M. pneumoniae to its host cell. Images PMID:3138660

  8. Nanopore-CMOS Interfaces for DNA Sequencing

    PubMed Central

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-01-01

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529

  9. mtDNAprofiler: a Web application for the nomenclature and comparison of human mitochondrial DNA sequences.

    PubMed

    Yang, In Seok; Lee, Hwan Young; Yang, Woo Ick; Shin, Kyoung-Jin

    2013-07-01

    Mitochondrial DNA (mtDNA) is a valuable tool in the fields of forensic, population, and medical genetics. However, recording and comparing mtDNA control region or entire genome sequences would be difficult if researchers are not familiar with mtDNA nomenclature conventions. Therefore, mtDNAprofiler, a Web application, was designed for the analysis and comparison of mtDNA sequences in a string format or as a list of mtDNA single-nucleotide polymorphisms (mtSNPs). mtDNAprofiler which comprises four mtDNA sequence-analysis tools (mtDNA nomenclature, mtDNA assembly, mtSNP conversion, and mtSNP concordance-check) supports not only the accurate analysis of mtDNA sequences via an automated nomenclature function, but also consistent management of mtSNP data via direct comparison and validity-check functions. Since mtDNAprofiler consists of four tools that are associated with key steps of mtDNA sequence analysis, mtDNAprofiler will be helpful for researchers working with mtDNA. mtDNAprofiler is freely available at http://mtprofiler.yonsei.ac.kr.

  10. Sequence-Dependent Persistence Lengths of DNA.

    PubMed

    Mitchell, Jonathan S; Glowacki, Jaroslaw; Grandchamp, Alexandre E; Manning, Robert S; Maddocks, John H

    2017-04-11

    A Monte Carlo code applied to the cgDNA coarse-grain rigid-base model of B-form double-stranded DNA is used to predict a sequence-averaged persistence length of lF = 53.5 nm in the sense of Flory, and of lp = 160 bp or 53.5 nm in the sense of apparent tangent-tangent correlation decay. These estimates are slightly higher than the consensus experimental values of 150 bp or 50 nm, but we believe the agreement to be good given that the cgDNA model is itself parametrized from molecular dynamics simulations of short fragments of length 10-20 bp, with no explicit fit to persistence length. Our Monte Carlo simulations further predict that there can be substantial dependence of persistence lengths on the specific sequence [Formula: see text] of a fragment. We propose, and confirm the numerical accuracy of, a simple factorization that separates the part of the apparent tangent-tangent correlation decay [Formula: see text] attributable to intrinsic shape, from a part [Formula: see text] attributable purely to stiffness, i.e., a sequence-dependent version of what has been called sequence-averaged dynamic persistence length l̅d (=58.8 nm within the cgDNA model). For ensembles of both random and λ-phage fragments, the apparent persistence length [Formula: see text] has a standard deviation of 4 nm over sequence, whereas our dynamic persistence length [Formula: see text] has a standard deviation of only 1 nm. However, there are notable dynamic persistence length outliers, including poly(A) (exceptionally straight and stiff), poly(TA) (tightly coiled and exceptionally soft), and phased A-tract sequence motifs (exceptionally bent and stiff). The results of our numerical simulations agree reasonably well with both molecular dynamics simulation and diverse experimental data including minicircle cyclization rates and stereo cryo-electron microscopy images.

  11. Programmable and automated bead-based microfluidics for versatile DNA microarrays under isothermal conditions.

    PubMed

    Penchovsky, Robert

    2013-06-21

    Advances in modern genomic research depend heavily on applications of various devices for automated high- or ultra-throughput arrays. Micro- and nanofluidics offer possibilities for miniaturization and integration of many different arrays onto a single device. Therefore, such devices are becoming a platform of choice for developing analytical instruments for modern biotechnology. This paper presents an implementation of a bead-based microfluidic platform for fully automated and programmable DNA microarrays. The devices are designed to work under isothermal conditions as DNA immobilization and hybridization transfer are performed under steady temperature using reversible pH alterations of reaction solutions. This offers the possibility for integration of more selection modules onto a single chip compared to maintaining a temperature gradient. This novel technology allows integration of many modules on a single reusable chip reducing the application cost. The method takes advantage of demonstrated high-speed DNA hybridization kinetics and denaturation on beads under flow conditions, high-fidelity of DNA hybridization, and small sample volumes are needed. The microfluidic devices are applied for a single nucleotide polymorphism analysis and DNA sequencing by synthesis without the need for fluorescent removal step. Apart from that, the microfluidic platform presented is applicable to many areas of modern biotechnology, including biosensor devices, DNA hybridization microarrays, molecular computation, on-chip nucleic acid selection, high-throughput screening of chemical libraries for drug discovery.

  12. megasat: automated inference of microsatellite genotypes from sequence data.

    PubMed

    Zhan, Luyao; Paterson, Ian G; Fraser, Bonnie A; Watson, Beth; Bradbury, Ian R; Nadukkalam Ravindran, Praveen; Reznick, David; Beiko, Robert G; Bentzen, Paul

    2017-03-01

    megasat is software that enables genotyping of microsatellite loci using next-generation sequencing data. Microsatellites are amplified in large multiplexes, and then sequenced in pooled amplicons. megasat reads sequence files and automatically scores microsatellite genotypes. It uses fuzzy matches to allow for sequencing errors and applies decision rules to account for amplification artefacts, including nontarget amplification products, replication slippage during PCR (amplification stutter) and differential amplification of alleles. An important feature of megasat is the generation of histograms of the length-frequency distributions of amplification products for each locus and each individual. These histograms, analogous to electropherograms traditionally used to score microsatellite genotypes, enable rapid evaluation and editing of automatically scored genotypes. megasat is written in Perl, runs on Windows, Mac OS X and Linux systems, and includes a simple graphical user interface. We demonstrate megasat using data from guppy, Poecilia reticulata. We genotype 1024 guppies at 43 microsatellites per run on an Illumina MiSeq sequencer. We evaluated the accuracy of automatically called genotypes using two methods, based on pedigree and repeat genotyping data, and obtained estimates of mean genotyping error rates of 0.021 and 0.012. In both estimates, three loci accounted for a disproportionate fraction of genotyping errors; conversely, 26 loci were scored with 0-1 detected error (error rate ≤0.007). Our results show that with appropriate selection of loci, automated genotyping of microsatellite loci can be achieved with very high throughput, low genotyping error and very low genotyping costs.

  13. Automated ribosomal DNA fingerprinting by capillary electrophoresis of PCR products.

    PubMed

    Martin, F; Vairelles, D; Henrion, B

    1993-10-01

    Capillary electrophoresis (CE) provides a rapid and automated technique for the analysis of subnanogram amounts of DNA fragments generated by the polymerase chain reaction (PCR). Here we describe the implementation of size-selective CE for DNA profiling and restriction fragment length polymorphism analysis of amplified polymorphic spacers of ribosomal DNA from fungi. Separations of unpurified and isopropanol-precipitated PCR-amplified DNA fragments in the size range of 20-1600 base pairs were achieved in less than 20 min with the use of hydroxypropyl methylcellulose as a sieving medium. The amplified internal transcribed spacer (ITS) and intergenic spacer (IGS) of RNA genes could be sized by coelectrophoresing a standard size ladder mixed with every sample, thereby eliminating errors in size estimation. This, together with the strictly controlled conditions of CE, permit the discrimination of amplified rDNA fragments differing only a few base pairs in length. Inter- and intraspecific variation in the size and number of restriction sites of the amplified rDNA spacers from the ectomycorrhizal basidiomycetes Laccaria laccata and Laccaria bicolor was observed and most strains could thus be reliably genotyped by PCR-CE. Multiple amplified IGS fragments of heterogeneous size were detected in several strains. This polymorphism is due to the occurrence of 5S rDNA subrepeats (i.e., multiple annealing of primer) within IGS. With CE, in contrast to slab gel electrophoresis, run times are short, the capillary can be reused, and full automation is feasible. Data acquisition and analysis are computer-controlled, which facilitates the locus identification and reduces error especially when large numbers of PCR products must be analyzed.(ABSTRACT TRUNCATED AT 250 WORDS)

  14. DNA Sequencing Using an Engineered Protein Nanopore

    NASA Astrophysics Data System (ADS)

    Gundlach, Jens H.

    2010-03-01

    Inexpensive and fast sequencing of DNA is of paramount importance to medicine, the life sciences and to many other applications. Because of the nanometer diameter of DNA a nanometer-scale reader directly interfaced to macroscopic observables seems particularly attractive. We are working on a new single molecule technique based on a biological pore embedded in a lipid bilayer. When a voltage is applied across the bilayer an ion current is measured that flows through the nanometer opening of the pore. Poly-negatively charged single stranded DNA passes through the pore and reduces the ion current with the remaining ion current being indicative of the nucleotide type in the constriction of the pore. The protein pore that we introduced to the field, MspA, has a shape ideally suited to nanopore sequencing, has robustness comparable to solid state devices, is easily reproduced with sub-nanometer level precision and is engineerable using genetic mutations. I will present proof-of-principle data showing that this technique can lead to a direct very inexpensive and fast sequencing technology. The experimental electronic signatures of the DNA translocation process provide an ideal test bed for molecular dynamics simulations, which in turn allows developing intuition and prediction of nanoscale dynamics.

  15. Evaluation of an automated protocol for efficient and reliable DNA extraction of dietary samples.

    PubMed

    Wallinger, Corinna; Staudacher, Karin; Sint, Daniela; Thalinger, Bettina; Oehm, Johannes; Juen, Anita; Traugott, Michael

    2017-08-01

    Molecular techniques have become an important tool to empirically assess feeding interactions. The increased usage of next-generation sequencing approaches has stressed the need of fast DNA extraction that does not compromise DNA quality. Dietary samples here pose a particular challenge, as these demand high-quality DNA extraction procedures for obtaining the minute quantities of short-fragmented food DNA. Automatic high-throughput procedures significantly decrease time and costs and allow for standardization of extracting total DNA. However, these approaches have not yet been evaluated for dietary samples. We tested the efficiency of an automatic DNA extraction platform and a traditional CTAB protocol, employing a variety of dietary samples including invertebrate whole-body extracts as well as invertebrate and vertebrate gut content samples and feces. Extraction efficacy was quantified using the proportions of successful PCR amplifications of both total and prey DNA, and cost was estimated in terms of time and material expense. For extraction of total DNA, the automated platform performed better for both invertebrate and vertebrate samples. This was also true for prey detection in vertebrate samples. For the dietary analysis in invertebrates, there is still room for improvement when using the high-throughput system for optimal DNA yields. Overall, the automated DNA extraction system turned out as a promising alternative to labor-intensive, low-throughput manual extraction methods such as CTAB. It is opening up the opportunity for an extensive use of this cost-efficient and innovative methodology at low contamination risk also in trophic ecology.

  16. Sequence-specific recognition of DNA nanostructures.

    PubMed

    Rusling, David A; Fox, Keith R

    2014-05-15

    DNA is the most exploited biopolymer for the programmed self-assembly of objects and devices that exhibit nanoscale-sized features. One of the most useful properties of DNA nanostructures is their ability to be functionalized with additional non-nucleic acid components. The introduction of such a component is often achieved by attaching it to an oligonucleotide that is part of the nanostructure, or hybridizing it to single-stranded overhangs that extend beyond or above the nanostructure surface. However, restrictions in nanostructure design and/or the self-assembly process can limit the suitability of these procedures. An alternative strategy is to couple the component to a DNA recognition agent that is capable of binding to duplex sequences within the nanostructure. This offers the advantage that it requires little, if any, alteration to the nanostructure and can be achieved after structure assembly. In addition, since the molecular recognition of DNA can be controlled by varying pH and ionic conditions, such systems offer tunable properties that are distinct from simple Watson-Crick hybridization. Here, we describe methodology that has been used to exploit and characterize the sequence-specific recognition of DNA nanostructures, with the aim of generating functional assemblies for bionanotechnology and synthetic biology applications.

  17. A simple automated instrument for DNA extraction in forensic casework.

    PubMed

    Montpetit, Shawn A; Fitch, Ian T; O'Donnell, Patrick T

    2005-05-01

    The Qiagen BioRobot EZ1 is a small, rapid, and reliable automated DNA extraction instrument capable of extracting DNA from up to six samples in as few as 20 min using magnetic bead technology. The San Diego Police Department Crime Laboratory has validated the BioRobot EZ1 for the DNA extraction of evidence and reference samples in forensic casework. The BioRobot EZ1 was evaluated for use on a variety of different evidence sample types including blood, saliva, and semen evidence. The performance of the BioRobot EZ1 with regard to DNA recovery and potential cross-contamination was also assessed. DNA yields obtained with the BioRobot EZ1 were comparable to those from organic extraction. The BioRobot EZ1 was effective at removing PCR inhibitors, which often co-purify with DNA in organic extractions. The incorporation of the BioRobot EZ1 into forensic casework has streamlined the DNA analysis process by reducing the need for labor-intensive phenol-chloroform extractions.

  18. Compilation of DNA sequences of Escherichia coli

    PubMed Central

    Kröger, Manfred

    1989-01-01

    We have compiled the DNA sequence data for E.coli K12 available from the GENBANK and EMBO databases and over a period of several years independently from the literature. We have introduced all available genetic map data and have arranged the sequences accordingly. As far as possible the overlaps are deleted and a total of 940,449 individual bp is found to be determined till the beginning of 1989. This corresponds to a total of 19.92% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2% derived from the sequence of lysogenic bacteriophage lambda and the various insertion sequences. This compilation may be available in machine readable form from one of the international databanks in some future. PMID:2654890

  19. Systematic and fully automated identification of protein sequence patterns.

    PubMed

    Hart, R K; Royyuru, A K; Stolovitzky, G; Califano, A

    2000-01-01

    We present an efficient algorithm to systematically and automatically identify patterns in protein sequence families. The procedure is based on the Splash deterministic pattern discovery algorithm and on a framework to assess the statistical significance of patterns. We demonstrate its application to the fully automated discovery of patterns in 974 PROSITE families (the complete subset of PROSITE families which are defined by patterns and contain DR records). Splash generates patterns with better specificity and undiminished sensitivity, or vice versa, in 28% of the families; identical statistics were obtained in 48% of the families, worse statistics in 15%, and mixed behavior in the remaining 9%. In about 75% of the cases, Splash patterns identify sequence sites that overlap more than 50% with the corresponding PROSITE pattern. The procedure is sufficiently rapid to enable its use for daily curation of existing motif and profile databases. Third, our results show that the statistical significance of discovered patterns correlates well with their biological significance. The trypsin subfamily of serine proteases is used to illustrate this method's ability to exhaustively discover all motifs in a family that are statistically and biologically significant. Finally, we discuss applications of sequence patterns to multiple sequence alignment and the training of more sensitive score-based motif models, akin to the procedure used by PSI-BLAST. All results are available at httpl//www.research.ibm.com/spat/.

  20. [Economical sequencing of DNA with terminators].

    PubMed

    Kraev, A S; Mironov, V N

    1990-01-01

    We describe several improvements of chain-termination DNA sequencing procedure of Sanger et al. For template preparation we use 0.3 ml cultures of M13 clones, grown in standard 1,5 ml polypropylene tubes. The sequencing experiment differs from the previously described by the use of deoxyNTP, labelled with phosphorus-33 (a low energy isotope with a half-life of 25 days, commercially produced in the USSR), and by a "quasi-end labelling" reaction, preceding the DNA synthesis in the presence of dideoxyNTPs. The combination of the phosphorus-33 and the quasi-end labelling produces very sharp sequencing ladders, that equal or exceed in quality those obtained with sulphur-35, and only an overnight exposure with a conventional X-ray film is required. The use of plastic tubes for bacterial growth and the 60-well microchambers for carrying out sequencing reactions results in substantial saving of time and cost in routine "middle scale" sequencing (both types of plasticware are produced in the USSR).

  1. Sequencing and Analysis of Neanderthal Genomic DNA

    PubMed Central

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Pääbo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2008-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library are of Neanderthal origin, the strongest being the ascertainment of sequence identities between Neanderthal and chimpanzee at sites where the human genomic sequence is different. These results enabled us to calculate the human-Neanderthal divergence time based on multiple randomly distributed autosomal loci. Our analyses suggest that on average the Neanderthal genomic sequence we obtained and the reference human genome sequence share a most recent common ancestor ~706,000 years ago, and that the human and Neanderthal ancestral populations split ~370,000 years ago, before the emergence of anatomically modern humans. Our finding that the Neanderthal and human genomes are at least 99.5% identical led us to develop and successfully implement a targeted method for recovering specific ancient DNA sequences from metagenomic libraries. This initial analysis of the Neanderthal genome advances our understanding of the evolutionary relationship of Homo sapiens and Homo neanderthalensis and signifies the dawn of Neanderthal genomics. PMID:17110569

  2. Genetic algorithms for DNA sequence assembly.

    PubMed

    Parsons, R; Forrest, S; Burks, C

    1993-01-01

    This paper describes a genetic algorithm application to the DNA sequence assembly problem. The genetic algorithm uses a sorted order representation for representing the orderings of fragments. Two different fitness functions, both based on pairwise overlap strengths between fragments, are tested. The paper concludes that the genetic algorithm is a promising method for fragment assembly problems, achieving usable solutions quickly, but that the current fitness functions are flawed and that other representations might be more appropriate.

  3. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    SciTech Connect

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  4. Complete sequence of Euglena gracilis chloroplast DNA.

    PubMed Central

    Hallick, R B; Hong, L; Drager, R G; Favreau, M R; Monfort, A; Orsat, B; Spielmann, A; Stutz, E

    1993-01-01

    We report the complete DNA sequence of the Euglena gracilis, Pringsheim strain Z chloroplast genome. This circular DNA is 143,170 bp, counting only one copy of a 54 bp tandem repeat sequence that is present in variable copy number within a single culture. The overall organization of the genome involves a tandem array of three complete and one partial ribosomal RNA operons, and a large single copy region. There are genes for the 16S, 5S, and 23S rRNAs of the 70S chloroplast ribosomes, 27 different tRNA species, 21 ribosomal proteins plus the gene for elongation factor EF-Tu, three RNA polymerase subunits, and 27 known photosynthesis-related polypeptides. Several putative genes of unknown function have also been identified, including five within large introns, and five with amino acid sequence similarity to genes in other organisms. This genome contains at least 149 introns. There are 72 individual group II introns, 46 individual group III introns, 10 group II introns and 18 group III introns that are components of twintrons (introns-within-introns), and three additional introns suspected to be twintrons composed of multiple group II and/or group III introns, but not yet characterized. At least 54,804 bp, or 38.3% of the total DNA content is represented by introns. PMID:8346031

  5. DNA SEQUENCING RESEARCH GROUP (DSRG) 2003—A GENERAL SURVEY OF CORE DNA SEQUENCING FACILITIES

    PubMed Central

    Wiebe, Glenis J.; Pershad, Rashmi; Escobar, Helaman; Hawes, John W.; Hunter, Timothy; Jackson-Machelski, Emily; Knudtson, Kevin L.; Robertson, Margaret; Thannhauser, Theodore W.

    2003-01-01

    DNA sequencing core facilities serve as centralized resources within both academic and commercial institutions, providing expertise in the area of DNA analysis. The composition and configuration of these facilities continue to evolve in response to new developments in instrumentation and methodology. The goal of the 2003 DNA Sequencing Research Group (DSRG) survey was to identify recent changes in staffing, funding, instrumentation, services, and customer relations. Responses to 58 survey questions from 30 participants are presented to offer a look at the current typical DNA core sequencing facility. The results from this study will serve as a resource for institutions to benchmark their shared core laboratories, and to give facility directors an opportunity to compare and contrast their respective services and experiences.

  6. Automated sequencing batch bioreactor under extreme peaks of 4-chlorophenol.

    PubMed

    Bultrón, G; Schoeb, M E; Moreno, J

    2003-01-01

    The operation of a sequencing batch bioreactor is evaluated when high concentration peaks of a toxic compound (4-chlorophenol, 4CP) are introduced into the reactor. A control strategy based on the dissolved oxygen concentration, measured on line, is utilized. To detect the end of the reaction period, the automated system search for the moment when the dissolved oxygen has passed by a minimum, as a consequence of the metabolic activity of the microorganisms and right after to a maximum due to the saturation of the water (similar to the self-cycling fermentation, SCF, strategy). The dissolved oxygen signal was sent to a personal computer via data acquisition and control using MATLAB and the SIMULINK package. The system operating under the automated strategy presented a stable operation when the acclimated microorganisms (to an initial concentration of 350 mg 4CP/L), were exposed to a punctual concentration peaks of 600 mg 4CP/L. The 4CP concentrations peaks superior or equals to 1,050 mg/L only disturbed the system from a short to a medium term (one month). The 1,400 mg/L peak caused a shutdown in the metabolic activity of the microorganisms that led to the reactor failure. The biomass acclimated with the SCF strategy can partially support the variations of the toxic influent since, at the moment in which the influent become inhibitory, there is a failure of the system.

  7. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    DTIC Science & Technology

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  8. An oligonucleotide hybridization approach to DNA sequencing.

    PubMed

    Khrapko, K R; Lysov YuP; Khorlyn, A A; Shick, V V; Florentiev, V L; Mirzabekov, A D

    1989-10-09

    We have proposed a DNA sequencing method based on hybridization of a DNA fragment to be sequenced with the complete set of fixed-length oligonucleotides (e.g., 4(8) = 65,536 possible 8-mers) immobilized individually as dots of a 2-D matrix [(1989) Dokl. Akad. Nauk SSSR 303, 1508-1511]. It was shown that the list of hybridizing octanucleotides is sufficient for the computer-assisted reconstruction of the structures for 80% of random-sequence fragments up to 200 bases long, based on the analysis of the octanucleotide overlapping. Here a refinement of the method and some experimental data are presented. We have performed hybridizations with oligonucleotides immobilized on a glass plate, and obtained their dissociation curves down to heptanucleotides. Other approaches, e.g., an additional hybridization of short oligonucleotides which continuously extend duplexes formed between the fragment and immobilized oligonucleotides, should considerably increase either the probability of unambiguous reconstruction, or the length of reconstructed sequences, or decrease the size of immobilized oligonucleotides.

  9. Text mining of DNA sequence homology searches.

    PubMed

    McCallum, John; Ganesh, Siva

    2003-01-01

    Primary tasks in analysis and annotation of expressed sequence tag (EST) datasets are to identify similarity among sequences by unsupervised clustering and assign putative function based on BLAST homology searches. We investigated the usefulness of text mining as a simple approach for further higher-level clustering of EST datasets using IBM Intelligent Miner for Text v2.3 tools. Agglomerative and k-means clustering tools were used to cluster BLASTx homology search documents from two onion EST datasets and optimised by pre-processing and pruning. Subjective evaluation confirmed that these tools provided biologically useful and complementary views of the two libraries, provided new insights into their composition and revealed clusters previously identified by human experts. We compared BLASTx textual clusters for two gene families with their DNA sequence-based clusters and confirmed that these shared similar morphology.

  10. Sequence-of-Events-Driven Automation of the Deep Space Network

    NASA Technical Reports Server (NTRS)

    Hill, R., Jr.; Fayyad, K.; Smyth, C.; Santos, T.; Chen, R.; Chien, S.; Bevan, R.

    1996-01-01

    In February 1995, sequence-of-events (SOE)-driven automation technology was demonstrated for a Voyager telemetry downlink track at DSS 13. This demonstration entailed automated generation of an operations procedure (in the form of a temporal dependency network) from project SOE information using artificial intelligence planning technology and automated execution of the temporal dependency network using the link monitor and control operator assistant system. This article describes the overall approach to SOE-driven automation that was demonstrated, identifies gaps in SOE definitions and project profiles that hamper automation, and provides detailed measurements of the knowledge engineering effort required for automation.

  11. Sequence-of-events-driven automation of the deep space network

    NASA Technical Reports Server (NTRS)

    Hill, R., Jr.; Fayyad, K.; Smyth, C.; Santos, T.; Chen, R.; Chien, S.; Bevan, R.

    1996-01-01

    In February 1995, sequence-of-events (SOE)-driven automation technology was demonstrated for a Voyager telemetry downlink track at DSS 13. This demonstration entailed automated generation of an operations procedure (in the form of a temporal dependency network) from project SOE information using artificial intelligence planning technology and automated execution of the temporal dependency network using the link monitor and control operator assistant system. This article describes the overall approach to SOE-driven automation that was demonstrated, identifies gaps in SOE definitions and project profiles that hamper automation, and provides detailed measurements of the knowledge engineering effort required for automation.

  12. Analysis of DNA Sequence Variants Detected by High Throughput Sequencing

    PubMed Central

    Adams, David R; Sincan, Murat; Fajardo, Karin Fuentes; Mullikin, James C; Pierson, Tyler M; Toro, Camilo; Boerkoel, Cornelius F; Tifft, Cynthia J; Gahl, William A; Markello, Tom C

    2014-01-01

    The Undiagnosed Diseases Program at the National Institutes of Health uses High Throughput Sequencing (HTS) to diagnose rare and novel diseases. HTS techniques generate large numbers of DNA sequence variants, which must be analyzed and filtered to find candidates for disease causation. Despite the publication of an increasing number of successful exome-based projects, there has been little formal discussion of the analytic steps applied to HTS variant lists. We present the results of our experience with over 30 families for whom HTS sequencing was used in an attempt to find clinical diagnoses. For each family, exome sequence was augmented with high-density SNP-array data. We present a discussion of the theory and practical application of each analytic step and provide example data to illustrate our approach. The paper is designed to provide an analytic roadmap for variant analysis, thereby enabling a wide range of researchers and clinical genetics practitioners to perform direct analysis of HTS data for their patients and projects. PMID:22290882

  13. Aspects of coverage in medical DNA sequencing

    PubMed Central

    Wendl, Michael C; Wilson, Richard K

    2008-01-01

    Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study. PMID:18485222

  14. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1988-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330

  15. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1987-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113

  16. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1989-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889

  17. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1990-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227

  18. A Molecular Fraction Collecting Tool for the ABI 310 Automated Sequencer

    PubMed Central

    Lin, Ming-Tseh; Rich, Roy G.; Shipley, Royce F.; Hafez, Michael J.; Tseng, Li-Hui; Murphy, Kathleen M.; Gocke, Christopher D.; Eshleman, James R.

    2007-01-01

    Several methods exist to retrieve and purify DNA fragments after agarose or polyacrylamide gel electrophoresis for subsequent analyses. However, molecules present in low concentration and molecules similar in size to their neighbors are difficult to purify. Capillary electrophoresis has become popular in molecular diagnostic laboratories because of its automation, excellent resolution, and high sensitivity. In the current study, the ABI Prism 310 Genetic Analyzer was reconfigured into a fraction collector by adapting the standard gel block to accommodate a collection tube at the distal end of capillary. The time to collect the desired peaks was estimated by extrapolating from standard capillary electrophoresis using the original gel block. Fraction collection from a mixture of DNA fragments amplified from wild type and several internal tandem duplication mutations of the FMS-like tyrosine kinase 3 (Flt3) gene yielded highly purified DNA fragments containing internal tandem duplication mutations and predictable electrokinetics using the reconstructed gel block. The reconfigured instrument could successfully isolate DNA amplicons from extremely low-amplitude peaks (110 relative fluorescent units), which were undetectable using polyacrylamide gel electrophoresis. In addition, we successfully isolated bands that were only three bases apart that comigrated on polyacrylamide gel electrophoresis. DNA sequencing was used to confirm that the correct peaks were recovered at sufficient purity. PMID:17916601

  19. Porcine parvovirus: DNA sequence and genome organization.

    PubMed

    Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I

    1989-10-01

    We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV.

  20. Method for priming and DNA sequencing

    SciTech Connect

    Mugasimangalam, R.C.; Ulanovsky, L.E.

    1997-12-01

    A method is presented for improving the priming specificity of an oligonucleotide primer that is non-unique in a nucleic acid template which includes selecting a continuous stretch of several nucleotides in the template DNA where one of the four bases does not occur in the stretch. This also includes bringing the template DNA in contract with a non-unique primer partially or fully complimentary to the sequence immediately upstream of the selected sequence stretch. This results in polymerase-mediated differential extension of the primer in the presence of a subset of deoxyribonucleotide triphosphates that does not contain the base complementary to the base absent in the selected sequence stretch. These reactions occur at a temperature sufficiently low for allowing the extension of the non-unique primer. The method causes polymerase-mediated extension reactions in the presence of all four natural deoxyribonucleotide triphosphates or modifications. At this high temperature discrimination occurs against priming sites of the non-unique primer where the differential extension has not made the primer sufficiently stable to prime. However, the primer extended at the selected stretch is sufficiently stable to prime.

  1. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

    1997-05-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  2. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    SciTech Connect

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  3. Preparation of yeast mitochondrial DNA for direct sequence analysis.

    PubMed

    Valach, Matus; Tomaska, Lubomir; Nosek, Jozef

    2008-08-01

    We describe two simple protocols for preparation of templates for direct sequencing of yeast mitochondrial DNA (mtDNA) by automatic DNA analyzers. The protocols work with a range of yeast species and yield a sufficient quantity and quality of the template DNA. In combination with primer-walking strategy, they can be used either as an alternative or a complementary approach to shot-gun sequencing of random fragment DNA libraries. We demonstrate that the templates are suitable for re-sequencing of the mtDNA for comparative analyses of intraspecific variability of yeast strains as well as for primary determination of the complete mitochondrial genome sequence.

  4. Sequence dependent hole evolution in DNA.

    PubMed

    Lakhno, V D

    2004-06-01

    The paper examines thedynamical behavior of a radical cation(G(+*)) generated in adouble stranded DNA for differentoligonucleotide sequences. The resonancehole tunneling through an oligonucleotidesequence is studied by the method ofnumerical integration of self-consistentquantum-mechanical equations. The holemotion is considered quantum mechanicallyand nucleotide base oscillations aretreated classically. The results obtaineddemonstrate a strong dependence of chargetransfer on the type of nucleotidesequence. The rates of the hole transferare calculated for different nucleotidesequences and compared with experimentaldata on the transfer from (G(+*))to a GGG unit.

  5. Recent advances in DNA sequencing techniques

    NASA Astrophysics Data System (ADS)

    Singh, Rama Shankar

    2013-06-01

    Successful mapping of the draft human genome in 2001 and more recent mapping of the human microbiome genome in 2012 have relied heavily on the parallel processing of the second generation/Next Generation Sequencing (NGS) DNA machines at a cost of several millions dollars and long computer processing times. These have been mainly biochemical approaches. Here a system analysis approach is used to review these techniques by identifying the requirements, specifications, test methods, error estimates, repeatability, reliability and trends in the cost reduction. The first generation, NGS and the Third Generation Single Molecule Real Time (SMART) detection sequencing methods are reviewed. Based on the National Human Genome Research Institute (NHGRI) data, the achieved cost reduction of 1.5 times per yr. from Sep. 2001 to July 2007; 7 times per yr., from Oct. 2007 to Apr. 2010; and 2.5 times per yr. from July 2010 to Jan 2012 are discussed.

  6. MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools

    PubMed Central

    Liang, Chun; Sun, Feng; Wang, Haiming; Qu, Junfeng; Freeman, Robert M; Pratt, Lee H; Cordonnier-Pratt, Marie-Michèle

    2006-01-01

    Background Processing raw DNA sequence data is an especially challenging task for relatively small laboratories and core facilities that produce as many as 5000 or more DNA sequences per week from multiple projects in widely differing species. To meet this challenge, we have developed the flexible, scalable, and automated sequence processing package described here. Results MAGIC-SPP is a DNA sequence processing package consisting of an Oracle 9i relational database, a Perl pipeline, and user interfaces implemented either as JavaServer Pages (JSP) or as a Java graphical user interface (GUI). The database not only serves as a data repository, but also controls processing of trace files. MAGIC-SPP includes an administrative interface, a laboratory information management system, and interfaces for exploring sequences, monitoring quality control, and troubleshooting problems related to sequencing activities. In the sequence trimming algorithm it employs new features designed to improve performance with respect to concerns such as concatenated linkers, identification of the expected start position of a vector insert, and extending the useful length of trimmed sequences by bridging short regions of low quality when the following high quality segment is sufficiently long to justify doing so. Conclusion MAGIC-SPP has been designed to minimize human error, while simultaneously being robust, versatile, flexible and automated. It offers a unique combination of features that permit administration by a biologist with little or no informatics background. It is well suited to both individual research programs and core facilities. PMID:16522212

  7. A Microfluidic DNA Library Preparation Platform for Next-Generation Sequencing

    PubMed Central

    Sinha, Anupama; Bent, Zachary W.; Solberg, Owen D.; Williams, Kelly P.; Langevin, Stanley A.; Renzi, Ronald F.; Van De Vreugde, James L.; Meagher, Robert J.; Schoeniger, Joseph S.; Lane, Todd W.; Branda, Steven S.; Bartsch, Michael S.; Patel, Kamlesh D.

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories. PMID:23894387

  8. A microfluidic DNA library preparation platform for next-generation sequencing.

    PubMed

    Kim, Hanyoup; Jebrail, Mais J; Sinha, Anupama; Bent, Zachary W; Solberg, Owen D; Williams, Kelly P; Langevin, Stanley A; Renzi, Ronald F; Van De Vreugde, James L; Meagher, Robert J; Schoeniger, Joseph S; Lane, Todd W; Branda, Steven S; Bartsch, Michael S; Patel, Kamlesh D

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories.

  9. Transverse Electronic Signature of DNA for Electronic Sequencing

    NASA Astrophysics Data System (ADS)

    Xu, Mingsheng; Endres, Robert G.; Arakawa, Yasuhiko

    In recent years, the proliferation of large-scale DNA sequencing projects for applications in clinical medicine and health care has driven the search for new methods that could reduce the time and cost. The commonly used Sanger sequencing method relies on the chemistry to read the bases in DNA and is far too slow and expensive for reading personal genetic codes. There were earlier attempts to sequence DNA by directly visualizing the nucleotide composition of the DNA molecules by scanning tunneling microscopy (STM). However, sequencing DNA based on directly imaging DNA's atomic structure has not yet been successful. In Chap. 9, Xu, Endres, and Arakawa report a potential physical alternative by detecting unique transverse electronic signatures of DNA bases using ultrahigh vacuum STM. Supported by the principles, calculations and statistical analyses, these authors argue that it would be possible to directly sequence DNA by the STM-based technology without any modification of the DNA.

  10. Improving DNA sequencing accuracy and throughput

    SciTech Connect

    Nelson, D.O. |

    1996-12-31

    LLNL is beginning to explore statistical approaches to the problem of determining the DNA sequence underlying data obtained from fluorescence-based gel electrophoresis. Among the features of this problem that make it interesting to statisticians include: (1) the underlying mechanics of electrophoresis is quite complex and still not completely understood; (2) the yield of fragments of any given size can be quite small and variable; (3) the mobility of fragments of a given size can depend on the terminating base; (4) the data consists of samples from one or more continuous, non-stationary signals; (5) boundaries between segments generated by distinct elements of the underlying sequence are ill-defined or nonexistent in the signal; and (6) the sampling rate of the signal greatly exceeds the rate of evolution of the underlying discrete sequence. Current approaches to base calling address only some of these issues, and usually in a heuristic, ad hoc way. In this article we describe some of our initial efforts towards increasing base calling accuracy and throughput by providing a rational, statistical foundation to the process of deducing sequence from signal. 31 refs., 12 figs.

  11. Image correlation method for DNA sequence alignment.

    PubMed

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  12. Image Correlation Method for DNA Sequence Alignment

    PubMed Central

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were “digitally” obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment. PMID:22761742

  13. DNA sequencing by denaturation: experimental proof of concept with an integrated fluidic device

    PubMed Central

    Chen, Ying-Ja; Roller, Eric E.; Huang, Xiaohua

    2010-01-01

    We report the proof of concept of a novel DNA sequencing technology called sequencing by denaturation (SBD). SBD is based on the Sanger sequencing reaction performed on amplified target templates immobilized on a solid surface followed by the denaturation of these Sanger fragments. As these fluorescently labeled fragments denature sequentially, the fluorescence intensities in the four channels corresponding to the four base types are monitored in a flow cell. A sequencing instrument with a microfluidic flowcell has been custom-designed to integrate automated fluidics, temperature control, and fluorescence imaging. The denaturation profiles of several synthetic oligonucleotides were measured with this system and our data demonstrated the ability to sequence short DNA by SBD. SBD is a simple and fast method with the potential to improve the speed and cost of large-scale genome re-sequencing. PMID:20390134

  14. MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

    PubMed Central

    Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2016-01-01

    The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about

  15. MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

    PubMed

    Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2016-01-01

    The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a 'dark matter.' We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the

  16. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  17. Determining orientation and direction of DNA sequences

    DOEpatents

    Goodwin, Edwin H.; Meyne, Julianne

    2000-01-01

    Determining orientation and direction of DNA sequences. A method by which fluorescence in situ hybridization can be made strand specific is described. Cell cultures are grown in a medium containing a halogenated nucleotide. The analog is partially incorporated in one DNA strand of each chromatid. This substitution takes place in opposite strands of the two sister chromatids. After staining with the fluorescent DNA-binding dye Hoechst 33258, cells are exposed to long-wavelength ultraviolet light which results in numerous strand nicks. These nicks enable the substituted strand to be denatured and solubilized by heat, treatment with high or low pH aqueous solutions, or by immersing the strands in 2.times.SSC (0.3M NaCl+0.03M sodium citrate), to name three procedures. It is unnecessary to enzymatically digest the strands using Exo III or another exonuclease in order to excise and solubilize nucleotides starting at the sites of the nicks. The denaturing/solubilizing process removes most of the substituted strand while leaving the prereplication strand largely intact. Hybridization of a single-stranded probe of a tandem repeat arranged in a head-to-tail orientation will result in hybridization only to the chromatid with the complementary strand present.

  18. P46-S A New Method for the Purification of DNA Sequencing Reactions

    PubMed Central

    Amparo, G.; Harrold, M.; Pistacchi, S.

    2007-01-01

    BigDye XTerminator is a new single-tube method for purifying DNA sequencing reactions prior to electrophoretic analysis. Sequencing reactions purified using this method display very few artifacts from residual dye-labeled nucleotides (dye blobs), as well as excellent recovery of the smallest extension products. BigDye XTerminator also has unique workflow advantages over conventional purification schemes such as ethanol precipitation or spin column purifications. BigDye XTerminator is added directly to finished DNA sequencing reactions and mixed. After mixing, the samples purified with BigDye XTerminator can be directly injected from the reaction tube without additional transfer steps. Performance of BigDye XTerminator with a variety of DNA sequencing reactions will be demonstrated. Workflow and automation issues will also be described.

  19. Interlaboratory concordance of DNA sequence analysis to detect reverse transcriptase mutations in HIV-1 proviral DNA. ACTG Sequencing Working Group. AIDS Clinical Trials Group.

    PubMed

    Demeter, L M; D'Aquila, R; Weislow, O; Lorenzo, E; Erice, A; Fitzgibbon, J; Shafer, R; Richman, D; Howard, T M; Zhao, Y; Fisher, E; Huang, D; Mayers, D; Sylvester, S; Arens, M; Sannerud, K; Rasheed, S; Johnson, V; Kuritzkes, D; Reichelderfer, P; Japour, A

    1998-11-01

    Thirteen laboratories evaluated the reproducibility of sequencing methods to detect drug resistance mutations in HIV-1 reverse transcriptase (RT). Blinded, cultured peripheral blood mononuclear cell pellets were distributed to each laboratory. Each laboratory used its preferred method for sequencing proviral DNA. Differences in protocols included: DNA purification; number of PCR amplifications; PCR product purification; sequence/location of PCR/sequencing primers; sequencing template; sequencing reaction label; sequencing polymerase; and use of manual versus automated methods to resolve sequencing reaction products. Five unknowns were evaluated. Thirteen laboratories submitted 39043 nucleotide assignments spanning codons 10-256 of HIV-1 RT. A consensus nucleotide assignment (defined as agreement among > or = 75% of laboratories) could be made in over 99% of nucleotide positions, and was more frequent in the three laboratory isolates. The overall rate of discrepant nucleotide assignments was 0.29%. A consensus nucleotide assignment could not be made at RT codon 41 in the clinical isolate tested. Clonal analysis revealed that this was due to the presence of a mixture of wild-type and mutant genotypes. These observations suggest that sequencing methodologies currently in use in ACTG laboratories to sequence HIV-1 RT yield highly concordant results for laboratory strains; however, more discrepancies among laboratories may occur when clinical isolates are tested.

  20. Development of an Automated DNA Detection System Using an Electrochemical DNA Chip Technology

    NASA Astrophysics Data System (ADS)

    Hongo, Sadato; Okada, Jun; Hashimoto, Koji; Tsuji, Koichi; Nikaido, Masaru; Gemma, Nobuhiro

    A new compact automated DNA detection system Genelyzer™ has been developed. After injecting a sample solution into a cassette with a built-in electrochemical DNA chip, processes from hybridization reaction to detection and analysis are all operated fully automatically. In order to detect a sample DNA, electrical currents from electrodes due to an oxidization reaction of electrochemically active intercalator molecules bound to hybridized DNAs are detected. The intercalator is supplied as a reagent solution by a fluid supply unit of the system. The feasibility test proved that the simultaneous typing of six single nucleotide polymorphisms (SNPs) associated with a rheumatoid arthritis (RA) was carried out within two hours and that all the results were consistent with those by conventional typing methods. It is expected that this system opens a new way to a DNA testing such as a test for infectious diseases, a personalized medicine, a food inspection, a forensic application and any other applications.

  1. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  2. DNA extraction from vegetative tissue for next-generation sequencing.

    PubMed

    Furtado, Agnelo

    2014-01-01

    The quality of extracted DNA is crucial for several applications in molecular biology. If the DNA is to be used for next-generation sequencing (NGS), then microgram quantities of good-quality DNA is required. In addition, the DNA must substantially be of high molecular weight so that it can be used for library preparation and NGS sequencing. Contaminating phenol or starch in the isolated DNA can be easily removed by filtration through kit-based cartridges. In this chapter we describe a simple two-reagent DNA extraction protocol which yields a high quality and quantity of DNA which can be used for different applications including NGS.

  3. Solid-Phase Purification of Synthetic DNA Sequences.

    PubMed

    Grajkowski, Andrzej; Cieslak, Jacek; Beaucage, Serge L

    2016-08-05

    Although high-throughput methods for solid-phase synthesis of DNA sequences are currently available for synthetic biology applications and technologies for large-scale production of nucleic acid-based drugs have been exploited for various therapeutic indications, little has been done to develop high-throughput procedures for the purification of synthetic nucleic acid sequences. An efficient process for purification of phosphorothioate and native DNA sequences is described herein. This process consists of functionalizing commercial aminopropylated silica gel with aminooxyalkyl functions to enable capture of DNA sequences carrying a 5'-siloxyl ether linker with a "keto" function through an oximation reaction. Deoxyribonucleoside phosphoramidites functionalized with the 5'-siloxyl ether linker were prepared in yields of 75-83% and incorporated last into the solid-phase assembly of DNA sequences. Capture of nucleobase- and phosphate-deprotected DNA sequences released from the synthesis support is demonstrated to proceed near quantitatively. After shorter than full-length DNA sequences were washed from the capture support, the purified DNA sequences were released from this support upon treatment with tetra-n-butylammonium fluoride in dry DMSO. The purity of released DNA sequences exceeds 98%. The scalability and high-throughput features of the purification process are demonstrated without sacrificing purity of the DNA sequences.

  4. Automated degenerate PCR primer design for high-throughput sequencing improves efficiency of viral sequencing.

    PubMed

    Li, Kelvin; Shrivastava, Susmita; Brownley, Anushka; Katzel, Dan; Bera, Jayati; Nguyen, Anh Thu; Thovarai, Vishal; Halpin, Rebecca; Stockwell, Timothy B

    2012-11-06

    In a high-throughput environment, to PCR amplify and sequence a large set of viral isolates from populations that are potentially heterogeneous and continuously evolving, the use of degenerate PCR primers is an important strategy. Degenerate primers allow for the PCR amplification of a wider range of viral isolates with only one set of pre-mixed primers, thus increasing amplification success rates and minimizing the necessity for genome finishing activities. To successfully select a large set of degenerate PCR primers necessary to tile across an entire viral genome and maximize their success, this process is best performed computationally. We have developed a fully automated degenerate PCR primer design system that plays a key role in the J. Craig Venter Institute's (JCVI) high-throughput viral sequencing pipeline. A consensus viral genome, or a set of consensus segment sequences in the case of a segmented virus, is specified using IUPAC ambiguity codes in the consensus template sequence to represent the allelic diversity of the target population. PCR primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the full length of the specified target region. As part of the tiling process, primer pairs are computationally screened to meet the criteria for successful PCR with one of two described amplification protocols. The actual sequencing success rates for designed primers for measles virus, mumps virus, human parainfluenza virus 1 and 3, human respiratory syncytial virus A and B and human metapneumovirus are described, where >90% of designed primer pairs were able to consistently successfully amplify >75% of the isolates. Augmenting our previously developed and published JCVI Primer Design Pipeline, we achieved similarly high sequencing success rates with only minor software modifications. The recommended methodology for the construction of the consensus sequence that encapsulates the allelic variation of the targeted

  5. What Advances Are Being Made in DNA Sequencing?

    MedlinePlus

    ... of DNA building blocks (nucleotides) in an individual's genetic code, called DNA sequencing, has advanced the study of ... a breakthrough that helped scientists determine the human genetic code, but it is time-consuming and expensive. The ...

  6. An automated sample preparation system with mini-reactor to isolate and process submegabase fragments of bacterial DNA.

    PubMed

    Mollova, Emilia T; Patil, Vishal A; Protozanova, Ekaterina; Zhang, Meng; Gilmanshin, Rudolf

    2009-08-15

    Existing methods for extraction and processing of large fragments of bacterial genomic DNA are manual, time-consuming, and prone to variability in DNA quality and recovery. To solve these problems, we have designed and built an automated fluidic system with a mini-reactor. Balancing flows through and tangential to the ultrafiltration membrane in the reactor, cells and then released DNA can be immobilized and subjected to a series of consecutive processing steps. The steps may include enzymatic reactions, tag hybridization, buffer exchange, and selective removal of cell debris and by-products of the reactions. The system can produce long DNA fragments (up to 0.5 Mb) of bacterial genome restriction digest and perform DNA tagging with fluorescent sequence-specific probes. The DNA obtained is of high purity and floating free in solution, and it can be directly analyzed by pulsed-field gel electrophoresis (PFGE) or used in applications requiring submegabase DNA fragments. PFGE-ready samples of DNA restriction digests can be produced in as little as 2.1 h and require less than 10(8) cells. All fluidic operations are automated except for the injection of the sample and reagents.

  7. Micropreparative capillary gel electrophoresis of DNA: rapid expressed sequence tag library construction.

    PubMed

    Shi, Liang; Khandurina, Julia; Ronai, Zsolt; Li, Bi-Yu; Kwan, Wai King; Wang, Xun; Guttman, András

    2003-01-01

    A capillary gel electrophoresis based automated DNA fraction collection technique was developed to support a novel DNA fragment-pooling strategy for expressed sequence tag (EST) library construction. The cDNA population is first cleaved by BsaJ I and EcoR I restriction enzymes, and then subpooled by selective ligation with specific adapters followed by polymerase chain reaction (PCR) amplification and labeling. Combination of this cDNA fingerprinting method with high-resolution capillary gel electrophoresis separation and precise fractionation of individual cDNA transcript representatives avoids redundant fragment selection and concomitant repetitive sequencing of abundant transcripts. Using a computer-controlled capillary electrophoresis device the transcript representatives were separated by their size and fractions were automatically collected in every 30 s into 96-well plates. The high resolving power of the sieving matrix ensured sequencing grade separation of the DNA fragments (i.e., single-base resolution) and successful fraction collection. Performance and precision of the fraction collection procedure was validated by PCR amplification of the collected DNA fragments followed by capillary electrophoresis analysis for size and purity verification. The collected and PCR-amplified transcript representatives, ranging up to several hundred base pairs, were then sequenced to create an EST library.

  8. Microfluidic devices for DNA sequencing: sample preparation and electrophoretic analysis.

    PubMed

    Paegel, Brian M; Blazej, Robert G; Mathies, Richard A

    2003-02-01

    Modern DNA sequencing 'factories' have revolutionized biology by completing the human genome sequence, but in the race to completion we are left with inefficient, cumbersome, and costly macroscale processes and supporting facilities. During the same period, microfabricated DNA sequencing, sample processing and analysis devices have advanced rapidly toward the goal of a 'sequencing lab-on-a-chip'. Integrated microfluidic processing dramatically reduces analysis time and reagent consumption, and eliminates costly and unreliable macroscale robotics and laboratory apparatus. A microfabricated device for high-throughput DNA sequencing that couples clone isolation, template amplification, Sanger extension, purification, and electrophoretic analysis in a single microfluidic circuit is now attainable.

  9. Yeast DNA sequences initiating gene expression in Escherichia coli.

    PubMed

    Lewin, Astrid; Tran, Thi Tuyen; Jacob, Daniela; Mayer, Martin; Freytag, Barbara; Appel, Bernd

    2004-01-01

    DNA transfer between pro- and eukaryotes occurs either during natural horizontal gene transfer or as a result of the employment of gene technology. We analysed the capacity of DNA sequences from a eukaryotic donor organism (Saccharomyces cerevisiae) to serve as promoter region in a prokaryotic recipient (Escherichia coli) by creating fusions between promoterless luxAB genes from Vibrio harveyi and random DNA sequences from S. cerevisiae and measuring the luminescence of transformed E. coli. Fifty-four out of 100 randomly analysed S. cerevisiae DNA sequences caused considerable gene expression in E. coli. Determination of transcription start sites within six selected yeast sequences in E. coli confirmed the existence of bacterial -10 and -35 consensus sequences at appropriate distances upstream from transcription initiation sites. Our results demonstrate that the probability of transcription of transferred eukaryotic DNA in bacteria is extremely high and does not require the insertion of the transferred DNA behind a promoter of the recipient genome.

  10. Automated and Multiplexed Soft Lithography for the Production of Low-Density DNA Microarrays

    PubMed Central

    Fredonnet, Julie; Foncy, Julie; Cau, Jean-Christophe; Séverac, Childérick; François, Jean Marie; Trévisiol, Emmanuelle

    2016-01-01

    Microarrays are established research tools for genotyping, expression profiling, or molecular diagnostics in which DNA molecules are precisely addressed to the surface of a solid support. This study assesses the fabrication of low-density oligonucleotide arrays using an automated microcontact printing device, the InnoStamp 40®. This automate allows a multiplexed deposition of oligoprobes on a functionalized surface by the use of a MacroStampTM bearing 64 individual pillars each mounted with 50 circular micropatterns (spots) of 160 µm diameter at 320 µm pitch. Reliability and reuse of the MacroStampTM were shown to be fast and robust by a simple washing step in 96% ethanol. The low-density microarrays printed on either epoxysilane or dendrimer-functionalized slides (DendriSlides) showed excellent hybridization response with complementary sequences at unusual low probe and target concentrations, since the actual probe density immobilized by this technology was at least 10-fold lower than with the conventional mechanical spotting. In addition, we found a comparable hybridization response in terms of fluorescence intensity between spotted and printed oligoarrays with a 1 nM complementary target by using a 50-fold lower probe concentration to produce the oligoarrays by the microcontact printing method. Taken together, our results lend support to the potential development of this multiplexed microcontact printing technology employing soft lithography as an alternative, cost-competitive tool for fabrication of low-density DNA microarrays. PMID:27681742

  11. DNA sequence determination by hybridization: A strategy for efficient large-scale sequencing

    SciTech Connect

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoody, J.; Crkvenjakov, R. ); Funkhouser, W.K.; Koop, B.; Hood, L. )

    1993-06-11

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project. 22 refs., 3 figs.

  12. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    NASA Astrophysics Data System (ADS)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  13. A novel constraint for thermodynamically designing DNA sequences.

    PubMed

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  14. Highly conserved repetitive DNA sequences are present at human centromeres.

    PubMed Central

    Grady, D L; Ratliff, R L; Robinson, D L; McCanlies, E C; Meyne, J; Moyzis, R K

    1992-01-01

    Highly conserved repetitive DNA sequence clones, largely consisting of (GGAAT)n repeats, have been isolated from a human recombinant repetitive DNA library by high-stringency hybridization with rodent repetitive DNA. This sequence, the predominant repetitive sequence in human satellites II and III, is similar to the essential core DNA of the Saccharomyces cerevisiae centromere, centromere DNA element (CDE) III. In situ hybridization to human telophase and Drosophila polytene chromosomes shows localization of the (GGAAT)n sequence to centromeric regions. Hyperchromicity studies indicate that the (GGAAT)n sequence exhibits unusual hydrogen bonding properties. The purine-rich strand alone has the same thermal stability as the duplex. Hyperchromicity studies of synthetic DNA variants indicate that all sequences with the composition (AATGN)n exhibit this unusual thermal stability. DNA-mobility-shift assays indicate that specific HeLa-cell nuclear proteins recognize this sequence with a relative affinity greater than 10(5). The extreme evolutionary conservation of this DNA sequence, its centromeric location, its unusual hydrogen bonding properties, its high affinity for specific nuclear proteins, and its similarity to functional centromeres isolated from yeast suggest that this sequence may be a component of the functional human centromere. Images PMID:1542662

  15. Evaluation of automated and manual DNA purification methods for detecting Ricinus communis DNA during ricin investigations.

    PubMed

    Hutchins, Anne S; Astwood, Michael J; Saah, J Royden; Michel, Pierre A; Newton, Bruce R; Dauphin, Leslie A

    2014-03-01

    In April of 2013, letters addressed to the President of United States and other government officials were intercepted and found to be contaminated with ricin, heightening awareness about the need to evaluate laboratory methods for detecting ricin. This study evaluated commercial DNA purification methods for isolating Ricinus communis DNA as measured by real-time polymerase chain reaction (PCR). Four commercially available DNA purification methods (two automated, MagNA Pure compact and MagNA Pure LC, and two manual, MasterPure complete DNA and RNA purification kit and QIAamp DNA blood mini kit) were evaluated. We compared their ability to purify detectable levels of R. communis DNA from four different sample types, including crude preparations of ricin that could be used for biological crimes or acts of bioterrorism. Castor beans, spiked swabs, and spiked powders were included to simulate sample types typically tested during criminal and public health investigations. Real-time PCR analysis indicated that the QIAamp kit resulted in the greatest sensitivity for ricin preparations; the MasterPure kit performed best with spiked powders. The four methods detected equivalent levels by real-time PCR when castor beans and spiked swabs were used. All four methods yielded DNA free of PCR inhibitors as determined by the use of a PCR inhibition control assay. This study demonstrated that DNA purification methods differ in their ability to purify R. communis DNA; therefore, the purification method used for a given sample type can influence the sensitivity of real-time PCR assays for R. communis. Published by Elsevier Ireland Ltd.

  16. DNA sequencing by denaturation: principle and thermodynamic simulations.

    PubMed

    Chen, Ying-Ja; Huang, Xiaohua

    2009-01-01

    We describe a new DNA sequencing method called sequencing by denaturation (SBD). A Sanger dideoxy sequencing reaction is performed on the templates on a solid surface to generate a ladder of DNA fragments randomly terminated by fluorescently labeled dideoxyribonucleotides. The labeled DNA fragments are sequentially denatured from the templates and the process is monitored by measuring the change in fluorescence intensities from the surface. By analyzing the denaturation profiles, the base sequence of the template can be determined. Using thermodynamic principles, we simulated the denaturation profiles of a series of oligonucleotides ranging from 12 to 32 bases and developed a base-calling algorithm to decode the sequences. These simulations demonstrate that DNA molecules up to 20 bases can be sequenced by SBD. Experimental measurements of the melting profiles of DNA fragments in solution confirm that DNA sequences can be determined by SBD. The potential limitations and advantages of SBD are discussed. With SBD, millions of sequencing reactions can be performed on a small area on a surface in parallel with a very small amount of sequencing reagents. Therefore, DNA sequencing by SBD could potentially result in a significant increase in speed and reduction in cost in large-scale genome resequencing.

  17. Nanopores: A journey towards DNA sequencing

    PubMed Central

    Wanunu, Meni

    2013-01-01

    Much more than ever, nucleic acids are recognized as key building blocks in many of life's processes, and the science of studying these molecular wonders at the single-molecule level is thriving. A new method of doing so has been introduced in the mid 1990's. This method is exceedingly simple: a nanoscale pore that spans across an impermeable thin membrane is placed between two chambers that contain an electrolyte, and voltage is applied across the membrane using two electrodes. These conditions lead to a steady stream of ion flow across the pore. Nucleic acid molecules in solution can be driven through the pore, and structural features of the biomolecules are observed as measurable changes in the trans-membrane ion current. In essence, a nanopore is a high-throughput ion microscope and a single-molecule force apparatus. Nanopores are taking center stage as a tool that promises to read a DNA sequence, and this promise has resulted in overwhelming academic, industrial, and national interest. Regardless of the fate of future nanopore applications, in the process of this 16-year-long exploration, many studies have validated the indispensability of nanopores in the toolkit of single-molecule biophysics. This review surveys past and current studies related to nucleic acid biophysics, and will hopefully provoke a discussion of immediate and future prospects for the field. PMID:22658507

  18. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

  19. Configuring the Orion Guidance, Navigation, and Control Flight Software for Automated Sequencing

    NASA Technical Reports Server (NTRS)

    Odegard, Ryan G.; Siliwinski, Tomasz K.; King, Ellis T.; Hart, Jeremy J.

    2010-01-01

    The Orion Crew Exploration Vehicle is being designed with greater automation capabilities than any other crewed spacecraft in NASA s history. The Guidance, Navigation, and Control (GN&C) flight software architecture is designed to provide a flexible and evolvable framework that accommodates increasing levels of automation over time. Within the GN&C flight software, a data-driven approach is used to configure software. This approach allows data reconfiguration and updates to automated sequences without requiring recompilation of the software. Because of the great dependency of the automation and the flight software on the configuration data, the data management is a vital component of the processes for software certification, mission design, and flight operations. To enable the automated sequencing and data configuration of the GN&C subsystem on Orion, a desktop database configuration tool has been developed. The database tool allows the specification of the GN&C activity sequences, the automated transitions in the software, and the corresponding parameter reconfigurations. These aspects of the GN&C automation on Orion are all coordinated via data management, and the database tool provides the ability to test the automation capabilities during the development of the GN&C software. In addition to providing the infrastructure to manage the GN&C automation, the database tool has been designed with capabilities to import and export artifacts for simulation analysis and documentation purposes. Furthermore, the database configuration tool, currently used to manage simulation data, is envisioned to evolve into a mission planning tool for generating and testing GN&C software sequences and configurations. A key enabler of the GN&C automation design, the database tool allows both the creation and maintenance of the data artifacts, as well as serving the critical role of helping to manage, visualize, and understand the data-driven parameters both during software development

  20. Configuring the Orion Guidance, Navigation, and Control Flight Software for Automated Sequencing

    NASA Technical Reports Server (NTRS)

    Odegard, Ryan G.; Siliwinski, Tomasz K.; King, Ellis T.; Hart, Jeremy J.

    2010-01-01

    The Orion Crew Exploration Vehicle is being designed with greater automation capabilities than any other crewed spacecraft in NASA s history. The Guidance, Navigation, and Control (GN&C) flight software architecture is designed to provide a flexible and evolvable framework that accommodates increasing levels of automation over time. Within the GN&C flight software, a data-driven approach is used to configure software. This approach allows data reconfiguration and updates to automated sequences without requiring recompilation of the software. Because of the great dependency of the automation and the flight software on the configuration data, the data management is a vital component of the processes for software certification, mission design, and flight operations. To enable the automated sequencing and data configuration of the GN&C subsystem on Orion, a desktop database configuration tool has been developed. The database tool allows the specification of the GN&C activity sequences, the automated transitions in the software, and the corresponding parameter reconfigurations. These aspects of the GN&C automation on Orion are all coordinated via data management, and the database tool provides the ability to test the automation capabilities during the development of the GN&C software. In addition to providing the infrastructure to manage the GN&C automation, the database tool has been designed with capabilities to import and export artifacts for simulation analysis and documentation purposes. Furthermore, the database configuration tool, currently used to manage simulation data, is envisioned to evolve into a mission planning tool for generating and testing GN&C software sequences and configurations. A key enabler of the GN&C automation design, the database tool allows both the creation and maintenance of the data artifacts, as well as serving the critical role of helping to manage, visualize, and understand the data-driven parameters both during software development

  1. Sequence Recognition in the Pairing of DNA Duplexes

    NASA Astrophysics Data System (ADS)

    Kornyshev, A. A.; Leikin, S.

    2001-04-01

    Pairing of DNA fragments with homologous sequences occurs in gene shuffling, DNA repair, and other vital processes. While chemical individuality of base pairs is hidden inside the double helix, x ray and NMR revealed sequence-dependent modulation of the structure of DNA backbone. Here we show that the resulting modulation of the DNA surface charge pattern enables duplexes longer than ~50 base pairs to recognize sequence homology electrostatically at a distance of up to several water layers. This may explain the local recognition observed in pairing of homologous chromosomes and the observed length dependence of homologous recombination.

  2. Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.

    PubMed

    Li, Qing; Hermanson, Peter J; Springer, Nathan M

    2018-01-01

    DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.

  3. Characterization of group A Streptococcus strains recovered from Mexican children with pharyngitis by automated DNA sequencing of virulence-related genes: unexpectedly large variation in the gene (sic) encoding a complement-inhibiting protein.

    PubMed Central

    Mejia, L M; Stockbauer, K E; Pan, X; Cravioto, A; Musser, J M

    1997-01-01

    Sequence variation was studied in several target genes in 54 strains of group A Streptococcus (GAS) cultured from children with pharyngitis in Mexico City. Although 16 distinct emm alleles were identified, only 4 had not been previously described. Virtually all bacteria (31 of 33 [94%] with the streptococcal pyrogenic exotoxin gene (speA) had emm1-related, emm3, or emm6 alleles. The gene (sic) encoding an extracellular GAS protein that inhibits complement function was unusually variable among isolates with the emm1 family of alleles, with a total of seven variants identified. The data suggest that many GAS strains infecting Mexican children are genetically similar to organisms commonly encountered in the United States and western Europe. Sequence variation in the sic gene is useful for rapid differentiation among GAS isolates with the emm1 family of alleles. PMID:9399523

  4. Chimeric DNA methyltransferases target DNA methylation to specific DNA sequences and repress expression of target genes

    PubMed Central

    Li, Fuyang; Papworth, Monika; Minczuk, Michal; Rohde, Christian; Zhang, Yingying; Ragozin, Sergei; Jeltsch, Albert

    2007-01-01

    Gene silencing by targeted DNA methylation has potential applications in basic research and therapy. To establish targeted methylation in human cell lines, the catalytic domains (CDs) of mouse Dnmt3a and Dnmt3b DNA methyltransferases (MTases) were fused to different DNA binding domains (DBD) of GAL4 and an engineered Cys2His2 zinc finger domain. We demonstrated that (i) Dense DNA methylation can be targeted to specific regions in gene promoters using chimeric DNA MTases. (ii) Site-specific methylation leads to repression of genes controlled by various cellular or viral promoters. (iii) Mutations affecting any of the DBD, MTase or target DNA sequences reduce targeted methylation and gene silencing. (iv) Targeted DNA methylation is effective in repressing Herpes Simplex Virus type 1 (HSV-1) infection in cell culture with the viral titer reduced by at least 18-fold in the presence of an MTase fused to an engineered zinc finger DBD, which binds a single site in the promoter of HSV-1 gene IE175k. In short, we show here that it is possible to direct DNA MTase activity to predetermined sites in DNA, achieve targeted gene silencing in mammalian cell lines and interfere with HSV-1 propagation. PMID:17151075

  5. Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

    1998-03-01

    Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.

  6. Food Fish Identification from DNA Extraction through Sequence Analysis

    ERIC Educational Resources Information Center

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  7. cDNA cloning and sequencing of tarantula hemocyanin subunits.

    PubMed

    Voit, R; Feldmaier-Fuchs, G

    1990-01-01

    Tarantula heart cDNA libraries were screened with synthetic oligonucleotide probes deduced from the highly conserved amino acid sequences of the two copper-binding sites, copper A and copper B, found in chelicerate hemocyanins. Positive cDNA clones could be obtained and four different cDNA types were characterized.

  8. Scanning probe and nanopore DNA sequencing: core techniques and possibilities.

    PubMed

    Lund, John; Parviz, Babak A

    2009-01-01

    We provide an overview of the current state of research towards DNA sequencing using nanopore and scanning probe techniques. Additionally, we provide methods for the creation of two key experimental platforms for studies relating to nanopore and scanning probe DNA studies: a synthetic nanopore apparatus and an atomically flat conductive substrate with stretched DNA molecules.

  9. Food Fish Identification from DNA Extraction through Sequence Analysis

    ERIC Educational Resources Information Center

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  10. A 4D representation of DNA sequences and its application

    NASA Astrophysics Data System (ADS)

    Liao, Bo; Tan, Mingshu; Ding, Kequan

    2005-02-01

    A 4D representation of DNA sequences has been derived for mathematical denotation of DNA sequence. The 4D representation also avoids loss of information accompanying alternative 2D and 3D representation. The geometrical centers of the 4D graph of DNA sequences indicate the distribution of base frequencies. A interesting phenomenon is observed for Goat and Gallus β-globin genomes with high G + C content. The examination of similarities/dissimilarities among the coding sequences of the first exon of β-globin gene of different species illustrates the utility of the approach.

  11. DNA sequence mapping by fluorescence in situ hybridization

    SciTech Connect

    Brandriff, B.F.; Gordon, L.A.; Trask, B.J. )

    1991-01-01

    Various types of DNA probes, such as total genomic DNA, repetitive sequences, unique sequences, and composites of chromosome-specific DNA probes, can be used with fluorescence in situ hybridization (FISH) techniques to address research questions having to do with localization, mapping, and distribution of DNA in situ. FISH involves the formation of a heteroduplex between such DNA probes and chromatin targets on a microscope slide, which can be visualized with fluorescent reporter molecules. Three chromatin targets - metaphase chromosomes, somatic interphases, and zygote interphases - offer increasingly extended states of chromatin which can be strategically selected, individually or in combination, to address specific research questions of interest.

  12. Characteristics of cloned repeated DNA sequences in the barley genome

    SciTech Connect

    Anan'ev, E.V.; Bochkanov, S.S.; Ryzhik, M.V.; Sonina, N.V.; Chernyshev, A.I.; Shchipkova, N.I.; Yakovleva, E.Yu.

    1986-12-01

    A partial clone library of barley DNA fragments based on plasmid pBR325 was created. The cloned EcoRI-fragments of chromosomal DNA are from 2 to 14 kbp in length. More than 95% of the barley DNA inserts comprise repeated sequences of different complexity and copy number. Certain of these DNA sequences are from families comprising at least 1% of the barley genome. A significant proportion of the clones hybridize with numerous sets of restriction fragments of genome DNA and they are dispersed throughout the barley chromosomes.

  13. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    ERIC Educational Resources Information Center

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  14. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    ERIC Educational Resources Information Center

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  15. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present.

    PubMed

    Chen, Cheng-Yao

    2014-01-01

    Next-generation sequencing (NGS) technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. Escherichia coli DNA polymerase I proteolytic (Klenow) fragment was originally utilized in Sanger's dideoxy chain-terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE) and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ϕ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ϕ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  16. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles

    1997-01-01

    Modified gene encoding a modified DNA polymerase wherein the modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase.

  17. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, S.; Richardson, C.

    1997-03-25

    A modified gene encoding a modified DNA polymerase is disclosed. The modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase. 6 figs.

  18. Mapping DNA polymerase errors by single-molecule sequencing

    SciTech Connect

    Lee, David F.; Lu, Jenny; Chang, Seungwoo; Loparo, Joseph J.; Xie, Xiaoliang S.

    2016-05-16

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replication product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.

  19. Neandertal DNA sequences and the origin of modern humans.

    PubMed

    Krings, M; Stone, A; Schmitz, R W; Krainitzki, H; Stoneking, M; Pääbo, S

    1997-07-11

    DNA was extracted from the Neandertal-type specimen found in 1856 in western Germany. By sequencing clones from short overlapping PCR products, a hitherto unknown mitochondrial (mt) DNA sequence was determined. Multiple controls indicate that this sequence is endogenous to the fossil. Sequence comparisons with human mtDNA sequences, as well as phylogenetic analyses, show that the Neandertal sequence falls outside the variation of modern humans. Furthermore, the age of the common ancestor of the Neandertal and modern human mtDNAs is estimated to be four times greater than that of the common ancestor of human mtDNAs. This suggests that Neandertals went extinct without contributing mtDNA to modern humans.

  20. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  1. Application of 2-D graphical representation of DNA sequence

    NASA Astrophysics Data System (ADS)

    Liao, Bo; Tan, Mingshu; Ding, Kequan

    2005-10-01

    Recently, we proposed a 2-D graphical representation of DNA sequence [Bo Liao, A 2-D graphical representation of DNA sequence, Chem. Phys. Lett. 401 (2005) 196-199]. Based on this representation, we consider properties of mutations and compute the similarities among 11 mitochondrial sequences belonging to different species. The elements of the similarity matrix are used to construct phylogenic tree. Unlike most existing phylogeny construction methods, the proposed method does not require multiple alignment.

  2. Evolution of a complex minisatellite DNA sequence.

    PubMed

    Barros, Paula; Blanco, Miguel G; Boán, Francisco; Gómez-Márquez, Jaime

    2008-11-01

    Minisatellites are tandem repeats of short DNA units widely distributed in genomes. However, the information on their dynamics in a phylogenetic context is very limited. Here we have studied the organization of the MsH43 locus in several species of primates and from these data we have reconstructed the evolutionary history of this complex minisatellite. Overall, with the exception of gibbon, MsH43 has an organization that is asymmetric, since the distribution of repeats is distinct between the 5' and 3' halves, and heterogeneous since there are many different repeats, some of them characteristic of each species. Inspection of the MsH43 arrays showed the existence of many duplications and deletions, suggesting the implication of slippage processes in the generation of polymorphism. Concerning the evolutionary history of this minisatellite, we propose that the birth of MsH43 may be situated before the divergence of Old World Monkeys since we found the existence of some MsH43 repeat motifs in prosimians and New World Monkeys. The analysis of MsH43 in apes revealed the existence of an evolutionary breakpoint in the pathway that originated African great apes and humans. Remarkably, human MsH43 is more homologous to orang-utan than to the corresponding sequence in gorilla and chimpanzee. This finding does not comply with the evolutionary paradigm that continuous alterations occur during the course of genome evolution. To adjust our results to the standard phylogeny of primates, we propose the existence of a wandering allele that was maintained almost unaltered during the period that extends between orang-utan and humans.

  3. Cross-utilizing hyperchaotic and DNA sequences for image encryption

    NASA Astrophysics Data System (ADS)

    Zhan, Kun; Wei, Dong; Shi, Jinhui; Yu, Jun

    2017-01-01

    The hyperchaotic sequence and the DNA sequence are utilized jointly for image encryption. A four-dimensional hyperchaotic system is used to generate a pseudorandom sequence. The main idea is to apply the hyperchaotic sequence to almost all steps of the encryption. All intensity values of an input image are converted to a serial binary digit stream, and the bitstream is scrambled globally by the hyperchaotic sequence. DNA algebraic operation and complementation are performed between the hyperchaotic sequence and the DNA sequence to obtain a robust encryption performance. The experiment results demonstrate that the encryption algorithm achieves the performance of the state-of-the-art methods in term of quality, security, and robustness against noise and cropping attack.

  4. An auditory display tool for DNA sequence analysis.

    PubMed

    Temple, Mark D

    2017-04-24

    DNA Sonification refers to the use of an auditory display to convey the information content of DNA sequence data. Six sonification algorithms are presented that each produce an auditory display. These algorithms are logically designed from the simple through to the more complex. Three of these parse individual nucleotides, nucleotide pairs or codons into musical notes to give rise to 4, 16 or 64 notes, respectively. Codons may also be parsed degenerately into 20 notes with respect to the genetic code. Lastly nucleotide pairs can be parsed as two separate frames or codons can be parsed as three reading frames giving rise to multiple streams of audio. The most informative sonification algorithm reads the DNA sequence as codons in three reading frames to produce three concurrent streams of audio in an auditory display. This approach is advantageous since start and stop codons in either frame have a direct affect to start or stop the audio in that frame, leaving the other frames unaffected. Using these methods, DNA sequences such as open reading frames or repetitive DNA sequences can be distinguished from one another. These sonification tools are available through a webpage interface in which an input DNA sequence can be processed in real time to produce an auditory display playable directly within the browser. The potential of this approach as an analytical tool is discussed with reference to auditory displays derived from test sequences including simple nucleotide sequences, repetitive DNA sequences and coding or non-coding genes. This study presents a proof-of-concept that some properties of a DNA sequence can be identified through sonification alone and argues for their inclusion within the toolkit of DNA sequence browsers as an adjunct to existing visual and analytical tools.

  5. Presence of Bacterial Phage-Like DNA Sequences in Commercial Taq DNA Polymerase Reagents

    PubMed Central

    Newsome, Tamara; Li, Bing-Jie; Zou, Nianxiang; Lo, Shyh-Ching

    2004-01-01

    Many studies have reported the presence of bacterial DNA contamination in commercial Taq DNA polymerase reagents. This is the first report of the presence of phage-like DNA sequences in certain commercial Taq DNA polymerase reagents. Precautions are needed when using amplification reagents with exogenous DNAs. PMID:15131208

  6. Advanced microinstrumentation for rapid DNA sequencing and large DNA fragment separation

    SciTech Connect

    Balch, J.; Davidson, J.; Brewer, L.; Gingrich, J.; Koo, J.; Mariella, R.; Carrano, A.

    1995-01-25

    Our efforts to develop novel technology for a rapid DNA sequencer and large fragment analysis system based upon gel electrophoresis are described. We are using microfabrication technology to build dense arrays of high speed micro electrophoresis lanes that will ultimately increase the sequencing rate of DNA by at least 100 times the rate of current sequencers. We have demonstrated high resolution DNA fragment separation needed for sequencing in polyacrylamide microgels formed in glass microchannels. We have built prototype arrays of microchannels having up to 48 channels. Significant progress has also been made in developing a sensitive fluorescence detection system based upon a confocal microscope design that will enable the diagnostics and detection of DNA fragments in ultrathin microchannel gels. Development of a rapid DNA sequencer and fragment analysis system will have a major impact on future DNA instrumentation used in clinical, molecular and forensic analysis of DNA fragments.

  7. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  8. A comparison of methods for forensic DNA extraction: Chelex-100® and the QIAGEN DNA Investigator Kit (manual and automated).

    PubMed

    Phillips, Kirsty; McCallum, Nicola; Welch, Lindsey

    2012-03-01

    Efficient isolation of DNA from a sample is the basis for successful forensic DNA profiling. There are many DNA extraction methods available and they vary in their ability to efficiently extract the DNA; as well as in processing time, operator intervention, contamination risk and ease of use. In recent years, automated robots have been made available which speed up processing time and decrease the amount of operator input. This project was set up to investigate the efficiency of three DNA extraction methods, two manual (Chelex(®)-100 and the QIAGEN DNA Investigator Kit) and one automated (QIAcube), using both buccal cells and blood stains as the DNA source. Extracted DNA was quantified using real-time PCR in order to assess the amount of DNA present in each sample. Selected samples were then amplified using AmpFlSTR SGM Plus amplification kit. The results suggested that there was no statistical difference between results gained for the different methods investigated, but the automated QIAcube robot made sample processing much simpler and quicker without introducing DNA contamination.

  9. DNA Shape versus Sequence Variations in the Protein Binding Process.

    PubMed

    Chen, Chuanying; Pettitt, B Montgomery

    2016-02-02

    The binding process of a protein with a DNA involves three stages: approach, encounter, and association. It has been known that the complexation of protein and DNA involves mutual conformational changes, especially for a specific sequence association. However, it is still unclear how the conformation and the information in the DNA sequences affects the binding process. What is the extent to which the DNA structure adopted in the complex is induced by protein binding, or is instead intrinsic to the DNA sequence? In this study, we used the multiscale simulation method to explore the binding process of a protein with DNA in terms of DNA sequence, conformation, and interactions. We found that in the approach stage the protein can bind both the major and minor groove of the DNA, but uses different features to locate the binding site. The intrinsic conformational properties of the DNA play a significant role in this binding stage. By comparing the specific DNA with the nonspecific in unbound, intermediate, and associated states, we found that for a specific DNA sequence, ∼40% of the bending in the association forms is intrinsic and that ∼60% is induced by the protein. The protein does not induce appreciable bending of nonspecific DNA. In addition, we proposed that the DNA shape variations induced by protein binding are required in the early stage of the binding process, so that the protein is able to approach, encounter, and form an intermediate at the correct site on DNA. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  10. Multiplexed Sequence Encoding: A Framework for DNA Communication

    PubMed Central

    Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  11. [DNA analysis for the post genome-sequencing era].

    PubMed

    Kambara, Hideki

    2002-05-01

    With the completion of the human genome sequencing, the new post genome-sequencing era has started. The major subjects are clarifying the function of genes to apply this information to medical as well as various industrial fields. Various DNA analysis methods and instruments for gene expression profiling as well as genetic diversity including SNPs typing are required and have been developed. Here, the history and technologies related to DNA analysis including the Wada project in the early 1980's, and the Human genome project from 1990 are described. Various new technologies have developed in this decade. They include a capillary gel array DNA sequencer, DNA chips, bead probe arrays, a new DNA sequencing method using pyrosequencing and an efficient SNP typing method by BAMPER.

  12. A mathematical model and numerical method for thermoelectric DNA sequencing

    NASA Astrophysics Data System (ADS)

    Shi, Liwei; Guilbeau, Eric J.; Nestorova, Gergana; Dai, Weizhong

    2014-05-01

    Single nucleotide polymorphisms (SNPs) are single base pair variations within the genome that are important indicators of genetic predisposition towards specific diseases. This study explores the feasibility of SNP detection using a thermoelectric sequencing method that measures the heat released when DNA polymerase inserts a deoxyribonucleoside triphosphate into a DNA strand. We propose a three-dimensional mathematical model that governs the DNA sequencing device with a reaction zone that contains DNA template/primer complex immobilized to the surface of the lower channel wall. The model is then solved numerically. Concentrations of reactants and the temperature distribution are obtained. Results indicate that when the nucleoside is complementary to the next base in the DNA template, polymerization occurs lengthening the complementary polymer and releasing thermal energy with a measurable temperature change, implying that the thermoelectric conceptual device for sequencing DNA may be feasible for identifying specific genes in individuals.

  13. DNA Shape Dominates Sequence Affinity in Nucleosome Formation

    NASA Astrophysics Data System (ADS)

    Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.

    2014-10-01

    Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.

  14. Translating sanger-based routine DNA diagnostics into generic massive parallel ion semiconductor sequencing.

    PubMed

    Diekstra, Adinda; Bosgoed, Ermanno; Rikken, Alwin; van Lier, Bart; Kamsteeg, Erik-Jan; Tychon, Marloes; Derks, Ronny C; van Soest, Ronald A; Mensenkamp, Arjen R; Scheffer, Hans; Neveling, Kornelia; Nelen, Marcel R

    2015-01-01

    Dideoxy-based chain termination sequencing developed by Sanger is the gold standard sequencing approach and allows clinical diagnostics of disorders with relatively low genetic heterogeneity. Recently, new next generation sequencing (NGS) technologies have found their way into diagnostic laboratories, enabling the sequencing of large targeted gene panels or exomes. The development of benchtop NGS instruments now allows the analysis of single genes or small gene panels, making these platforms increasingly competitive with Sanger sequencing. We developed a generic automated ion semiconductor sequencing work flow that can be used in a clinical setting and can serve as a substitute for Sanger sequencing. Standard amplicon-based enrichment remained identical to PCR for Sanger sequencing. A novel postenrichment pooling strategy was developed, limiting the number of library preparations and reducing sequencing costs up to 70% compared to Sanger sequencing. A total of 1224 known pathogenic variants were analyzed, yielding an analytical sensitivity of 99.92% and specificity of 99.99%. In a second experiment, a total of 100 patient-derived DNA samples were analyzed using a blind analysis. The results showed an analytical sensitivity of 99.60% and specificity of 99.98%, comparable to Sanger sequencing. Ion semiconductor sequencing can be a first choice mutation scanning technique, independent of the genes analyzed. © 2014 American Association for Clinical Chemistry.

  15. Enrichment by hybridisation of long DNA fragments for Nanopore sequencing

    PubMed Central

    Eckert, Sabine E.; Chan, Jackie Z.-M.; Houniet, Darren; Breuer, Judy

    2016-01-01

    Enrichment of DNA by hybridisation is an important tool which enables users to gather target-focused next-generation sequence data in an economical fashion. Current in-solution methods capture short fragments of around 200–300 nt, potentially missing key structural information such as recombination or translocations often found in viral or bacterial pathogens. The increasing use of long-read third-generation sequencers requires methods and protocols to be adapted for their specific requirements. Here, we present a variation of the traditional bait–capture approach which can selectively enrich large fragments of DNA or cDNA from specific bacterial and viral pathogens, for sequencing on long-read sequencers. We enriched cDNA from cultured influenza virus A, human cytomegalovirus (HCMV) and genomic DNA from two strains of Mycobacterium tuberculosis (M. tb) from a background of cell line or spiked human DNA. We sequenced the enriched samples on the Oxford Nanopore MinION™ and the Illumina MiSeq platform and present an evaluation of the method, together with analysis of the sequence data. We found that unenriched influenza A and HCMV samples had no reads matching the target organism due to the high background of DNA from the cell line used to culture the pathogen. In contrast, enriched samples sequenced on the MinION™ platform had 57 % and 99 % best-quality on-target reads respectively. PMID:28785419

  16. Real-time DNA sequencing from single polymerase molecules.

    PubMed

    Korlach, Jonas; Bjornson, Keith P; Chaudhuri, Bidhan P; Cicero, Ronald L; Flusberg, Benjamin A; Gray, Jeremy J; Holden, David; Saxena, Ravi; Wegener, Jeffrey; Turner, Stephen W

    2010-01-01

    Pacific Biosciences has developed a method for real-time sequencing of single DNA molecules (Eid et al., 2009), with intrinsic sequencing rates of several bases per second and read lengths into the kilobase range. Conceptually, this sequencing approach is based on eavesdropping on the activity of DNA polymerase carrying out template-directed DNA polymerization. Performed in a highly parallel operational mode, sequential base additions catalyzed by each polymerase are detected with terminal phosphate-linked, fluorescence-labeled nucleotides. This chapter will first outline the principle of this single-molecule, real-time (SMRT) DNA sequencing method, followed by descriptions of its underlying components and typical sequencing run conditions. Two examples are provided which illustrate that, in addition to the DNA sequence, the dynamics of DNA polymerization from each enzyme molecules is directly accessible: the determination of base-specific kinetic parameters from single-molecule sequencing reads, and the characterization of DNA synthesis rate heterogeneities. Copyright 2010 Elsevier Inc. All rights reserved.

  17. The number of reduced alignments between two DNA sequences

    PubMed Central

    2014-01-01

    Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679

  18. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  19. Visual Automated Fluorescence Electrophoresis Provides Simultaneous Quality, Quantity, and Molecular Weight Spectra for Genomic DNA from Archived Neonatal Blood Spots

    PubMed Central

    Klassen, Tara L.; Drabek, Janice; Tomson, Torjbörn; Sveinsson, Olafur; von Döbeln, Ulrika; Noebels, Jeffrey L.; Goldman, Alicia M.

    2014-01-01

    The Guthrie 903 card archived dried blood spots (DBSs) are a unique but terminal resource amenable for individual and population-wide genomic profiling. The limited amounts of DBS-derived genomic DNA (gDNA) can be whole genome amplified, producing sufficient gDNA for genomic applications, albeit with variable success; optimizing the isolation of high-quality DNA from these finite, low-yield specimens is essential. Agarose gel electrophoresis and spectrophotometry are established postextraction quality control (QC) methods but lack the power to disclose detailed structural, qualitative, or quantitative aspects that underlie gDNA failure in downstream applications. Visual automated fluorescence electrophoresis (VAFE) is a novel QC technology that affords precise quality, quantity, and molecular weight of double-stranded DNA from a single microliter of sample. We extracted DNA from 3-mm DBSs archived in the Swedish Neonatal Repository for >30 years and performed the first quantitative and qualitative analyses of DBS-derived DNA on VAFE, before and after whole genome amplified, in parallel with traditional QC methods. The VAFE QC data were correlated with subsequent sample performance in PCR, sequencing, and high-density comparative genome hybridization array. We observed improved standardization of nucleic acid quantity, quality and integrity, and high performance in the downstream genomic technologies. Addition of VAFE measures in QC increases confidence in the validity of genetic data and allows cost-effective downstream analysis of gDNA for investigational and diagnostic applications. PMID:23518217

  20. Biological nanopore MspA for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Manrao, Elizabeth A.

    Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore

  1. Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

    SciTech Connect

    Fields, C.A.

    1994-09-01

    This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.

  2. An Optimal Seed Based Compression Algorithm for DNA Sequences

    PubMed Central

    Gopalakrishnan, Gopakumar; Karunakaran, Muralikrishnan

    2016-01-01

    This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. PMID:27555868

  3. DNA Methyltransferase Accessibility Protocol for Individual Templates by Deep Sequencing

    PubMed Central

    Darst, Russell P.; Nabilsi, Nancy H.; Pardo, Carolina E.; Riva, Alberto; Kladde, Michael P.

    2013-01-01

    A single-molecule probe of chromatin structure can uncover dynamic chromatin states and rare epigenetic variants of biological importance that bulk measures of chromatin structure miss. In bisulfite genomic sequencing, each sequenced clone records the methylation status of multiple sites on an individual molecule of DNA. An exogenous DNA methyltransferase can thus be used to image nucleosomes and other protein–DNA complexes. In this chapter, we describe the adaptation of this technique, termed Methylation Accessibility Protocol for individual templates, to modern high-throughput sequencing, which both simplifies the workflow and extends its utility. PMID:22929770

  4. DNA sequence analysis with droplet-based microfluidics

    PubMed Central

    Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.

    2014-01-01

    Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402

  5. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    NASA Astrophysics Data System (ADS)

    Nielsen, Peter E.

    2008-10-01

    Peptide nucleic acids (PNA) can be designed to target duplex DNA with very high sequence specificity and efficiency via various binding modes. We have designed three domain PNA clamps, that bind stably to predefined decameric homopurine targets in large dsDNA molecules and via a third PNA domain sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technology of protein dsDNA structures.

  6. Current-voltage characteristics of double-strand DNA sequences

    NASA Astrophysics Data System (ADS)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  7. Sequence-dependent DNA deformability studied using molecular dynamics simulations

    PubMed Central

    Fujii, Satoshi; Kono, Hidetoshi; Takenaka, Shigeori; Go, Nobuhiro; Sarai, Akinori

    2007-01-01

    Proteins recognize specific DNA sequences not only through direct contact between amino acids and bases, but also indirectly based on the sequence-dependent conformation and deformability of the DNA (indirect readout). We used molecular dynamics simulations to analyze the sequence-dependent DNA conformations of all 136 possible tetrameric sequences sandwiched between CGCG sequences. The deformability of dimeric steps obtained by the simulations is consistent with that by the crystal structures. The simulation results further showed that the conformation and deformability of the tetramers can highly depend on the flanking base pairs. The conformations of xATx tetramers show the most rigidity and are not affected by the flanking base pairs and the xYRx show by contrast the greatest flexibility and change their conformations depending on the base pairs at both ends, suggesting tetramers with the same central dimer can show different deformabilities. These results suggest that analysis of dimeric steps alone may overlook some conformational features of DNA and provide insight into the mechanism of indirect readout during protein–DNA recognition. Moreover, the sequence dependence of DNA conformation and deformability may be used to estimate the contribution of indirect readout to the specificity of protein–DNA recognition as well as nucleosome positioning and large-scale behavior of nucleic acids. PMID:17766249

  8. DNA Sequencing by Hexagonal Boron Nitride Nanopore: A Computational Study

    PubMed Central

    Zhang, Liuyang; Wang, Xianqiao

    2016-01-01

    The single molecule detection associated with DNA sequencing has motivated intensive efforts to identify single DNA bases. However, little research has been reported utilizing single-layer hexagonal boron nitride (hBN) for DNA sequencing. Here we employ molecular dynamics simulations to explore pathways for single-strand DNA (ssDNA) sequencing by nanopore on the hBN sheet. We first investigate the adhesive strength between nucleobases and the hBN sheet, which provides the foundation for the hBN-base interaction and nanopore sequencing mechanism. Simulation results show that the purine base has a more remarkable energy profile and affinity than the pyrimidine base on the hBN sheet. The threading of ssDNA through the hBN nanopore can be clearly identified due to their different energy profiles and conformations with circular nanopores on the hBN sheet. The sequencing process is orientation dependent when the shape of the hBN nanopore deviates from the circle. Our results open up a promising avenue to explore the capability of DNA sequencing by hBN nanopore.

  9. Guanine tracts enhance sequence directed DNA bends.

    PubMed Central

    Milton, D L; Casper, M L; Wills, N M; Gesteland, R F

    1990-01-01

    Synthetic DNA fragments were constructed to determine the effect of G tracts, in conjunction with periodically spaced A tracts, on DNA bends. Relative length measurements showed that the G tracts spaced at the half helical turn enhanced the DNA bend. When the G tract was interrupted with a thymine or shortened to one or two guanines, the relative lengths decreased. If the G tract was replaced with either an A tract or a T tract, the bend was cancelled. Replacement with a C tract decreased the relative length to that of a thymine interruption suggesting that bend enhancement due to G tracts requires A tracts on the same strand. PMID:2315040

  10. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA

    PubMed Central

    2015-01-01

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  11. DNA sequence compression using the burrows-wheeler transform.

    PubMed

    Adjeroh, Don; Zhang, Yong; Mukherjee, Amar; Powell, Matt; Bell, Tim

    2002-01-01

    We investigate off-line dictionary oriented approaches to DNA sequence compression, based on the Burrows-Wheeler Transform (BWT). The preponderance of short repeating patterns is an important phenomenon in biological sequences. Here, we propose off-line methods to compress DNA sequences that exploit the different repetition structures inherent in such sequences. Repetition analysis is performed based on the relationship between the BWT and important pattern matching data structures, such as the suffix tree and suffix array. We discuss how the proposed approach can be incorporated in the BWT compression pipeline.

  12. Mapping and sequencing DNA using nanopores and nanodetectors.

    PubMed

    Thompson, John F; Oliver, John S

    2012-12-01

    Even prior to the introduction of capillary DNA sequencers, nanopores were discussed as a low-cost, high-throughput substrate for sequencing. Since then, other next-generation sequencing technologies have been developed and achieved widespread use, but nanopores have lagged behind due to difficulties in generating usable sequence data. The practical and theoretical issues of translocation speed and signal detection encountered when attempting to sequence DNA with nanopores are discussed. Various methods that different laboratories have used to overcome difficulties in biologically based and solid-state nanopores are also presented. Different approaches designed to circumvent the overriding issue of detecting signals from individual bases in a time-resolved manner in nanopores are described. For example, genomic positional sequencing utilizes hybridization of short oligonucleotide probes to very long DNA templates and then detects these probes by variations in current blockade in solid-state nanodetectors. The positions of the probes relative to each other and relative to the ends of the DNA are determined by measuring the time between current blockade peaks. By assembling many such measurements, it is possible to overcome the problems encountered when attempting to sequence DNA at high speed in nanopores, providing the potential for true de novo sequencing of large genomes on a routine basis. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Semiconductor-based DNA sequencing of histone modification states.

    PubMed

    Cheng, Christine S; Rai, Kunal; Garber, Manuel; Hollinger, Andrew; Robbins, Dana; Anderson, Scott; Macbeth, Alyssa; Tzou, Austin; Carneiro, Mauricio O; Raychowdhury, Raktima; Russ, Carsten; Hacohen, Nir; Gershenwald, Jeffrey E; Lennon, Niall; Nusbaum, Chad; Chin, Lynda; Regev, Aviv; Amit, Ido

    2013-01-01

    The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues.

  14. Sequence organisation in nuclear DNA from Physarum polycephalum: methylation of repetitive sequences.

    PubMed

    Whittaker, P A; McLachlan, A; Hardman, N

    1981-02-25

    Nuclear DNA from the slime mould Physarum polycephalum is digested by the restriction endonuclease HpaII to generate a high molecular weight and a low molecular weight component. These are referred to as the M+ and the M- compartment, respectively. Sequences that are present in the M+ compartment are cleaved by MspI, the restriction enzyme isoschizomer of HpaII, thus showing that the recognition sequences for these enzymes in M+ DNA contain methylated CpG doublets. The distribution of repetitive sequences in the M+ and M- DNA compartments was investigated by comparison of the 'fingerprint' patterns of total Physarum DNA and isolated M+ DNA after digestion using different restriction endonucleases, and by probing for the presence of specific repetitive sequences in Southern blots of M+ and M- DNA by the use of cloned DNA segments. Both types of experiment indicate that many repetitive sequences are shared by both compartments, though some repetitive sequences appear to be considerably enriched, or are present exclusively, either in M+ DNA or in M- DNA.

  15. Sequence organisation in nuclear DNA from Physarum polycephalum: methylation of repetitive sequences.

    PubMed Central

    Whittaker, P A; McLachlan, A; Hardman, N

    1981-01-01

    Nuclear DNA from the slime mould Physarum polycephalum is digested by the restriction endonuclease HpaII to generate a high molecular weight and a low molecular weight component. These are referred to as the M+ and the M- compartment, respectively. Sequences that are present in the M+ compartment are cleaved by MspI, the restriction enzyme isoschizomer of HpaII, thus showing that the recognition sequences for these enzymes in M+ DNA contain methylated CpG doublets. The distribution of repetitive sequences in the M+ and M- DNA compartments was investigated by comparison of the 'fingerprint' patterns of total Physarum DNA and isolated M+ DNA after digestion using different restriction endonucleases, and by probing for the presence of specific repetitive sequences in Southern blots of M+ and M- DNA by the use of cloned DNA segments. Both types of experiment indicate that many repetitive sequences are shared by both compartments, though some repetitive sequences appear to be considerably enriched, or are present exclusively, either in M+ DNA or in M- DNA. Images PMID:6262717

  16. DNA sequencing using polymerase substrate-binding kinetics

    PubMed Central

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  17. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Cancer.gov

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  18. Microchannel DNA Sequencing by End-Labelled Free Solution Electrophoresis

    SciTech Connect

    Barron, A.

    2005-09-29

    The further development of End-Labeled Free-Solution Electrophoresis will greatly simplify DNA separation and sequencing on microfluidic devices. The development and optimization of drag-tags is critical to the success of this research.

  19. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Cancer.gov

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  20. Characterizing self-similarity in bacteria DNA sequences

    NASA Astrophysics Data System (ADS)

    Lu, Xin; Sun, Zhirong; Chen, Huimin; Li, Yanda

    1998-09-01

    In this paper some parametric methods are introduced to characterize the self-similarity of DNA sequences. Compared with Fourier analysis, these methods perform statistically more stably and yield more reliable results. Using these methods, eight whole genomes of bacteria provided by NCBI are analyzed. Long-range correlation properties in the nucleotide density distribution along these DNA sequences are explored. Estimation results show that the long-range correlation structure prevails through the entire molecule of DNA. Higher order statistics through coarse graining reveal that rather than multifractal, there are only monofractal phenomena presented in the sequences. Hence, the nucleotide density distribution can be modeled asymptotically as fractional Gaussian noise. This result points to a new direction for analyzing and understanding the intrinsic structures of DNA sequences.

  1. Local alignment of two-base encoded DNA sequence

    PubMed Central

    Homer, Nils; Merriman, Barry; Nelson, Stanley F

    2009-01-01

    Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732

  2. Identifying individuals by sequencing mitochondrial DNA from teeth.

    PubMed

    Ginther, C; Issel-Tarver, L; King, M C

    1992-10-01

    Mitochondrial DNA (mtDNA) was extracted from teeth stored from 3 months to 20 years, including teeth from the semi-skeletonized remains of a murder victim which had been buried for 10 months. Tooth donors and/or their maternal relatives provided blood or buccal cells, from which mtDNA was also extracted. Enzymatic amplification and direct sequencing of roughly 650 nucleotides from two highly polymorphic regions of mtDNA yielded identical sequences for each comparison of tooth and fresh DNA. Our results suggest that teeth provide an excellent source for high molecular weight mtDNA that can be valuable for extending the time in which decomposed human remains can be genetically identified.

  3. Sequence-Specific Molecular Lithography on Single DNA Molecules

    NASA Astrophysics Data System (ADS)

    Keren, Kinneret; Krueger, Michael; Gilad, Rachel; Ben-Yoseph, Gdalyahu; Sivan, Uri; Braun, Erez

    2002-07-01

    Recent advances in the realization of individual molecular-scale electronic devices emphasize the need for novel tools and concepts capable of assembling such devices into large-scale functional circuits. We demonstrated sequence-specific molecular lithography on substrate DNA molecules by harnessing homologous recombination by RecA protein. In a sequence-specific manner, we patterned the coating of DNA with metal, localized labeled molecular objects and grew metal islands on specific sites along the DNA substrate, and generated molecularly accurate stable DNA junctions for patterning the DNA substrate connectivity. In our molecular lithography, the information encoded in the DNA molecules replaces the masks used in conventional microelectronics, and the RecA protein serves as the resist. The molecular lithography works with high resolution over a broad range of length scales from nanometers to many micrometers.

  4. Nucleotide correlations and electronic transport of DNA sequences

    NASA Astrophysics Data System (ADS)

    Albuquerque, E. L.; Vasconcelos, M. S.; Lyra, M. L.; de Moura, F. A. B. F.

    2005-02-01

    We use a tight-binding formulation to investigate the transmissivity and wave-packet dynamics of sequences of single-strand DNA molecules made up from the nucleotides guanine G , adenine A , cytosine C , and thymine T . In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of two artificial sequences: (i) the Rudin-Shapiro one, which has long-range correlations; (ii) a random sequence, which is a kind of prototype of a short-range correlated system, presented here with the same first-neighbor pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the persistence of resonances of finite segments. On the other hand, the wave-packet dynamics seems to be mostly influenced by the short-range correlations.

  5. Single Molecule Electrical Sequencing of DNA and RNA

    NASA Astrophysics Data System (ADS)

    Taniguchi, Masateru

    2013-03-01

    Gating nanopore devices are composed of nanopores with embedded nanoelectrodes, and they are expected to be one of the core devices used to realize label-free, low-cost DNA sequencing, subsequently leading to 1000-genome sequencing technologies. The operating principle of these nanodevices is based on identifying single base molecules of single DNA passing through a nanopore using a tunneling current between nanoelectrodes. We successfully identified single base molecules of DNA and RNA using tunneling currents. To make gating nanopore devices fit for practical use, core technologies should be integrated on one device chip. One core technology is the identification of single DNA and RNA composed of many base molecules using tunneling currents. We have succeeded in the single-molecule electrical sequencing of DNA and RNA formed by 3 and 7 base molecules, respectively, using a hybrid method of identifying single base molecules via a tunnelling current and random sequencing. A method that controls the speed of a single DNA passing through a nanopore is one core technology that determines the speed and accuracy of sequencing. We successfully developed a method that controls the translocation speed of a single DNA by three orders of magnitude using a voltage between nanoelectrodes.

  6. Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer

    PubMed Central

    Johnson, Sarah S.; Zaikova, Elena; Goerlitz, David S.; Bai, Yu; Tighe, Scott W.

    2017-01-01

    The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions. PMID:28337073

  7. Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer.

    PubMed

    Johnson, Sarah S; Zaikova, Elena; Goerlitz, David S; Bai, Yu; Tighe, Scott W

    2017-04-01

    The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions.

  8. Periodic organisation of foldback sequences in Physarum polycephalum nuclear DNA.

    PubMed

    Hardman, N; Jack, P L

    1978-07-01

    Nuclear DNA from the slime mould Physarum polycephalum is shown to contain interspersed inverted repeat sequences, such that denatured fragments of DNA containing pairs of these sequences form intra-chain duplexes under appropriate conditions. The organisation and distribution of the nucleotide sequences responsible for the formation of foldback structures in Physarum DNA have been investigated using the electron microscope. The majority of foldback duplexes have sizes ranging up to 800 base pairs, and about 60-80% of DNA molecules 2.2 X 10(4) bases in length contain interspersed foldback elements. The size of individual foldback duplexes, and also the length of the intervening sequences which separate them, are non-random. The results can best be explained by a model in which separate foldback foci in Physarum DNA are spaced periodically at regular intervals. The regions containing foldback foci are thought to contain smaller, tandemly-arranged sequences of discrete sizes, in some cases related to other nucleotide sequences of a similar nature in the same locality in Physarum DNA.

  9. Periodic organisation of foldback sequences in Physarum polycephalum nuclear DNA.

    PubMed Central

    Hardman, N; Jack, P L

    1978-01-01

    Nuclear DNA from the slime mould Physarum polycephalum is shown to contain interspersed inverted repeat sequences, such that denatured fragments of DNA containing pairs of these sequences form intra-chain duplexes under appropriate conditions. The organisation and distribution of the nucleotide sequences responsible for the formation of foldback structures in Physarum DNA have been investigated using the electron microscope. The majority of foldback duplexes have sizes ranging up to 800 base pairs, and about 60-80% of DNA molecules 2.2 X 10(4) bases in length contain interspersed foldback elements. The size of individual foldback duplexes, and also the length of the intervening sequences which separate them, are non-random. The results can best be explained by a model in which separate foldback foci in Physarum DNA are spaced periodically at regular intervals. The regions containing foldback foci are thought to contain smaller, tandemly-arranged sequences of discrete sizes, in some cases related to other nucleotide sequences of a similar nature in the same locality in Physarum DNA. Images PMID:566909

  10. Nuclear and mitochondrial DNA sequences from two Denisovan individuals

    PubMed Central

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V.; Derevianko, Anatoly P.; Prüfer, Kay; Pääbo, Svante

    2015-01-01

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans. PMID:26630009

  11. Biodiversity, genomes, and DNA sequence databases.

    PubMed

    Leipe, D D

    1996-12-01

    There are approximately 1.4 million organisms on this planet that have been described morphologically but there is no comparable coverage of biodiversity at the molecular level. Little more than 1% of the known species have been subject to any molecular scrutiny and eukaryotic genome projects have focused on a group of closely related model organisms. The past year, however, has seen an approximately 80% increase in the number of species represented in sequence databases and the completion of the sequencing of three prokaryotic genomes. Large-scale sequencing projects seem set to begin coverage of a wider range of the eukaryotic diversity, including green plants, microsporidians and diplomonads.

  12. Effects of sequence on DNA wrapping around histones

    NASA Astrophysics Data System (ADS)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  13. ASAP: automated sequence annotation pipeline for web-based updating of sequence information with a local dynamic database.

    PubMed

    Kossenkov, Andrew; Manion, Frank J; Korotkov, Eugene; Moloshok, Thomas D; Ochs, Michael F

    2003-03-22

    The automated sequence annotation pipeline (ASAP) is designed to ease routine investigation of new functional annotations on unknown sequences, such as expressed sequence tags (ESTs), through querying of web-accessible resources and maintenance of a local database. The system allows easy use of the output from one search as the input for a new search, as well as the filtering of results. The database is used to store formats and parameters and information for parsing data from web sites. The database permits easy updating of format information should a site modify the format of a query or of a returned web page.

  14. Probe mapping to facilitate transposon-based DNA sequencing

    SciTech Connect

    Strausbaugh, L.D.; Bourke, M.T.; Sommer, M.T.; Coon, M.E.; Berg, C.M. )

    1990-08-01

    A promising strategy for DNA sequencing exploits transposons to provide mobile sites for the binding of sequencing primers. For such a strategy to be maximally efficient, the location and orientation of the transposon must be readily determined and the insertion sites should be randomly distributed. The authors demonstrate an efficient probe-based method for the localization and orientation of transposon-borne primer sites, which is adaptable to large-scale sequencing strategies. This approach requires no prior restriction enzyme mapping or knowledge of the cloned sequence and eliminates the inefficiency inherent in totally random sequencing methods. To test the efficiency of probe mapping, 49 insertions of the transposon {gamma}{delta} (Tn1000) in a cloned fragment of Drosophila melanogaster DNA were mapped and oriented. In addition, oligonucleotide primers specific for unique subterminal {gamma}{delta} segments were used to prime dideoxynucleotide double-stranded sequencing. These data provided an opportunity to rigorously examine {gamma}{delta} insertion sites. The insertions were quire randomly distributed, even though the target DNA fragment had both A+T-rich and G+C-rich regions; in G+C-rich DNA, the insertions were found in A+T-rich valleys. These data demonstrate that {gamma}{delta} is an excellent choice for supplying mobile primer binding sites to cloned DNA and that transposon-based probe mapping permits the sequences of large cloned segments to be determined without any subcloning.

  15. Googling DNA sequences on the World Wide Web.

    PubMed

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  16. Affordable hands-on DNA sequencing and genotyping: an exercise for teaching DNA analysis to undergraduates.

    PubMed

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.

  17. Terminal repetitive sequences in herpesvirus saimiri virion DNA.

    PubMed

    Bankier, A T; Dietrich, W; Baer, R; Barrell, B G; Colbère-Garapin, F; Fleckenstein, B; Bodemer, W

    1985-07-01

    The H-DNA repeat unit of Herpesvirus saimiri strain 11 was cloned in plasmid vector pAGO, and the nucleotide sequence was determined by the dideoxy chain termination method. One unit of repetitive DNA has 1,444 base pairs with 70.8% G+C content. The structural features of repeat DNA sequences at the termini of intact virion M-DNA (160 kilobases) and orientation of reiterated DNA were analyzed by radioactive end labeling of M-DNA, followed by cleavage of the end fragments with restriction endonucleases. The termini appeared to be blunt ended with a 5'-phosphate group, probably generated during encapsidation by cleavage in the immediate vicinity of the single ApaI recognition site in the H-DNA repeat unit. The sequence did not reveal sizeable open reading frames, the longest hypothetical peptide from H-DNA being 85 amino acids. There was no evidence for an mRNA promoter or terminator element, and H-DNA-specific transcription could not be found in productively infected cells.

  18. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    PubMed

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  19. Automated quantification of lumbar vertebral kinematics from dynamic fluoroscopic sequences

    NASA Astrophysics Data System (ADS)

    Camp, Jon; Zhao, Kristin; Morel, Etienne; White, Dan; Magnuson, Dixon; Gay, Ralph; An, Kai-Nan; Robb, Richard

    2009-02-01

    We hypothesize that the vertebra-to-vertebra patterns of spinal flexion and extension motion of persons with lower back pain will differ from those of persons who are pain-free. Thus, it is our goal to measure the motion of individual lumbar vertebrae noninvasively from dynamic fluoroscopic sequences. Two-dimensional normalized mutual information-based image registration was used to track frame-to-frame motion. Software was developed that required the operator to identify each vertebra on the first frame of the sequence using a four-point "caliper" placed at the posterior and anterior edges of the inferior and superior end plates of the target vertebrae. The program then resolved the individual motions of each vertebra independently throughout the entire sequence. To validate the technique, 6 cadaveric lumbar spine specimens were potted in polymethylmethacrylate and instrumented with optoelectric sensors. The specimens were then placed in a custom dynamic spine simulator and moved through flexion-extension cycles while kinematic data and fluoroscopic sequences were simultaneously acquired. We found strong correlation between the absolute flexionextension range of motion of each vertebra as recorded by the optoelectric system and as determined from the fluoroscopic sequence via registration. We conclude that this method is a viable way of noninvasively assessing twodimensional vertebral motion.

  20. Ancient DNA sequence revealed by error-correcting codes.

    PubMed

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  1. The properties and applications of single-molecule DNA sequencing

    PubMed Central

    2011-01-01

    Single-molecule sequencing enables DNA or RNA to be sequenced directly from biological samples, making it well-suited for diagnostic and clinical applications. Here we review the properties and applications of this rapidly evolving and promising technology. PMID:21349208

  2. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  3. Human gamma X satellite DNA: an X chromosome specific centromeric DNA sequence.

    PubMed

    Lee, C; Li, X; Jabs, E W; Court, D; Lin, C C

    1995-11-01

    The cosmid clone, CX16-2D12, was previously localized to the centromeric region of the human X chromosome and shown to lack human X-specific alpha satellite DNA. A 1.2 kb EcoRI fragment was subcloned from the CX16-2D12 cosmid and was named 2D12/E2. DNA sequencing revealed that this 1,205 bp fragment consisted of approximately five tandemly repeated DNA monomers of 220 bp. DNA sequence homology between the monomers of 2D12/E2 ranged from 72.8% to 78.6%. Interestingly, DNA sequence analysis of the 2D12/E2 clone displayed a change in monomer unit orientation between nucleotide positions 585-586 from a "tail-to-head" arrangement to a "head-to-tail" configuration. This may reflect the existence of at least one inversion within this repetitive DNA array in the centromeric region of the human X chromosome. The DNA consensus sequence derived from a compilation of these 220 bp monomers had approximately 62% DNA sequence similarity to the previously determined gamma 8 satellite DNA consensus sequence. Comparison of the 2D12/E2 and gamma 8 consensus sequences revealed a 20 bp DNA sequence that was well conserved in both DNA consensus sequences. Slot-blot analysis revealed that this repetitive DNA sequence comprises approximately 0.015% of the human genome, similar to that found with gamma 8 satellite DNA. These observations suggest that this satellite DNA clone is derived from a subfamily of gamma satellite DNA and is thus designated gamma X satellite DNA. When genomic DNA from six unrelated males and two unrelated females was cut with SstI or HpaI and separated by pulsed-field gel electrophoresis, no restriction fragment length polymorphisms were observed for either gamma X (2D12/E2) or gamma 8 (50E4) probes. Fluorescence in situ hybridization localized the 2D12/E2 clone to the lateral sides of the primary constriction specifically on the human X chromosome.

  4. Microfabricated bioprocessor for integrated nanoliter-scale Sanger DNA sequencing.

    PubMed

    Blazej, Robert G; Kumaresan, Palani; Mathies, Richard A

    2006-05-09

    An efficient, nanoliter-scale microfabricated bioprocessor integrating all three Sanger sequencing steps, thermal cycling, sample purification, and capillary electrophoresis, has been developed and evaluated. Hybrid glass-polydimethylsiloxane (PDMS) wafer-scale construction is used to combine 250-nl reactors, affinity-capture purification chambers, high-performance capillary electrophoresis channels, and pneumatic valves and pumps onto a single microfabricated device. Lab-on-a-chip-level integration enables complete Sanger sequencing from only 1 fmol of DNA template. Up to 556 continuous bases were sequenced with 99% accuracy, demonstrating read lengths required for de novo sequencing of human and other complex genomes. The performance of this miniaturized DNA sequencer provides a benchmark for predicting the ultimate cost and efficiency limits of Sanger sequencing.

  5. Beyond reasonable doubt: evolution from DNA sequences.

    PubMed

    White, W Timothy J; Zhong, Bojian; Penny, David

    2013-01-01

    We demonstrate quantitatively that, as predicted by evolutionary theory, sequences of homologous proteins from different species converge as we go further and further back in time. The converse, a non-evolutionary model can be expressed as probabilities, and the test works for chloroplast, nuclear and mitochondrial sequences, as well as for sequences that diverged at different time depths. Even on our conservative test, the probability that chance could produce the observed levels of ancestral convergence for just one of the eight datasets of 51 proteins is ≈1×10⁻¹⁹ and combined over 8 datasets is ≈1×10⁻¹³². By comparison, there are about 10⁸⁰ protons in the universe, hence the probability that the sequences could have been produced by a process involving unrelated ancestral sequences is about 10⁵⁰ lower than picking, among all protons, the same proton at random twice in a row. A non-evolutionary control model shows no convergence, and only a small number of parameters are required to account for the observations. It is time that that researchers insisted that doubters put up testable alternatives to evolution.

  6. Automated high throughput nucleic acid purification from formalin-fixed paraffin-embedded tissue samples for next generation sequence analysis

    PubMed Central

    Haile, Simon; Pandoh, Pawan; McDonald, Helen; Corbett, Richard D.; Tsao, Philip; Kirk, Heather; MacLeod, Tina; Jones, Martin; Bilobram, Steve; Brooks, Denise; Smailus, Duane; Steidl, Christian; Scott, David W.; Bala, Miruna; Hirst, Martin; Miller, Diane; Moore, Richard A.; Mungall, Andrew J.; Coope, Robin J.; Ma, Yussanne; Zhao, Yongjun; Holt, Rob A.; Jones, Steven J.

    2017-01-01

    Curation and storage of formalin-fixed, paraffin-embedded (FFPE) samples are standard procedures in hospital pathology laboratories around the world. Many thousands of such samples exist and could be used for next generation sequencing analysis. Retrospective analyses of such samples are important for identifying molecular correlates of carcinogenesis, treatment history and disease outcomes. Two major hurdles in using FFPE material for sequencing are the damaged nature of the nucleic acids and the labor-intensive nature of nucleic acid purification. These limitations and a number of other issues that span multiple steps from nucleic acid purification to library construction are addressed here. We optimized and automated a 96-well magnetic bead-based extraction protocol that can be scaled to large cohorts and is compatible with automation. Using sets of 32 and 91 individual FFPE samples respectively, we generated libraries from 100 ng of total RNA and DNA starting amounts with 95–100% success rate. The use of the resulting RNA in micro-RNA sequencing was also demonstrated. In addition to offering the potential of scalability and rapid throughput, the yield obtained with lower input requirements makes these methods applicable to clinical samples where tissue abundance is limiting. PMID:28570594

  7. Automated high throughput nucleic acid purification from formalin-fixed paraffin-embedded tissue samples for next generation sequence analysis.

    PubMed

    Haile, Simon; Pandoh, Pawan; McDonald, Helen; Corbett, Richard D; Tsao, Philip; Kirk, Heather; MacLeod, Tina; Jones, Martin; Bilobram, Steve; Brooks, Denise; Smailus, Duane; Steidl, Christian; Scott, David W; Bala, Miruna; Hirst, Martin; Miller, Diane; Moore, Richard A; Mungall, Andrew J; Coope, Robin J; Ma, Yussanne; Zhao, Yongjun; Holt, Rob A; Jones, Steven J; Marra, Marco A

    2017-01-01

    Curation and storage of formalin-fixed, paraffin-embedded (FFPE) samples are standard procedures in hospital pathology laboratories around the world. Many thousands of such samples exist and could be used for next generation sequencing analysis. Retrospective analyses of such samples are important for identifying molecular correlates of carcinogenesis, treatment history and disease outcomes. Two major hurdles in using FFPE material for sequencing are the damaged nature of the nucleic acids and the labor-intensive nature of nucleic acid purification. These limitations and a number of other issues that span multiple steps from nucleic acid purification to library construction are addressed here. We optimized and automated a 96-well magnetic bead-based extraction protocol that can be scaled to large cohorts and is compatible with automation. Using sets of 32 and 91 individual FFPE samples respectively, we generated libraries from 100 ng of total RNA and DNA starting amounts with 95-100% success rate. The use of the resulting RNA in micro-RNA sequencing was also demonstrated. In addition to offering the potential of scalability and rapid throughput, the yield obtained with lower input requirements makes these methods applicable to clinical samples where tissue abundance is limiting.

  8. Electronic Transport and Thermopower in Aperiodic DNA Sequences

    NASA Astrophysics Data System (ADS)

    Roche, Stephan; Maciá, Enrique

    A detailed study of charge transport properties of synthetic and genomic DNA sequences is reported. Genomic sequences of the Chromosome 22, λ-bacteriophage, and D1s80 genes of Human and Pygmy chimpanzee are considered in this work, and compared with both periodic and quasiperiodic (Fibonacci) sequences of nucleotides. Charge transfer efficiency is compared for all these different sequences, and large variations in charge transfer efficiency, stemming from sequence-dependent effects, are reported. In addition, basic characteristics of tunneling currents, including contact effects, are described. Finally, the thermoelectric power of nucleobases connected in between metallic contacts at different temperatures is presented.

  9. Sequence specificity of DNA cleavage by Micrococcus luteus. gamma. endonuclease

    SciTech Connect

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-04-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by ..gamma..-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus ..gamma.. endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to ..gamma.. radiation.

  10. A simple, high-resolution method for establishing DNA binding affinity and sequence selectivity.

    PubMed

    Boger, D L; Fink, B E; Brunette, S R; Tse, W C; Hedrick, M P

    2001-06-27

    Full details of the development of a simple, nondestructive, and high-throughput method for establishing DNA binding affinity and sequence selectivity are described. The method is based on the loss of fluorescence derived from the displacement of ethidium bromide or thiazole orange from the DNA of interest or, in selected instances, the change in intrinsic fluorescence of a DNA binding agent itself and is applicable for assessing relative or absolute DNA binding affinities. Enlisting a library of hairpin deoxyoligonucleotides containing all five base pair (512 hairpins) or four base pair (136 hairpins) sequences displayed in a 96-well format, a compound's rank order binding to all possible sequences is generated, resulting in a high-resolution definition of its sequence selectivity using this fluorescent intercalator displacement (FID) assay. As such, the technique complements the use of footprinting or affinity cleavage for the establishment of DNA binding selectivity and provides the information at a higher resolution. The merged bar graphs generated by this rank order binding provide a qualitative way to compare, or profile, DNA binding affinity and selectivity. The 96-well format assay (512 hairpins) can be conducted at a minimal cost (presently ca. $100 for hairpin deoxyoligonucleotides/assay with ethiduim bromide or less with thiazole orange), with a rapid readout using a fluorescent plate reader (15 min), and is adaptable to automation (Tecan Genesis Workstation 100 robotic system). Its use in generating a profile of DNA binding selectivity for several agents including distamycin A, netropsin, DAPI, Hoechst 33258, and berenil is described. Techniques for establishing binding constants from quantitative titrations are compared, and recommendations are made for use of a Scatchard or curve fitting analysis of the titration binding curves as a reliable means to quantitate the binding affinity.

  11. DNA-protein recognition and sequence-dependent variations of DNA conformational properties

    NASA Astrophysics Data System (ADS)

    Vologodskii, Alexander

    2015-03-01

    Parameters of B-DNA, the major form of the double helix, depend on its sequence. This dependence can contribute to the recognition of specific DNA sequences by proteins. Here we try to analyze this contribution quantitatively. In the first approach to this goal we used experimental data on the sequence dependence of DNA bending rigidity and its helical repeat. The solution data on these parameters of B-DNA were derived from the experiments on cyclization of short DNA fragments with specially designed sequences. The data allowed calculating the sequence variations of DNA bending energy, as well as the variations of the energy of torsional deformation of the double helix associated with a protein binding. The results show that DNA conformational parameters can have very limited influence on the sequence specificity of protein binding. In the second approach we analyzed the experimental data on the binding affinity of the nucleosome core with DNA fragments of different sequences. The conclusions derived in these two approaches are in a good agreement with one another.

  12. Reticuloendotheliosis Virus Nucleic Acid Sequences in Cellular DNA

    PubMed Central

    Kang, Chil-Yong; Temin, Howard M.

    1974-01-01

    Reticuloendotheliosis virus 60S RNA labeled with 125I, or reticuloendotheliosis virus complementary DNA labeled with 3H, were hybridized to DNAs from infected chicken and pheasant cells. Most of the sequences of the viral RNA were found in the infected cell DNAs. The reticuloendotheliosis viruses, therefore, replicate through a DNA intermediate. The same labeled nucleic acids were hybridized to DNA of uninfected chicken, pheasant, quail, turkey, and duck. About 10% of the sequences of reticuloendotheliosis virus RNA were present in the DNA of uninfected chicken, pheasant, quail, and turkey. None were detected in DNA of duck. The specificity of the hybridization was shown by competition between unlabeled and 125I-labeled viral RNAs and by determination of melting temperatures. In contrast, 125I-labeled RNA of Rous-associated virus-O, an avian leukosis-sarcoma virus, hybridized 55% to DNA of uninfected chicken, 20% to DNA of uninfected pheasant, 15% to DNA of uninfected quail, 10% to DNA of uninfected turkey, and less than 1% to DNA of uninfected duck. PMID:4372393

  13. Polyamide platinum anticancer complexes designed to target specific DNA sequences.

    PubMed

    Jaramillo, David; Wheate, Nial J; Ralph, Stephen F; Howard, Warren A; Tor, Yitzhak; Aldrich-Wright, Janice R

    2006-07-24

    Two new platinum complexes, trans-chlorodiammine[N-(2-aminoethyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-2) and trans-chlorodiammine[N-(6-aminohexyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-6) have been synthesized as proof-of-concept molecules in the design of agents that can specifically target genes in DNA. Coordinate covalent binding to DNA was demonstrated with electrospray ionization mass spectrometry. Using circular dichroism, these complexes were found to show greater DNA binding affinity to the target sequence: d(CATTGTCAGAC)(2), than toward either d(GTCTGTCAATG)(2,) which contains different flanking sequences, or d(CATTGAGAGAC)(2), which contains a double base pair mismatch sequence. DJ1953-2 unwinds the DNA helix by around 13 degrees , but neither metal complex significantly affects the DNA melting temperature. Unlike simple DNA minor groove binders, DJ1953-2 is able to inhibit, in vitro, RNA synthesis. The cytotoxicity of both metal complexes in the L1210 murine leukaemia cell line was also determined, with DJ1953-6 (34 microM) more active than DJ1953-2 (>50 microM). These results demonstrate the potential of polyamide platinum complexes and provide the structural basis for designer agents that are able to recognize biologically relevant sequences and prevent DNA transcription and replication.

  14. Sequencing of adenine in DNA by scanning tunneling microscopy

    NASA Astrophysics Data System (ADS)

    Tanaka, Hiroyuki; Taniguchi, Masateru

    2017-08-01

    The development of DNA sequencing technology utilizing the detection of a tunnel current is important for next-generation sequencer technologies based on single-molecule analysis technology. Using a scanning tunneling microscope, we previously reported that dI/dV measurements and dI/dV mapping revealed that the guanine base (purine base) of DNA adsorbed onto the Cu(111) surface has a characteristic peak at V s = -1.6 V. If, in addition to guanine, the other purine base of DNA, namely, adenine, can be distinguished, then by reading all the purine bases of each single strand of a DNA double helix, the entire base sequence of the original double helix can be determined due to the complementarity of the DNA base pair. Therefore, the ability to read adenine is important from the viewpoint of sequencing. Here, we report on the identification of adenine by STM topographic and spectroscopic measurements using a synthetic DNA oligomer and viral DNA.

  15. Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes

    PubMed Central

    Hazkani-Covo, Einat; Zeller, Raymond M.; Martin, William

    2010-01-01

    The natural transfer of DNA from mitochondria to the nucleus generates nuclear copies of mitochondrial DNA (numts) and is an ongoing evolutionary process, as genome sequences attest. In humans, five different numts cause genetic disease and a dozen human loci are polymorphic for the presence of numts, underscoring the rapid rate at which mitochondrial sequences reach the nucleus over evolutionary time. In the laboratory and in nature, numts enter the nuclear DNA via non-homolgous end joining (NHEJ) at double-strand breaks (DSBs). The frequency of numt insertions among 85 sequenced eukaryotic genomes reveal that numt content is strongly correlated with genome size, suggesting that the numt insertion rate might be limited by DSB frequency. Polymorphic numts in humans link maternally inherited mitochondrial genotypes to nuclear DNA haplotypes during the past, offering new opportunities to associate nuclear markers with mitochondrial markers back in time. PMID:20168995

  16. Efficient depletion of host DNA contamination in malaria clinical sequencing.

    PubMed

    Oyola, Samuel O; Gu, Yong; Manske, Magnus; Otto, Thomas D; O'Brien, John; Alcock, Daniel; Macinnis, Bronwyn; Berriman, Matthew; Newbold, Chris I; Kwiatkowski, Dominic P; Swerdlow, Harold P; Quail, Michael A

    2013-03-01

    The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.

  17. Bioinformatics analysis of circulating cell-free DNA sequencing data.

    PubMed

    Chan, Landon L; Jiang, Peiyong

    2015-10-01

    The discovery of cell-free DNA molecules in plasma has opened up numerous opportunities in noninvasive diagnosis. Cell-free DNA molecules have become increasingly recognized as promising biomarkers for detection and management of many diseases. The advent of next generation sequencing has provided unprecedented opportunities to scrutinize the characteristics of cell-free DNA molecules in plasma in a genome-wide fashion and at single-base resolution. Consequently, clinical applications of circulating cell-free DNA analysis have not only revolutionized noninvasive prenatal diagnosis but also facilitated cancer detection and monitoring toward an era of blood-based personalized medicine. With the remarkably increasing throughput and lowering cost of next generation sequencing, bioinformatics analysis becomes increasingly demanding to understand the large amount of data generated by these sequencing platforms. In this Review, we highlight the major bioinformatics algorithms involved in the analysis of cell-free DNA sequencing data. Firstly, we briefly describe the biological properties of these molecules and provide an overview of the general bioinformatics approach for the analysis of cell-free DNA. Then, we discuss the specific upstream bioinformatics considerations concerning the analysis of sequencing data of circulating cell-free DNA, followed by further detailed elaboration on each key clinical situation in noninvasive prenatal diagnosis and cancer management where downstream bioinformatics analysis is heavily involved. We also discuss bioinformatics analysis as well as clinical applications of the newly developed massively parallel bisulfite sequencing of cell-free DNA. Finally, we offer our perspectives on the future development of bioinformatics in noninvasive diagnosis. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  18. Identification of repeats in DNA sequences using nucleotide distribution uniformity.

    PubMed

    Yin, Changchuan

    2017-01-07

    Repetitive elements are important in genomic structures, functions and regulations, yet effective methods in precisely identifying repetitive elements in DNA sequences are not fully accessible, and the relationship between repetitive elements and periodicities of genomes is not clearly understood. We present an ab initio method to quantitatively detect repetitive elements and infer the consensus repeat pattern in repetitive elements. The method uses the measure of the distribution uniformity of nucleotides at periodic positions in DNA sequences or genomes. It can identify periodicities, consensus repeat patterns, copy numbers and perfect levels of repetitive elements. The results of using the method on different DNA sequences and genomes demonstrate efficacy and accuracy in identifying repeat patterns and periodicities. The complexity of the method is linear with respect to the lengths of the analyzed sequences. The Python programs in this study are freely available to the public upon request or at https://github.com/cyinbox/DNADU. Copyright © 2016 Elsevier Ltd. All rights reserved.

  19. Ultrasensitive fluorescence detection of DNA sequencing gels

    SciTech Connect

    Mathies, R.A.

    1991-01-01

    During the three years of this grant we have: (1) Developed and applied a new theory for optimizing high-sensitivity fluorescence detection. (2) Developed and patented a new high-sensitivity confocal-fluorescence laser-excited gel-scanner. (3) Applied this scanner to the development of a new class of versatile and sensitive fluorescent dyes for DNA detection. (4) Developed methods for the detection of single fluorescent molecules by fluorescence burst detection. 11 refs., 10 figs.

  20. RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization ( 7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Lane, Todd [SNL

    2016-07-12

    Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  1. RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Lane, Todd

    2012-06-01

    Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  2. Automated DNA extraction from genetically modified maize using aminosilane-modified bacterial magnetic particles.

    PubMed

    Ota, Hiroyuki; Lim, Tae-Kyu; Tanaka, Tsuyoshi; Yoshino, Tomoko; Harada, Manabu; Matsunaga, Tadashi

    2006-09-18

    A novel, automated system, PNE-1080, equipped with eight automated pestle units and a spectrophotometer was developed for genomic DNA extraction from maize using aminosilane-modified bacterial magnetic particles (BMPs). The use of aminosilane-modified BMPs allowed highly accurate DNA recovery. The (A(260)-A(320)):(A(280)-A(320)) ratio of the extracted DNA was 1.9+/-0.1. The DNA quality was sufficiently pure for PCR analysis. The PNE-1080 offered rapid assay completion (30 min) with high accuracy. Furthermore, the results of real-time PCR confirmed that our proposed method permitted the accurate determination of genetically modified DNA composition and correlated well with results obtained by conventional cetyltrimethylammonium bromide (CTAB)-based methods.

  3. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed Central

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-01-01

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region. PMID:3671088

  4. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-10-26

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region.

  5. PCR Primers for Metazoan Mitochondrial 12S Ribosomal DNA Sequences

    PubMed Central

    Machida, Ryuji J.; Kweskin, Matthew; Knowlton, Nancy

    2012-01-01

    Background Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. Methodology/Principal Findings A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. Conclusions/Significance Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans. PMID:22536450

  6. Recognizing a Single Base in an Individual DNA Strand: A Step Toward Nanopore DNA Sequencing**

    PubMed Central

    Ashkenasy, N.; Sánchez-Quesada, J.; Ghadiri, M. R.; Bayley, H.

    2007-01-01

    Functional supramolecular chemistry at the single-molecule level. Single strands of DNA can be captured inside α-hemolysin transmembrane pore protein to form single-species α-HL·DNA pseudorotaxanes. This process can be used to identify a single adenine nucleotide at a specific location on a strand of DNA by the characteristic reductions in the α-HL ion conductance. This study suggests that α-HL-mediated single-molecule DNA sequencing might be fundamentally feasible. PMID:15666419

  7. Comparison of Automated and Manual DNA Isolation Methods for DNA Methylation Analysis of Biopsy, Fresh Frozen, and Formalin-Fixed, Paraffin-Embedded Colorectal Cancer Samples.

    PubMed

    Kalmár, Alexandra; Péterfia, Bálint; Wichmann, Barnabás; Patai, Árpád V; Barták, Barbara K; Nagy, Zsófia B; Furi, István; Tulassay, Zsolt; Molnár, Béla

    2015-12-01

    Automated DNA isolation can decrease hands-on time in routine pathology. Our aim was to apply automated DNA isolation and perform DNA methylation analyses. DNA isolation was performed manually from fresh frozen (CRC = 10, normal = 10) specimens and colonic biopsies (CRC = 10, healthy = 10) with QIAamp DNA Mini Kit and from FFPE blocks (CRC = 10, normal = 10) with QIAamp DNA FFPET Kit. Automated DNA isolation was performed with MagNA Pure DNA and Viral NA SV kit on MagNA Pure 96 system. DNA methylation of MAL, SFRP1, and SFRP2 were analyzed with methylation-specific high-resolution melting analysis. Yield of automatically isolated samples was equal in fresh frozens and significantly lower compared to manually isolated biopsy and FFPE samples. OD260/280 of fresh frozen and biopsy samples were similar after both isolations, automated isolation resulted in lower purity in FFPE samples. Both protocols resulted in similar OD260/230 from fresh frozens, automated isolation method was superior in biopsies and manual protocol in FFPE samples. DNA methylation of biopsies, fresh frozen samples were highly similar after both methods, results of automatically and manually isolated FFPE samples were different. Automated DNA isolation from fresh frozen samples can be suitable for high-throughput laboratories. © 2015 Society for Laboratory Automation and Screening.

  8. A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries

    PubMed Central

    2011-01-01

    Genome targeting methods enable cost-effective capture of specific subsets of the genome for sequencing. We present here an automated, highly scalable method for carrying out the Solution Hybrid Selection capture approach that provides a dramatic increase in scale and throughput of sequence-ready libraries produced. Significant process improvements and a series of in-process quality control checkpoints are also added. These process improvements can also be used in a manual version of the protocol. PMID:21205303

  9. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    PubMed

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  10. Automated genomic DNA purification options in agricultural applications using MagneSil paramagnetic particles

    NASA Astrophysics Data System (ADS)

    Bitner, Rex M.; Koller, Susan C.

    2002-06-01

    The automated high throughput purification of genomic DNA form plant materials can be performed using MagneSil paramagnetic particles on the Beckman-Coulter FX, BioMek 2000, and the Tecan Genesis robot. Similar automated methods are available for DNA purifications from animal blood. These methods eliminate organic extractions, lengthy incubations and cumbersome filter plates. The DNA is suitable for applications such as PCR and RAPD analysis. Methods are described for processing traditionally difficult samples such as those containing large amounts of polyphenolics or oils, while still maintaining a high level of DNA purity. The robotic protocols have ben optimized for agricultural applications such as marker assisted breeding, seed-quality testing, and SNP discovery and scoring. In addition to high yield purification of DNA from plant samples or animal blood, the use of Promega's DNA-IQ purification system is also described. This method allows for the purification of a narrow range of DNA regardless of the amount of additional DNA that is present in the initial sample. This simultaneous Isolation and Quantification of DNA allows the DNA to be used directly in applications such as PCR, SNP analysis, and RAPD, without the need for separate quantitation of the DNA.

  11. An adaptive, object oriented strategy for base calling in DNA sequence analysis.

    PubMed Central

    Giddings, M C; Brumley, R L; Haker, M; Smith, L M

    1993-01-01

    An algorithm has been developed for the determination of nucleotide sequence from data produced in fluorescence-based automated DNA sequencing instruments employing the four-color strategy. This algorithm takes advantage of object oriented programming techniques for modularity and extensibility. The algorithm is adaptive in that data sets from a wide variety of instruments and sequencing conditions can be used with good results. Confidence values are provided on the base calls as an estimate of accuracy. The algorithm iteratively employs confidence determinations from several different modules, each of which examines a different feature of the data for accurate peak identification. Modules within this system can be added or removed for increased performance or for application to a different task. In comparisons with commercial software, the algorithm performed well. Images PMID:8233787

  12. Theoretical modelling of epigenetically modified DNA sequences

    PubMed Central

    Carvalho, Alexandra Teresa Pires; Gouveia, Maria Leonor; Raju Kanna, Charan; Wärmländer, Sebastian K. T. S.; Platts, Jamie; Kamerlin, Shina Caroline Lynn

    2015-01-01

    We report herein a set of calculations designed to examine the effects of epigenetic modifications on the structure of DNA. The incorporation of methyl, hydroxymethyl, formyl and carboxy substituents at the 5-position of cytosine is shown to hardly affect the geometry of CG base pairs, but to result in rather larger changes to hydrogen-bond and stacking binding energies, as predicted by dispersion-corrected density functional theory (DFT) methods. The same modifications within double-stranded GCG and ACA trimers exhibit rather larger structural effects, when including the sugar-phosphate backbone as well as sodium counterions and implicit aqueous solvation. In particular, changes are observed in the buckle and propeller angles within base pairs and the slide and roll values of base pair steps, but these leave the overall helical shape of DNA essentially intact. The structures so obtained are useful as a benchmark of faster methods, including molecular mechanics (MM) and hybrid quantum mechanics/molecular mechanics (QM/MM) methods. We show that previously developed MM parameters satisfactorily reproduce the trimer structures, as do QM/MM calculations which treat bases with dispersion-corrected DFT and the sugar-phosphate backbone with AMBER. The latter are improved by inclusion of all six bases in the QM region, since a truncated model including only the central CG base pair in the QM region is considerably further from the DFT structure. This QM/MM method is then applied to a set of double-stranded DNA heptamers derived from a recent X-ray crystallographic study, whose size puts a DFT study beyond our current computational resources. These data show that still larger structural changes are observed than in base pairs or trimers, leading us to conclude that it is important to model epigenetic modifications within realistic molecular contexts. PMID:26448859

  13. Measurement of the sequence specificity of covalent DNA modification by antineoplastic agents using Taq DNA polymerase.

    PubMed Central

    Ponti, M; Forrow, S M; Souhami, R L; D'Incalci, M; Hartley, J A

    1991-01-01

    A polymerase stop assay has been developed to determine the DNA nucleotide sequence specificity of covalent modification by antineoplastic agents using the thermostable DNA polymerase from Thermus aquaticus and synthetic labelled primers. The products of linear amplification are run on sequencing gels to reveal the sites of covalent drug binding. The method has been studied in detail for a number of agents including nitrogen mustards, platinum analogues and mitomycin C, and the sequence specificities obtained accord with those obtained by other procedures. The assay is advantageous in that it is not limited to a single type of DNA lesion (as in the piperidine cleavage assay for guanine-N7 alkylation), does not require a strand breakage step, and is more sensitive than other primer extension procedures which have only one cycle of polymerization. In particular the method has considerable potential for examining the sequence selectivity of damage and repair in single copy gene sequences in genomic DNA from cells. Images PMID:2057351

  14. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    PubMed

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  15. Implementation of an Automated High-Throughput Plasmid DNA Production Pipeline.

    PubMed

    Billeci, Karen; Suh, Christopher; Di Ioia, Tina; Singh, Lovejit; Abraham, Ryan; Baldwin, Anne; Monteclaro, Stephen

    2016-12-01

    Biologics sample management facilities are often responsible for a diversity of large-molecule reagent types, such as DNA, RNAi, and protein libraries. Historically, the management of large molecules was dispersed into multiple laboratories. As methodologies to support pathway discovery, antibody discovery, and protein production have become high throughput, the implementation of automation and centralized inventory management tools has become important. To this end, to improve sample tracking, throughput, and accuracy, we have implemented a module-based automation system integrated into inventory management software using multiple platforms (Hamilton, Hudson, Dynamic Devices, and Brooks). Here we describe the implementation of these systems with a focus on high-throughput plasmid DNA production management.

  16. Selective enrichment of damaged DNA molecules for ancient genome sequencing

    PubMed Central

    2014-01-01

    Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA—the presence of deoxyuracils—for selective enrichment of endogenous DNA against a complex background of contamination during DNA library preparation. By applying the method to Neanderthal DNA extracts that are heavily contaminated with present-day human DNA, we show that the fraction of useful sequence information increases ∼10-fold and that the resulting sequences are more efficiently depleted of human contamination than when using purely computational approaches. Furthermore, we show that U selection can lead to a four- to fivefold increase in the proportion of endogenous DNA sequences relative to those of microbial contaminants in some samples. U selection may thus help to lower the costs for ancient genome sequencing of nonhuman samples also. PMID:25081630

  17. Applications of recursive segmentation to the analysis of DNA sequences.

    PubMed

    Li, Wentian; Bernaola-Galván, Pedro; Haghighi, Fatameh; Grosse, Ivo

    2002-07-01

    Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.

  18. A Fast Algorithm for Exonic Regions Prediction in DNA Sequences

    PubMed Central

    Saberkari, Hamidreza; Shamsi, Mousa; Heravi, Hamed; Sedaaghi, Mohammad Hossein

    2013-01-01

    The main purpose of this paper is to introduce a fast method for gene prediction in DNA sequences based on the period-3 property in exons. First, the symbolic DNA sequences were converted to digital signal using the electron ion interaction potential method. Then, to reduce the effect of background noise in the period-3 spectrum, we used the discrete wavelet transform at three levels and applied it on the input digital signal. Finally, the Goertzel algorithm was used to extract period-3 components in the filtered DNA sequence. The proposed algorithm leads to decrease the computational complexity and hence, increases the speed of the process. Detection of small size exons in DNA sequences, exactly, is another advantage of the algorithm. The proposed algorithm ability in exon prediction was compared with several existing methods at the nucleotide level using: (i) specificity - sensitivity values; (ii) receiver operating curves (ROC); and (iii) area under ROC curve. Simulation results confirmed that the proposed method can be used as a promising tool for exon prediction in DNA sequences. PMID:24672762

  19. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    PubMed

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  20. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    PubMed Central

    Inbamalar, T. M.; Sivakumar, R.

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system. PMID:26000337

  1. Management of High-Throughput DNA Sequencing Projects: Alpheus

    PubMed Central

    Miller, Neil A.; Kingsmore, Stephen F.; Farmer, Andrew; Langley, Raymond J.; Mudge, Joann; Crow, John A.; Gonzalez, Alvaro J.; Schilkey, Faye D.; Kim, Ryan J.; van Velkinburgh, Jennifer; May, Gregory D.; Black, C. Forrest; Myers, M. Kathy; Utsey, John P.; Frost, Nicholas S.; Sugarbaker, David J.; Bueno, Raphael; Gullans, Stephen R.; Baxter, Susan M.; Day, Steve W.; Retzel, Ernest F.

    2009-01-01

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem’s SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis. PMID:20151039

  2. Mitochondrial DNA Sequence Analysis - Validation and Use for Forensic Casework.

    PubMed

    Holland, M M; Parsons, T J

    1999-06-01

    With the discovery of the polymerase chain reaction (PCR) in the mid-1980's, the last in a series of critical molecular biology techniques (to include the isolation of DNA from human and non-human biological material, and primary sequence analysis of DNA) had been developed to rapidly analyze minute quantities of mitochondrial DNA (mtDNA). This was especially true for mtDNA isolated from challenged sources, such as ancient or aged skeletal material and hair shafts. One of the beneficiaries of this work has been the forensic community. Over the last decade, a significant amount of research has been conducted to develop PCR-based sequencing assays for the mtDNA control region (CR), which have subsequently been used to further characterize the CR. As a result, the reliability of these assays has been investigated, the limitations of the procedures have been determined, and critical aspects of the analysis process have been identified, so that careful control and monitoring will provide the basis for reliable testing. With the application of these assays to forensic identification casework, mtDNA sequence analysis has been properly validated, and is a reliable procedure for the examination of biological evidence encountered in forensic criminalistic cases. Copyright © 1999 Central Police University.

  3. Ribosomal DNA copy number loss and sequence variation in cancer

    PubMed Central

    Xu, Baoshan; Li, Hua; Perry, John M.; Singh, Vijay Pratap; Yu, Zulin; Zakari, Musinu; Li, Linheng

    2017-01-01

    Ribosomal DNA is one of the most variable regions in the human genome with respect to copy number. Despite the importance of rDNA for cellular function, we know virtually nothing about what governs its copy number, stability, and sequence in the mammalian genome due to challenges associated with mapping and analysis. We applied computational and droplet digital PCR approaches to measure rDNA copy number in normal and cancer states in human and mouse genomes. We find that copy number and sequence can change in cancer genomes. Counterintuitively, human cancer genomes show a loss of copies, accompanied by global copy number co-variation. The sequence can also be more variable in the cancer genome. Cancer genomes with lower copies have mutational evidence of mTOR hyperactivity. The PTEN phosphatase is a tumor suppressor that is critical for genome stability and a negative regulator of the mTOR kinase pathway. Surprisingly, but consistent with the human cancer genomes, hematopoietic cancer stem cells from a Pten-/- mouse model for leukemia have lower rDNA copy number than normal tissue, despite increased proliferation, rRNA production, and protein synthesis. Loss of copies occurs early and is associated with hypersensitivity to DNA damage. Therefore, copy loss is a recurrent feature in cancers associated with mTOR activation. Ribosomal DNA copy number may be a simple and useful indicator of whether a cancer will be sensitive to DNA damaging treatments. PMID:28640831

  4. Ribosomal DNA copy number loss and sequence variation in cancer.

    PubMed

    Xu, Baoshan; Li, Hua; Perry, John M; Singh, Vijay Pratap; Unruh, Jay; Yu, Zulin; Zakari, Musinu; McDowell, William; Li, Linheng; Gerton, Jennifer L

    2017-06-01

    Ribosomal DNA is one of the most variable regions in the human genome with respect to copy number. Despite the importance of rDNA for cellular function, we know virtually nothing about what governs its copy number, stability, and sequence in the mammalian genome due to challenges associated with mapping and analysis. We applied computational and droplet digital PCR approaches to measure rDNA copy number in normal and cancer states in human and mouse genomes. We find that copy number and sequence can change in cancer genomes. Counterintuitively, human cancer genomes show a loss of copies, accompanied by global copy number co-variation. The sequence can also be more variable in the cancer genome. Cancer genomes with lower copies have mutational evidence of mTOR hyperactivity. The PTEN phosphatase is a tumor suppressor that is critical for genome stability and a negative regulator of the mTOR kinase pathway. Surprisingly, but consistent with the human cancer genomes, hematopoietic cancer stem cells from a Pten-/- mouse model for leukemia have lower rDNA copy number than normal tissue, despite increased proliferation, rRNA production, and protein synthesis. Loss of copies occurs early and is associated with hypersensitivity to DNA damage. Therefore, copy loss is a recurrent feature in cancers associated with mTOR activation. Ribosomal DNA copy number may be a simple and useful indicator of whether a cancer will be sensitive to DNA damaging treatments.

  5. Aozan: an automated post-sequencing data-processing pipeline.

    PubMed

    Perrin, Sandrine; Firmo, Cyril; Lemoine, Sophie; Le Crom, Stéphane; Jourdren, Laurent

    2017-07-15

    Data management and quality control of output from Illumina sequencers is a disk space- and time-consuming task. Thus, we developed Aozan to automatically handle data transfer, demultiplexing, conversion and quality control once a run has finished. This software greatly improves run data management and the monitoring of run statistics via automatic emails and HTML web reports. Aozan is implemented in Java and Python, supported on Linux systems, and distributed under the GPLv3 License at: http://www.outils.genomique.biologie.ens.fr/aozan/ . Aozan source code is available on GitHub: https://github.com/GenomicParisCentre/aozan . aozan@biologie.ens.fr.

  6. DNA sequence and structure requirements for cleavage of V(D)J recombination signal sequences.

    PubMed Central

    Cuomo, C A; Mundy, C L; Oettinger, M A

    1996-01-01

    Purified RAG1 and RAG2 proteins can cleave DNA at V(D)J recombination signals. In dissecting the DNA sequence and structural requirements for cleavage, we find that the heptamer and nonamer motifs of the recombination signal sequence can independently direct both steps of the cleavage reaction. Proper helical spacing between these two elements greatly enhances the efficiency of cleavage, whereas improper spacing can lead to interference between the two elements. The signal sequences are surprisingly tolerant of structural variation and function efficiently when nicks, gaps, and mismatched bases are introduced or even when the signal sequence is completely single stranded. Sequence alterations that facilitate unpairing of the bases at the signal/coding border activate the cleavage reaction, suggesting that DNA distortion is critical for V(D)J recombination. PMID:8816481

  7. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    PubMed Central

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  8. Label-free DNA sequencing using Millikan detection.

    PubMed

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications.

  9. Label-Free DNA Sequencing Using Millikan Detection

    PubMed Central

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-01-01

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of ~ 1% were reliably detected during DNA polymerization allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications. PMID:26151683

  10. Mapping DNA polymerase errors by single-molecule sequencing

    PubMed Central

    Lee, David F.; Lu, Jenny; Chang, Seungwoo; Loparo, Joseph J.; Xie, Xiaoliang S.

    2016-01-01

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replication product is tagged with a unique nucleotide sequence before amplification. This allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases. PMID:27185891

  11. Mapping DNA polymerase errors by single-molecule sequencing

    DOE PAGES

    Lee, David F.; Lu, Jenny; Chang, Seungwoo; ...

    2016-05-16

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less

  12. VoSeq: a voucher and DNA sequence web application.

    PubMed

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

  13. Hiding message into DNA sequence through DNA coding and chaotic maps.

    PubMed

    Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

    2014-09-01

    The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.

  14. Correlations in DNA sequences across the three domains of life

    NASA Astrophysics Data System (ADS)

    Guharay, Sabyasachi; Hunt, Brian R.; Yorke, James A.; White, Owen R.

    2000-11-01

    We report statistical studies of correlation properties of ∼7500 gene sequences, covering coding (exon) and non-coding (intron) sequences for DNA and primary amino acid sequences for proteins, across all three domains of life, namely Eukaryotes (cells with nuclei), Prokaryotes (bacteria) and Archaea (archaebacteria). Mutual information function, power spectrum and Hölder exponent analyses show exons with somewhat greater correlation content than the introns studied. These results are further confirmed with hypothesis testing. While ∼30% of the Eukaryote coding sequences show distinct correlations above noise threshold, this is true for only ∼10% of the Prokaryote and Archaea coding sequences. For protein sequences, we observe correlation lengths similar to that of “random” sequences.

  15. Automated reconstruction of 3D scenes from sequences of images

    NASA Astrophysics Data System (ADS)

    Pollefeys, M.; Koch, R.; Vergauwen, M.; Van Gool, L.

    Modelling of 3D objects from image sequences is a challenging problem and has been an important research topic in the areas of photogrammetry and computer vision for many years. In this paper, a system is presented which automatically extracts a textured 3D surface model from a sequence of images of a scene. The system can deal with unknown camera settings. In addition, the parameters of this camera are allowed to change during acquisition (e.g., by zooming or focusing). No prior knowledge about the scene is necessary to build the 3D models. Therefore, this system offers a high degree of flexibility. The system is based on state-of-the-art algorithms recently developed in computer vision. The 3D modelling task is decomposed into a number of successive steps. Gradually, more knowledge of the scene and the camera setup is retrieved. At this point, the obtained accuracy is not yet at the level required for most metrology applications, but the visual quality is very convincing. This system has been applied to a number of applications in archaeology. The Roman site of Sagalassos (southwest Turkey) was used as a test case to illustrate the potential of this new approach.

  16. Evaluation of four automated protocols for extraction of DNA from FTA cards.

    PubMed

    Stangegaard, Michael; Børsting, Claus; Ferrero-Miliani, Laura; Frank-Hansen, Rune; Poulsen, Lena; Hansen, Anders J; Morling, Niels

    2013-10-01

    Extraction of DNA using magnetic bead-based techniques on automated DNA extraction instruments provides a fast, reliable, and reproducible method for DNA extraction from various matrices. Here, we have compared the yield and quality of DNA extracted from FTA cards using four automated extraction protocols on three different instruments. The extraction processes were repeated up to six times with the same pieces of FTA cards. The sample material on the FTA cards was either blood or buccal cells. With the QIAamp DNA Investigator and QIAsymphony DNA Investigator kits, it was possible to extract DNA from the FTA cards in all six rounds of extractions in sufficient amount and quality to obtain complete short tandem repeat (STR) profiles on a QIAcube and a QIAsymphony SP. With the PrepFiler Express kit, almost all the extractable DNA was extracted in the first two rounds of extractions. Furthermore, we demonstrated that it was possible to successfully extract sufficient DNA for STR profiling from previously processed FTA card pieces that had been stored at 4 °C for up to 1 year. This showed that rare or precious FTA card samples may be saved for future analyses even though some DNA was already extracted from the FTA cards.

  17. Highly efficient automated extraction of DNA from old and contemporary skeletal remains.

    PubMed

    Zupanič Pajnič, Irena; Debska, Magdalena; Gornjak Pogorelc, Barbara; Vodopivec Mohorčič, Katja; Balažic, Jože; Zupanc, Tomaž; Štefanič, Borut; Geršak, Ksenija

    2016-01-01

    We optimised the automated extraction of DNA from old and contemporary skeletal remains using the AutoMate Express system and the PrepFiler BTA kit. 24 Contemporary and 25 old skeletal remains from WWII were analysed. For each skeleton, extraction using only 0.05 g of powder was performed according to the manufacturer's recommendations (no demineralisation - ND method). Since only 32% of full profiles were obtained from aged and 58% from contemporary casework skeletons, the extraction protocol was modified to acquire higher quality DNA and genomic DNA was obtained after full demineralisation (FD method). The nuclear DNA of the samples was quantified using the Investigator Quantiplex kit and STR typing was performed using the NGM kit to evaluate the performance of tested extraction methods. In the aged DNA samples, 64% of full profiles were obtained using the FD method. For the contemporary skeletal remains the performance of the ND method was closer to the FD method compared to the old skeletons, giving 58% of full profiles with the ND method and 71% of full profiles using the FD method. The extraction of DNA from only 0.05 g of bone or tooth powder using the AutoMate Express has proven highly successful in the recovery of DNA from old and contemporary skeletons, especially with the modified FD method. We believe that the results obtained will contribute to the possibilities of using automated devices for extracting DNA from skeletal remains, which would shorten the procedures for obtaining high-quality DNA from skeletons in forensic laboratories.

  18. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  19. Dialects of the DNA Uptake Sequence in Neisseriaceae

    PubMed Central

    Frye, Stephan A.; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-01-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS–dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5′-CTG-3′ is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS–dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation

  20. DNA sequence alignment by microhomology sampling during homologous recombination.

    PubMed

    Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A; Sung, Patrick; Greene, Eric C

    2015-02-26

    Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair single-strand DNA (ssDNA) with a homologous double-strand DNA (dsDNA) template. Here, we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a ninth nucleotide coincides with an additional reduction in binding free energy, and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. Mitochondrial DNA Sequence Divergence among Lycopersicon and Related Solanum Species

    PubMed Central

    McClean, Phillip E.; Hanson, Maureen R.

    1986-01-01

    Sequence divergence among the mitochondrial (mt) DNA of nine Lycopersicon and two closely related Solanum species was estimated using the shared fragment method. A portion of each mt genome was highlighted by probing total DNA with a series of plasmid clones containing mt-specific DNA fragments from Lycopersicon pennellii. A total of 660 fragments were compared. As calculated by the shared fragment method, sequence divergence among the mtDNAs ranged from 0.4% for the L. esculentum-L. esculentum var. cerasiforme pair to 2.7% for the Solanum rickii-L. pimpinellifolium and L. cheesmanii-L. chilense pairs. The mtDNA divergence is higher than that reported for Lycopersicon chloroplast (cp) DNA, which indicates that the DNAs of the two plant organelles are evolving at different rates. The percentages of shared fragments were used to construct a phenogram that illustrates the present-day relationships of the mtDNAs. The mtDNA-derived phenogram places L. hirsutum closer to L. esculentum than taxonomic and cpDNA comparisons. Further, the recent assignment of L. pennellii to the genus Lycopersicon is supported by the mtDNA analysis. PMID:17246320

  2. Accelerating Computation of DNA Sequence Alignment in Distributed Environment

    NASA Astrophysics Data System (ADS)

    Guo, Tao; Li, Guiyang; Deaton, Russel

    Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.

  3. Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA.

    PubMed

    Poinar, Hendrik N; Schwarz, Carsten; Qi, Ji; Shapiro, Beth; Macphee, Ross D E; Buigues, Bernard; Tikhonov, Alexei; Huson, Daniel H; Tomsho, Lynn P; Auch, Alexander; Rampp, Markus; Miller, Webb; Schuster, Stephan C

    2006-01-20

    We sequenced 28 million base pairs of DNA in a metagenomics approach, using a woolly mammoth (Mammuthus primigenius) sample from Siberia. As a result of exceptional sample preservation and the use of a recently developed emulsion polymerase chain reaction and pyrosequencing technique, 13 million base pairs (45.4%) of the sequencing reads were identified as mammoth DNA. Sequence identity between our data and African elephant (Loxodonta africana) was 98.55%, consistent with a paleontologically based divergence date of 5 to 6 million years. The sample includes a surprisingly small diversity of environmental DNAs. The high percentage of endogenous DNA recoverable from this single mammoth would allow for completion of its genome, unleashing the field of paleogenomics.

  4. Applications of high-throughput DNA sequencing to benign hematology

    PubMed Central

    Gallagher, Patrick G.

    2013-01-01

    The development of novel technologies for high-throughput DNA sequencing is having a major impact on our ability to measure and define normal and pathologic variation in humans. This review discusses advances in DNA sequencing that have been applied to benign hematologic disorders, including those affecting the red blood cell, the neutrophil, and other white blood cell lineages. Relevant examples of how these approaches have been used for disease diagnosis, gene discovery, and studying complex traits are provided. High-throughput DNA sequencing technology holds significant promise for impacting clinical care. This includes development of improved disease detection and diagnosis, better understanding of disease progression and stratification of risk of disease-specific complications, and development of improved therapeutic strategies, particularly patient-specific pharmacogenomics-based therapy, with monitoring of therapy by genomic biomarkers. PMID:24021670

  5. Compilation of DNA sequences of Escherichia coli (update 1991)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Rice, Peter

    1991-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the third listing replacing and increasing the former listing roughly by one fifth. However, in order to save space this printed version contains DNA sequence information only. The complete compilation is now available in machine readable form from the EMBL data library (ECD release 6). After deletion of all detected overlaps a total of 1 492 282 individual bp is found to be determined till the beginning of 1991. This corresponds to a total of 31.62% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for statistical purposes only. PMID:2041799

  6. Multiple Base Substitution Corrections in DNA Sequence Evolution

    NASA Astrophysics Data System (ADS)

    Kowalczuk, M.; Mackiewicz, P.; Szczepanik, D.; Nowicka, A.; Dudkiewicz, M.; Dudek, M. R.; Cebrat, S.

    We discuss the Jukes and Cantor's one-parameter model and Kimura's two-parameter model unability to describe evolution of asymmetric DNA molecules. The standard distance measure between two DNA sequences, which is the number of substitutions per site, should include the effect of multiple base substitutions separately for each type of the base. Otherwise, the respective tables of substitutions cannot reconstruct the asymmetric DNA molecule with respect to the composition. Basing on Kimura's neutral theory, we have derived a linear law for the correlation of the mean survival time of nucleotides under constant mutation pressure and their fraction in the genome. According to the law, the corrections to Kimura's theory have been discussed to describe evolution of genomes with asymmetric nucleotide composition. We consider the particular case of the strongly asymmetric Borrelia burgdorferi genome and we discuss in detail the corrections, which should be introduced into the distance measure between two DNA sequences to include multiple base substitutions.

  7. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  8. Facilitated diffusion on mobile DNA: configurational traps and sequence heterogeneity.

    PubMed

    Brackley, C A; Cates, M E; Marenduzzo, D

    2012-10-19

    We present Brownian dynamics simulations of the facilitated diffusion of a protein, modeled as a sphere with a binding site on its surface, along DNA, modeled as a semiflexible polymer. We consider both the effect of DNA organization in three dimensions and of sequence heterogeneity. We find that in a network of DNA loops, which are thought to be present in bacterial DNA, the search process is very sensitive to the spatial location of the target within such loops. Therefore, specific genes might be repressed or promoted by changing the local topology of the genome. On the other hand, sequence heterogeneity creates traps which normally slow down facilitated diffusion. When suitably positioned, though, these traps can, surprisingly, render the search process much more efficient.

  9. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy.

    PubMed

    Mankos, Marian; Persson, Henrik H J; N'Diaye, Alpha T; Shadman, Khashayar; Schmid, Andreas K; Davis, Ronald W

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging.

  10. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  11. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  12. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  13. Next Generation DNA Sequencing and the Future of Genomic Medicine

    PubMed Central

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpretation, laboratory workflow, data storage, and ethical considerations. This review describes the current high-throughput sequencing platforms commercially available, and compares the inherent advantages and disadvantages of each. The potential applications for clinical diagnostics are considered, as well as the need for software and analysis tools to interpret the vast amount of data generated. Finally, we discuss the clinical and ethical implications of the wealth of genetic information generated by these methods. Despite the challenges, we anticipate that the evolution and refinement of high-throughput DNA sequencing technologies will catalyze a new era of personalized medicine based on individualized genomic analysis. PMID:24710010

  14. Ancient mtDNA sequences from the First Australians revisited

    PubMed Central

    Subramanian, Sankar; Wright, Joanne L.; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D.; Willerslev, Eske; Lambert, David M.

    2016-01-01

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537–542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the “Out of Africa” model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  15. Ancient mtDNA sequences from the First Australians revisited.

    PubMed

    Heupink, Tim H; Subramanian, Sankar; Wright, Joanne L; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D; Willerslev, Eske; Lambert, David M

    2016-06-21

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537-542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the "Out of Africa" model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains.

  16. Multiple sequence alignment: in pursuit of homologous DNA positions.

    PubMed

    Kumar, Sudhir; Filipski, Alan

    2007-02-01

    DNA sequence alignment is a prerequisite to virtually all comparative genomic analyses, including the identification of conserved sequence motifs, estimation of evolutionary divergence between sequences, and inference of historical relationships among genes and species. While it is mere common sense that inaccuracies in multiple sequence alignments can have detrimental effects on downstream analyses, it is important to know the extent to which the inferences drawn from these alignments are robust to errors and biases inherent in all sequence alignments. A survey of investigations into strengths and weaknesses of sequence alignments reveals, as expected, that alignment quality is generally poor for two distantly related sequences and can often be improved by adding additional sequences as stepping stones between distantly related species. Errors in sequence alignment are also found to have a significant negative effect on subsequent inference of sequence divergence, phylogenetic trees, and conserved motifs. However, our understanding of alignment biases remains rudimentary, and sequence alignment procedures continue to be used somewhat like benign formatting operations to make sequences equal in length. Because of the central role these alignments now play in our endeavors to establish the tree of life and to identify important parts of genomes through evolutionary functional genomics, we see a need for increased community effort to investigate influences of alignment bias on the accuracy of large-scale comparative genomics.

  17. Mapping DNA methylation with high-throughput nanopore sequencing.

    PubMed

    Rand, Arthur C; Jain, Miten; Eizenga, Jordan M; Musselman-Brown, Audrey; Olsen, Hugh E; Akeson, Mark; Paten, Benedict

    2017-04-01

    DNA chemical modifications regulate genomic function. We present a framework for mapping cytosine and adenosine methylation with the Oxford Nanopore Technologies MinION using this nanopore sequencer's ionic current signal. We map three cytosine variants and two adenine variants. The results show that our model is sensitive enough to detect changes in genomic DNA methylation levels as a function of growth phase in Escherichia coli.

  18. Automating HIV Drug Resistance Genotyping with RECall, a Freely Accessible Sequence Analysis Tool

    PubMed Central

    Woods, Conan K.; Brumme, Chanson J.; Liu, Tommy F.; Chui, Celia K. S.; Chu, Anna L.; Wynhoven, Brian; Hall, Tom A.; Trevino, Christina; Shafer, Robert W.

    2012-01-01

    Genotypic HIV drug resistance testing is routinely used to guide clinical decisions. While genotyping methods can be standardized, a slow, labor-intensive, and subjective manual sequence interpretation step is required. We therefore performed external validation of our custom software RECall, a fully automated sequence analysis pipeline. HIV-1 drug resistance genotyping was performed on 981 clinical samples at the Stanford Diagnostic Virology Laboratory. Sequencing trace files were first interpreted manually by a laboratory technician and subsequently reanalyzed by RECall, without intervention. The relative performances of the two methods were assessed by determination of the concordance of nucleotide base calls, identification of key resistance-associated substitutions, and HIV drug resistance susceptibility scoring by the Stanford Sierra algorithm. RECall is freely available at http://pssm.cfenet.ubc.ca. In total, 875 of 981 sequences were analyzed by both human and RECall interpretation. RECall analysis required minimal hands-on time and resulted in a 25-fold improvement in processing speed (∼150 technician-hours versus ∼6 computation-hours). Excellent concordance was obtained between human and automated RECall interpretation (99.7% agreement for >1,000,000 bases compared). Nearly all discordances (99.4%) were due to nucleotide mixtures being called by one method but not the other. Similarly, 98.6% of key antiretroviral resistance-associated mutations observed were identified by both methods, resulting in 98.5% concordance of resistance susceptibility interpretations. This automated sequence analysis tool provides both standardization of analysis and a significant improvement in data workflow. The time-consuming, error-prone, and dreadfully boring manual sequence analysis step is replaced with a fully automated system without compromising the accuracy of reported HIV drug resistance data. PMID:22403431

  19. Mitochondrial DNA sequences from a 7000-year old brain.

    PubMed Central

    Pääbo, S; Gifford, J A; Wilson, A C

    1988-01-01

    Pieces of mitochondrial DNA from a 7000-year-old human brain were amplified by the polymerase chain reaction and sequenced. Albumin and high concentrations of polymerase were required to overcome a factor in the brain extract that inhibits amplification. For this and other sources of ancient DNA, we find an extreme inverse dependence of the amplification efficiency on the length of the sequence to be amplified. This property of ancient DNA distinguishes it from modern DNA and thus provides a new criterion of authenticity for use in research on ancient DNA. The brain is from an individual recently excavated from Little Salt Spring in southwestern Florida and the anthropologically informative sequences it yielded are the first obtained from archaeologically retrieved remains. The sequences show that this ancient individual belonged to a mitochondrial lineage that is rare in the Old World and not previously known to exist among Native Americans. Our finding brings to three the number of maternal lineages known to have been involved in the prehistoric colonization of the New World. Images PMID:3186445

  20. Sequence dependence of transcription factor-mediated DNA looping.

    PubMed

    Johnson, Stephanie; Lindén, Martin; Phillips, Rob

    2012-09-01

    DNA is subject to large deformations in a wide range of biological processes. Two key examples illustrate how such deformations influence the readout of the genetic information: the sequestering of eukaryotic genes by nucleosomes and DNA looping in transcriptional regulation in both prokaryotes and eukaryotes. These kinds of regulatory problems are now becoming amenable to systematic quantitative dissection with a powerful dialogue between theory and experiment. Here, we use a single-molecule experiment in conjunction with a statistical mechanical model to test quantitative predictions for the behavior of DNA looping at short length scales and to determine how DNA sequence affects looping at these lengths. We calculate and measure how such looping depends upon four key biological parameters: the strength of the transcription factor binding sites, the concentration of the transcription factor, and the length and sequence of the DNA loop. Our studies lead to the surprising insight that sequences that are thought to be especially favorable for nucleosome formation because of high flexibility lead to no systematically detectable effect of sequence on looping, and begin to provide a picture of the distinctions between the short length scale mechanics of nucleosome formation and looping.

  1. The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data

    PubMed Central

    2014-01-01

    Background Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. Results The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation’ barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. Conclusions The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is

  2. Transcriptome analysis by strand-specific sequencing of complementary DNA.

    PubMed

    Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

    2009-10-01

    High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.

  3. Isolation of a sex-linked DNA sequence in cranes.

    PubMed

    Duan, W; Fuerst, P A

    2001-01-01

    A female-specific DNA fragment (CSL-W; crane sex-linked DNA on W chromosome) was cloned from female whooping cranes (Grus americana). From the nucleotide sequence of CSL-W, a set of polymerase chain reaction (PCR) primers was identified which amplify a 227-230 bp female-specific fragment from all existing crane species and some other noncrane species. A duplicated versions of the DNA segment, which is found to have a larger size (231-235 bp) than CSL-W in both sexes, was also identified, and was designated CSL-NW (crane sex-linked DNA on non-W chromosome). The nucleotide similarity between the sequences of CSL-W and CSL-NW from whooping cranes was 86.3%. The CSL primers do not amplify any sequence from mammalian DNA, limiting the potential for contamination from human sources. Using the CSL primers in combination with a quick DNA extraction method allows the noninvasive identification of crane gender in less than 10 h. A test of the methodology was carried out on fully developed body feathers from 18 captive cranes and resulted in 100% successful identification.

  4. Bayesian estimation of sequence damage in ancient DNA.

    PubMed

    Ho, Simon Y W; Heupink, Tim H; Rambaut, Andrew; Shapiro, Beth

    2007-06-01

    DNA extracted from archaeological and paleontological remains is usually damaged by biochemical processes postmortem. Some of these processes lead to changes in the structure of the DNA molecule, which can result in the incorporation of incorrect nucleotides during polymerase chain reaction. These base misincorporations, or miscoding lesions, can lead to the inclusion of spurious additional mutations in ancient DNA (aDNA) data sets. This has the potential to affect the outcome of phylogenetic and population genetic analyses, including estimates of mutation rates and genetic diversity. We present a novel model, termed the delta model, which estimates the amount of damage in DNA data and accounts for its effects in a Bayesian phylogenetic framework. The ability of the delta model to estimate damage is first investigated using a simulation study. The model is then applied to 13 aDNA data sets. The amount of damage in these data sets is shown to be significant but low (about 1 damaged base per 750 nt), suggesting that precautions for limiting the influence of damaged sites, such as cloning and enzymatic treatment, are worthwhile. The results also suggest that relatively high rates of mutation previously estimated from aDNA data are not entirely an artifact of sequence damage and are likely to be due to other factors such as the persistence of transient polymorphisms. The delta model appears to be particularly useful for placing upper credibility limits on the amount of sequence damage in an alignment, and this capacity might be beneficial for future aDNA studies or for the estimation of sequencing errors in modern DNA.

  5. Fast comparison of DNA sequences by oligonucleotide profiling

    PubMed Central

    Arnau, Vicente; Gallach, Miguel; Marín, Ignacio

    2008-01-01

    Background The comparison of DNA sequences is a traditional problem in genomics and bioinformatics. Many new opportunities emerge due to the improvement of personal computers, allowing the implementation of novel strategies of analysis. Findings We describe a new program, called UVWORD, which determines the number of times that each DNA word present in a sequence (target) is found in a second sequence (source), a procedure that we have called oligonucleotide profiling. On a standard computer, the user may search for words of a size ranging from k = 1 to k = 14 nucleotides. Average counts for groups of contiguous words may also be established. The rate of analysis on standard computers is from 3.4 (k = 14) to 16 millions of words per second (1 ≤ k ≤ 8). This makes feasible the fast screening of even the longest known DNA molecules. Discussion We show that the combination of the ability of analyzing words of relatively long size, which occur very rarely by chance, and the fast speed of the program allows to perform novel types of screenings, complementary to those provided by standard programs such as BLAST. This method can be used to determine oligonucleotide content, to characterize the distribution of repetitive sequences in chromosomes, to determine the evolutionary conservation of sequences in different species, to establish regions of similar DNA among chromosomes or genomes, etc. PMID:18710530

  6. RNA–DNA sequence differences in Saccharomyces cerevisiae

    PubMed Central

    Wang, Isabel X.; Grunseich, Christopher; Chung, Youree G.; Kwak, Hojoong; Ramrattan, Girish; Zhu, Zhengwei; Cheung, Vivian G.

    2016-01-01

    Alterations of RNA sequences and structures, such as those from editing and alternative splicing, result in two or more RNA transcripts from a DNA template. It was thought that in yeast, RNA editing only occurs in tRNAs. Here, we found that Saccharomyces cerevisiae have all 12 types of RNA–DNA sequence differences (RDDs) in the mRNA. We showed these sequence differences are propagated to proteins, as we identified peptides encoded by the RNA sequences in addition to those by the DNA sequences at RDD sites. RDDs are significantly enriched at regions with R-loops. A screen of yeast mutants showed that RDD formation is affected by mutations in genes regulating R-loops. Loss-of-function mutations in ribonuclease H, senataxin, and topoisomerase I that resolve RNA–DNA hybrids lead to increases in RDD frequency. Our results demonstrate that RDD is a conserved process that diversifies transcriptomes and proteomes and provide a mechanistic link between R-loops and RDDs. PMID:27638543

  7. Real-time DNA sequencing from single polymerase molecules.

    PubMed

    Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David; Baybayan, Primo; Bettman, Brad; Bibillo, Arkadiusz; Bjornson, Keith; Chaudhuri, Bidhan; Christians, Frederick; Cicero, Ronald; Clark, Sonya; Dalal, Ravindra; Dewinter, Alex; Dixon, John; Foquet, Mathieu; Gaertner, Alfred; Hardenbol, Paul; Heiner, Cheryl; Hester, Kevin; Holden, David; Kearns, Gregory; Kong, Xiangxu; Kuse, Ronald; Lacroix, Yves; Lin, Steven; Lundquist, Paul; Ma, Congcong; Marks, Patrick; Maxham, Mark; Murphy, Devon; Park, Insil; Pham, Thang; Phillips, Michael; Roy, Joy; Sebra, Robert; Shen, Gene; Sorenson, Jon; Tomaney, Austin; Travers, Kevin; Trulson, Mark; Vieceli, John; Wegener, Jeffrey; Wu, Dawn; Yang, Alicia; Zaccarin, Denis; Zhao, Peter; Zhong, Frank; Korlach, Jonas; Turner, Stephen

    2009-01-02

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequencing reactions. Conjugation of fluorophores to the terminal phosphate moiety of the dNTPs allows continuous observation of DNA synthesis over thousands of bases without steric hindrance. The data report directly on polymerase dynamics, revealing distinct polymerization states and pause sites corresponding to DNA secondary structure. Sequence data were aligned with the known reference sequence to assay biophysical parameters of polymerization for each template position. Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.

  8. Rapid genotyping of carcinogenic human papillomavirus by loop-mediated isothermal amplification using a new automated DNA test (Clinichip HPV™).

    PubMed

    Satoh, Toyomi; Matsumoto, Koji; Fujii, Takuma; Sato, Osamu; Gemma, Nobuhiro; Onuki, Mamiko; Saito, Hiroshi; Aoki, Daisuke; Hirai, Yasuo; Yoshikawa, Hiroyuki

    2013-03-01

    This study was designed to evaluate the Clinichip HPV test, a new DNA test that detects carcinogenic human papillomavirus (HPV) rapidly by loop-mediated isothermal amplification and performs genotyping of all 13 carcinogenic types using automated DNA chip technology with an assay time 2.5h. Using this test, 247 Japanese women (109 with normal cytology, 43 with cervical intraepithelial neoplasia grade 1, 60 with cervical intraepithelial neoplasia grade 2/3 and 35 with invasive cervical cancer) were tested for carcinogenic HPV genotypes. The results were compared to those obtained by the polymerase chain reaction-amplified DNA sequencing using 13 type-specific primers. Overall, there was very good agreement for the detection of carcinogenic HPV between the Clinichip test and direct sequencing, with 95.5% total agreement and a kappa value of 0.91. Comparison of the detection of individual HPV types shows that the overall agreement was also high (range: 96.8-100%). In women with cervical intraepithelial neoplasia grade 2 or worse, the detection rate of carcinogenic HPV was 95.7% by both the Clinichip test and the direct-sequencing method, indicating complete agreement between the two methods. In conclusion, it was found that the Clinichip test is a promising new laboratory method for genotyping of carcinogenic HPV.

  9. Reduced-stringency DNA reassociation: sequence specific duplex formation.

    PubMed Central

    Burr, H E; Schimke, R T

    1982-01-01

    Reduced-stringency DNA reassociation conditions allow low stability duplexes to be detected in prokaryotic, plant, fish, avian, mammalian, and primate genomes. Highly diverged families of sequences can be detected in avian, mouse, and human unique sequence dNAs. Such a family has been described among twelve species of birds; based on species specific melting profiles and fractionation of sequences belonging to this family, it was concluded that permissive reassociation conditions did not artifactually produce low stability structures (1). We report S1 nuclease and optical melting experiments, and further fractionation of the diverged family to confirm sequence specific DNA reassociation at 50 degrees in 0.5 M phosphate buffer. PMID:6278429

  10. Optimized DNA extraction and metagenomic sequencing of airborne microbial communities.

    PubMed

    Jiang, Wenjun; Liang, Peng; Wang, Buying; Fang, Jianhuo; Lang, Jidong; Tian, Geng; Jiang, Jingkun; Zhu, Ting F

    2015-05-01

    Metagenomic sequencing has been widely used for the study of microbial communities from various environments such as soil, ocean, sediment and fresh water. Nonetheless, metagenomic sequencing of microbial communities in the air remains technically challenging, partly owing to the limited mass of collectable atmospheric particulate matter and the low biological content it contains. Here we present an optimized protocol for extracting up to tens of nanograms of airborne microbial genomic DNA from collected particulate matter. With an improved sequencing library preparation protocol, this quantity is sufficient for downstream applications, such as metagenomic sequencing for sampling various genes from the airborne microbial community. The described protocol takes ∼12 h of bench time over 2-3 d, and it can be performed with standard molecular biology equipment in the laboratory. A modified version of this protocol may also be used for genomic DNA extraction from other environmental samples of limited mass or low biological content.

  11. Re-identification of DNA through an automated linkage process.

    PubMed Central

    Malin, B.; Sweeney, L.

    2001-01-01

    This work demonstrates how seemingly anonymous DNA database entries can be related to publicly available health information to uniquely and specifically identify the persons who are the subjects of the information even though the DNA information contains no accompanying explicit identifiers such as name, address, or Social Security number and contains no additional fields of personal information. The software program, REID (Re-Identification of DNA), iteratively uncovers unique occurrences in visit-disease patterns across data collections that reveal inferences about the identities of the patients who are the subject of the DNA. Using real-world data, REID established identifiable linkages in 33-100% of the 10,886 cases explicitly surveyed over 8 gene-based diseases. PMID:11825223

  12. Contrasting DNA sequence organisation patterns in sauropsidian genomes.

    PubMed

    Epplen, J T; Diedrich, U; Wagenmann, M; Schmidtke, J; Engel, W

    1979-11-01

    The genomic DNA organisation patterns of four sauropsidian species, namely Python reticularis, Caiman crocodilus, Terrapene carolina triungius and Columba livia domestica were investigated by reassociation of short and long DNA fragments, by hyperchromicity measurements of reannealed fragments and by length estimations of S1-nuclease resistant repetitive duplexes. While the genomic DNA of the three reptilian species shows a short period interspersion pattern, the genome of the avian species is organised in a long period interspersion pattern apparently typical for birds. These findings are discussed in view of the close phylogenetic relationships of birds and reptiles, and also with regard to a possible relationship between the extent of sequence interspersion and genome size.

  13. An optimization approach and its application to compare DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Liwei; Li, Chao; Bai, Fenglan; Zhao, Qi; Wang, Ying

    2015-02-01

    Studying the evolutionary relationship between biological sequences has become one of the main tasks in bioinformatics research by means of comparing and analyzing the gene sequence. Many valid methods have been applied to the DNA sequence alignment. In this paper, we propose a novel comparing method based on the Lempel-Ziv (LZ) complexity to compare biological sequences. Moreover, we introduce a new distance measure and make use of the corresponding similarity matrix to construct phylogenic tree without multiple sequence alignment. Further, we construct phylogenic tree for 24 species of Eutherian mammals and 48 countries of Hepatitis E virus (HEV) by an optimization approach. The results indicate that this new method improves the efficiency of sequence comparison and successfully construct phylogenies.

  14. Feature Extraction From DNA Sequences by Multifractal Analysis

    DTIC Science & Technology

    2007-11-02

    genome may lead to an under- standing of the genome and to the understanding of life. Recently a draft sequence of the human genome ...which covers 96% of the entire human genome containing base pairs, has been published by the Human Genome Project (HGP) and Celera Genomics . However...time series model based on the global structure of the complete genome , and showed long-range correlations in the bacteria DNA sequences . Although

  15. Fast DNA sequencing by electrical means inches closer

    NASA Astrophysics Data System (ADS)

    Di Ventra, Massimiliano

    2013-08-01

    The sequencing of the human genome offered a glimpse of future medical practices, where information retrieved from the genome could be harnessed to inform treatment decisions. However, making DNA sequencing accessible enough for widespread use poses a number of challenges. This perspective article traces the progress made in the field so far and looks at how close we may be already to real-life applications.

  16. Base-sequence-dependent sliding of proteins on DNA.

    PubMed

    Barbi, M; Place, C; Popkov, V; Salerno, M

    2004-10-01

    The possibility that the sliding motion of proteins on DNA is influenced by the base sequence through a base pair reading interaction, is considered. Referring to the case of the T7 RNA-polymerase, we show that the protein should follow a noise-influenced sequence-dependent motion which deviate from the standard random walk usually assumed. The general validity and the implications of the results are discussed.

  17. Rotating rod renewable microcolumns for automated, solid-phase DNA hybridization studies.

    PubMed

    Bruckner-Lea, C J; Stottlemyre, M S; Holman, D A; Grate, J W; Brockman, F J; Chandler, D P

    2000-09-01

    The development of a new temperature-controlled renewable microcolumn flow cell for solid-phase nucleic acid hybridization in an automated sequential injection system is described. The flow cell included a stepper motor-driven rotating rod with the working end cut to a 45 degrees angle. In one position, the end of the rod prevented passage of microbeads while allowing fluid flow; rotation of the rod by 180 degrees releases the beads. This system was used to rapidly test many hybridization and elution protocols to examine the temperature and solution conditions required for sequence-specific nucleic acid hybridization. Target nucleic acids labeled with a near-infrared fluorescent dye were detected immediately postcolumn during all column perfusion and elution steps using a flow-through fluorescence detector. Temperature control of the column and the presence of Triton X-100 surfactant were critical for specific hybridization. Perfusion of the column with complementary oligonucleotide (200 microL, 10 nM) resulted in hybridization with 8% of the DNA binding sites on the microbeads with a solution residence time of less than 1 s and a total sample perfusion time of 40 s. The use of the renewable column system for detection of an unlabeled PCR product in a sandwich assay was also demonstrated.

  18. Visual automated fluorescence electrophoresis provides simultaneous quality, quantity, and molecular weight spectra for genomic DNA from archived neonatal blood spots.

    PubMed

    Klassen, Tara L; Drabek, Janice; Tomson, Torjbörn; Sveinsson, Olafur; von Döbeln, Ulrika; Noebels, Jeffrey L; Goldman, Alicia M

    2013-05-01

    The Guthrie 903 card archived dried blood spots (DBSs) are a unique but terminal resource amenable for individual and population-wide genomic profiling. The limited amounts of DBS-derived genomic DNA (gDNA) can be whole genome amplified, producing sufficient gDNA for genomic applications, albeit with variable success; optimizing the isolation of high-quality DNA from these finite, low-yield specimens is essential. Agarose gel electrophoresis and spectrophotometry are established postextraction quality control (QC) methods but lack the power to disclose detailed structural, qualitative, or quantitative aspects that underlie gDNA failure in downstream applications. Visual automated fluorescence electrophoresis (VAFE) is a novel QC technology that affords precise quality, quantity, and molecular weight of double-stranded DNA from a single microliter of sample. We extracted DNA from 3-mm DBSs archived in the Swedish Neonatal Repository for >30 years and performed the first quantitative and qualitative analyses of DBS-derived DNA on VAFE, before and after whole genome amplified, in parallel with traditional QC methods. The VAFE QC data were correlated with subsequent sample performance in PCR, sequencing, and high-density comparative genome hybridization array. We observed improved standardization of nucleic acid quantity, quality and integrity, and high performance in the downstream genomic technologies. Addition of VAFE measures in QC increases confidence in the validity of genetic data and allows cost-effective downstream analysis of gDNA for investigational and diagnostic applications. Copyright © 2013 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  19. Alignments of DNA and protein sequences containing frameshift errors.

    PubMed

    Guan, X; Uberbacher, E C

    1996-02-01

    Molecular sequences, like all experimental data, are subject to error. Many current DNA sequencing protocols have very significant error rates and often generate artefactual insertions and deletions of bases (indels) which corrupt the translation of sequences and compromise the detection of protein homologies. The impact of these errors on the utility of molecular sequence data is dependent on the analytic technique used to interpret the data. In the presence of frameshift errors, standard algorithms using six-frame translation can miss important homologies because only subfragments of the correct translation are available in any given frame. We present a new algorithm which can detect and correct frameshift errors in DNA sequences during comparison of translated sequences with protein sequences in the databases. This algorithm can recognize homologous proteins sharing 30% identity even in the presence of a 7% frameshift error rate. Our algorithm uses dynamic programming, producing a guaranteed optimal alignment in the presence of frameshifts, and has a sensitivity equivalent to Smith-Waterman. The computational efficiency of the algorithm is O(nm) where n and m are the sizes of two sequences being compared. The algorithm does not rely on prior knowledge or heuristic rules and performs significantly better than any previously reported method.

  20. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    PubMed Central

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346

  1. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    NASA Astrophysics Data System (ADS)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

    2014-12-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5'-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 . 10-14 cm2 and 7.06 . 10-14 cm2. The highest cross section was found for 5'-TT(ATA)3TT and 5'-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy.

  2. Evaluation of automated and manual commercial DNA extraction methods for recovery of Brucella DNA from suspensions and spiked swabs.

    PubMed

    Dauphin, Leslie A; Hutchins, Rebecca J; Bost, Liberty A; Bowen, Michael D

    2009-12-01

    This study evaluated automated and manual commercial DNA extraction methods for their ability to recover DNA from Brucella species in phosphate-buffered saline (PBS) suspension and from spiked swab specimens. Six extraction methods, representing several of the methodologies which are commercially available for DNA extraction, as well as representing various throughput capacities, were evaluated: the MagNA Pure Compact and the MagNA Pure LC instruments, the IT 1-2-3 DNA sample purification kit, the MasterPure Complete DNA and RNA purification kit, the QIAamp DNA blood mini kit, and the UltraClean microbial DNA isolation kit. These six extraction methods were performed upon three pathogenic Brucella species: B. abortus, B. melitensis, and B. suis. Viability testing of the DNA extracts indicated that all six extraction methods were efficient at inactivating virulent Brucella spp. Real-time PCR analysis using Brucella genus- and species-specific TaqMan assays revealed that use of the MasterPure kit resulted in superior levels of detection from bacterial suspensions, while the MasterPure kit and MagNA Pure Compact performed equally well for extraction of spiked swab samples. This study demonstrated that DNA extraction methodologies differ in their ability to recover Brucella DNA from PBS bacterial suspensions and from swab specimens and, thus, that the extraction method used for a given type of sample matrix can influence the sensitivity of real-time PCR assays for Brucella.

  3. Investigation of a Sybr-Green-Based Method to Validate DNA Sequences for DNA Computing

    DTIC Science & Technology

    2005-05-01

    stranded DNA . We previously demonstrated that this technique can be exploited to distinguish between stably-hybridized Watson - Crick duplexes and...et al., 2004) we described the difference between the canonical Watson - Crick base pairs of DNA and the usually less stable mismatches that can also...computing, cross-hybridized duplexes represent errors. It is therefore crucial that DNA sequences be designed so that the formation of a Watson - Crick

  4. Mitochondrial DNA sequence evolution in the Arctoidea.

    PubMed Central

    Zhang, Y P; Ryder, O A

    1993-01-01

    Some taxa in the superfamily Arctoidea, such as the giant panda and the lesser panda, have presented puzzles to taxonomists. In the present study, approximately 397 bases of the cytochrome b gene, 364 bases of the 12S rRNA gene, and 74 bases of the tRNA(Thr) and tRNA(Pro) genes from the giant panda, lesser panda, kinkajou, raccoon, coatimundi, and all species of the Ursidae were sequenced. The high transition/transversion ratios in cytochrome b and RNA genes prior to saturation suggest that the presumed transition bias may represent a trend for some mammalian lineages rather than strictly a primate phenomenon. Transversions in the 12S rRNA gene accumulate in arctoids at about half the rate reported for artiodactyls. Different arctoid lineages evolve at different rates: the kinkajou, a procyonid, evolves the fastest, 1.7-1.9 times faster than the slowest lineage that comprises the spectacled and polar bears. Generation-time effect can only partially explain the different rates of nucleotide substitution in arctoids. Our results based on parsimony analysis show that the giant panda is more closely related to bears than to the lesser panda; the lesser panda is neither closely related to bears nor to the New World procyonids. The kinkajou, raccoon, and coatimundi diverged from each other very early, even though they group together. The polar bear is closely related to the spectacled bear, and they began to diverge from a common mitochondrial ancestor approximately 2 million years ago. Relationships of the remaining five bear species are derived. PMID:8415740

  5. Magnetic bead purification of labeled DNA fragments forhigh-throughput capillary electrophoresis sequencing

    SciTech Connect

    Elkin, Christopher; Kapur, Hitesh; Smith, Troy; Humphries, David; Pollard, Martin; Hammon, Nancy; Hawkins, Trevor

    2001-09-15

    We have developed an automated purification method for terminator sequencing products based on a magnetic bead technology. This 384-well protocol generates labeled DNA fragments that are essentially free of contaminates for less than $0.005 per reaction. In comparison to laborious ethanol precipitation protocols, this method increases the phred20 read length by forty bases with various DNA templates such as PCR fragments, Plasmids, Cosmids and RCA products. Our method eliminates centrifugation and is compatible with both the MegaBACE 1000 and ABIPrism 3700 capillary instruments. As of September 2001, this method has produced over 1.6 million samples with 93 percent averaging 620 phred20 bases as part of Joint Genome Institutes Production Process.

  6. DNA shotgun sequencing analysis of Garcinia mangostana L. variety Mesta.

    PubMed

    Abu Bakar, Syuhaidah; Kumar, Suresh; Loke, Kok-Keong; Goh, Hoe-Han; Mohd Noor, Normah

    2017-06-01

    Mangosteen (Garcinia mangostana Linn.) is an ultra-tropical tree characterized by its unique dark purple fruits with white flesh. The xanthone-rich purple pericarp tissue contains valuable compounds with medicinal properties. Following previously reported genome sequencing of a common variety of mangosteen [1], we performed another whole genome sequencing of a commercially popular variety of this fruit species (var. Mesta) for comparative analysis of its genome composition. Raw reads of the DNA sequencing project were deposited to SRA database with the accession number SRX2709728.

  7. Compilation and analysis of Escherichia coli promoter DNA sequences.

    PubMed Central

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter mutations. Nearly all of the altered base pairs in the mutants conform to the following general rule: down-mutations decrease homology and up-mutations increase homology to the consensus sequence. PMID:6344016

  8. Effect of Noise on DNA Sequencing via Transverse Electronic Transport

    PubMed Central

    Krems, Matt; Zwolak, Michael; Pershin, Yuriy V.; Di Ventra, Massimiliano

    2009-01-01

    Abstract Previous theoretical studies have shown that measuring the transverse current across DNA strands while they translocate through a nanopore or channel may provide a statistically distinguishable signature of the DNA bases, and may thus allow for rapid DNA sequencing. However, fluctuations of the environment, such as ionic and DNA motion, introduce important scattering processes that may affect the viability of this approach to sequencing. To understand this issue, we have analyzed a simple model that captures the role of this complex environment in electronic dephasing and its ability to remove charge carriers from current-carrying states. We find that these effects do not strongly influence the current distributions due to the off-resonant nature of tunneling through the nucleotides—a result we expect to be a common feature of transport in molecular junctions. In particular, only large scattering strengths, as compared to the energetic gap between the molecular states and the Fermi level, significantly alter the form of the current distributions. Since this gap itself is quite large, the current distributions remain protected from this type of noise, further supporting the possibility of using transverse electronic transport measurements for DNA sequencing. PMID:19804730

  9. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  10. Educational Software for the Analysis of DNA and Protein Sequences.

    ERIC Educational Resources Information Center

    Maloy, Stanley; Olson, Sue

    1989-01-01

    Describes the development of the microcomputer-based educational software, DNAzoom, which was designed to introduce undergraduates in molecular biology to computer analysis of DNA protein sequences. Highlights include graphical presentation of data, the functional use of color, a menu-oriented interface, and students' evaluations of the software.…

  11. Decoding long nanopore sequencing reads of natural DNA.

    PubMed

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands.

  12. Derivatized versions of ligase enzymes for constructing DNA sequences

    DOEpatents

    Mariella, Jr., Raymond P.; Christian, Allen T.; Tucker, James D.; Dzenitis, John M.; Papavasiliou, Alexandros P.

    2006-08-15

    A method of making very long, double-stranded synthetic poly-nucleotides. A multiplicity of short oligonucleotides is provided. The short oligonucleotides are sequentially hybridized to each other. Enzymatic ligation of the oligonucleotides provides a contiguous piece of PCR-ready DNA of predetermined sequence.

  13. DNA methylation mapping by tag-modified bisulfite genomic sequencing.

    PubMed

    Han, Weiguo; Cauchi, Stephane; Herman, James G; Spivack, Simon D

    2006-08-01

    A tag-modified bisulfite genomic sequencing (tBGS) method employing direct cycle sequencing of polymerase chain reaction (PCR) products at kilobase scale, without conventional DNA fragment cloning, was developed for simplified evaluation of DNA methylation sites. The method entails subjecting bisulfite-modified genomic DNA to a second-round PCR amplification employing GC-tagged primers. Qualitative results from tBGS closely correlated with those from conventional BGS (R=0.935, p=0.002). In application, the intertissue and interindividual CpG methylation differences in promoter sequence for two genes, CYP1B1 and GSTP1, were then explored across four human tissue types (peripheral blood cells, exfoliated buccal cells, paired nontumor-tumor lung tissues), and two lung cell types in culture (normal NHBE and malignant A549). Predominantly conserved methylation maps for the two gene promoters were apparent across donors and tissues. At any given CpG site, variation in the degree of methylation could be determined by the relative height of C and T peaks in the sequencing trace. Methylation maps for the GSTP1 promoter diverged between NHBE (unmethylated) and A549 (completely methylated) cells in a previously unexplored upstream region, correlating with a 2.7-fold difference in GSTP1 mRNA expression (p<0.01). The tBGS method simplifies detailed methylation scanning of kilobase-scale genomic DNA, facilitating more ambitious genomic methylation mapping studies.

  14. Light-generated oligonucleotide arrays for rapid DNA sequence analysis.

    PubMed Central

    Pease, A C; Solas, D; Sullivan, E J; Cronin, M T; Holmes, C P; Fodor, S P

    1994-01-01

    In many areas of molecular biology there is a need to rapidly extract and analyze genetic information; however, current technologies for DNA sequence analysis are slow and labor intensive. We report here how modern photolithographic techniques can be used to facilitate sequence analysis by generating miniaturized arrays of densely packed oligonucleotide probes. These probe arrays, or DNA chips, can then be applied to parallel DNA hybridization analysis, directly yielding sequence information. In a preliminary experiment, a 1.28 x 1.28 cm array of 256 different octanucleotides was produced in 16 chemical reaction cycles, requiring 4 hr to complete. The hybridization pattern of fluorescently labeled oligonucleotide targets was then detected by epifluorescence microscopy. The fluorescence signals from complementary probes were 5-35 times stronger than those with single or double base-pair hybridization mismatches, demonstrating specificity in the identification of complementary sequences. This method should prove to be a powerful tool for rapid investigations in human genetics and diagnostics, pathogen detection, and DNA molecular recognition. Images PMID:8197176

  15. RNA-DNA sequence differences spell genetic code ambiguities

    PubMed Central

    Nielsen, Michael L.

    2011-01-01

    A recent paper in Science by Li et al. 20111 reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized.  PMID:22567189

  16. Wide-field imaging design for a multiple-capillary DNA-sequencing system

    NASA Astrophysics Data System (ADS)

    Nay, Lyle M.; Sinclair, Robert; Swerdlow, Harold

    1997-05-01

    A laser-induced fluorescence detection system compatible with a capillary electrophoresis array was developed. The design incorporates fiber-optic excitation and a detection system including a diffraction grating and a CCD camera. The system employs no moving parts and is capable of producing data comparable to commercially available systems. It is based on a spectrally-resolved four-dye sequencing scheme. The conceptual design was proven, however, refinements must be made to optimize performance for high-throughput capillary-array DNA sequencing. Automated sample preparation and loading in combination with a refillable separation- matrix capillary-array system could prove to be an invaluable tool for completion of the Human Genome Project.

  17. Sequence heterogeneity accelerates protein search for targets on DNA

    SciTech Connect

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  18. Sequence heterogeneity accelerates protein search for targets on DNA

    NASA Astrophysics Data System (ADS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-12-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  19. Probing the linearity and nonlinearity in DNA sequences

    NASA Astrophysics Data System (ADS)

    Tsonis, Anastasios A.; Heller, Fred L.; Tsonis, Panagiotis A.

    2002-09-01

    In this paper, we apply the principles of information theory that relate to the definition of nonlinear predictability, which is a measure that describes both the linear and nonlinear components of a system. By comparing this measure to a measure of linear predictability, one can assess whether a given system has a strong linear or a strong nonlinear component. This provides insights as to whether the system should be modeled by a nonlinear or a linear model. We apply these ideas to DNA sequences. Our results, which extend previous results on this issue indicate that all DNA sequences (coding and noncoding) exhibit strong nonlinear structure. At the same time the results provide insights to understand DNA structure and possible clues about evolutionary mechanisms.

  20. Electronic density of states in sequence dependent DNA molecules

    NASA Astrophysics Data System (ADS)

    de Oliveira, B. P. W.; Albuquerque, E. L.; Vasconcelos, M. S.

    2006-09-01

    We report in this work a numerical study of the electronic density of states (DOS) in π-stacked arrays of DNA single-strand segments made up from the nucleotides guanine G, adenine A, cytosine C and thymine T, forming a Rudin-Shapiro (RS) as well as a Fibonacci (FB) polyGC quasiperiodic sequences. Both structures are constructed starting from a G nucleotide as seed and following their respective inflation rules. Our theoretical method uses Dyson's equation together with a transfer-matrix treatment, within an electronic tight-binding Hamiltonian model, suitable to describe the DNA segments modelled by the quasiperiodic chains. We compared the DOS spectra found for the quasiperiodic structure to those using a sequence of natural DNA, as part of the human chromosome Ch22, with a remarkable concordance, as far as the RS structure is concerned. The electronic spectrum shows several peaks, corresponding to localized states, as well as a striking self-similar aspect.

  1. Automated sequencing of insoluble peptides using detergent. Bacteriophage fl coat protein.

    PubMed

    Bailey, G S; Gillett, D; Hill, D F; Petersen, G B

    1977-04-10

    Peptides which are highly nonpolar and insoluble under moderate conditions of pH and ionic strength cannot be subjected to automated sequence analysis. We report a method for solubilization of one such peptide, bacteriophage fl coat protein, by chemical modification in the presence of sodium dodecyl sulfate. Following this treatment the 50-residue peptide was degraded stepwise in an automated sequenator using a single cleavage Quadrol program with high repetitive yield through residue 47. We also report a modified program using detergent incorporated into dimethylallylamine buffer which permitted sequencing with high repetitive yields for at least the first 18 residues of the unmodified and otherwise highly insoluble coat protein. The presence of detergent caused no observable difficulties in detection of residues by gas chromatography, thin layer chromatography, or amino acid analysis.

  2. Automated carboxy-terminal sequence analysis of peptides and proteins using diphenyl phosphoroisothiocyanatidate.

    PubMed Central

    Bailey, J. M.; Nikfarjam, F.; Shenoy, N. R.; Shively, J. E.

    1992-01-01

    Proteins and peptides can be sequenced from the carboxy-terminus with isothiocyanate reagents to produce amino acid thiohydantoin derivatives. Previous studies in our laboratory have focused on the automation of the thiocyanate chemistry using acetic anhydride and trimethylsilylisothiocyanate (TMS-ITC) to derivatize the C-terminal amino acid to a thiohydantoin and sodium trimethylsilanolate for specific hydrolysis of the derivatized C-terminal amino acid (Bailey, J.M., Shenoy, N.R., Ronk, M., & Shively, J.E., 1992, Protein Sci. 1, 68-80). A major limitation of this approach was the need to activate the C-terminus with acetic anhydride. We now describe the use of a new reagent, diphenyl phosphoroisothiocyanatidate (DPP-ITC) and pyridine, which combines the activation and derivatization steps to produce peptidylthiohydantoins. Previous work by Kenner et al. (Kenner, G.W., Khorana, H.G., & Stedman, R.J., 1953, Chem. Soc. J., 673-678) with this reagent demonstrated slow kinetics. Several days were required for complete reaction. We show here that the inclusion of pyridine was found to promote the formation of C-terminal thiohydantoins by DPP-ITC resulting in complete conversion of the C-terminal amino acid to a thiohydantoin in less than 1 h. Reagents such as imidazole, triazine, and tetrazole were also found to promote the reaction with DPP-ITC as effectively as pyridine. General base catalysts, such as triethylamine, do not promote the reaction, but are required to convert the C-terminal carboxylic acid to a salt prior to the reaction with DPP-ITC and pyridine. By introducing the DPP-ITC reagent and pyridine in separate steps in an automated sequencer, we observed improved sequencing yields for amino acids normally found difficult to derivatize with acetic anhydride/TMS-ITC. This was particularly true for aspartic acid, which now can be sequenced in yields comparable to most of the other amino acids. Automated programs are described for the C-terminal sequencing of

  3. Sequence- and structure-dependent DNA base dynamics: Synthesis, structure, and dynamics of site and sequence specifically spin-labeled DNA

    SciTech Connect

    Spaltenstein, A.; Robinson, B.H.; Hopkins, P.B. )

    1989-11-28

    A nitroxide spin-labeled analogue of thymidine (1a), in which the methyl group is replaced by an acetylene-tethered nitroxide, was evaluated as a probe for structural and dynamics studies of sequence specifically spin-labeled DNA. Residue 1a was incorporated into synthetic deoxyoligonucleotides by using automated phosphite triester methods. {sup 1}H NMR, CD, and thermal denaturation studies indicate that 1a (T) does not significantly alter the structure of 5{prime}-d(CGCGAATT*CGCG) from that of the native dodecamer. EPR studies on monomer, single-stranded, and duplexed DNA show that 1a readily distinguishes environments of different rigidity. Comparison of the general line-shape features of the observed EPR spectra of several small duplexes (12-mer, 24-mer) with simulated EPR spectra assuming isotropic motion suggests that probe 1a monitors global tumbling of small duplexes. Increasing the length of the DNA oligomers results in significant deviation from isotropic motion, with line-shape features similar to those of calculated spectra of objects with isotropic rotational correlation times of 20-100 ns. EPR spectra of a spin-labeled GT mismatch and a T bulge in long DNAs are distinct from those of spin-labeled Watson-Crick paired DNAs, further demonstrating the value of EPR as a tool in the evaluation of local dynamic and structural features in macromolecules.

  4. Sequence-dependent folding of DNA three-way junctions

    PubMed Central

    Assenberg, René; Weston, Anthony; Cardy, Don L. N.; Fox, Keith R.

    2002-01-01

    Three-way DNA junctions can adopt several different conformers, which differ in the coaxial stacking of the arms. These structural variants are often dominated by one conformer, which is determined by the DNA sequence. In this study we have compared several three-way DNA junctions in order to assess how the arrangement of bases around the branch point affects the conformer distribution. The results show that rearranging the different arms, while retaining their base sequences, can affect the conformer distribution. In some instances this generates a structure that appears to contain parallel coaxially stacked helices rather than the usual anti-parallel arrangement. Although the conformer equilibrium can be affected by the order of purines and pyrimidines around the branch point, this is not sufficient to predict the conformer distribution. We find that the folding of three-way junctions can be separated into two groups of dinucleotide steps. These two groups show distinctive stacking properties in B-DNA, suggesting there is a correlation between B-DNA stacking and coaxial stacking in DNA junctions. PMID:12466538

  5. Reiterated DNA sequences in Rhizobium and Agrobacterium spp.

    PubMed Central

    Flores, M; González, V; Brom, S; Martínez, E; Piñero, D; Romero, D; Dávila, G; Palacios, R

    1987-01-01

    Repeated DNA sequences are a general characteristic of eucaryotic genomes. Although several examples of DNA reiteration have been found in procaryotic organisms, only in the case of the archaebacteria Halobacterium halobium and Halobacterium volcanii [C. Sapienza and W. F. Doolittle, Nature (London) 295:384-389, 1982], has DNA reiteration been reported as a common genomic feature. The genomes of two Rhizobium phaseoli strains, one Rhizobium meliloti strain, and one Agrobacterium tumefaciens strain were analyzed for the presence of repetitive DNA. Rhizobium and Agrobacterium spp. are closely related soil bacteria that interact with plants and that belong to the taxonomical family Rhizobiaceae. Rhizobium species establish a nitrogen-fixing symbiosis in the roots of legumes, whereas Agrobacterium species is a pathogen in different plants. The four strains revealed a large number of repeated DNA sequences. The family size was usually small, from 2 to 5 elements, but some presented more than 10 elements. Rhizobium and Agrobacterium spp. contain large plasmids in addition to the chromosomes. Analysis of the two Rhizobium strains indicated that DNA reiteration is not confined to the chromosome or to some plasmids but is a property of the whole genome. Images PMID:3450286

  6. A blind testing design for authenticating ancient DNA sequences.

    PubMed

    Yang, H; Golenberg, E M; Shoshani, J

    1997-04-01

    Reproducibility is a serious concern among researchers of ancient DNA. We designed a blind testing procedure to evaluate laboratory accuracy and authenticity of ancient DNA obtained from closely related extant and extinct species. Soft tissue and bones of fossil and contemporary museum proboscideans were collected and identified based on morphology by one researcher, and other researchers carried out DNA testing on the samples, which were assigned anonymous numbers. DNA extracted using three principal isolation methods served as template in PCR amplifications of a segment of the cytochrome b gene (mitochondrial genome), and the PCR product was directly sequenced and analyzed. The results show that such a blind testing design performed in one laboratory, when coupled with phylogenetic analysis, can nonarbitrarily test the consistency and reliability of ancient DNA results. Such reproducible results obtained from the blind testing can increase confidence in the authenticity of ancient sequences obtained from postmortem specimens and avoid bias in phylogenetic analysis. A blind testing design may be applicable as an alternative to confirm ancient DNA results in one laboratory when independent testing by two laboratories is not available.

  7. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W.

    1992-01-01

    We are developing a machine learning system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being learned. Using this information (which we call a domain theory''), our learning algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, the KBANN algorithm maps inference rules, such as consensus sequences, into a neural (connectionist) network. Neural network training techniques then use the training examples of refine these inference rules. We have been applying this approach to several problems in DNA sequence analysis and have also been extending the capabilities of our learning system along several dimensions.

  8. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W. . Dept. of Computer Sciences); Noordewier, M.O. . Dept. of Computer Science)

    1992-01-01

    We are primarily developing a machine teaming (ML) system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being teamed. Using this information, our teaming algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, our KBANN algorithm maps inference rules about a given recognition task into a neural network. Neural network training techniques then use the training examples to refine these inference rules. We call these rules a domain theory, following the convention in the machine teaming community. We have been applying this approach to several problems in DNA sequence analysis. In addition, we have been extending the capabilities of our teaming system along several dimensions. We have also been investigating parallel algorithms that perform sequence alignments in the presence of frameshift errors.

  9. Improving the PCR protocol to amplify a repetitive DNA sequence.

    PubMed

    Riet, J; Ramos, L R V; Lewis, R V; Marins, L F

    2017-09-21

    Although PCR-based techniques have become an essential tool in the field of molecular and genetic research, the amplification of repetitive DNA sequences is limited. This is due to the truncated nature of the amplified sequences, which are also prone to errors during DNA polymerase-based amplification. The complex structure of repetitive DNA can form hairpin loops, which promote dissociation of the polymerase from the template, impairing complete amplification, and leading to the formation of incomplete fragments that serve as megaprimers. These megaprimers anneal with other sequences, generating unexpected fragments in each PCR cycle. Our gene model, MaSp1, is 1037-bp long, with 68% GC content, and its amino acid sequence is characterized by poly-alanine-glycine motifs, which represent the repetitive codon consensus. We describe the amplification of the MaSp1 gene through minor changes in the PCR program. The results show that a denaturation temperature of 98°C is the key determinant in the amplification of the MaSp1 partial gene sequence.

  10. CGGBP1 mitigates cytosine methylation at repetitive DNA sequences.

    PubMed

    Agarwal, Prasoon; Collier, Paul; Fritz, Markus Hsi-Yang; Benes, Vladimir; Wiklund, Helena Jernberg; Westermark, Bengt; Singh, Umashankar

    2015-05-16

    CGGBP1 is a repetitive DNA-binding transcription regulator with target sites at CpG-rich sequences such as CGG repeats and Alu-SINEs and L1-LINEs. The role of CGGBP1 as a possible mediator of CpG methylation however remains unknown. At CpG-rich sequences cytosine methylation is a major mechanism of transcriptional repression. Concordantly, gene-rich regions typically carry lower levels of CpG methylation than the repetitive elements. It is well known that at interspersed repeats Alu-SINEs and L1-LINEs high levels of CpG methylation constitute a transcriptional silencing and retrotransposon inactivating mechanism. Here, we have studied genome-wide CpG methylation with or without CGGBP1-depletion. By high throughput sequencing of bisulfite-treated genomic DNA we have identified CGGBP1 to be a negative regulator of CpG methylation at repetitive DNA sequences. In addition, we have studied CpG methylation alterations on Alu and L1 retrotransposons in CGGBP1-depleted cells using a novel bisulfite-treatment and high throughput sequencing approach. The results clearly show that CGGBP1 is a possible bidirectional regulator of CpG methylation at Alus, and acts as a repressor of methylation at L1 retrotransposons.

  11. Cloning, sequencing and analysis of dnaK -dnaJ gene cluster of Bacillus megaterium.

    PubMed

    Bao, Fangming; Gong, Lei; Shao, Weilan

    2008-12-01

    The DNA fragment of heat shock genes (hrcA-grpE-dnaK-dnaJ) containing complete hrcA-grpE-dnaK operon and the transcription unit of dnaJ was cloned, sequensed and analyzed from Bacillus megaterium RF5. The sequence of hrcA, grpE and dnaJ were first time reported, and their coding products exibit 60%, 63% and 81% of identities to the homologs of B. subtilis. A sigmaA-type promoter of Gram-positive bacteria (PA1) and a terminator were located upstream of the hrcA and downstream of dnaK, and a Controlling inverted repeat of chaperone expression element (CIRCE) was identified between PA1 and hrcA. Another sigmaA-type promoter (PA2) and a terminator were found upstream and downstream of dnaJ, indicating B. megaterium has a transcription unit containing a single gene dnaJ. The structure of dnaJ transcription unit is more similar to that of Listeria monocytogenes than other species of Bacillus. A partial protein-based phylogenetic tree, derived from Gram-positive bacteria using HrcA sequence, indicated a closer phylogenetic relationship between B. megaterium and Geobacillus species than other two Bacillus species.

  12. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    PubMed Central

    2009-01-01

    Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms. Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. Results The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. Conclusion The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of Bdelloidea and Eutardigrada in

  13. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers.

    PubMed

    Robeson, Michael S; Costello, Elizabeth K; Freeman, Kristen R; Whiting, Jeremy; Adams, Byron; Martin, Andrew P; Schmidt, Steve K

    2009-12-11

    The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms.Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of Bdelloidea and Eutardigrada in a variety of soils.

  14. Spectral sum rules and search for periodicities in DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.

    2011-04-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory.

  15. Nanopore-based Fourth-generation DNA Sequencing Technology

    PubMed Central

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  16. Methyl-binding DNA capture Sequencing for Patient Tissues

    PubMed Central

    Jadhav, Rohit R.; Wang, Yao V.; Hsu, Ya-Ting; Liu, Joseph; Garcia, Dawn; Lai, Zhao; Huang, Tim H. M.; Jin, Victor X.

    2016-01-01

    Methylation is one of the essential epigenetic modifications to the DNA, which is responsible for the precise regulation of genes required for stable development and differentiation of different tissue types. Dysregulation of this process is often the hallmark of various diseases like cancer. Here, we outline one of the recent sequencing techniques, Methyl-Binding DNA Capture sequencing (MBDCap-seq), used to quantify methylation in various normal and disease tissues for large patient cohorts. We describe a detailed protocol of this affinity enrichment approach along with a bioinformatics pipeline to achieve optimal quantification. This technique has been used to sequence hundreds of patients across various cancer types as a part of the 1,000 methylome project (Cancer Methylome System). PMID:27842364

  17. DNA sequence analysis using hierarchical ART-based classification networks

    SciTech Connect

    LeBlanc, C.; Hruska, S.I.; Katholi, C.R.; Unnasch, T.R.

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  18. Internal Transcribed Spacer 2 (nu ITS2 rRNA) Sequence-Structure Phylogenetics: Towards an Automated Reconstruction of the Green Algal Tree of Life

    PubMed Central

    Buchheim, Mark A.; Keller, Alexander; Koetschan, Christian; Förster, Frank; Merget, Benjamin; Wolf, Matthias

    2011-01-01

    Background Chloroplast-encoded genes (matK and rbcL) have been formally proposed for use in DNA barcoding efforts targeting embryophytes. Extending such a protocol to chlorophytan green algae, though, is fraught with problems including non homology (matK) and heterogeneity that prevents the creation of a universal PCR toolkit (rbcL). Some have advocated the use of the nuclear-encoded, internal transcribed spacer two (ITS2) as an alternative to the traditional chloroplast markers. However, the ITS2 is broadly perceived to be insufficiently conserved or to be confounded by introgression or biparental inheritance patterns, precluding its broad use in phylogenetic reconstruction or as a DNA barcode. A growing body of evidence has shown that simultaneous analysis of nucleotide data with secondary structure information can overcome at least some of the limitations of ITS2. The goal of this investigation was to assess the feasibility of an automated, sequence-structure approach for analysis of IT2 data from a large sampling of phylum Chlorophyta. Methodology/Principal Findings Sequences and secondary structures from 591 chlorophycean, 741 trebouxiophycean and 938 ulvophycean algae, all obtained from the ITS2 Database, were aligned using a sequence structure-specific scoring matrix. Phylogenetic relationships were reconstructed by Profile Neighbor-Joining coupled with a sequence structure-specific, general time reversible substitution model. Results from analyses of the ITS2 data were robust at multiple nodes and showed considerable congruence with results from published phylogenetic analyses. Conclusions/Significance Our observations on the power of automated, sequence-structure analyses of ITS2 to reconstruct phylum-level phylogenies of the green algae validate this approach to assessing diversity for large sets of chlorophytan taxa. Moreover, our results indicate that objections to the use of ITS2 for DNA barcoding should be weighed against the utility of an automated

  19. Redundancy modulation of nuclear DNA sequences in Dasypyrum villosum.

    PubMed

    Frediani, M; Colonna, N; Cremonini, R; De Pace, C; Delre, V; Caccia, R; Cionini, P G

    1994-05-01

    In order to assess fluid domains in the genome of Dasypyrum villosum, Feulgen/DNA cytophotometric determinations and molecular and cytological DNA-DNA hybridization experiments were carried out in resting embryos and developing seedlings from yellow and brown caryopses belonging to different populations. The cytophotometric data showed that the basic amount of nuclear DNA is, on average, 12% higher in 2-day-old seedlings from yellow caryopses as compared to those from brown caryopses. It increases in each individual during seed germination, to a higher extent in seedlings from yellow caryopses than in those from brown caryopses. DNA content also differs up to 13% between plants within a caryopsis-colour group and up to 40% between populations. Dot-blot hybridization of a 396-bp D. villosum-specific DNA repeat to genomic DNA extracted from embryos in dry seeds, or from seedlings belonging to single progenies of plants from different populations, confirmed the cytophotometric results. The redundancy in the genome of sequences hybridizing to the 396-bp element differs significantly both between populations and between plant progenies within a population. During seed germination these sequences are the more amplified the less they are redundant in the genome of resting embryos, and amplification occurs to a significantly-greater extent in seedlings from yellow caryopses than in those from brown caryopses. (3)H-labelled 396-bp sequences hybridize at or near the telomeres of most chromsome pairs though only to the shorter of the two subtelocentric pairs. The hybridization level is higher in seedlings from yellow caryopses that in those from brown caryopses, and a linear correlation exists between the number of silver grains counted over the labelled regions of each chromosome pair in the two groups of seedlings. Possible control mechanisms of the observed changes in the nuclear genome, and the role of these changes in developmental pregulation and environmental

  20. Chloroplast DNA Sequence Homologies among Vascular Plants 1

    PubMed Central

    Lamppa, Gayle K.; Bendich, Arnold J.

    1979-01-01

    The extent of sequence conservation in the chloroplast genome of higher plants has been investigated. Supercoiled chloroplast DNA, prepared from pea seedlings, was labeled in vitro and used as a probe in reassociation experiments with a high concentration of total DNAs extracted from several angiosperms, gymnosperms, and lower vascular plants. In each case the probe reassociation was accelerated, demonstrating that some chloroplast sequences have been highly conserved throughout the evolution of vascular plants. Only among the flowering plants were distinct levels of cross-reaction with the pea chloroplast probe evident; broad bean and barley exhibited the highest and lowest levels, respectively. With the hydroxylapatite assay these levels decreased with a decrease in probe fragment length (from 1,860 to 735 bases), indicating that many conserved sequences in the chloroplast genome are separated by divergent sequences on a rather fine scale. Despite differences observed in levels of homology with the hydroxylapatite assay, S1 nuclease analysis of heteroduplexes showed that outside of the pea family the extent of sequence relatedness between the probe and various heterologous DNAs is approximately the same: 30%. In our interpretation, the fundamental changes in the chloroplast genome during angiosperm evolution involved the rearrangement of this 30% with respect to the more rapidly changing sequences of the genome. These rearrangements may have been more extensive in dicotyledons than in monocotyledons. We have estimated the amount of conserved and divergent DNA interspersed between one another. From the reassociation experiments, determinations were made of the percentage of chloroplast DNA in total DNA extracts from different higher plants; this value remained relatively constant when compared with the large variation in the diploid genome size of the plants. PMID:16660786

  1. Identification of parasite DNA in common bile duct stones by PCR and DNA sequencing

    PubMed Central

    Jang, Ji Sun; Kim, Kyung Ho; Yu, Jae-Ran

    2007-01-01

    We attempted to identify parasite DNA in the biliary stones of humans via PCR and DNA sequencing. Genomic DNA was isolated from each of 15 common bile duct (CBD) stones and 5 gallbladder (GB) stones. The patients who had the CBD stones suffered from cholangitis, and the patients with GB stones showed acute cholecystitis, respectively. The 28S and 18S rDNA genes were amplified successfully from 3 and/or 1 common bile duct stone samples, and then cloned and sequenced. The 28S and 18S rDNA sequences were highly conserved among isolates. Identity of the obtained 28S D1 rDNA with that of Clonorchis sinensis was higher than 97.6%, and identity of the 18S rDNA with that of other Ascarididae was 97.9%. Almost no intra-specific variations were detected in the 28S and 18S rDNA with the exception of a few nucleotide variations, i.e., substitution and deletion. These findings suggest that C. sinensis and Ascaris lumbricoides may be related with the biliary stone formation and development. PMID:18165713

  2. Automated Device for Asynchronous Extraction of RNA, DNA, or Protein Biomarkers from Surrogate Patient Samples.

    PubMed

    Bitting, Anna L; Bordelon, Hali; Baglia, Mark L; Davis, Keersten M; Creecy, Amy E; Short, Philip A; Albert, Laura E; Karhade, Aditya V; Wright, David W; Haselton, Frederick R; Adams, Nicholas M

    2016-12-01

    Many biomarker-based diagnostic methods are inhibited by nontarget molecules in patient samples, necessitating biomarker extraction before detection. We have developed a simple device that purifies RNA, DNA, or protein biomarkers from complex biological samples without robotics or fluid pumping. The device design is based on functionalized magnetic beads, which capture biomarkers and remove background biomolecules by magnetically transferring the beads through processing solutions arrayed within small-diameter tubing. The process was automated by wrapping the tubing around a disc-like cassette and rotating it past a magnet using a programmable motor. This device recovered biomarkers at ~80% of the operator-dependent extraction method published previously. The device was validated by extracting biomarkers from a panel of surrogate patient samples containing clinically relevant concentrations of (1) influenza A RNA in nasal swabs, (2) Escherichia coli DNA in urine, (3) Mycobacterium tuberculosis DNA in sputum, and (4) Plasmodium falciparum protein and DNA in blood. The device successfully extracted each biomarker type from samples representing low levels of clinically relevant infectivity (i.e., 7.3 copies/µL of influenza A RNA, 405 copies/µL of E. coli DNA, 0.22 copies/µL of TB DNA, 167 copies/µL of malaria parasite DNA, and 2.7 pM of malaria parasite protein). © 2015 Society for Laboratory Automation and Screening.

  3. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    PubMed Central

    Michaeli, Miri; Noga, Hila; Tabibian-Keissar, Hilla; Barshack, Iris; Mehr, Ramit

    2012-01-01

    High-throughput sequencing (HTS) yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig) genes, which are variable and often highly mutated. This paper describes Ig High-Throughput Sequencing Cleaner (Ig-HTS-Cleaner), a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig Insertion—Deletion Identifier (Ig-Indel-Identifier), a program for identifying legitimate and artifact insertions and/or deletions (indels). Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets. PMID:23293637

  4. AutoMate Express™ forensic DNA extraction system for the extraction of genomic DNA from biological samples.

    PubMed

    Liu, Jason Y; Zhong, Chang; Holt, Allison; Lagace, Robert; Harrold, Michael; Dixon, Alan B; Brevnov, Maxim G; Shewale, Jaiprakash G; Hennessy, Lori K

    2012-07-01

    The AutoMate Express™ Forensic DNA Extraction System was developed for automatic isolation of DNA from a variety of forensic biological samples. The performance of the system was investigated using a wide range of biological samples. Depending on the sample type, either PrepFiler™ lysis buffer or PrepFiler BTA™ lysis buffer was used to lyse the samples. After lysis and removal of the substrate using LySep™ column, the lysate in the sample tubes were loaded onto AutoMate Express™ instrument and DNA was extracted using one of the two instrument extraction protocols. Our study showed that DNA was recovered from as little as 0.025 μL of blood. DNA extracted from casework-type samples was free of detectable PCR inhibitors and the short tandem repeat profiles were complete, conclusive, and devoid of any PCR artifacts. The system also showed consistent performance from day-to-day operation. 2012 American Academy of Forensic Sciences. Published 2012. This article is a U.S. Government work and is in the public domain in the U.S.A.

  5. Entire Mitochondrial DNA Sequencing on Massively Parallel Sequencing for the Korean Population

    PubMed Central

    2017-01-01

    Mitochondrial DNA (mtDNA) genome analysis has been a potent tool in forensic practice as well as in the understanding of human phylogeny in the maternal lineage. The traditional mtDNA analysis is focused on the control region, but the introduction of massive parallel sequencing (MPS) has made the typing of the entire mtDNA genome (mtGenome) more accessible for routine analysis. The complete mtDNA information can provide large amounts of novel genetic data for diverse populations as well as improved discrimination power for identification. The genetic diversity of the mtDNA sequence in different ethnic populations has been revealed through MPS analysis, but the Korean population not only has limited MPS data for the entire mtGenome, the existing data is mainly focused on the control region. In this study, the complete mtGenome data for 186 Koreans, obtained using Ion Torrent Personal Genome Machine (PGM) technology and retrieved from rather common mtDNA haplogroups based on the control region sequence, are described. The results showed that 24 haplogroups, determined with hypervariable regions only, branched into 47 subhaplogroups, and point heteroplasmy was more frequent in the coding regions. In addition, sequence variations in the coding regions observed in this study were compared with those presented in other reports on different populations, and there were similar features observed in the sequence variants for the predominant haplogroups among East Asian populations, such as Haplogroup D and macrohaplogroups M9, G, and D. This study is expected to be the trigger for the development of Korean specific mtGenome data followed by numerous future studies. PMID:28244283

  6. Detecting and Analyzing DNA Sequencing Errors: Toward a Higher Quality of the Bacillus subtilis Genome Sequence

    PubMed Central

    Médigue, Claudine; Rose, Matthias; Viari, Alain; Danchin, Antoine

    1999-01-01

    During the determination of a DNA sequence, the introduction of artifactual frameshifts and/or in-frame stop codons in putative genes can lead to misprediction of gene products. Detection of such errors with a method based on protein similarity matching is only possible when related sequences are available in databases. Here, we present a method to detect frameshift errors in DNA sequences that is based on the intrinsic properties of the coding sequences. It combines the results of two analyses, the search for translational initiation/termination sites and the prediction of coding regions. This method was used to screen the complete Bacillus subtilis genome sequence and the regions flanking putative errors were resequenced for verification. This procedure allowed us to correct the sequence and to analyze in detail the nature of the errors. Interestingly, in several cases in-frame termination codons or frameshifts were not sequencing errors but confirmed to be present in the chromosome, indicating that the genes are either nonfunctional (pseudogenes) or subject to regulatory processes such as programmed translational frameshifts. The method can be used for checking the quality of the sequences produced by any prokaryotic genome sequencing project. PMID:10568751

  7. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    SciTech Connect

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  8. DNA sequence of the Escherichia coli tonB gene.

    PubMed Central

    Postle, K; Good, R F

    1983-01-01

    The nucleotide sequence of a cloned section of the Escherichia coli chromosome containing the tonB gene has been determined. Transcription initiation and termination sites for tonB RNA have been determined by S1 nuclease mapping. The tonB promoter and terminator resemble other E. coli promoters and terminators; the sequence of the tonB terminator region suggests that it may function bidirectionally. The DNA sequence specifies an open translation reading frame between the 5' and 3' RNA termini whose location is consistent with the position of previously isolated tonB::IS1 mutations. The DNA sequence predicts a proline-rich protein with a calculated size of 26.1-26.6 kilodaltons (239-244 amino acids), depending on which of three potential initiation codons is utilized. The predicted NH2 terminus of tonB protein resembles the proteolytically cleaved signal sequences of E. coli periplasmic and outer membrane proteins; the overall hydrophilic character of the protein sequence suggests that the bulk of the tonB protein is not embedded within the inner or outer membrane. A significant discrepancy exists between the calculated size of tonB protein and the apparent size of 36 kilodaltons determined by sodium dodecyl sulfate/polyacrylamide gel electrophoresis. Images PMID:6310567

  9. Delineating relative homogeneous G+C domains in DNA sequences.

    PubMed

    Li, W

    2001-10-03

    The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses.

  10. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    PubMed

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  11. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  12. A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering.

    PubMed

    Yin, Changchuan; Chen, Ying; Yau, Stephen S-T

    2014-10-21

    Multiple sequence alignment (MSA) is a prominent method for classification of DNA sequences, yet it is hampered with inherent limitations in computational complexity. Alignment-free methods have been developed over past decade for more efficient comparison and classification of DNA sequences than MSA. However, most alignment-free methods may lose structural and functional information of DNA sequences because they are based on feature extractions. Therefore, they may not fully reflect the actual differences among DNA sequences. Alignment-free methods with information conservation are needed for more accurate comparison and classification of DNA sequences. We propose a new alignment-free similarity measure of DNA sequences using the Discrete Fourier Transform (DFT). In this method, we map DNA sequences into four binary indicator sequences and apply DFT to the indicator sequences to transform them into frequency domain. The Euclidean distance of full DFT power spectra of the DNA sequences is used as similarity distance metric. To compare the DFT power spectra of DNA sequences with different lengths, we propose an even scaling method to extend shorter DFT power spectra to equal the longest length of the sequences compared. After the DFT power spectra are evenly scaled, the DNA sequences are compared in the same DFT frequency space dimensionality. We assess the accuracy of the similarity metric in hierarchical clustering using simulated DNA and virus sequences. The results demonstrate that the DFT based method is an effective and accurate measure of DNA sequence similarity.

  13. Bacterial and fungal DNA extraction from positive blood culture bottles: a manual and an automated protocol.

    PubMed

    Mäki, Minna

    2015-01-01

    When adapting a gene amplification-based method in a routine sepsis diagnostics using a blood culture sample as a specimen type, a prerequisite for a successful and sensitive downstream analysis is the efficient DNA extraction step. In recent years, a number of in-house and commercial DNA extraction solutions have become available. Careful evaluation in respect to cell wall disruption of various microbes and subsequent recovery of microbial DNA without putative gene amplification inhibitors should be conducted prior selecting the most feasible DNA extraction solution for the downstream analysis used. Since gene amplification technologies have been developed to be highly sensitive for a broad range of microbial species, it is also important to confirm that the used sample preparation reagents and materials are bioburden-free to avoid any risks for false-positive result reporting or interference of the diagnostic process. Here, one manual and one automated DNA extraction system feasible for blood culture samples are described.

  14. Automated centrifugal-microfluidic platform for DNA purification using laser burst valve and coriolis effect.

    PubMed

    Choi, Min-Seong; Yoo, Jae-Chern

    2015-04-01

    We report a fully automated DNA purification platform with a micropored membrane in the channel utilizing centrifugal microfluidics on a lab-on-a-disc (LOD). The microfluidic flow in the LOD, into which the reagents are injected for DNA purification, is controlled by a single motor and laser burst valve. The sample and reagents pass successively through the micropored membrane in the channel when each laser burst valve is opened. The Coriolis effect is used by rotating the LOD bi-directionally to increase the purity of the DNA, thereby preventing the mixing of the waste and elution solutions. The total process from the lysed sample injection into the LOD to obtaining the purified DNA was finished within 7 min with only one manual step. The experimental result for Salmonella shows that the proposed microfluidic platform is comparable to the existing devices in terms of the purity and yield of DNA.

  15. Pericentric satellite DNA sequences in Pipistrellus pipistrellus (Vespertilionidae; Chiroptera).

    PubMed

    Barragán, M J L; Martínez, S; Marchal, J A; Fernández, R; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2003-09-01

    This paper reports the molecular and cytogenetic characterization of a HindIII family of satellite DNA in the bat species Pipistrellus pipistrellus. This satellite is organized in tandem repeats of 418 bp monomer units, and represents approximately 3% of the whole genome. The consensus sequence from five cloned monomer units has an A-T content of 62.20%. We have found differences in the ladder pattern of bands between two populations of the same species. These differences are probably because of the absence of the target sites for the HindIII enzyme in most monomer units of one population, but not in the other. Fluorescent in situ hybridization (FISH) localized the satellite DNA in the pericentromeric regions of all autosomes and the X chromosome, but it was absent from the Y chromosome. Digestion of genomic DNAs with HpaII and its isoschizomer MspI demonstrated that these repetitive DNA sequences are not methylated. Other bat species were tested for the presence of this repetitive DNA. It was absent in five Vespertilionidae and one Rhinolophidae species, indicating that it could be a species/genus specific, repetitive DNA family.

  16. Semi-automated unidirectional sequence analysis for mutation detection in a clinical diagnostic setting.

    PubMed

    Ellard, Sian; Shields, Beverley; Tysoe, Carolyn; Treacy, Rebecca; Yau, Shu; Mattocks, Christopher; Wallace, Andrew

    2009-06-01

    The past 10 years have seen an improvement in sequence data quality due to the introduction of capillary sequencers and new sequencing chemistries. In parallel, new software programs for automated mutation detection have been developed. We evaluated the sensitivity of semiautomated unidirectional sequence analysis for the detection of heterozygous base substitutions using the Mutation Surveyor software package. Detection rates for heterozygous base substitutions in 29 genes by automated and visual inspection were compared. Examples of heterozygous bases not detected in one direction during bidirectional analysis were also sought through a national survey of United Kingdom (UK) genetics laboratories. Sequence quality was assessed in a consecutive cohort of 50 patients for whom the 39 exons of the ABCC8 gene had been sequenced in one direction. A total of 701 different heterozygous base substitutions were detected by the software with no false negatives (sensitivity >or=99.57%). Four examples of heterozygous bases missed in one direction during bidirectional analysis were reported. Two were detected using unidirectional analysis settings, and the other two bases had low-quality scores. Of the 1950 amplicons examined, 97.2% had a quality score >or=30 and an average PHRED-like score >or=50 for the defined region of interest, and 98.1% of the 323,650 bases had a PHRED score >40. We found no evidence to support a requirement for bidirectional sequencing. Semiautomated analysis of good quality unidirectional sequence data has high sensitivity and is suitable for heterozygote mutation scanning in clinical diagnostic laboratories. Further work is required to determine minimum quality parameters for semiautomated analysis.

  17. RNA sequencing using fluorescent-labeled dideoxynucleotides and automated fluorescence detection.

    PubMed Central

    Bauer, G J

    1990-01-01

    Although dideoxy terminated sequencing of RNA, using reverse transcriptase and oligodeoxynucleotide primers, is now a well established method, the accuracy is limited by sequence ambiguities due to unspecific chain termination events. A protocol is described which circumvents these ambiguities by using fluorescence labels tagged to dideoxynucleotides. Only chain terminations caused by dideoxynucleotides were detected while premature terminated cDNA's remain undetectable. In addition, the remaining multiple signals at nucleotide positions can be assigned to sequence heterogeneities within the RNA sequence to be determined. Images PMID:1690393

  18. A frameshift error detection algorithm for DNA sequencing projects.

    PubMed Central

    Fichant, G A; Quentin, Y

    1995-01-01

    During the determination of DNA sequences, frameshift errors are not the most frequent but they are the most bothersome as they corrupt the amino acid sequence over several residues. Detection of such errors by sequence alignment is only possible when related sequences are found in the databases. To avoid this limitation, we have developed a new tool based on the distribution of non-overlapping 3-tuples or 6-tuples in the three frames of an ORF. The method relies upon the result of a correspondence analysis. It has been extensively tested on Bacillus subtilis and Saccharomyces cerevisiae sequences and has also been examined with human sequences. The results indicate that it can detect frameshift errors affecting as few as 20 bp with a low rate of false positives (no more than 1.0/1000 bp scanned). The proposed algorithm can be used to scan a large collection of data, but it is mainly intended for laboratory practice as a tool for checking the quality of the sequences produced during a sequencing project. PMID:7659513

  19. MEME: discovering and analyzing DNA and protein sequence motifs.

    PubMed

    Bailey, Timothy L; Williams, Nadya; Misleh, Chris; Li, Wilfred W

    2006-07-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel 'signals' in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (http://meme.nbcr.net) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.

  20. MEME: discovering and analyzing DNA and protein sequence motifs

    PubMed Central

    Bailey, Timothy L.; Williams, Nadya; Misleh, Chris; Li, Wilfred W.

    2006-01-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel ‘signals’ in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource () and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance. PMID:16845028

  1. Prediction of fine-tuned promoter activity from DNA sequence

    PubMed Central

    Siwo, Geoffrey; Rider, Andrew; Tan, Asako; Pinapati, Richard; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael

    2016-01-01

    The quantitative prediction of transcriptional activity of genes using promoter sequence is fundamental to the engineering of biological systems for industrial purposes and understanding the natural variation in gene expression. To catalyze the development of new algorithms for this purpose, the Dialogue on Reverse Engineering Assessment and Methods (DREAM) organized a community challenge seeking predictive models of promoter activity given normalized promoter activity data for 90 ribosomal protein promoters driving expression of a fluorescent reporter gene. By developing an unbiased modeling approach that performs an iterative search for predictive DNA sequence features using the frequencies of various k-mers, inferred DNA mechanical properties and spatial positions of promoter sequences, we achieved the best performer status in this challenge. The specific predictive features used in the model included the frequency of the nucleotide G, the length of polymeric tracts of T and TA, the frequencies of 6 distinct trinucleotides and 12 tetranucleotides, and the predicted protein deformability of the DNA sequence. Our method accurately predicted the activity of 20 natural variants of ribosomal protein promoters (Spearman correlation r = 0.73) as compared to 33 laboratory-mutated variants of the promoters (r = 0.57) in a test set that was hidden from participants. Notably, our model differed substantially from the rest in 2 main ways: i) it did not explicitly utilize transcription factor binding information implying that subtle DNA sequence features are highly associated with gene expression, and ii) it was entirely based on features extracted exclusively from the 100 bp region upstream from the translational start site demonstrating that this region encodes much of the overall promoter activity. The findings from this study have important implications for the engineering of predictable gene expression systems and the evolution of gene expression in naturally occurring

  2. Prediction of fine-tuned promoter activity from DNA sequence.

    PubMed

    Siwo, Geoffrey; Rider, Andrew; Tan, Asako; Pinapati, Richard; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael

    2016-01-01

    The quantitative prediction of transcriptional activity of genes using promoter sequence is fundamental to the engineering of biological systems for industrial purposes and understanding the natural variation in gene expression. To catalyze the development of new algorithms for this purpose, the Dialogue on Reverse Engineering Assessment and Methods (DREAM) organized a community challenge seeking predictive models of promoter activity given normalized promoter activity data for 90 ribosomal protein promoters driving expression of a fluorescent reporter gene. By developing an unbiased modeling approach that performs an iterative search for predictive DNA sequence features using the frequencies of various k-mers, inferred DNA mechanical properties and spatial positions of promoter sequences, we achieved the best performer status in this challenge. The specific predictive features used in the model included the frequency of the nucleotide G, the length of polymeric tracts of T and TA, the frequencies of 6 distinct trinucleotides and 12 tetranucleotides, and the predicted protein deformability of the DNA sequence. Our method accurately predicted the activity of 20 natural variants of ribosomal protein promoters (Spearman correlation r = 0.73) as compared to 33 laboratory-mutated variants of the promoters (r = 0.57) in a test set that was hidden from participants. Notably, our model differed substantially from the rest in 2 main ways: i) it did not explicitly utilize transcription factor binding information implying that subtle DNA sequence features are highly associated with gene expression, and ii) it was entirely based on features extracted exclusively from the 100 bp region upstream from the translational start site demonstrating that this region encodes much of the overall promoter activity. The findings from this study have important implications for the engineering of predictable gene expression systems and the evolution of gene expression in naturally occurring

  3. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  4. Development of a protein microarray using sequence-specific DNA binding domain on DNA chip surface

    SciTech Connect

    Choi, Yoo Seong; Pack, Seung Pil; Yoo, Young Je . E-mail: yjyoo@snu.ac.kr

    2005-04-22

    A protein microarray based on DNA microarray platform was developed to identify protein-protein interactions in vitro. The conventional DNA chip surface by 156-bp PCR product was prepared for a substrate of protein microarray. High-affinity sequence-specific DNA binding domain, GAL4 DNA binding domain, was introduced to the protein microarray as fusion partner of a target model protein, enhanced green fluorescent protein. The target protein was oriented immobilized directly on the DNA chip surface. Finally, monoclonal antibody of the target protein was used to identify the immobilized protein on the surface. This study shows that the conventional DNA chip can be used to make a protein microarray directly, and this novel protein microarray can be applicable as a tool for identifying protein-protein interactions.

  5. A CLIQUE algorithm using DNA computing techniques based on closed-circle DNA sequences.

    PubMed

    Zhang, Hongyan; Liu, Xiyu

    2011-07-01

    DNA computing has been applied in broad fields such as graph theory, finite state problems, and combinatorial problem. DNA computing approaches are more suitable used to solve many combinatorial problems because of the vast parallelism and high-density storage. The CLIQUE algorithm is one of the gird-based clustering techniques for spatial data. It is the combinatorial problem of the density cells. Therefore we utilize DNA computing using the closed-circle DNA sequences to execute the CLIQUE algorithm for the two-dimensional data. In our study, the process of clustering becomes a parallel bio-chemical reaction and the DNA sequences representing the marked cells can be combined to form a closed-circle DNA sequences. This strategy is a new application of DNA computing. Although the strategy is only for the two-dimensional data, it provides a new idea to consider the grids to be vertexes in a graph and transform the search problem into a combinatorial problem.

  6. Directly repeated sequences associated with pathogenic mitochondrial DNA deletions.

    PubMed Central

    Johns, D R; Rutledge, S L; Stine, O C; Hurko, O

    1989-01-01

    We determined the nucleotide sequences of junctional regions associated with large deletions of mitochondrial DNA found in four unrelated individuals with a phenotype of chronic progressive external ophthalmoplegia. In each patient, the deletion breakpoint occurred within a directly repeated sequence of 13-18 base pairs, present in different regions of the normal mitochondrial genome-separated by 4.5-7.7 kilobases. In two patients, the deletions were identical. When all four repeated sequences are compared, a consensus sequence of 11 nucleotides emerges, similar to putative recombination signals, suggesting the involvement of a recombinational event. Partially deleted and normal mitochondrial DNAs were found in all tissues examined, but in very different proportions, indicating that these mutations originated before the primary cell layers diverged. Images PMID:2813377

  7. Cloning and sequencing of cDNA and genomic DNA encoding PDM phosphatase of Fusarium moniliforme.

    PubMed

    Yoshida, Hiroshi; Iizuka, Mari; Narita, Takao; Norioka, Naoko; Norioka, Shigemi

    2006-12-01

    PDM phosphatase was purified approximately 500-fold through six steps from the extract of dried powder of the culture filtrate of Fusarium moniliforme. The purified preparation appeared homogeneous on SDS-PAGE although the protein band was broad. Amino acid sequence information was collected on tryptic peptides from this preparation. cDNA cloning was carried out based on the information. A full-length cDNA was obtained and sequenced. The sequence had an open reading frame of 651 amino acid residues with a molecular mass of 69,988 Da. Cloning and sequencing of the genomic DNA corresponding to the cDNA was also conducted. The deduced amino acid sequence could account for many but not all of the tryptic peptides, suggesting presence of contaminant protein(s). SDS-PAGE analysis after chemical deglycosylation showed two proteins with molecular masses of 58 and 68 kDa. This implied that the 58 kDa protein had been copurified with PDM phosphatase. Homology search showed that PDM phosphatase belongs to the purple acid phosphatase family, which is widely distributed in the biosphere. Sequence data of fungal purple acid phosphatases were collected from the database. Processing of the data revealed presence of two types, whose evolutionary relationships were discussed.

  8. Automated Reconstruction of Whole-Genome Phylogenies from Short-Sequence Reads

    PubMed Central

    Bertels, Frederic; Silander, Olin K.; Pachkov, Mikhail; Rainey, Paul B.; van Nimwegen, Erik

    2014-01-01

    Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating these assemblies, and aligning orthologous genes, many recent studies 1) directly map raw sequencing reads to a single reference sequence, 2) extract single nucleotide polymorphisms (SNPs), and 3) infer the phylogenetic tree using maximum likelihood methods from the aligned SNP positions. However, here we show that, when using such methods to reconstruct phylogenies from sets of simulated sequences, both the exclusion of nonpolymorphic positions and the alignment to a single reference genome, introduce systematic biases and errors in phylogeny reconstruction. To address these problems, we developed a new method that combines alignments from mappings to multiple reference sequences and show that this successfully removes biases from the reconstructed phylogenies. We implemented this method as a web server named REALPHY (Reference sequence Alignment-based Phylogeny builder), which fully automates phylogenetic reconstruction from raw sequencing reads. PMID:24600054

  9. Automation of cDNA microarray hybridization and washing yields improved data quality.

    PubMed

    Yauk, Carole; Berndt, Lynn; Williams, Andrew; Douglas, George R

    2005-07-29

    Microarray technology allows the analysis of whole-genome transcription within a single hybridization, and has become a standard research tool. It is extremely important to minimize variation in order to obtain high quality microarray data that can be compared among experiments and laboratories. The majority of facilities implement manual hybridization approaches for microarray studies. We developed an automated method for cDNA microarray hybridization that uses equivalent pre-hybridization, hybridization and washing conditions to the suggested manual protocol. The automated method significantly decreased variability across microarray slides compared to manual hybridization. Although normalized signal intensities for buffer-only spots across the chips were identical, significantly reduced variation and inter-quartile ranges were obtained using the automated workstation. This decreased variation led to improved correlation among technical replicates across slides in both the Cy3 and Cy5 channels.

  10. Human somatostatin I: sequence of the cDNA.

    PubMed Central

    Shen, L P; Pictet, R L; Rutter, W J

    1982-01-01

    RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875

  11. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  12. Viral Discovery and Sequence Recovery Using DNA Microarrays

    PubMed Central

    Wang, David; Urisman, Anatoly; Liu, Yu-Tsueng; Springer, Michael; Ksiazek, Thomas G; Erdman, Dean D; Mardis, Elaine R; Hickenbotham, Matthew; Magrini, Vincent; Eldred, James; Latreille, J. Phillipe; Wilson, Richard K; Ganem, Don

    2003-01-01

    Because of the constant threat posed by emerging infectious diseases and the limitations of existing approaches used to identify new pathogens, there is a great demand for new technological methods for viral discovery. We describe herein a DNA microarray-based platform for novel virus identification and characterization. Central to this approach was a DNA microarray designed to detect a wide range of known viruses as well as novel members of existing viral families; this microarray contained the most highly conserved 70mer sequences from every fully sequenced reference viral genome in GenBank. During an outbreak of severe acute respiratory syndrome (SARS) in March 2003, hybridization to this microarray revealed the presence of a previously uncharacterized coronavirus in a viral isolate cultivated from a SARS patient. To further characterize this new virus, approximately 1 kb of the unknown virus genome was cloned by physically recovering viral sequences hybridized to individual array elements. Sequencing of these fragments confirmed that the virus was indeed a new member of the coronavirus family. This combination of array hybridization followed by direct viral sequence recovery should prove to be a general strategy for the rapid identification and characterization of novel viruses and emerging infectious disease. PMID:14624234

  13. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Treesearch

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  14. The implementation of bit-parallelism for DNA sequence alignment

    NASA Astrophysics Data System (ADS)

    Setyorini; Kuspriyanto; Widyantoro, D. H.; Pancoro, A.

    2017-05-01

    Dynamic Programming (DP) remain the central algorithm of biological sequence alignment. Matching score computation is the most time-consuming process. Bit-parallelism is one of approximate string matching techniques that transform DP matrix cell unit processing into word unit (groups of cell). Bit-parallelism computate the scores column-wise. Adopting from word processing in computer system work, this technique promise reducing time in score computing process in DP matrix. In this paper, we implement bit-parallelism technique for DNA sequence alignment. Our bit-parallelism implementation have less time for score computational process but still need improvement for there construction process.

  15. A syntactic pattern recognition system for DNA sequences

    SciTech Connect

    Searls, D.; Dong, S.

    1993-12-31

    The authors review both theoretical and practical results of a linguistic approach to studying the structure of features of DNA sequences. Using generative grammars, complex assemblages can not only be described and analyzed abstractly, but also concretely, such that features can be searched for by a general-purpose parser. Their parser, called GENLANG, uses an extended logic grammar formalism and has found features as complex as TRNA genes, group 1 introns, and protein-encoding genes, within input sequences on a genomic scale.

  16. Detection theory in identification of RNA-DNA sequence differences using RNA-sequencing.

    PubMed

    Toung, Jonathan M; Lahens, Nicholas; Hogenesch, John B; Grant, Gregory

    2014-01-01

    Advances in sequencing technology have allowed for detailed analyses of the transcriptome at single-nucleotide resolution, facilitating the study of RNA editing or sequence differences between RNA and DNA genome-wide. In humans, two types of post-transcriptional RNA editing processes are known to occur: A-to-I deamination by ADAR and C-to-U deamination by APOBEC1. In addition to these sequence differences, researchers have reported the existence of all 12 types of RNA-DNA sequence differences (RDDs); however, the validity of these claims is debated, as many studies claim that technical artifacts account for the majority of these non-canonical sequence differences. In this study, we used a detection theory approach to evaluate the performance of RNA-Sequencing (RNA-Seq) and associated aligners in accurately identifying RNA-DNA sequence differences. By generating simulated RNA-Seq datasets containing RDDs, we assessed the effect of alignment artifacts and sequencing error on the sensitivity and false discovery rate of RDD detection. Overall, we found that even in the presence of sequencing errors, false negative and false discovery rates of RDD detection can be contained below 10% with relatively lenient thresholds. We also assessed the ability of various filters to target false positive RDDs and found them to be effective in discriminating between true and false positives. Lastly, we used the optimal thresholds we identified from our simulated analyses to identify RDDs in a human lymphoblastoid cell line. We found approximately 6,000 RDDs, the majority of which are A-to-G edits and likely to be mediated by ADAR. Moreover, we found the majority of non A-to-G RDDs to be associated with poorer alignments and conclude from these results that the evidence for widespread non-canonical RDDs in humans is weak. Overall, we found RNA-Seq to be a powerful technique for surveying RDDs genome-wide when coupled with the appropriate thresholds and filters.

  17. Sequences sufficient for programming imprinted germline DNA methylation defined.

    PubMed

    Park, Yoon Jung; Herman, Herry; Gao, Ying; Lindroth, Anders M; Hu, Benjamin Y; Murphy, Patrick J; Putnam, James R; Soloway, Paul D

    2012-01-01

    Epigenetic marks are fundamental to normal development, but little is known about signals that dictate their placement. Insights have been provided by studies of imprinted loci in mammals, where monoallelic expression is epigenetically controlled. Imprinted expression is regulated by DNA methylation programmed during gametogenesis in a sex-specific manner and maintained after fertilization. At Rasgrf1 in mouse, paternal-specific DNA methylation on a differential methylation domain (DMD) requires downstream tandem repeats. The DMD and repeats constitute a binary switch regulating paternal-specific expression. Here, we define sequences sufficient for imprinted methylation using two transgenic mouse lines: One carries the entire Rasgrf1 cluster (RC); the second carries only the DMD and repeats (DR) from Rasgrf1. The RC transgene recapitulated all aspects of imprinting seen at the endogenous locus. DR underwent proper DNA methylation establishment in sperm and erasure in oocytes, indicating the DMD and repeats are sufficient to program imprinted DNA methylation in germlines. Both transgenes produce a DMD-spanning pit-RNA, previously shown to be necessary for imprinted DNA methylation at the endogenous locus. We show that when pit-RNA expression is controlled by the repeats, it regulates DNA methylation in cis only and not in trans. Interestingly, pedigree history dictated whether established DR methylation patterns were maintained after fertilization. When DR was paternally transmitted followed by maternal transmission, the unmethylated state that was properly established in the female germlines could not be maintained. This provides a model for transgenerational epigenetic inheritance in mice.

  18. cisExpress: motif detection in DNA sequences.

    PubMed

    Triska, Martin; Grocutt, David; Southern, James; Murphy, Denis J; Tatarinova, Tatiana

    2013-09-01

    One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest. cisExpress is available at www.cisexpress.org.

  19. Silicene as a new potential DNA sequencing device

    NASA Astrophysics Data System (ADS)

    Amorim, Rodrigo G.; Scheicher, Ralph H.

    2015-04-01

    Silicene, a hexagonal buckled 2D allotrope of silicon, shows potential as a platform for numerous new applications, and may allow for easier integration with existing silicon-based microelectronics than graphene. Here, we show that silicene could function as an electrical DNA sequencing device. We investigated the stability of this novel nano-bio system, its electronic properties and the pronounced effects on the transverse electronic transport, i.e., changes in the transmission and the conductance caused by adsorption of each nucleobase, explored by us through the non-equilibrium Green’s function method. Intriguingly, despite the relatively weak interaction between nucleobases and silicene, significant changes in the transmittance at zero bias are predicted by us, in particular for the two nucleobases cytosine and guanine. Our findings suggest that silicene could be utilized as an integrated-circuit biosensor as part of a lab-on-a-chip device for DNA sequencing.

  20. cisExpress: motif detection in DNA sequences

    PubMed Central

    Triska, Martin; Grocutt, David; Southern, James; Murphy, Denis J.; Tatarinova, Tatiana

    2013-01-01

    Motivation: One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest. Availability: cisExpress is available at www.cisexpress.org. Contact: tatiana.tatarinova@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23793750