Science.gov

Sample records for automated dna sequencing

  1. Automated DNA Sequencing System

    SciTech Connect

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  2. Automated DNA sequencing.

    PubMed

    Wallis, Yvonne; Morrell, Natalie

    2011-01-01

    Fluorescent cycle sequencing of PCR products is a multistage process and several methodologies are available to perform each stage. This chapter will describe the more commonly utilised dye-terminator cycle sequencing approach using BigDye® terminator chemistry (Applied Biosystems) ready for analysis on a 3730 DNA genetic analyzer. Even though DNA sequencing is one of the most common and robust techniques performed in molecular laboratories it may not always produce desirable results. The causes of the most common problems will also be discussed in this chapter. PMID:20938839

  3. Use of an automated capillary DNA sequencer to investigate the interaction of cisplatin with telomeric DNA sequences.

    PubMed

    Paul, Moumita; Murray, Vincent

    2012-03-01

    The determination of the sequence selectivity of DNA-damaging agents is very important in elucidating the mechanism of action of anti-tumour drugs. The development of automated capillary DNA sequencers with fluorescent labelling has enabled a more precise method for DNA sequence specificity analysis. In this work we utilized the ABI 3730 capillary sequencer with laser-induced fluorescence to examine the sequence selectivity of cisplatin with purified DNA sequences. The use of this automated machine enabled a higher degree of precision of both position and intensity of cisplatin-DNA adducts than previously possible with manual and automated slab gel procedures. A problem with artefact bands was overcome by ethanol precipitation. It was found that cisplatin strongly formed adducts with telomeric DNA sequences. PMID:21678458

  4. Automated hybridization/imaging device for fluorescent multiplex DNA sequencing

    DOEpatents

    Weiss, Robert B.; Kimball, Alvin W.; Gesteland, Raymond F.; Ferguson, F. Mark; Dunn, Diane M.; Di Sera, Leonard J.; Cherry, Joshua L.

    1995-01-01

    A method is disclosed for automated multiplex sequencing of DNA with an integrated automated imaging hybridization chamber system. This system comprises an hybridization chamber device for mounting a membrane containing size-fractionated multiplex sequencing reaction products, apparatus for fluid delivery to the chamber device, imaging apparatus for light delivery to the membrane and image recording of fluorescence emanating from the membrane while in the chamber device, and programmable controller apparatus for controlling operation of the system. The multiplex reaction products are hybridized with a probe, then an enzyme (such as alkaline phosphatase) is bound to a binding moiety on the probe, and a fluorogenic substrate (such as a benzothiazole derivative) is introduced into the chamber device by the fluid delivery apparatus. The enzyme converts the fluorogenic substrate into a fluorescent product which, when illuminated in the chamber device with a beam of light from the imaging apparatus, excites fluorescence of the fluorescent product to produce a pattern of hybridization. The pattern of hybridization is imaged by a CCD camera component of the imaging apparatus to obtain a series of digital signals. These signals are converted by the controller apparatus into a string of nucleotides corresponding to the nucleotide sequence an automated sequence reader. The method and apparatus are also applicable to other membrane-based applications such as colony and plaque hybridization and Southern, Northern, and Western blots.

  5. Automated hybridization/imaging device for fluorescent multiplex DNA sequencing

    DOEpatents

    Weiss, R.B.; Kimball, A.W.; Gesteland, R.F.; Ferguson, F.M.; Dunn, D.M.; Di Sera, L.J.; Cherry, J.L.

    1995-11-28

    A method is disclosed for automated multiplex sequencing of DNA with an integrated automated imaging hybridization chamber system. This system comprises an hybridization chamber device for mounting a membrane containing size-fractionated multiplex sequencing reaction products, apparatus for fluid delivery to the chamber device, imaging apparatus for light delivery to the membrane and image recording of fluorescence emanating from the membrane while in the chamber device, and programmable controller apparatus for controlling operation of the system. The multiplex reaction products are hybridized with a probe, the enzyme (such as alkaline phosphatase) is bound to a binding moiety on the probe, and a fluorogenic substrate (such as a benzothiazole derivative) is introduced into the chamber device by the fluid delivery apparatus. The enzyme converts the fluorogenic substrate into a fluorescent product which, when illuminated in the chamber device with a beam of light from the imaging apparatus, excites fluorescence of the fluorescent product to produce a pattern of hybridization. The pattern of hybridization is imaged by a CCD camera component of the imaging apparatus to obtain a series of digital signals. These signals are converted by the controller apparatus into a string of nucleotides corresponding to the nucleotide sequence an automated sequence reader. The method and apparatus are also applicable to other membrane-based applications such as colony and plaque hybridization and Southern, Northern, and Western blots. 9 figs.

  6. A new technique for determining the distribution of N7-methylguanine using an automated DNA sequencer.

    PubMed

    Shoukry, S; Anderson, M W; Glickman, B W

    1991-11-01

    We have developed a method to determine rapidly the sequence specificity of DNA alkylation resulting from chemical treatment. The utility of this approach is demonstrated here in a study of the sequence specificity of alkylation by dimethylsulphate (DMS). The method is independent of the sequence chosen and makes use of the polymerase chain reaction (PCR) to generate a fluorescently labelled DNA target. In this study, a 302 bp segment of the Escherichia coli lacI gene was amplified and the product purified by liquid chromatography on a Mono Q column. This DNA was alkylated with DMS and treated with hot piperidine to produce single-strand breaks at sites of N7 alkylation. The distribution of the break points, and hence the position and extent of alkylation, were determined on an Applied Biosystems 370A automated DNA sequencer. PMID:1682064

  7. Aptaligner: Automated Software for Aligning Pseudorandom DNA X-Aptamers from Next-Generation Sequencing Data

    PubMed Central

    2015-01-01

    Next-generation sequencing results from bead-based aptamer libraries have demonstrated that traditional DNA/RNA alignment software is insufficient. This is particularly true for X-aptamers containing specialty bases (W, X, Y, Z, ...) that are identified by special encoding. Thus, we sought an automated program that uses the inherent design scheme of bead-based X-aptamers to create a hypothetical reference library and Markov modeling techniques to provide improved alignments. Aptaligner provides this feature as well as length error and noise level cutoff features, is parallelized to run on multiple central processing units (cores), and sorts sequences from a single chip into projects and subprojects. PMID:24866698

  8. Aptaligner: automated software for aligning pseudorandom DNA X-aptamers from next-generation sequencing data.

    PubMed

    Lu, Emily; Elizondo-Riojas, Miguel-Angel; Chang, Jeffrey T; Volk, David E

    2014-06-10

    Next-generation sequencing results from bead-based aptamer libraries have demonstrated that traditional DNA/RNA alignment software is insufficient. This is particularly true for X-aptamers containing specialty bases (W, X, Y, Z, ...) that are identified by special encoding. Thus, we sought an automated program that uses the inherent design scheme of bead-based X-aptamers to create a hypothetical reference library and Markov modeling techniques to provide improved alignments. Aptaligner provides this feature as well as length error and noise level cutoff features, is parallelized to run on multiple central processing units (cores), and sorts sequences from a single chip into projects and subprojects. PMID:24866698

  9. Automation and integration of multiplexed on-line sample preparation with capillary electrophoresis for DNA sequencing

    SciTech Connect

    Tan, H.

    1999-03-31

    The purpose of this research is to develop a multiplexed sample processing system in conjunction with multiplexed capillary electrophoresis for high-throughput DNA sequencing. The concept from DNA template to called bases was first demonstrated with a manually operated single capillary system. Later, an automated microfluidic system with 8 channels based on the same principle was successfully constructed. The instrument automatically processes 8 templates through reaction, purification, denaturation, pre-concentration, injection, separation and detection in a parallel fashion. A multiplexed freeze/thaw switching principle and a distribution network were implemented to manage flow direction and sample transportation. Dye-labeled terminator cycle-sequencing reactions are performed in an 8-capillary array in a hot air thermal cycler. Subsequently, the sequencing ladders are directly loaded into a corresponding size-exclusion chromatographic column operated at {approximately} 60 C for purification. On-line denaturation and stacking injection for capillary electrophoresis is simultaneously accomplished at a cross assembly set at {approximately} 70 C. Not only the separation capillary array but also the reaction capillary array and purification columns can be regenerated after every run. DNA sequencing data from this system allow base calling up to 460 bases with accuracy of 98%.

  10. Automated DNA mutation detection using universal conditions direct sequencing: application to ten muscular dystrophy genes

    PubMed Central

    2009-01-01

    sequences are reported in this paper. Conclusion This automated process allows laboratories to discover DNA variations in a short time and at low cost. PMID:19835634

  11. Dna Sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  12. High-speed automated DNA sequencing utilizing from-the-side laser excitation

    NASA Astrophysics Data System (ADS)

    Westphall, Michael S.; Brumley, Robert L., Jr.; Buxton, Erin C.; Smith, Lloyd M.

    1995-04-01

    The Human Genome Initiative is an ambitious international effort to map and sequence the three billion bases of DNA encoded in the human genome. If successfully completed, the resultant sequence database will be a tool of unparalleled power for biomedical research. One of the major challenges of this project is in the area of DNA sequencing technology. At this time, virtually all DNA sequencing is based upon the separation of DNA fragments in high resolution polyacrylamide gels. This method, as generally practiced, is one to two orders of magnitude too slow and expensive for the successful completion of the Human Genome projection. One reasonable approach is improved sequencing of DNA fragments is to increase the performance of such gel-based sequencing methods. Decreased sequencing times may be obtained by increasing the magnitude of the electric field employed. This is not possible with conventional sequencing, due to the fact that the additional heat associated with the increased electric field cannot be adequately dissipated. Recent developments in the use of thin gels have addressed this problem. Performing electrophoresis in ultrathin (50 to 100 microns) gels greatly increases the heat transfer efficiency, thus allowing the benefits of larger electric fields to be obtained. An increase in separation speed of about an order of magnitude is readily achieved. Thin gels have successfully been used in capillary and slab formats. A detection system has been designed for use with a multiple fluorophore sequencing strategy in horizontal ultrathin slab gels. The system employs laser through-the-side excitation and a cooled CCD detector; this allows for the parallel detection of up to 24 sets of four fluorescently labeled DNA sequencing reactions during their electrophoretic separation in ultrathin (115 micrometers ) denaturing polyacrylamide gels. Four hundred bases of sequence information is obtained from 100 ng of M13 template DNA in an hour, corresponding to an

  13. DNA Sequencing apparatus

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1992-01-01

    An automated DNA sequencing apparatus having a reactor for providing at least two series of DNA products formed from a single primer and a DNA strand, each DNA product of a series differing in molecular weight and having a chain terminating agent at one end; separating means for separating the DNA products to form a series bands, the intensity of substantially all nearby bands in a different series being different, band reading means for determining the position an This invention was made with government support including a grant from the U.S. Public Health Service, contract number AI-06045. The U.S. government has certain rights in the invention.

  14. A fully automated 384 capillary array for DNA sequencer. Final report

    SciTech Connect

    Li, Qingbo; Kane, T

    2003-03-20

    Phase I SpectruMedix has successfully developed an automatic 96-capillary array DNA prototype based on the multiplexed capillary electrophoresis system originated from Ames Laboratory-USDOE, Iowa State University. With computer control of all steps involved in a 96-capillary array running cycle, the prototype instrument (the SCE9600) is now capable of sequencing 450 base pairs (bp) per capillary, or 48,000 bp per instrument run within 2 hrs. Phase II of this grant involved the advancement of the core 96 capillary technologies, as well as designing a high density 384 capillary prototype. True commercialization of the 96 capillary instrument involved finalization of the gel matrix, streamlining the instrument hardware, creating a more reliable capillary cartridge, and further advancement of the data processing software. Together these silos of technology create a truly commercializable product (the SCE9610) capable of meeting the operation needs of the sequencing centers.

  15. Algorithms for automated DNA assembly

    PubMed Central

    Densmore, Douglas; Hsiau, Timothy H.-C.; Kittleson, Joshua T.; DeLoache, Will; Batten, Christopher; Anderson, J. Christopher

    2010-01-01

    Generating a defined set of genetic constructs within a large combinatorial space provides a powerful method for engineering novel biological functions. However, the process of assembling more than a few specific DNA sequences can be costly, time consuming and error prone. Even if a correct theoretical construction scheme is developed manually, it is likely to be suboptimal by any number of cost metrics. Modular, robust and formal approaches are needed for exploring these vast design spaces. By automating the design of DNA fabrication schemes using computational algorithms, we can eliminate human error while reducing redundant operations, thus minimizing the time and cost required for conducting biological engineering experiments. Here, we provide algorithms that optimize the simultaneous assembly of a collection of related DNA sequences. We compare our algorithms to an exhaustive search on a small synthetic dataset and our results show that our algorithms can quickly find an optimal solution. Comparison with random search approaches on two real-world datasets show that our algorithms can also quickly find lower-cost solutions for large datasets. PMID:20335162

  16. Image analysis for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Palaniappan, Kannappan; Huang, Thomas S.

    1991-07-01

    There is a great deal of interest in automating the process of DNA (deoxyribonucleic acid) sequencing to support the analysis of genomic DNA such as the Human and Mouse Genome projects. In one class of gel-based sequencing protocols autoradiograph images are generated in the final step and usually require manual interpretation to reconstruct the DNA sequence represented by the image. The need to handle a large volume of sequence information necessitates automation of the manual autoradiograph reading step through image analysis in order to reduce the length of time required to obtain sequence data and reduce transcription errors. Various adaptive image enhancement, segmentation and alignment methods were applied to autoradiograph images. The methods are adaptive to the local characteristics of the image such as noise, background signal, or presence of edges. Once the two-dimensional data is converted to a set of aligned one-dimensional profiles waveform analysis is used to determine the location of each band which represents one nucleotide in the sequence. Different classification strategies including a rule-based approach are investigated to map the profile signals, augmented with the original two-dimensional image data as necessary, to textual DNA sequence information.

  17. Initial analysis of non-typical Leber hereditary optic neuropathy (LHON) at onset and late developing demyelinating disease in Italian patients by SSCP and automated DNA sequence analysis

    SciTech Connect

    Sartore, M.; Semeraro, A.; Fortina, P.

    1994-09-01

    LHON is a mitochondrial genetic disease characterized by maternal inheritance and late onset of blindness caused by bilateral retinal degeneration. A number of molecular defects are known affecting expression of seven mitochondrial genes encoding subunits of respiratory chain complex I, III and IV. We screened genomic DNA from Italian patients for seven of the known point mutations in the ND-1, ND-4 and ND-6 subunits of complex I by PCR followed by SSCP and restriction enzyme digestion. Most of the patients had nonfamilial bilateral visual loss with partial or no recovery and normal neurological examination. Fundoscopic examination revealed that none of the patients had features typical of LHON. Nine of 21 patients (43%) showed multifocal CNS demyelination on MRI. Our results show aberrant SSCP patterns for a PCR product from the ND-4 subunit in one affected child and his mother. Sfa NI and Mae III digestions suggested the absence of a previously defined LHON mutation, and automated DNA sequence analysis revealed two A to G neutral sequence polymorphisms in the third position of codons 351 and 353. In addition, PCR products from the same two samples and an unrelated one showed abnormal SSCP patterns for the ND-1 subunit region of complex I due to the presence of a T to C change at nt 4,216 which was demonstrated after Nla III digestion of PCR products and further confirmed by DNA sequence analysis. Our results indicate that additional defects are present in the Italian population, and identification of abnormal SSCP patterns followed by targeted automated DNA sequence analysis is a reasonable strategy for delineation of new LHON mutations.

  18. j5 DNA assembly design automation.

    PubMed

    Hillson, Nathan J

    2014-01-01

    Modern standardized methodologies, described in detail in the previous chapters of this book, have enabled the software-automated design of optimized DNA construction protocols. This chapter describes how to design (combinatorial) scar-less DNA assembly protocols using the web-based software j5. j5 assists biomedical and biotechnological researchers construct DNA by automating the design of optimized protocols for flanking homology sequence as well as type IIS endonuclease-mediated DNA assembly methodologies. Unlike any other software tool available today, j5 designs scar-less combinatorial DNA assembly protocols, performs a cost-benefit analysis to identify which portions of an assembly process would be less expensive to outsource to a DNA synthesis service provider, and designs hierarchical DNA assembly strategies to mitigate anticipated poor assembly junction sequence performance. Software integrated with j5 add significant value to the j5 design process through graphical user-interface enhancement and downstream liquid-handling robotic laboratory automation. PMID:24395369

  19. Comparison of Boiling and Robotics Automation Method in DNA Extraction for Metagenomic Sequencing of Human Oral Microbes.

    PubMed

    Yamagishi, Junya; Sato, Yukuto; Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu

    2016-01-01

    The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no "gold standard" for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study. PMID:27104353

  20. Comparison of Boiling and Robotics Automation Method in DNA Extraction for Metagenomic Sequencing of Human Oral Microbes

    PubMed Central

    Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu

    2016-01-01

    The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no “gold standard” for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study. PMID:27104353

  1. A Microfluidic Device for Preparing Next Generation DNA Sequencing Libraries and for Automating Other Laboratory Protocols That Require One or More Column Chromatography Steps

    PubMed Central

    Tan, Swee Jin; Phan, Huan; Gerry, Benjamin Michael; Kuhn, Alexandre; Hong, Lewis Zuocheng; Min Ong, Yao; Poon, Polly Suk Yean; Unger, Marc Alexander; Jones, Robert C.; Quake, Stephen R.; Burkholder, William F.

    2013-01-01

    Library preparation for next-generation DNA sequencing (NGS) remains a key bottleneck in the sequencing process which can be relieved through improved automation and miniaturization. We describe a microfluidic device for automating laboratory protocols that require one or more column chromatography steps and demonstrate its utility for preparing Next Generation sequencing libraries for the Illumina and Ion Torrent platforms. Sixteen different libraries can be generated simultaneously with significantly reduced reagent cost and hands-on time compared to manual library preparation. Using an appropriate column matrix and buffers, size selection can be performed on-chip following end-repair, dA tailing, and linker ligation, so that the libraries eluted from the chip are ready for sequencing. The core architecture of the device ensures uniform, reproducible column packing without user supervision and accommodates multiple routine protocol steps in any sequence, such as reagent mixing and incubation; column packing, loading, washing, elution, and regeneration; capture of eluted material for use as a substrate in a later step of the protocol; and removal of one column matrix so that two or more column matrices with different functional properties can be used in the same protocol. The microfluidic device is mounted on a plastic carrier so that reagents and products can be aliquoted and recovered using standard pipettors and liquid handling robots. The carrier-mounted device is operated using a benchtop controller that seals and operates the device with programmable temperature control, eliminating any requirement for the user to manually attach tubing or connectors. In addition to NGS library preparation, the device and controller are suitable for automating other time-consuming and error-prone laboratory protocols requiring column chromatography steps, such as chromatin immunoprecipitation. PMID:23894273

  2. DNA sequencing conference, 2

    SciTech Connect

    Cook-Deegan, R.M.; Venter, J.C.; Gilbert, W.; Mulligan, J.; Mansfield, B.K.

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  3. DNA Sequencing by Capillary Electrophoresis

    PubMed Central

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  4. DNA sequencing: chemical methods

    SciTech Connect

    Ambrose, B.J.B.; Pless, R.C.

    1987-01-01

    Limited base-specific or base-selective cleavage of a defined DNA fragment yields polynucleotide products, the length of which correlates with the positions of the particular base (or bases) in the original fragment. Sverdlov and co-workers recognized the possibility of using this principle for the determination of DNA sequences. In 1977 a fully elaborated method was introduced based on this principle, which allowed routine analysis of DNA sequences over distances greater than 100 nucleotide unite from a defined, radiolabeled terminus. Six procedures for partial cleavage were described. Simultaneous parallel resolution of an appropriate set of partial cleavage mixtures by polyacrylamide gel electrophoresis, followed by visualization of the radioactive bands by autoradiography, allows the deduction of nucleotide sequence.

  5. Automated DNA Base Pair Calling Algorithm

    Energy Science and Technology Software Center (ESTSC)

    1999-07-07

    The procedure solves the problem of calling the DNA base pair sequence from two channel electropherogram separations in an automated fashion. The core of the program involves a peak picking algorithm based upon first, second, and third derivative spectra for each electropherogram channel, signal levels as a function of time, peak spacing, base pair signal to noise sequence patterns, frequency vs ratio of the two channel histograms, and confidence levels generated during the run. Themore » ratios of the two channels at peak centers can be used to accurately and reproducibly determine the base pair sequence. A further enhancement is a novel Gaussian deconvolution used to determine the peak heights used in generating the ratio.« less

  6. Complementary DNA sequencing: Expressed sequence tags and human genome project

    SciTech Connect

    Adams, M.D.; Kelley, J.M.; Gocayne, J.D.; Dubnick, M.; Wu, A.; Olde, B.; Moreno, R.F.; Kerlavage, A.R.; McCombie, W.R.; Venter, J.C. ); Polymeropoulos, M.H.; Hong Xiao; Merril, C.R. )

    1991-06-21

    Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.

  7. Indexing Similar DNA Sequences

    NASA Astrophysics Data System (ADS)

    Huang, Songbo; Lam, T. W.; Sung, W. K.; Tam, S. L.; Yiu, S. M.

    To study the genetic variations of a species, one basic operation is to search for occurrences of patterns in a large number of very similar genomic sequences. To build an indexing data structure on the concatenation of all sequences may require a lot of memory. In this paper, we propose a new scheme to index highly similar sequences by taking advantage of the similarity among the sequences. To store r sequences with k common segments, our index requires only O(n + NlogN) bits of memory, where n is the total length of the common segments and N is the total length of the distinct regions in all texts. The total length of all sequences is rn + N, and any scheme to store these sequences requires Ω(n + N) bits. Searching for a pattern P of length m takes O(m + m logN + m log(rk)psc(P) + occlogn), where psc(P) is the number of prefixes of P that appear as a suffix of some common segments and occ is the number of occurrences of P in all sequences. In practice, rk ≤ N, and psc(P) is usually a small constant. We have implemented our solution and evaluated our solution using real DNA sequences. The experiments show that the memory requirement of our solution is much less than that required by BWT built on the concatenation of all sequences. When compared to the other existing solution (RLCSA), we use less memory with faster searching time.

  8. Transposon facilitated DNA sequencing

    SciTech Connect

    Berg, D.E.; Berg, C.M.; Huang, H.V.

    1990-01-01

    The purpose of this research is to investigate and develop methods that exploit the power of bacterial transposable elements for large scale DNA sequencing: Our premise is that the use of transposons to put primer binding sites randomly in target DNAs should provide access to all portions of large DNA fragments, without the inefficiencies of methods involving random subcloning and attendant repetitive sequencing, or of sequential synthesis of many oligonucleotide primers that are used to match systematically along a DNA molecule. Two unrelated bacterial transposons, Tn5 and {gamma}{delta}, are being used because they have both proven useful for molecular analyses, and because they differ sufficiently in mechanism and specificity of transposition to merit parallel development.

  9. Construction and evaluation of a capillary electrophoresis DNA sequencer

    SciTech Connect

    Drossman, H.

    1992-01-01

    This dissertation describes the construction and evaluation of an automated DNA sequencer using capillary gel electrophoresis (CGE) for separating single-strand DNA fragments and a fluorescence detector for analyzing labeled fragments. Theories governing the electrophoretic separation of DNA, dispersion processes in CGE and high sensitivity fluorescence detection are reviewed. The CGE DNA sequencer is compared with current DNA sequencing instruments and with projections of future DNA sequencing instruments. Parameters affecting the limits of detection, DNA sample loading, sample mobility and resolution are evaluated. Predictions for the future of capillary electrophoresis for large-scale sequencing projects are presented.

  10. Automated DNA extraction from pollen in honey.

    PubMed

    Guertler, Patrick; Eicheldinger, Adelina; Muschler, Paul; Goerlich, Ottmar; Busch, Ulrich

    2014-04-15

    In recent years, honey has become subject of DNA analysis due to potential risks evoked by microorganisms, allergens or genetically modified organisms. However, so far, only a few DNA extraction procedures are available, mostly time-consuming and laborious. Therefore, we developed an automated DNA extraction method from pollen in honey based on a CTAB buffer-based DNA extraction using the Maxwell 16 instrument and the Maxwell 16 FFS Nucleic Acid Extraction System, Custom-Kit. We altered several components and extraction parameters and compared the optimised method with a manual CTAB buffer-based DNA isolation method. The automated DNA extraction was faster and resulted in higher DNA yield and sufficient DNA purity. Real-time PCR results obtained after automated DNA extraction are comparable to results after manual DNA extraction. No PCR inhibition was observed. The applicability of this method was further successfully confirmed by analysis of different routine honey samples. PMID:24295710

  11. Pyrosequencing sheds light on DNA sequencing.

    PubMed

    Ronaghi, M

    2001-01-01

    DNA sequencing is one of the most important platforms for the study of biological systems today. Sequence determination is most commonly performed using dideoxy chain termination technology. Recently, pyrosequencing has emerged as a new sequencing methodology. This technique is a widely applicable, alternative technology for the detailed characterization of nucleic acids. Pyrosequencing has the potential advantages of accuracy, flexibility, parallel processing, and can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides, and gel-electrophoresis. This article considers key features regarding different aspects of pyrosequencing technology, including the general principles, enzyme properties, sequencing modes, instrumentation, and potential applications. PMID:11156611

  12. A Bioluminometric Method of DNA Sequencing

    NASA Technical Reports Server (NTRS)

    Ronaghi, Mostafa; Pourmand, Nader; Stolc, Viktor; Arnold, Jim (Technical Monitor)

    2001-01-01

    Pyrosequencing is a bioluminometric single-tube DNA sequencing method that takes advantage of co-operativity between four enzymes to monitor DNA synthesis. In this sequencing-by-synthesis method, a cascade of enzymatic reactions yields detectable light, which is proportional to incorporated nucleotides. Pyrosequencing has the advantages of accuracy, flexibility and parallel processing. It can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides and gel-electrophoresis. In this chapter, the use of this technique for different applications is discussed.

  13. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  14. The Dynamics of DNA Sequencing.

    ERIC Educational Resources Information Center

    Morvillo, Nancy

    1997-01-01

    Describes a paper-and-pencil activity that helps students understand DNA sequencing and expands student understanding of DNA structure, replication, and gel electrophoresis. Appropriate for advanced biology students who are familiar with the Sanger method. (DDR)

  15. Biosensors for DNA sequence detection

    NASA Technical Reports Server (NTRS)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  16. Graphene nanodevices for DNA sequencing.

    PubMed

    Heerema, Stephanie J; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology. PMID:26839258

  17. Graphene nanodevices for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  18. Automated cycle sequencing with Taquenase: protocols for internal labeling, dye primer and "doublex" simultaneous sequencing.

    PubMed

    Voss, H; Nentwich, U; Duthie, S; Wiemann, S; Benes, V; Zimmermann, J; Ansorge, W

    1997-08-01

    This paper describes automated cycle sequencing protocols for internal labeling, dye primer and "doublex" simultaneous sequencing using Taquenase, a new genetically modified DNA polymerase with increased thermostability. Sequencing performance both with labeled and unlabeled primer yields uniform unambiguous signals up to the resolution limit of the sequencing gels. Primer walking with internal labeling was successfully performed on Pl-derived artificial chromosome (PAC) constructs with 130-kb inserts. Taquenase, a commercially available modified thermostable sequencing enzyme (delta 280, F667Y Taq DNA polymerase), incorporates a variety of fluorescent dNTPs carrying fluorescein isothiocyanate, TexasRed or Cy5 labels during the cycle-sequencing process with higher efficiency than other thermostable DNA polymerases. Comparison to other modified Taq DNA polymerases suggests that the particular N-terminal deletion of Taquenase rather than the presence of the F667Y mutation is responsible for the efficient incorporation and extension of labeled dNTPs. Taquenase makes feasible highly accurate "doublex" simultaneous cylce sequencing on both strands of template DNA with two internal labels or two dye-labeled primers in combination with the EMBL-2-dye DNA sequencing system, ARAKIS, or with two commercial DNA sequencers. It allows up to 2000 bases at > 99% accuracy to be determined in a single reaction. PMID:9266089

  19. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, S.K.

    1998-03-24

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.

  20. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, Stefan K.

    1998-01-01

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.

  1. Chromosome specific repetitive DNA sequences

    DOEpatents

    Moyzis, Robert K.; Meyne, Julianne

    1991-01-01

    A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).

  2. j5 DNA assembly design automation software.

    PubMed

    Hillson, Nathan J; Rosengarten, Rafael D; Keasling, Jay D

    2012-01-20

    Recent advances in Synthetic Biology have yielded standardized and automatable DNA assembly protocols that enable a broad range of biotechnological research and development. Unfortunately, the experimental design required for modern scar-less multipart DNA assembly methods is frequently laborious, time-consuming, and error-prone. Here, we report the development and deployment of a web-based software tool, j5, which automates the design of scar-less multipart DNA assembly protocols including SLIC, Gibson, CPEC, and Golden Gate. The key innovations of the j5 design process include cost optimization, leveraging DNA synthesis when cost-effective to do so, the enforcement of design specification rules, hierarchical assembly strategies to mitigate likely assembly errors, and the instruction of manual or automated construction of scar-less combinatorial DNA libraries. Using a GFP expression testbed, we demonstrate that j5 designs can be executed with the SLIC, Gibson, or CPEC assembly methods, used to build combinatorial libraries with the Golden Gate assembly method, and applied to the preparation of linear gene deletion cassettes for E. coli. The DNA assembly design algorithms reported here are generally applicable to broad classes of DNA construction methodologies and could be implemented to supplement other DNA assembly design tools. Taken together, these innovations save researchers time and effort, reduce the frequency of user design errors and off-target assembly products, decrease research costs, and enable scar-less multipart and combinatorial DNA construction at scales unfeasible without computer-aided design. PMID:23651006

  3. Automated Sequence Processor: Something Old, Something New

    NASA Technical Reports Server (NTRS)

    Streiffert, Barbara; Schrock, Mitchell; Fisher, Forest; Himes, Terry

    2012-01-01

    High productivity required for operations teams to meet schedules Risk must be minimized. Scripting used to automate processes. Scripts perform essential operations functions. Automated Sequence Processor (ASP) was a grass-roots task built to automate the command uplink process System engineering task for ASP revitalization organized. ASP is a set of approximately 200 scripts written in Perl, C Shell, AWK and other scripting languages.. ASP processes/checks/packages non-interactive commands automatically.. Non-interactive commands are guaranteed to be safe and have been checked by hardware or software simulators.. ASP checks that commands are non-interactive.. ASP processes the commands through a command. simulator and then packages them if there are no errors.. ASP must be active 24 hours/day, 7 days/week..

  4. Detecting seeded motifs in DNA sequences.

    PubMed

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at http://telethon.bio.unipd.it/bioinfo/MOST. PMID:16141193

  5. Detecting seeded motifs in DNA sequences

    PubMed Central

    Pizzi, Cinzia; Bortoluzzi, Stefania; Bisognin, Andrea; Coppe, Alessandro; Danieli, Gian Antonio

    2005-01-01

    The problem of detecting DNA motifs with functional relevance in real biological sequences is difficult due to a number of biological, statistical and computational issues and also because of the lack of knowledge about the structure of searched patterns. Many algorithms are implemented in fully automated processes, which are often based upon a guess of input parameters from the user at the very first step. In this paper, we present a novel method for the detection of seeded DNA motifs, composed by regions with a different extent of variability. The method is based on a multi-step approach, which was implemented in a motif searching web tool (MOST). Overrepresented exact patterns are extracted from input sequences and clustered to produce motifs core regions, which are then extended and scored to generate seeded motifs. The combination of automated pattern discovery algorithms and different display tools for the evaluation and selection of results at several analysis steps can potentially lead to much more meaningful results than complete automation can produce. Experimental results on different yeast and human real datasets proved the methodology to be a promising solution for finding seeded motifs. MOST web tool is freely available at . PMID:16141193

  6. An automated hydrodynamic process for controlled, unbiased DNA shearing.

    PubMed

    Thorstenson, Y R; Hunicke-Smith, S P; Oefner, P J; Davis, R W

    1998-08-01

    An automated, inexpensive, easy-to-use, and reproducible technique for controlled, random DNA fragmentation has been developed. The technique is based on point-sink hydrodynamics that result when a DNA sample is forced through a small hole by a syringe pump. Commercially available components are used to reduce the cost and complexity of the instrument. The design is optimized to reduce the volume of sample required and to speed processing time. Shearing of the samples can be completely automated by computer control. Ninety percent of sheared DNA fragments fall within a twofold size distribution that is highly reproducible. Three parameters are critical: the flow geometry, the flow rate, and a minimum number of iterations. Shearing is reproducible over a wide range of temperatures, DNA concentrations, and initial DNA size. The cloning efficiency of the sheared DNA is very good even without end repair, the distribution of assembled sequences is random, and there is no sequence bias at the ends of sheared fragments that have been cloned. The instrument, called the Point-sink Shearer (PtS), has already been exported successfully to many other laboratories. PMID:9724331

  7. Characterization by automated DNA sequencing of mutations in the gene (rpoB) encoding the RNA polymerase beta subunit in rifampin-resistant Mycobacterium tuberculosis strains from New York City and Texas.

    PubMed Central

    Kapur, V; Li, L L; Iordanescu, S; Hamrick, M R; Wanger, A; Kreiswirth, B N; Musser, J M

    1994-01-01

    Automated DNA sequencing was used to characterize mutations associated with rifampin resistance in a 69-bp region of the gene, rpoB, encoding the beta subunit of RNA polymerase in Mycobacterium tuberculosis. The data confirmed that greater than 90% of rifampin-resistant strains have sequence alterations in this region and showed that most are missense mutations. The analysis also identified several mutant rpoB alleles not previously associated with resistant organisms and one short region of rpoB that had an unusually high frequency of insertions and deletions. Although many strains with an identical IS6110 restriction fragment length polymorphism pattern have the same variant rpoB allele, some do not, a result that suggests the occurrence of evolutionary divergence at the clone level. PMID:8027320

  8. Microchips for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Mastrangelo, Carlos H.; Palaniappan, S.; Man, Piu Francis; Burns, Mark A.; Burke, David T.

    1999-08-01

    Genetic information is vital for understanding features and response of an organism. In humans, genetic errors are linked to the development of major diseases such as cancer and diabetes. In order to maximally exploit this information it is necessary to develop miniature sequencing assays that are rapid and inexpensive. In this paper we show how this could be attained with microfluidic chips that contain integrated assays. To date simple silicon/glass chips aimed for sequencing purpose have been realized; but these chips are not yet practical. Some of the solutions that are used to bring these devices closer to commercial applications are discussed.

  9. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  10. Statistical properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-02-01

    We review evidence supporting the idea that the DNA sequence in genese containing non-coding regions is correlated, and that the correlation is remarkably long range - indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the “non-stationarity” feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33 301 coding and 29 453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  11. Arduino-based automation of a DNA extraction system.

    PubMed

    Kim, Kyung-Won; Lee, Mi-So; Ryu, Mun-Ho; Kim, Jong-Won

    2015-01-01

    There have been many studies to detect infectious diseases with the molecular genetic method. This study presents an automation process for a DNA extraction system based on microfluidics and magnetic bead, which is part of a portable molecular genetic test system. This DNA extraction system consists of a cartridge with chambers, syringes, four linear stepper actuators, and a rotary stepper actuator. The actuators provide a sequence of steps in the DNA extraction process, such as transporting, mixing, and washing for the gene specimen, magnetic bead, and reagent solutions. The proposed automation system consists of a PC-based host application and an Arduino-based controller. The host application compiles a G code sequence file and interfaces with the controller to execute the compiled sequence. The controller executes stepper motor axis motion, time delay, and input-output manipulation. It drives the stepper motor with an open library, which provides a smooth linear acceleration profile. The controller also provides a homing sequence to establish the motor's reference position, and hard limit checking to prevent any over-travelling. The proposed system was implemented and its functionality was investigated, especially regarding positioning accuracy and velocity profile. PMID:26409535

  12. DNA Sequences at a Glance

    PubMed Central

    Pinho, Armando J.; Garcia, Sara P.; Pratas, Diogo; Ferreira, Paulo J. S. G.

    2013-01-01

    Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the “information profile”, which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h− and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance. PMID:24278218

  13. DNA sequences at a glance.

    PubMed

    Pinho, Armando J; Garcia, Sara P; Pratas, Diogo; Ferreira, Paulo J S G

    2013-01-01

    Data summarization and triage is one of the current top challenges in visual analytics. The goal is to let users visually inspect large data sets and examine or request data with particular characteristics. The need for summarization and visual analytics is also felt when dealing with digital representations of DNA sequences. Genomic data sets are growing rapidly, making their analysis increasingly more difficult, and raising the need for new, scalable tools. For example, being able to look at very large DNA sequences while immediately identifying potentially interesting regions would provide the biologist with a flexible exploratory and analytical tool. In this paper we present a new concept, the "information profile", which provides a quantitative measure of the local complexity of a DNA sequence, independently of the direction of processing. The computation of the information profiles is computationally tractable: we show that it can be done in time proportional to the length of the sequence. We also describe a tool to compute the information profiles of a given DNA sequence, and use the genome of the fission yeast Schizosaccharomyces pombe strain 972 h(-) and five human chromosomes 22 for illustration. We show that information profiles are useful for detecting large-scale genomic regularities by visual inspection. Several discovery strategies are possible, including the standalone analysis of single sequences, the comparative analysis of sequences from individuals from the same species, and the comparative analysis of sequences from different organisms. The comparison scale can be varied, allowing the users to zoom-in on specific details, or obtain a broad overview of a long segment. Software applications have been made available for non-commercial use at http://bioinformatics.ua.pt/software/dna-at-glance. PMID:24278218

  14. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  15. Method enabling fast partial sequencing of cDNA clones.

    PubMed

    Nordström, T; Gharizadeh, B; Pourmand, N; Nyren, P; Ronaghi, M

    2001-05-15

    Pyrosequencing is a nonelectrophoretic single-tube DNA sequencing method that takes advantage of cooperativity between four enzymes to monitor DNA synthesis. To investigate the feasibility of the recently developed technique for tag sequencing, 64 colonies of a selected cDNA library from human were sequenced by both pyrosequencing and Sanger DNA sequencing. To determine the needed length for finding a unique DNA sequence, 100 sequence tags from human were retrieved from the database and different lengths from each sequence were randomly analyzed. An homology search based on 20 and 30 nucleotides produced 97 and 98% unique hits, respectively. An homology search based on 100 nucleotides could identify all searched genes. Pyrosequencing was employed to produce sequence data for 30 nucleotides. A similar search using BLAST revealed 16 different genes. Forty-six percent of the sequences shared homology with one gene at different positions. Two of the 64 clones had unique sequences. The search results from pyrosequencing were in 100% agreement with conventional DNA sequencing methods. The possibility of using a fully automated pyrosequencer machine for future high-throughput tag sequencing is discussed. PMID:11355860

  16. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1996-05-07

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection. 18 figs.

  17. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1996-01-01

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection.

  18. Engineered DNA sequence syntax inspector.

    PubMed

    Hsiau, Timothy Hwei-Chung; Anderson, J Christopher

    2014-02-21

    DNAs encoding polypeptides often contain design errors that cause experiments to prematurely fail. One class of design errors is incorrect or missing elements in the DNA, here termed syntax errors. We have identified three major causes of syntax errors: point mutations from sequencing or manual data entry, gene structure misannotation, and unintended open reading frames (ORFs). The Engineered DNA Sequence Syntax Inspector (EDSSI) is an online bioinformatics pipeline that checks for syntax errors through three steps. First, ORF prediction in input DNA sequences is done by GeneMark; next, homologous sequences are retrieved by BLAST, and finally, syntax errors in the protein sequence are predicted by using the SIFT algorithm. We show that the EDSSI is able to identify previously published examples of syntactical errors and also show that our indel addition to the SIFT program is 97% accurate on a test set of Escherichia coli proteins. The EDSSI is available at http://andersonlab.qb3.berkeley.edu/Software/EDSSI/ . PMID:24364864

  19. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  20. Automated DNA electrophoresis, hybridization and detection

    SciTech Connect

    Zapolski, E.J.; Gersten, D.M.; Golab, T.J.; Ledley, R.S.

    1986-05-01

    A fully automated, computer controlled system for nucleic acid hybridization analysis has been devised and constructed. In practice, DNA is digested with restriction endonuclease enzyme(s) and loaded into the system by pipette; /sup 32/P-labelled nucleic acid probe(s) is loaded into the nine hybridization chambers. Instructions for all the steps in the automated process are specified by answering questions that appear on the computer screen at the start of the experiment. Subsequent steps are performed automatically. The system performs horizontal electrophoresis in agarose gel, fixed the fragments to a solid phase matrix, denatures, neutralizes, prehybridizes, hybridizes, washes, dries and detects the radioactivity according to the specifications given by the operator. The results, printed out at the end, give the positions on the matrix to which radioactivity remains hybridized following stringent washing.

  1. Dynamical model for DNA sequences

    NASA Astrophysics Data System (ADS)

    Allegrini, P.; Barbi, M.; Grigolini, P.; West, B. J.

    1995-11-01

    We address the problem of DNA sequences, developing a ``dynamical'' method based on the assumption that the statistical properties of DNA paths are determined by the joint action of two processes, one deterministic with long-range correlations, and the other random and δ-function correlated. The generator of the deterministic evolution is a nonlinear map, belonging to a class of maps recently tailored to mimic the processes of weak chaos that are responsible for the birth of anomalous diffusion. It is assumed that the deterministic process corresponds to unknown biological rules that determine the DNA path, whereas the noise mimics the influence of an infinite-dimensional environment on the biological process under study. We prove that the resulting diffusion process, if the effect of the random process is neglected, is an α-stable Lévy process with 1<α<2. We also show that, if the diffusion process is determined by the joint action of the deterministic and the random process, the correlation effects of the ``deterministic dynamics'' are cancelled on the short-range scale, but show up in the long-range one. We denote our prescription to generate statistical sequences as the copying mistake map (CMM). We carry out our analysis of several DNA sequences and their CMM realizations with a variety of techniques, and we especially focus on a method of regression to equilibrium, which we call the Onsager analysis. With these techniques we establish the statistical equivalence of the real DNA sequences with their CMM realizations. We show that long-range correlations are present in exons as well as in introns, but are difficult to detect, since the exon ``dynamics'' is shown to be determined by the entanglement of three distinct and independent CMM's.

  2. Automation of a single-DNA molecule stretching device.

    PubMed

    Sørensen, Kristian Tølbøl; Lopacinska, Joanna M; Tommerup, Niels; Silahtaroglu, Asli; Kristensen, Anders; Marie, Rodolphe

    2015-06-01

    We automate the manipulation of genomic-length DNA in a nanofluidic device based on real-time analysis of fluorescence images. In our protocol, individual molecules are picked from a microchannel and stretched with pN forces using pressure driven flows. The millimeter-long DNA fragments free flowing in micro- and nanofluidics emit low fluorescence and change shape, thus challenging the image analysis for machine vision. We demonstrate a set of image processing steps that increase the intrinsically low signal-to-noise ratio associated with single-molecule fluorescence microscopy. Furthermore, we demonstrate how to estimate the length of molecules by continuous real-time image stitching and how to increase the effective resolution of a pressure controller by pulse width modulation. The sequence of image-processing steps addresses the challenges of genomic-length DNA visualization; however, they should also be general to other applications of fluorescence-based microfluidics. PMID:26133839

  3. Automated DNA diagnostics using an ELISA-based oligonucleotide ligation assay.

    PubMed Central

    Nickerson, D A; Kaiser, R; Lappin, S; Stewart, J; Hood, L; Landegren, U

    1990-01-01

    DNA diagnostics, the detection of specific DNA sequences, will play an increasingly important role in medicine as the molecular basis of human disease is defined. Here, we demonstrate an automated, nonisotopic strategy for DNA diagnostics using amplification of target DNA segments by the polymerase chain reaction (PCR) and the discrimination of allelic sequence variants by a colorimetric oligonucleotide ligation assay (OLA). We have applied the automated PCR/OLA procedure to diagnosis of common genetic diseases, such as sickle cell anemia and cystic fibrosis (delta F508 mutation), and to genetic linkage mapping of gene segments in the human T-cell receptor beta-chain locus. The automated PCR/OLA strategy provides a rapid system for diagnosis of genetic, malignant, and infectious diseases as well as a powerful approach to genetic linkage mapping of chromosomes and forensic DNA typing. Images PMID:2247466

  4. Channel plate for DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1998-01-13

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface. 15 figs.

  5. Channel plate for DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1998-01-01

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface.

  6. DNA Sequencing Using capillary Electrophoresis

    SciTech Connect

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  7. Nanopore DNA sequencing with MspA.

    PubMed

    Derrington, Ian M; Butler, Tom Z; Collins, Marcus D; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H

    2010-09-14

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing. PMID:20798343

  8. Nanopore DNA sequencing with MspA

    PubMed Central

    Derrington, Ian M.; Butler, Tom Z.; Collins, Marcus D.; Manrao, Elizabeth; Pavlenok, Mikhail; Niederweis, Michael; Gundlach, Jens H.

    2010-01-01

    Nanopore sequencing has the potential to become a direct, fast, and inexpensive DNA sequencing technology. The simplest form of nanopore DNA sequencing utilizes the hypothesis that individual nucleotides of single-stranded DNA passing through a nanopore will uniquely modulate an ionic current flowing through the pore, allowing the record of the current to yield the DNA sequence. We demonstrate that the ionic current through the engineered Mycobacterium smegmatis porin A, MspA, has the ability to distinguish all four DNA nucleotides and resolve single-nucleotides in single-stranded DNA when double-stranded DNA temporarily holds the nucleotides in the pore constriction. Passing DNA with a series of double-stranded sections through MspA provides proof of principle of a simple DNA sequencing method using a nanopore. These findings highlight the importance of MspA in the future of nanopore sequencing. PMID:20798343

  9. Scar-less multi-part DNA assembly design automation

    DOEpatents

    Hillson, Nathan J.

    2016-06-07

    The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.

  10. Towards modeling DNA sequences as automata

    NASA Astrophysics Data System (ADS)

    Burks, Christian; Farmer, Doyne

    1984-01-01

    We seek to describe a starting point for modeling the evolution and role of DNA sequences within the framework of cellular automata by discussing the current understanding of genetic information storage in DNA sequences. This includes alternately viewing the role of DNA in living organisms as a simple scheme and as a complex scheme; a brief review of strategies for identifying and classifying patterns in DNA sequences; and finally, notes towards establishing DNA-like automata models, including a discussion of the extent of experimentally determined DNA sequence data present in the database at Los Alamos.

  11. Fluorescence-detected DNA sequencing

    SciTech Connect

    Haugland, R.P.

    1990-01-01

    Our research effort funded by this grant primarily focused on development of suitable fluorescent dyes for DNA sequencing studies. Prior to our efforts, the dyes being sued in commercial DNA sequencers were various versions of fluorescein dyes for the shorter wavelengths and of rhodamine dyes for the longer wavelengths. Our initial goal was to synthesize a set of four dyes that could all be excited by the 488 and 514 nm line of the argon laser lines and that have emission spectra that minimize spectral overlap. The specific result sought was higher fluorescent intensity, particularly of the longest wavelength dyes than was available using existing dyes. Another important property of the desired set of dyes was uniform ionic charge in order to have minimum interference on the electrophoretic mobility during the sequencing. During the period of this grant we prepared and characterized four types of dyes: fluorescent bifluorophores, derivatives of rhodamine dyes, derivatives of rhodol dyes and derivatives of boron dipyrromethene difluoride (BODIPY{trademark}) dyes.

  12. Particle sizer and DNA sequencer

    DOEpatents

    Olivares, Jose A.; Stark, Peter C.

    2005-09-13

    An electrophoretic device separates and detects particles such as DNA fragments, proteins, and the like. The device has a capillary which is coated with a coating with a low refractive index such as Teflon.RTM. AF. A sample of particles is fluorescently labeled and injected into the capillary. The capillary is filled with an electrolyte buffer solution. An electrical field is applied across the capillary causing the particles to migrate from a first end of the capillary to a second end of the capillary. A detector light beam is then scanned along the length of the capillary to detect the location of the separated particles. The device is amenable to a high throughput system by providing additional capillaries. The device can also be used to determine the actual size of the particles and for DNA sequencing.

  13. Genetic mapping and DNA sequencing

    SciTech Connect

    Speed, T.; Waterman, M.S.

    1996-12-31

    The Human Genome Initiative has as its primary objective the characterization of the human genome. High-resolution linkage maps of genetic markers will play an important role in completing the human genome project. This is one of two volumes based on the proceedings of the 1994 IMA Summer Program on Molecular Biology and comprises Weeks 1 and 2 of the four-week program. This volume focuses on genetic mapping and DNA sequencing. Selected papers are indexed separately for inclusion in the Energy Science and Technology Database.

  14. Laser desorption mass spectrometry for DNA analysis and sequencing

    SciTech Connect

    Chen, C.H.; Taranenko, N.I.; Tang, K.; Allman, S.L.

    1995-03-01

    Laser desorption mass spectrometry has been considered as a potential new method for fast DNA sequencing. Our approach is to use matrix-assisted laser desorption to produce parent ions of DNA segments and a time-of-flight mass spectrometer to identify the sizes of DNA segments. Thus, the approach is similar to gel electrophoresis sequencing using Sanger`s enzymatic method. However, gel, radioactive tagging, and dye labeling are not required. In addition, the sequencing process can possibly be finished within a few hundred microseconds instead of hours and days. In order to use mass spectrometry for fast DNA sequencing, the following three criteria need to be satisfied. They are (1) detection of large DNA segments, (2) sensitivity reaching the femtomole region, and (3) mass resolution good enough to separate DNA segments of a single nucleotide difference. It has been very difficult to detect large DNA segments by mass spectrometry before due to the fragile chemical properties of DNA and low detection sensitivity of DNA ions. We discovered several new matrices to increase the production of DNA ions. By innovative design of a mass spectrometer, we can increase the ion energy up to 45 KeV to enhance the detection sensitivity. Recently, we succeeded in detecting a DNA segment with 500 nucleotides. The sensitivity was 100 femtomole. Thus, we have fulfilled two key criteria for using mass spectrometry for fast DNA sequencing. The major effort in the near future is to improve the resolution. Different approaches are being pursued. When high resolution of mass spectrometry can be achieved and automation of sample preparation is developed, the sequencing speed to reach 500 megabases per year can be feasible.

  15. Variable copy number DNA sequences in rice.

    PubMed

    Kikuchi, S; Takaiwa, F; Oono, K

    1987-12-01

    We have cloned two types of variable copy number DNA sequences from the rice embryo genome. One of these sequences, which was cloned in pRB301, was amplified about 50-fold during callus formation and diminished in copy number to the embryonic level during regeneration. The other clone, named pRB401, showed the reciprocal pattern. The copy numbers of both sequences were changed even in the early developmental stage and eliminated from nuclear DNA along with growth of the plant. Sequencing analysis of the pRB301 insert revealed some open reading frames and direct repeat structures, but corresponding sequences were not identified in the EMBL and LASL DNA databases. Sequencing of the nuclear genomic fragment cloned in pRB401 revealed the presence of the 3'rps12-rps7 region of rice chloroplast DNA. Our observations suggest that during callus formation (dedifferentiation), regeneration and the growth process the copy numbers of some DNA sequences are variable and that nuclear integrated chloroplast DNA acts as a variable copy number sequence in the rice genome. Based on data showing a common sequence in mitochondria and chloroplast DNA of maize (Stern and Lonsdale 1982) and that the rps12 gene of tobacco chloroplast DNA is a divided gene (Torazawa et al. 1986), it is suggested that the sequence on the inverted repeat structure of chloroplast DNA may have the character of a movable genetic element. PMID:3481021

  16. The Value of DNA Sequencing - TCGA

    Cancer.gov

    DNA sequencing: what it tells us about DNA changes in cancer, how looking across many tumors will help to identify meaningful changes and potential drug targets, and how genomics is changing the way we think about cancer.

  17. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, Andrew M.; Dawson, John

    1993-01-01

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.

  18. Fibonacci Sequence and Supramolecular Structure of DNA.

    PubMed

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences. PMID:27265133

  19. Sequence and Structure Dependent DNA-DNA Interactions

    NASA Astrophysics Data System (ADS)

    Kopchick, Benjamin; Qiu, Xiangyun

    Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.

  20. Atypical regions in large genomic DNA sequences

    SciTech Connect

    Scherer, S. |; McPeek, M.S.; Speed, T.P.

    1994-07-19

    Large genomic DNA sequences contain regions with distinctive patterns of sequence organization. The authors describe a method using logarithms of probabilities based on seventh-order Markov chains to rapidly identify genomic sequences that do not resemble models of genome organization built from compilations of octanucleotide usage. Data bases have been constructed from Escherichia coli and Saccharomyces cerevisiae DNA sequences of >1000 nt and human sequences of >10,000 nt. Atypical genes and clusters of genes have been located in bacteriophage, yeast, and primate DNA sequences. The authors consider criteria for statistical significance of the results, offer possible explanations for the observed variation in genome organization, and give additional applications of these methods in DNA sequence analysis.

  1. Sequence Affects the Cyclization of DNA Minicircles.

    PubMed

    Wang, Qian; Pettitt, B Montgomery

    2016-03-17

    Understanding how the sequence of a DNA molecule affects its dynamic properties is a central problem affecting biochemistry and biotechnology. The process of cyclizing short DNA, as a critical step in molecular cloning, lacks a comprehensive picture of the kinetic process containing sequence information. We have elucidated this process by using coarse-grained simulations, enhanced sampling methods, and recent theoretical advances. We are able to identify the types and positions of structural defects during the looping process at a base-pair level. Correlations along a DNA molecule dictate critical sequence positions that can affect the looping rate. Structural defects change the bending elasticity of the DNA molecule from a harmonic to subharmonic potential with respect to bending angles. We explore the subelastic chain as a possible model in loop formation kinetics. A sequence-dependent model is developed to qualitatively predict the relative loop formation time as a function of DNA sequence. PMID:26938490

  2. Using DNA looping to measure sequence dependent DNA elasticity

    NASA Astrophysics Data System (ADS)

    Kandinov, Alan; Raghunathan, Krishnan; Meiners, Jens-Christian

    2012-10-01

    We are using tethered particle motion (TPM) microscopy to observe protein-mediated DNA looping in the lactose repressor system in DNA constructs with varying AT / CG content. We use these data to determine the persistence length of the DNA as a function of its sequence content and compare the data to direct micromechanical measurements with constant-force axial optical tweezers. The data from the TPM experiments show a much smaller sequence effect on the persistence length than the optical tweezers experiments.

  3. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  4. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  5. Small scale sequence automation pays big dividends

    NASA Technical Reports Server (NTRS)

    Nelson, Bill

    1994-01-01

    Galileo sequence design and integration are supported by a suite of formal software tools. Sequence review, however, is largely a manual process with reviewers scanning hundreds of pages of cryptic computer printouts to verify sequence correctness. Beginning in 1990, a series of small, PC based sequence review tools evolved. Each tool performs a specific task but all have a common 'look and feel'. The narrow focus of each tool means simpler operation, and easier creation, testing, and maintenance. Benefits from these tools are (1) decreased review time by factors of 5 to 20 or more with a concomitant reduction in staffing, (2) increased review accuracy, and (3) excellent returns on time invested.

  6. Fractal analysis of DNA sequence data

    SciTech Connect

    Berthelsen, C.L.

    1993-01-01

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the [open quote]sandbox method[close quote]. Analysis of 164 human DNA sequences compared to three types of control sequences (random, base-content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than to invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  7. Fractal Analysis of DNA Sequence Data

    NASA Astrophysics Data System (ADS)

    Berthelsen, Cheryl Lynn

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the "sandbox method." Analysis of 164 human DNA sequences compared to three types of control sequences (random, base -content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than do invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  8. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    PubMed Central

    Vlijm, Rifka; v.d. Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix. PMID:26513573

  9. Automated shielding analysis sequences for spent fuel casks

    SciTech Connect

    Tang, J.S.; Parks, C.V.; Hermann, O.W.

    1987-01-01

    Two important Shielding Analysis Sequences (SAS) have recently been developed within the SCALE computational system. These sequences significantly enhance the existing SCALE system capabilities for evaluating radiation doses exterior to spent fuel casks. These new control module sequences (SAS1 and SAS4) and their capabilities are discussed and demonstrated, together with the existing SAS2 sequence that is used to generate radiation sources for spent fuel. Particular attention is given to the new SAS4 sequence which provides an automated scheme for generating and using biasing parameters in a subsequent Monte Carlo analysis of a cask.

  10. DNA sequencing: bench to bedside and beyond†

    PubMed Central

    Hutchison, Clyde A.

    2007-01-01

    Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage ϕX174), which demonstrated that sequence could give profound insights into genetic organization. Incremental improvements allowed sequencing of molecules >200 kb (human cytomegalovirus) leading to an avalanche of data that demanded computational analysis and spawned the field of bioinformatics. The US Human Genome Project spurred sequencing activity. By 1992 the first ‘sequencing factory’ was established, and others soon followed. The first complete cellular genome sequences, from bacteria, appeared in 1995 and other eubacterial, archaebacterial and eukaryotic genomes were soon sequenced. Competition between the public Human Genome Project and Celera Genomics produced working drafts of the human genome sequence, published in 2001, but refinement and analysis of the human genome sequence will continue for the foreseeable future. New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine. These advances will also allow new approaches to a variety of problems in biology, evolution and the environment. PMID:17855400

  11. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. ); Arlinghaus, H.F. )

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  12. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A.; Arlinghaus, H.F.

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  13. Data management for re-sequencing DNA

    SciTech Connect

    Ying Jiahsu; Gilson, H.; Long, K.; Gibbs, R.A.

    1993-12-31

    The human genome project has greatly stimulated the advancement of techniques to sequence large fragments of DNA. The development of improved molecular methods has also simplified the process of comparing shorter, homologous DNA sequences from different individuals and species. This process of `re-sequencing` DNA has applications in medical genetics, in evolutionary studies, and for the identification of complex molecular variation that may explain multifactorial traits. Intrinsic differences in the processes of `sequencing` and `re-sequencing` suggest new requirements for data management tools. A data management scheme for a `re-sequencing` project is demonstrated using the Virtual Notebook System, a flexible multi-user tool designed as a metaphor of the laboratory notebook.

  14. Automated Selection Of Pictures In Sequences

    NASA Technical Reports Server (NTRS)

    Rorvig, Mark E.; Shelton, Robert O.

    1995-01-01

    Method of automated selection of film or video motion-picture frames for storage or examination developed. Beneficial in situations in which quantity of visual information available exceeds amount stored or examined by humans in reasonable amount of time, and/or necessary to reduce large number of motion-picture frames to few conveying significantly different information in manner intermediate between movie and comic book or storyboard. For example, computerized vision system monitoring industrial process programmed to sound alarm when changes in scene exceed normal limits.

  15. Automated correction of genome sequence errors

    PubMed Central

    Gajer, Pawel; Schatz, Michael; Salzberg, Steven L.

    2004-01-01

    By using information from an assembly of a genome, a new program called AutoEditor significantly improves base calling accuracy over that achieved by previous algorithms. This in turn improves the overall accuracy of genome sequences and facilitates the use of these sequences for polymorphism discovery. We describe the algorithm and its application in a large set of recent genome sequencing projects. The number of erroneous base calls in these projects was reduced by 80%. In an analysis of over one million corrections, we found that AutoEditor made just one error per 8828 corrections. By substantially increasing the accuracy of base calling, AutoEditor can dramatically accelerate the process of finishing genomes, which involves closing all gaps and ensuring minimum quality standards for the final sequence. It also greatly improves our ability to discover single nucleotide polymorphisms (SNPs) between closely related strains and isolates of the same species. PMID:14744981

  16. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, A.M.; Dawson, J.

    1993-12-14

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.

  17. Nucleotide sequence of bacteriophage fd DNA.

    PubMed Central

    Beck, E; Sommer, R; Auerswald, E A; Kurz, C; Zink, B; Osterburg, G; Schaller, H; Sugimoto, K; Sugisaki, H; Okamoto, T; Takanami, M

    1978-01-01

    The sequence of the 6,408 nucleotides of bacteriophage fd DNA has been determined. This allows to deduce the exact organisation of the filamentous phage genome and provides easy access to DNA segments of known structure and function. PMID:745987

  18. Streamlining DNA Barcoding Protocols: Automated DNA Extraction and a New cox1 Primer in Arachnid Systematics

    PubMed Central

    Vidergar, Nina; Toplak, Nataša; Kuntner, Matjaž

    2014-01-01

    Background DNA barcoding is a popular tool in taxonomic and phylogenetic studies, but for most animal lineages protocols for obtaining the barcoding sequences—mitochondrial cytochrome C oxidase subunit I (cox1 AKA CO1)—are not standardized. Our aim was to explore an optimal strategy for arachnids, focusing on the species-richest lineage, spiders by (1) improving an automated DNA extraction protocol, (2) testing the performance of commonly used primer combinations, and (3) developing a new cox1 primer suitable for more efficient alignment and phylogenetic analyses. Methodology We used exemplars of 15 species from all major spider clades, processed a range of spider tissues of varying size and quality, optimized genomic DNA extraction using the MagMAX Express magnetic particle processor—an automated high throughput DNA extraction system—and tested cox1 amplification protocols emphasizing the standard barcoding region using ten routinely employed primer pairs. Results The best results were obtained with the commonly used Folmer primers (LCO1490/HCO2198) that capture the standard barcode region, and with the C1-J-2183/C1-N-2776 primer pair that amplifies its extension. However, C1-J-2183 is designed too close to HCO2198 for well-interpreted, continuous sequence data, and in practice the resulting sequences from the two primer pairs rarely overlap. We therefore designed a new forward primer C1-J-2123 60 base pairs upstream of the C1-J-2183 binding site. The success rate of this new primer (93%) matched that of C1-J-2183. Conclusions The use of C1-J-2123 allows full, indel-free overlap of sequences obtained with the standard Folmer primers and with C1-J-2123 primer pair. Our preliminary tests suggest that in addition to spiders, C1-J-2123 will also perform in other arachnids and several other invertebrates. We provide optimal PCR protocols for these primer sets, and recommend using them for systematic efforts beyond DNA barcoding. PMID:25415202

  19. Automated Sequence Preprocessing in a Large-Scale Sequencing Environment

    PubMed Central

    Wendl, Michael C.; Dear, Simon; Hodgson, Dave; Hillier, LaDeana

    1998-01-01

    A software system for transforming fragments from four-color fluorescence-based gel electrophoresis experiments into assembled sequence is described. It has been developed for large-scale processing of all trace data, including shotgun and finishing reads, regardless of clone origin. Design considerations are discussed in detail, as are programming implementation and graphic tools. The importance of input validation, record tracking, and use of base quality values is emphasized. Several quality analysis metrics are proposed and applied to sample results from recently sequenced clones. Such quantities prove to be a valuable aid in evaluating modifications of sequencing protocol. The system is in full production use at both the Genome Sequencing Center and the Sanger Centre, for which combined weekly production is ∼100,000 sequencing reads per week. PMID:9750196

  20. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, K.D.; Chu, T.J.; Pitt, W.G.

    1992-05-12

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings

  1. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, Karin D.; Chu, Tun-Jen; Pitt, William G.

    1992-01-01

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through said smino groups contained on the surface thereof. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to said target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membrances may be reprobed numerous times.

  2. Model annotation for synthetic biology: automating model to nucleotide sequence conversion

    PubMed Central

    Misirli, Goksel; Hallinan, Jennifer S.; Yu, Tommy; Lawson, James R.; Wimalaratne, Sarala M.; Cooling, Michael T.; Wipat, Anil

    2011-01-01

    Motivation: The need for the automated computational design of genetic circuits is becoming increasingly apparent with the advent of ever more complex and ambitious synthetic biology projects. Currently, most circuits are designed through the assembly of models of individual parts such as promoters, ribosome binding sites and coding sequences. These low level models are combined to produce a dynamic model of a larger device that exhibits a desired behaviour. The larger model then acts as a blueprint for physical implementation at the DNA level. However, the conversion of models of complex genetic circuits into DNA sequences is a non-trivial undertaking due to the complexity of mapping the model parts to their physical manifestation. Automating this process is further hampered by the lack of computationally tractable information in most models. Results: We describe a method for automatically generating DNA sequences from dynamic models implemented in CellML and Systems Biology Markup Language (SBML). We also identify the metadata needed to annotate models to facilitate automated conversion, and propose and demonstrate a method for the markup of these models using RDF. Our algorithm has been implemented in a software tool called MoSeC. Availability: The software is available from the authors' web site http://research.ncl.ac.uk/synthetic_biology/downloads.html. Contact: anil.wipat@ncl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21296753

  3. Automated serial extraction of DNA and RNA from biobanked tissue specimens

    PubMed Central

    2013-01-01

    Background With increasing biobanking of biological samples, methods for large scale extraction of nucleic acids are in demand. The lack of such techniques designed for extraction from tissues results in a bottleneck in downstream genetic analyses, particularly in the field of cancer research. We have developed an automated procedure for tissue homogenization and extraction of DNA and RNA into separate fractions from the same frozen tissue specimen. A purpose developed magnetic bead based technology to serially extract both DNA and RNA from tissues was automated on a Tecan Freedom Evo robotic workstation. Results 864 fresh-frozen human normal and tumor tissue samples from breast and colon were serially extracted in batches of 96 samples. Yields and quality of DNA and RNA were determined. The DNA was evaluated in several downstream analyses, and the stability of RNA was determined after 9 months of storage. The extracted DNA performed consistently well in processes including PCR-based STR analysis, HaloPlex selection and deep sequencing on an Illumina platform, and gene copy number analysis using microarrays. The RNA has performed well in RT-PCR analyses and maintains integrity upon storage. Conclusions The technology described here enables the processing of many tissue samples simultaneously with a high quality product and a time and cost reduction for the user. This reduces the sample preparation bottleneck in cancer research. The open automation format also enables integration with upstream and downstream devices for automated sample quantitation or storage. PMID:23957867

  4. The expanding scope of DNA sequencing

    PubMed Central

    Shendure, Jay; Aiden, Erez Lieberman

    2014-01-01

    In just seven years, next-generation technologies have reduced the cost and increased the speed of DNA sequencing by four orders of magnitude, and experiments requiring many millions of sequencing reads are now routine. In research, sequencing is being applied not only to assemble genomes and to investigate the genetic basis of human disease, but also to explore myriad phenomena in organismic and cellular biology. In the clinic, the utility of sequence data is being intensively evaluated in diverse contexts, including reproductive medicine, oncology and infectious disease. A recurrent theme in the development of new sequencing applications is the creative ‘recombination’ of existing experimental building blocks. However, there remain many potentially high-impact applications of next-generation DNA sequencing that are not yet fully realized. PMID:23138308

  5. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley; Brown, Steven D; Podar, Mircea; Palumbo, Anthony Vito; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  6. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    NASA Astrophysics Data System (ADS)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  7. Dynamics and control of DNA sequence amplification

    SciTech Connect

    Marimuthu, Karthikeyan; Chakrabarti, Raj E-mail: rajc@andrew.cmu.edu

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  8. Dynamics and control of DNA sequence amplification

    NASA Astrophysics Data System (ADS)

    Marimuthu, Karthikeyan; Chakrabarti, Raj

    2014-10-01

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  9. Quadruplex DNA: sequence, topology and structure

    PubMed Central

    Burge, Sarah; Parkinson, Gary N.; Hazel, Pascale; Todd, Alan K.; Neidle, Stephen

    2006-01-01

    G-quadruplexes are higher-order DNA and RNA structures formed from G-rich sequences that are built around tetrads of hydrogen-bonded guanine bases. Potential quadruplex sequences have been identified in G-rich eukaryotic telomeres, and more recently in non-telomeric genomic DNA, e.g. in nuclease-hypersensitive promoter regions. The natural role and biological validation of these structures is starting to be explored, and there is particular interest in them as targets for therapeutic intervention. This survey focuses on the folding and structural features on quadruplexes formed from telomeric and non-telomeric DNA sequences, and examines fundamental aspects of topology and the emerging relationships with sequence. Emphasis is placed on information from the high-resolution methods of X-ray crystallography and NMR, and their scope and current limitations are discussed. Such information, together with biological insights, will be important for the discovery of drugs targeting quadruplexes from particular genes. PMID:17012276

  10. A test matrix sequencer for research test facility automation

    NASA Technical Reports Server (NTRS)

    Mccartney, Timothy P.; Emery, Edward F.

    1990-01-01

    The hardware and software configuration of a Test Matrix Sequencer, a general purpose test matrix profiler that was developed for research test facility automation at the NASA Lewis Research Center, is described. The system provides set points to controllers and contact closures to data systems during the course of a test. The Test Matrix Sequencer consists of a microprocessor controlled system which is operated from a personal computer. The software program, which is the main element of the overall system is interactive and menu driven with pop-up windows and help screens. Analog and digital input/output channels can be controlled from a personal computer using the software program. The Test Matrix Sequencer provides more efficient use of aeronautics test facilities by automating repetitive tasks that were once done manually.

  11. Sequencing mitochondrial DNA polymorphisms by hybridization

    SciTech Connect

    Chee, M.S.; Lockhart, D.J.; Hubbell, E.

    1994-09-01

    We have investigated the use of DNA chips for genetic analysis, using human mitochondrial DNA (mtDNA) as a model. The DNA chips are made up of ordered arrays of DNA oligonucleotide probes, synthesized on a glass substrate using photolithographic techniques. The synthesis site for each different probe is specifically addressed by illumination of the substrate through a photolithographic mask, achieving selective deprotection Nucleoside phosphoramidites bearing photolabile protecting groups are coupled only to exposed sites. Repeated cycles of deprotection and coupling generate all the probes in parallel. The set of 4{sup N} N-mer probes can be synthesized in only 4N steps. Any subset can be synthesized in 4N steps. Any subset can be synthesized in 4N or fewer steps. Sequences amplified from the D-loop region of human mitochondrial DNA (mtDNA) were fluorescently labelled and hybridized to DNA chips containing probes specific for mtDNA. Each nucleotide of a 1.3 kb region spanning the D loop is represented by four probes on the chip. Each probe has a different base at the position of interest: together they comprise a set of A, C, G and T probes which are otherwise identical. In principle, only one probe-target hybrid will be a perfect match. The other three will be single base mismatches. Fluorescence imaging of the hybridized chip allows quantification of hybridization signals. Heterozygous mixtures of sequences can also be characterized. We have developed software to quantitate and interpret the hybridization signals, and to call the sequence automatically. Results of sequence analysis of human mtDNAs will be presented.

  12. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  13. Automated screening for small organic ligands using DNA-encoded chemical libraries.

    PubMed

    Decurtins, Willy; Wichert, Moreno; Franzini, Raphael M; Buller, Fabian; Stravs, Michael A; Zhang, Yixin; Neri, Dario; Scheuermann, Jörg

    2016-04-01

    DNA-encoded chemical libraries (DECLs) are collections of organic compounds that are individually linked to different oligonucleotides, serving as amplifiable identification barcodes. As all compounds in the library can be identified by their DNA tags, they can be mixed and used in affinity-capture experiments on target proteins of interest. In this protocol, we describe the screening process that allows the identification of the few binding molecules within the multiplicity of library members. First, the automated affinity selection process physically isolates binding library members. Second, the DNA codes of the isolated binders are PCR-amplified and subjected to high-throughput DNA sequencing. Third, the obtained sequencing data are evaluated using a C++ program and the results are displayed using MATLAB software. The resulting selection fingerprints facilitate the discrimination of binding from nonbinding library members. The described procedures allow the identification of small organic ligands to biological targets from a DECL within 10 d. PMID:26985574

  14. Automated carboxy-terminal sequence analysis of peptides.

    PubMed Central

    Bailey, J. M.; Shenoy, N. R.; Ronk, M.; Shively, J. E.

    1992-01-01

    Proteins and peptides can be sequenced from the carboxy-terminus with isothiocyanate reagents to produce amino acid thiohydantoin derivatives. Previous studies in our laboratory have focused on solution phase conditions for formation of the peptidylthiohydantoins with trimethylsilylisothiocyanate (TMS-ITC) and for hydrolysis of these peptidylthiohydantoins into an amino acid thiohydantoin derivative and a new shortened peptide capable of continued degradation (Bailey, J. M. & Shively, J. E., 1990, Biochemistry 29, 3145-3156). The current study is a continuation of this work and describes the construction of an instrument for automated C-terminal sequencing, the application of the thiocyanate chemistry to peptides covalently coupled to a novel polyethylene solid support (Shenoy, N. R., Bailey, J. M., & Shively, J. E., 1992, Protein Sci. I, 58-67), the use of sodium trimethylsilanolate as a novel reagent for the specific cleavage of the derivatized C-terminal amino acid, and the development of methodology to sequence through the difficult amino acid, aspartate. Automated programs are described for the C-terminal sequencing of peptides covalently attached to carboxylic acid-modified polyethylene. The chemistry involves activation with acetic anhydride, derivatization with TMS-ITC, and cleavage of the derivatized C-terminal amino acid with sodium trimethylsilanolate. The thiohydantoin amino acid is identified by on-line high performance liquid chromatography using a Phenomenex Ultracarb 5 ODS(30) column and a triethylamine/phosphoric acid buffer system containing pentanesulfonic acid. The generality of our automated C-terminal sequencing methodology was examined by sequencing model peptides containing all 20 of the common amino acids. All of the amino acids were found to sequence in high yield (90% or greater) except for asparagine and aspartate, which could be only partially removed, and proline, which was found not be capable of derivatization. In spite of these

  15. Fluorogenic DNA Sequencing in PDMS Microreactors

    PubMed Central

    Sims, Peter A.; Greenleaf, William J.; Duan, Haifeng; Xie, X. Sunney

    2012-01-01

    We have developed a multiplex sequencing-by-synthesis method combining terminal-phosphate labeled fluorogenic nucleotides (TPLFNs) and resealable microreactors. In the presence of phosphatase, the incorporation of a non-fluorescent TPLFN into a DNA primer by DNA polymerase results in a fluorophore. We immobilize DNA templates within polydimethylsiloxane (PDMS) microreactors, sequentially introduce one of the four identically labeled TPLFNs, seal the microreactors, allow template-directed TPLFN incorporation, and measure the signal from the fluorophores trapped in the microreactors. This workflow allows sequencing in a manner akin to pyrosequencing but without constant monitoring of each microreactor. With cycle times of <10 minutes, we demonstrate 30 base reads with ∼99% raw accuracy. “Fluorogenic pyrosequencing” combines benefits of pyrosequencing, such as rapid turn-around, native DNA generation, and single-color detection, with benefits of fluorescence-based approaches, such as highly sensitive detection and simple parallelization. PMID:21666670

  16. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  17. Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

    PubMed

    Gupta, P D

    2016-10-01

    In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology. PMID:27605732

  18. Replication pattern of human repeated DNA sequences.

    PubMed

    Meneveri, R; Agresti, A; Breviario, D; Ginelli, E

    1984-10-01

    Either aphidicolin- or thymidine-synchronized human HL-60 cells were used to study the replication pattern of a family of human repetitive DNA sequences, the Eco RI 340 bp family (alpha RI-DNA), and of the ladders of fragments generated in total human DNA after digestion with XbaI and HaeIII (alpha satellite sequences). DNAs replicated in early, middle-early, middle-late and late S periods were labelled with BUdR or with [3H]thymidine. The efficiency of the cell synchronization procedure was confirmed by the transition from a high-GC to a high-AT average base composition of the DNA synthesized going from early to late S periods. By hybridizing EcoRI 340 bp repetitive fragments to BUdR-DNAs it was found that this family of sequences is replicated throughout the entire S period. Comparing fluorograph densitometric scans of [3H]DNAs to the scans of ethidium bromide patterns of total HL-60 DNA digested with XbaI and HaeIII, it was observed that DNA synthesized in different S periods is characterized by approximately the same ladder of fragments, while the intensity of each band may vary through the S phase; in particular, the XbaI 2.4 kb fragment becomes undetectable in late S. PMID:6089891

  19. Sequence change and phylogenetic signal in muscoid COII DNA sequences.

    PubMed

    Szalanski, Allen L; Owens, Carrie B

    2003-08-01

    The complete DNA sequence of the mtDNA cytochrome oxidase II gene from house fly, Musca domestica, face fly, Musca autumnalis, stable fly, Stomoxys calcitrans, horn fly, Haematobia irritans, and black garbage fly, Hydrotaea aenescens, are reported. The nucleotide sequence codes for a 229 amino acid peptide. The COII sequence is A + T rich (74.1%), with up to 12.3% nucleotide and 8.4% amino acid divergence among the five taxa. Of the 688 nucleotides encoding for the gene, 135 nucleotide sites (19.6%) are variable, and 55 (8.0%) are phylogenetically informative. A phylogenetic analysis using three calliphorids as the outgroup taxa, indicates that the two haematophagus species, horn fly and stable fly, form a sister group. PMID:14631656

  20. Automated synthesis of distillation sequences using fuzzy logic and simulation

    SciTech Connect

    Flowers, T.L.; Harrison, B.K.; Niccolai, M.J. )

    1994-08-01

    An automated distillation sequencing system (DSEQSYS) is presented, which consists of three components: a control program, a fuzzy heuristic synthesis program, and a process simulator. DSEQSYS, when applied to problems previously reported in the literature, overcomes some of the disadvantages of using heuristics or mathematical programming alone. DSEQSYS can address problems involving nonsharp separations, nonideal chemical behavior, and conflicting heuristics. A simple approach for converting the traditional separation heuristics into corresponding fuzzy heuristics is also demonstrated.

  1. DNA Sequencing in Cultural Heritage.

    PubMed

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies. PMID:27572991

  2. A microchannel electrophoresis DNA sequencing system

    SciTech Connect

    Madabhushi, R S; Warth, T; Balch, J W; Bass, M; Brewer, L R; Copeland, A C; Davidson, J C; Fitch, J P; Kegelmeyer, L M; Kimbrough, J R; McCready, P; Nelson, D; Pastrone, R L; Richardson, P M; Swierkowski, S P; Tarte, L A; Vainer, M

    1999-01-01

    In order to increase the DNA sequencing throughput of the Joint Genome Institute, we have developed a microchannel electrophoresis system. The critical new and unique elements of this system include 1) a process for the production of arrays of 96 and 384 microchannels on bonded glass substrates up to 14 x 58 cm and 2) new sieving media for high resolution and high speed separations. With custom fabrication apparatus, microchannels are etched in a borosilicate substrate, and then fusion bonded to a top substrate 1.1 mm thick that has access holes formed in it. SEM examination shows a typical microchannel to be 40 micrometers deep x 180 micrometers wide by 46 cm long. This technology offers significant advantages over discrete capillaries or conventional slab-gel approaches. High throughput DNA sequencing with over 550 base pairs resolution has been achieved in roughly half the time of conventional sequencers. In February 1999, we begin a pre-production evaluation protocol for the microchannel and for three glass capillary electrophoresis systems (two from industry and one developed by Lawrence Berkeley National Laboratory for the Joint Genome Institute). In order to utilize these instruments for DNA production sequencing, we have been evaluating and implementing software to convert raw electropherograms into called DNA bases with an associated probability of error. Our original intent was to utilize the DNA base calling software known as Plan and Phred developed by the University of Washington. This software has been outstanding for our slab gel electrophoresis systems currently in the production facility. In our tests and evaluations of this software applied to microchannel data, we observed that the electropherograms are of a different statistical and underlying signal structure compared to slab gels. Even with substantial modifications to the software, base calling performance was not satisfactory for the microchannel data. In this paper, we will present o The

  3. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  4. DNA Sequence Alignment during Homologous Recombination.

    PubMed

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  5. DNA sequencing via transverse electronic transport

    NASA Astrophysics Data System (ADS)

    Lagerqvist, Johan; Zwolak, Michael; di Ventra, Massimiliano

    2006-03-01

    Recently, it was theoretically shown that transverse current measurements could be used to distinguish the different bases of single stranded DNA. [1] If electrodes are embedded in a device, e.g., a nanopore, which allows translocation of ss-DNA, the strand can be sequenced by continuous measurement of the current in the direction perpendicular to the DNA backbone. [1] However, variations of the electronic signatures of each base in a real device due to structural fluctuations, counter-ions, water and other sources of noise will be important obstacles to overcome in order to make this theoretical proposal a reality. In order to explore these effects we have coupled molecular dynamics simulations with transport calculations to obtain the real time transverse current of ss-DNA translocating into a nanopore. We find that distributions of currents for each base are indeed different even in the presence of all the sources of noise discussed above. These results support even more the original proposal [1] that fast DNA sequencing could be done using transverse current measurements. Work supported by the National Humane Genome Research Institute. [1] M. Zwolak and M. Di Ventra, ``Electronic Signature of DNA Nucleotides via Transverse Transport'', Nano Lett. 5, 421 (2005).

  6. Imaging of DNA sequences with chemiluminescence

    SciTech Connect

    Tizard, R.; Cate, R.L.; Ramachandran, K.L.; Wysk, M.; Voyta, J.C.; Murphy, O.J.; Bronstein, I. )

    1990-06-01

    We have coupled a chemiluminescent detection method that uses an alkaline phosphatase label to the genomic DNA sequencing protocol of Church and Gilbert . Images of sequence ladders are obtained on x-ray film with exposure times of less than 30 min, as compared to 40 h required for a similar exposure with a 32P-labeled oligomer. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to DNA oligonucleotides labeled with alkaline phosphatase or with biotin, leading directly or indirectly to deposition of enzyme. If a biotinylated probe is used, an incubation with avidin-alkaline phosphatase conjugate follows. The membrane is soaked in the chemiluminescent substrate (AMPPD) and is exposed to film. Dephosphorylation of AMPPD leads in a two-step pathway to a highly localized emission of visible light. The demonstrated shorter exposure times may improve the efficiency of a serial reprobing strategy such as the multiplex sequencing approach of Church and Kieffer-Higgins.

  7. The DNA sequence of human chromosome 7.

    PubMed

    Hillier, Ladeana W; Fulton, Robert S; Fulton, Lucinda A; Graves, Tina A; Pepin, Kymberlie H; Wagner-McPherson, Caryn; Layman, Dan; Maas, Jason; Jaeger, Sara; Walker, Rebecca; Wylie, Kristine; Sekhon, Mandeep; Becker, Michael C; O'Laughlin, Michelle D; Schaller, Mark E; Fewell, Ginger A; Delehaunty, Kimberly D; Miner, Tracie L; Nash, William E; Cordes, Matt; Du, Hui; Sun, Hui; Edwards, Jennifer; Bradshaw-Cordum, Holland; Ali, Johar; Andrews, Stephanie; Isak, Amber; Vanbrunt, Andrew; Nguyen, Christine; Du, Feiyu; Lamar, Betty; Courtney, Laura; Kalicki, Joelle; Ozersky, Philip; Bielicki, Lauren; Scott, Kelsi; Holmes, Andrea; Harkins, Richard; Harris, Anthony; Strong, Cynthia Madsen; Hou, Shunfang; Tomlinson, Chad; Dauphin-Kohlberg, Sara; Kozlowicz-Reilly, Amy; Leonard, Shawn; Rohlfing, Theresa; Rock, Susan M; Tin-Wollam, Aye-Mon; Abbott, Amanda; Minx, Patrick; Maupin, Rachel; Strowmatt, Catrina; Latreille, Phil; Miller, Nancy; Johnson, Doug; Murray, Jennifer; Woessner, Jeffrey P; Wendl, Michael C; Yang, Shiaw-Pyng; Schultz, Brian R; Wallis, John W; Spieth, John; Bieri, Tamberlyn A; Nelson, Joanne O; Berkowicz, Nicolas; Wohldmann, Patricia E; Cook, Lisa L; Hickenbotham, Matthew T; Eldred, James; Williams, Donald; Bedell, Joseph A; Mardis, Elaine R; Clifton, Sandra W; Chissoe, Stephanie L; Marra, Marco A; Raymond, Christopher; Haugen, Eric; Gillett, Will; Zhou, Yang; James, Rose; Phelps, Karen; Iadanoto, Shawn; Bubb, Kerry; Simms, Elizabeth; Levy, Ruth; Clendenning, James; Kaul, Rajinder; Kent, W James; Furey, Terrence S; Baertsch, Robert A; Brent, Michael R; Keibler, Evan; Flicek, Paul; Bork, Peer; Suyama, Mikita; Bailey, Jeffrey A; Portnoy, Matthew E; Torrents, David; Chinwalla, Asif T; Gish, Warren R; Eddy, Sean R; McPherson, John D; Olson, Maynard V; Eichler, Evan E; Green, Eric D; Waterston, Robert H; Wilson, Richard K

    2003-07-10

    Human chromosome 7 has historically received prominent attention in the human genetics community, primarily related to the search for the cystic fibrosis gene and the frequent cytogenetic changes associated with various forms of cancer. Here we present more than 153 million base pairs representing 99.4% of the euchromatic sequence of chromosome 7, the first metacentric chromosome completed so far. The sequence has excellent concordance with previously established physical and genetic maps, and it exhibits an unusual amount of segmentally duplicated sequence (8.2%), with marked differences between the two arms. Our initial analyses have identified 1,150 protein-coding genes, 605 of which have been confirmed by complementary DNA sequences, and an additional 941 pseudogenes. Of genes confirmed by transcript sequences, some are polymorphic for mutations that disrupt the reading frame. PMID:12853948

  8. Nanopore-CMOS Interfaces for DNA Sequencing.

    PubMed

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-01-01

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529

  9. Repetitive DNA sequences in Mycoplasma pneumoniae.

    PubMed Central

    Wenzel, R; Herrmann, R

    1988-01-01

    Two types of different repetitive DNA sequences called RepMP1 and RepMP2 were identified in the genome of Mycoplasma pneumoniae. The number of these repeated elements, their nucleotide sequence and their localization on a physical map of the M. pneumoniae genome were determined. The results show that RepMP1 appears at least 10 times and RepMP2 at least 8 times in the genome. The repeated elements are dispersed on the chromosome and, in three cases, linked to each other by a homologous DNA sequence of 400 bp. The elements themselves are 300 bp (for RepMP1) and 150 bp (for RepMP2) long showing a high degree of homology. One copy of RepMP2 is a translated part of the gene for the major cytadhesin protein P1 which is responsible for the adsorption of M. pneumoniae to its host cell. Images PMID:3138660

  10. Sequence-specific DNA nicking endonucleases.

    PubMed

    Xu, Shuang-yong

    2015-08-01

    A group of small HNH nicking endonucleases (NEases) was discovered recently from phage or prophage genomes that nick double-stranded DNA sites ranging from 3 to 5 bp in the presence of Mg2+ or Mn2+. The cosN site of phage HK97 contains a gp74 nicking site AC↑CGC, which is similar to AC↑CGR (R=A/G) of N.ϕGamma encoded by Bacillus phage Gamma. A minimal nicking domain of 76 amino acid residues from N.ϕGamma could be fused to other DNA binding partners to generate chimeric NEases with new specificities. The biological roles of a few small HNH endonucleases (HNHE, gp74 of HK97, gp37 of ϕSLT, ϕ12 HNHE) have been demonstrated in phage and pathogenicity island DNA packaging. Another group of NEases with 3- to 7-bp specificities are either natural components of restriction systems or engineered from type IIS restriction endonucleases. A phage group I intron-encoded HNH homing endonucleases, I-PfoP3I was found to nick DNA sites of 14-16 bp. I-TslI encoded by T7-like ΦI appeared to nick DNA sites with a 9-bp core sequence. DNA nicking and labeling have been applied to optical mapping to aid genome sequence assembly and detection of large insertion/deletion mutations in genomic DNA of cancer cells. Nicking enzyme-mediated amplification reaction has been applied to rapid diagnostic testing of influenza A and B in clinical setting and for construction of DNA-based Boolean logic gates. The clustered regularly interspaced short palindromic repeats-ribonucleoprotein complex consisting of engineered Cas9 nickases in conjunction with tracerRNA:crRNA or a single-guide RNA have been successfully used in genome modifications. PMID:26352356

  11. Toward a visualization of DNA sequences.

    PubMed

    Cox, David N; Tharp, Alan L

    2010-01-01

    Most biologists associate pattern discovery in DNA with finding repetitive sequences or commonalities across several sequences. However, pattern discovery is not limited to finding repetitions and commonalities. Pattern discovery also involves identifying objects and distinguishing objects from one another. Human vision is unmatched in its ability to identify and distinguish objects. Considerable research into human vision has revealed to a fair degree the visual cues that our brains use to segment an image into separate regions and entities. In this paper, we consider some of these visual cues to construct a novel graphical representation of a DNA sequence. We exploit one of these cues, proximity, to segment DNA into visibly distinct regions and structures. We also demonstrate how to manipulate proximity to identify motifs visually. Lastly, we demonstrate how an additional cue, color, can be used to visualize the Shannon entropy associated with different structures. The presence of large numbers of such regions and structures in DNA suggests that they likely play some important biological role and would be interesting targets for further research. PMID:20865527

  12. DNA Sequencing Using an Engineered Protein Nanopore

    NASA Astrophysics Data System (ADS)

    Gundlach, Jens H.

    2010-03-01

    Inexpensive and fast sequencing of DNA is of paramount importance to medicine, the life sciences and to many other applications. Because of the nanometer diameter of DNA a nanometer-scale reader directly interfaced to macroscopic observables seems particularly attractive. We are working on a new single molecule technique based on a biological pore embedded in a lipid bilayer. When a voltage is applied across the bilayer an ion current is measured that flows through the nanometer opening of the pore. Poly-negatively charged single stranded DNA passes through the pore and reduces the ion current with the remaining ion current being indicative of the nucleotide type in the constriction of the pore. The protein pore that we introduced to the field, MspA, has a shape ideally suited to nanopore sequencing, has robustness comparable to solid state devices, is easily reproduced with sub-nanometer level precision and is engineerable using genetic mutations. I will present proof-of-principle data showing that this technique can lead to a direct very inexpensive and fast sequencing technology. The experimental electronic signatures of the DNA translocation process provide an ideal test bed for molecular dynamics simulations, which in turn allows developing intuition and prediction of nanoscale dynamics.

  13. "Doublex" fluorescent DNA sequencing: two independent sequences obtained simultaneously in one reaction with internal labeling and unlabeled primers.

    PubMed

    Wiemann, S; Stegemann, J; Zimmermann, J; Voss, H; Benes, V; Ansorge, W

    1996-02-15

    The novel "doublex" DNA sequencing technique that makes it possible to obtain simultaneously two independent sequences from one sequencing reaction with the use of unlabeled primers and internal labeling is described. The different sequencing products are labeled in parallel with fluorescein-15-dATP and Texas red-5-dCTP present in the same tube. The characteristics of T7 DNA polymerase are exploited to ensure that only either of the labeled dNTPs is incorporated into the corresponding sequencing products. Specificity of labeling is ensured by the selection of primers. One of the unlabeled primers is chosen to be followed by an "A," the other by a "C" to be incorporated immediately downstream from the primer binding site. The doublex sequencing technique is applicable to the simultaneous sequencing of either the same DNA template/strand or a mixture of different templates. Combinations of unlabeled and labeled primers in the same sequencing reaction are also possible. The two sequences can be determined in parallel and on-line in the same lanes of a gel with a novel automated DNA sequencer, which was previously described for use with labeled primers. PMID:8714594

  14. Local Renyi entropic profiles of DNA sequences

    PubMed Central

    Vinga, Susana; Almeida, Jonas S

    2007-01-01

    Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at . Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures. PMID:17939871

  15. Linguistic features of noncoding DNA sequences

    NASA Astrophysics Data System (ADS)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C.-K.; Simons, M.; Stanley, H. E.

    1994-12-01

    We extend the Zipf approach to analyzing linguistic texts to the statistical study of DNA base pair sequences, and find that the noncoding regions are more similar to natural languages than the coding regions. We also adapt the Shannon approach to quantifying the ``redundancy'' of a linguistic text in terms of a measurable entropy function, and demonstrate that noncoding regions in eukaryotes display a smaller entropy and larger redundancy B than coding regions, supporting the possibility that noncoding regions of DNA may carry biological information.

  16. Metagenomics: DNA sequencing of environmental samples

    SciTech Connect

    Tringe, Susannah Green; Rubin, Edward M.

    2005-09-01

    While genomics has classically focused on pure,easy-to-obtain samples, such as microbes that grow readily in culture orlarge animals and plants, these organisms represent but a fraction of theliving or once living organisms of interest. Many species are difficultto study in isolation, because they fail to grow in laboratory culture,depend on other organisms for critical processes, or have become extinct.DNA sequence-based methods circumvent these obstacles, as DNA can bedirectly isolated from live or dead cells in a variety of contexts, andhave led to the emergence of a new field referred to asmetagenomics.

  17. Compilation of DNA sequences of Escherichia coli

    PubMed Central

    Kröger, Manfred

    1989-01-01

    We have compiled the DNA sequence data for E.coli K12 available from the GENBANK and EMBO databases and over a period of several years independently from the literature. We have introduced all available genetic map data and have arranged the sequences accordingly. As far as possible the overlaps are deleted and a total of 940,449 individual bp is found to be determined till the beginning of 1989. This corresponds to a total of 19.92% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2% derived from the sequence of lysogenic bacteriophage lambda and the various insertion sequences. This compilation may be available in machine readable form from one of the international databanks in some future. PMID:2654890

  18. Effective Automated Feature Construction and Selection for Classification of Biological Sequences

    PubMed Central

    Kamath, Uday; De Jong, Kenneth; Shehu, Amarda

    2014-01-01

    Background Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features. Methodology We present an algorithmic framework (EFFECT) for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not. Results To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or

  19. Highly multiplexed DNA sequencing by capillary electrophoresis

    SciTech Connect

    Yeung, E.S.; Ueno, K.; Chang, H.T.

    1994-12-31

    It is obvious that irrespective of whichever basic technology is eventually selected to sequence the entire human genome there are substantial gains to be made if a high degree of multiplexing of parallel runs can be implemented. Such multiplexing should not involve expensive instrumentation and should not require additional personnel, or else the main objective of cost reduction will not be satisfied even though the total time for sequencing is reduced. In the last two years, several research groups have shown that capillary electrophoresis (CE) is an attractive alternative for DNA sequencing. Part of the improvement in sequencing speed in CE is counteracted by the inherent ability of slab gels for accommodating multiple lanes in a single run. Recently, the authors have developed several excitation schemes for highly multiplexed capillary electrophoresis. Detection at the pM level was demonstrated. The authors report here the use of a novel excitation geometry to simultaneously monitor 100 capillary tubes during electrophoresis. This represents a truly parallel multiplexing scheme for high-speed DNA sequencing.

  20. ASTRAL, a hyperspectral imaging DNA sequencer

    NASA Astrophysics Data System (ADS)

    O'Brien, Kevin M.; Wren, Jonathan; Davé, Varshal K.; Bai, Diane; Anderson, Richard D.; Rayner, Simon; Evans, Glen A.; Dabiri, Ali E.; Garner, Harold R.

    1998-05-01

    We are developing a prototype automatic DNA sequencer which utilizes polyacrylamide slab gels imaged through a novel optical detection system. The design of this prototype sequencer allows the ability to perform direct optical coupling over the entire read area of the gel and hyperspectrographic separation and detection of the fluorescence emission. The machine has no moving parts. All the major components incorporated in this prototype are all currently available "off the shelf," thus reducing equipment development time and decreasing costs. Software developed for data acquisition, analysis, and conversion to other standard formats facilitates compatibility.

  1. DNA sequences, recombinant DNA molecules and processes producing human phospholipase inhibitor polypeptides

    SciTech Connect

    Wallner, B.P.; Pepinsky, R.B.; Garwin, J.L.

    1989-11-07

    This patent describes a recombinant DNA molecule. In comprises a DNA sequence coding for a phospholopase inhibitor polypeptide and being selected from the group consisting of: the cDNA insert of ALC, DNA sequences which code on expression for a phospholopase inhibitor, and DNA sequences which are degenerate as a result of the genetic code to either of the foregoing DNA sequences and which code on expression for a phospholipase inhibitor.

  2. Sequencing and Analysis of Neanderthal Genomic DNA

    PubMed Central

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Pääbo, Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2008-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library are of Neanderthal origin, the strongest being the ascertainment of sequence identities between Neanderthal and chimpanzee at sites where the human genomic sequence is different. These results enabled us to calculate the human-Neanderthal divergence time based on multiple randomly distributed autosomal loci. Our analyses suggest that on average the Neanderthal genomic sequence we obtained and the reference human genome sequence share a most recent common ancestor ~706,000 years ago, and that the human and Neanderthal ancestral populations split ~370,000 years ago, before the emergence of anatomically modern humans. Our finding that the Neanderthal and human genomes are at least 99.5% identical led us to develop and successfully implement a targeted method for recovering specific ancient DNA sequences from metagenomic libraries. This initial analysis of the Neanderthal genome advances our understanding of the evolutionary relationship of Homo sapiens and Homo neanderthalensis and signifies the dawn of Neanderthal genomics. PMID:17110569

  3. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    SciTech Connect

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  4. Imaging of DNA sequences with chemiluminescence

    SciTech Connect

    Tizard, R.; Cate, R.L.; Ramachandran, K.L.; Wysk, M.; Bronstein, I.; Voyta, J.C.; Murphy, O.J.

    1989-12-31

    We have coupled a chemiluminescent method for detecting oligonucleotides labeled with alkaline phosphatase to the genomic DNA sequencing protocol of Church and Gilbert. Images of sequence ladders obtained on x-ray film in a 30 minute exposure are comparable to those from a 40 hour exposure with 3000 Ci/mmol {sup 32}P probes. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to an oligonucleotide probe conjugated either to biotin or to alkaline phosphates. If biotinylated probe is used, then an avidin-alkaline phosphatase conjugate is subsequently bound. This membrane, bearing immobilized alkaline phosphatase, is incubated with the commercially available chemiluminescent substrate disodium 3-(4-methoxyspiro[1,2-dioxetone-3,2{prime}-tricyclo[3.3.1.1.{sup 3.7}]decan]-4-yl)phenyl phosphate. (AMPPD) Dephosphorylation of AMPPD leads in a two step pathway to a highly localized emission of visible light.

  5. Imaging of DNA sequences with chemiluminescence

    SciTech Connect

    Tizard, R.; Cate, R.L.; Ramachandran, K.L.; Wysk, M. ); Bronstein, I.; Voyta, J.C.; Murphy, O.J. )

    1989-01-01

    We have coupled a chemiluminescent method for detecting oligonucleotides labeled with alkaline phosphatase to the genomic DNA sequencing protocol of Church and Gilbert. Images of sequence ladders obtained on x-ray film in a 30 minute exposure are comparable to those from a 40 hour exposure with 3000 Ci/mmol {sup 32}P probes. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to an oligonucleotide probe conjugated either to biotin or to alkaline phosphates. If biotinylated probe is used, then an avidin-alkaline phosphatase conjugate is subsequently bound. This membrane, bearing immobilized alkaline phosphatase, is incubated with the commercially available chemiluminescent substrate disodium 3-(4-methoxyspiro(1,2-dioxetone-3,2{prime}-tricyclo(3.3.1.1.{sup 3.7})decan)-4-yl)phenyl phosphate. (AMPPD) Dephosphorylation of AMPPD leads in a two step pathway to a highly localized emission of visible light.

  6. Accurate restoration of DNA sequences. Progress report

    SciTech Connect

    Churchill, G.A.

    1994-05-01

    The primary of this project are the development of (1) a general stochastic model for DNA sequencing errors (2) algorithms to restore the original DNA sequence and (3) statistical methods to assess the accuracy of this restoration. A secondary objective is to develop new algorithms for fragment assembly. Initially a stochastic model that assumes errors are independent and uniformly distributed will be developed. Generalizations of the basic model will be developed to account for (1) decay of accuracy along fragments, (2) variable error rates among fragments, (3) sequence dependent errors (e.g. homopolymeric, runs), and (4) strand--specific systematic errors (e.g. compressions). The emphasis of this project will be the development of a theoretical basis for determining sequence accuracy. However, new algorithms are proposed and these will be implemented as software (in the C programming language). This software will be tested using real and simulated data. It will be modular in design and will be made available for distribution to the scientific community.

  7. Automated mass spectrometric sequence determination of cyclic peptide library members.

    PubMed

    Redman, James E; Wilcoxen, Keith M; Ghadiri, M Reza

    2003-01-01

    Cyclic peptides have come under scrutiny as potential antimicrobial therapeutic agents. Combinatorial split-and-pool synthesis of cyclic peptides can afford single compound per well libraries for antimicrobial screening, new lead identification, and construction of quantitative structure-activity relationships (QSAR). Here, we report a new sequencing protocol for rapid identification of the members of a cyclic peptide library based on automated computer analysis of mass spectra, obviating the need for library encoding/decoding strategies. Furthermore, the software readily integrates with common spreadsheet and database packages to facilitate data visualization and archiving. The utility of the new MS-sequencing approach is demonstrated using sonic spray ionization ion trap MS and MS/MS spectrometry on a single compound per bead cyclic peptide library and validated with individually synthesized pure cyclic D,L-alpha-peptides. PMID:12523832

  8. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1988-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330

  9. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1989-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889

  10. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1990-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227

  11. DNA separations in microfabricated devices with automated capillary sample introduction.

    PubMed

    Smith, E M; Xu, H; Ewing, A G

    2001-01-01

    A novel method is presented for automated injection of DNA samples into microfabricated separation devices via capillary electrophoresis. A single capillary is used to electrokinetically inject discrete plugs of DNA into an array of separation lanes on a glass chip. A computer-controlled micromanipulator is used to automate this injection process and to repeat injections into five parallel lanes several times over the course of the experiment. After separation, labeled DNA samples are detected by laser-induced fluorescence. Five serial separations of 6-carboxyfluorescein (FAM)-labeled oligonucleotides in five parallel lanes are shown, resulting in the analysis of 25 samples in 25 min. It is estimated that approximately 550 separations of these same oligonucleotides could be performed in one hour by increasing the number of lanes to 37 and optimizing the rate of the manipulator movement. Capillary sample introduction into chips allows parallel separations to be continuously performed in serial, yielding high throughput and minimal need for operator intervention. PMID:11288906

  12. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

    1997-05-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  13. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    SciTech Connect

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  14. Porcine parvovirus: DNA sequence and genome organization.

    PubMed

    Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I

    1989-10-01

    We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV. PMID:2794971

  15. Automated detection of point mutations using fluorescent sequence trace subtraction.

    PubMed Central

    Bonfield, J K; Rada, C; Staden, R

    1998-01-01

    The final step in the detection of mutations is to determine the sequence of the suspected mutant and to compare it with that of the wild-type, and for this fluorescence-based sequencing instruments are widely used. We describe some simple algorithms forcomparing sequence traces which, as part of our sequence assembly and analysis package, are proving useful for the discovery of mutations and which may also help to identify misplaced readings in sequence assembly projects. The mutations can be detected automatically by a new program called TRACE_DIFF and new types of trace display in our program GAP4 greatly simplify visual checking of the assigned changes. To assess the accuracy of the automatic mutation detection algorithm we analysed 214 sequence readings from hypermutating DNA comprising a total of 108 497 bases. After the readings were assembled there were 1232 base differences, including 392 Ns and 166 alignment characters. Visual inspection of the traces established that of the 1232 differences, 353 were real mutations while the rest were due to base calling errors. The TRACE_DIFF algorithm automatically identified all but 36, with 28 false positives. Further information about the software can be obtained from http://www.mrc-lmb.cam.ac.uk/pubseq/ PMID:9649626

  16. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  17. Recent advances in DNA sequencing techniques

    NASA Astrophysics Data System (ADS)

    Singh, Rama Shankar

    2013-06-01

    Successful mapping of the draft human genome in 2001 and more recent mapping of the human microbiome genome in 2012 have relied heavily on the parallel processing of the second generation/Next Generation Sequencing (NGS) DNA machines at a cost of several millions dollars and long computer processing times. These have been mainly biochemical approaches. Here a system analysis approach is used to review these techniques by identifying the requirements, specifications, test methods, error estimates, repeatability, reliability and trends in the cost reduction. The first generation, NGS and the Third Generation Single Molecule Real Time (SMART) detection sequencing methods are reviewed. Based on the National Human Genome Research Institute (NHGRI) data, the achieved cost reduction of 1.5 times per yr. from Sep. 2001 to July 2007; 7 times per yr., from Oct. 2007 to Apr. 2010; and 2.5 times per yr. from July 2010 to Jan 2012 are discussed.

  18. Poincaré recurrences of DNA sequences

    NASA Astrophysics Data System (ADS)

    Frahm, K. M.; Shepelyansky, D. L.

    2012-01-01

    We analyze the statistical properties of Poincaré recurrences of Homo sapiens, mammalian, and other DNA sequences taken from the Ensembl Genome data base with up to 15 billion base pairs. We show that the probability of Poincaré recurrences decays in an algebraic way with the Poincaré exponent β≈4 even if the oscillatory dependence is well pronounced. The correlations between recurrences decay with an exponent ν≈0.6 that leads to an anomalous superdiffusive walk. However, for Homo sapiens sequences, with the largest available statistics, the diffusion coefficient converges to a finite value on distances larger than one million base pairs. We argue that the approach based on Poncaré recurrences determines new proximity features between different species and sheds a new light on their evolution history.

  19. Elucidating population histories using genomic DNA sequences.

    PubMed

    Vigilant, Linda

    2009-04-01

    In 1993, Cliff Jolly suggested that rather than debating species definitions and classifications, energy would be better spent investigating multidimensional patterns of variation and gene flow among populations. Until now, however, genetic studies of wild primate populations have been limited to very small portions of the genome. Access to complete genome sequences of humans, chimpanzees, macaques, and other primates makes it possible to design studies surveying substantial amounts of DNA sequence variation at multiple genetic loci in representatives of closely related but distinct wild primate populations. Such data can be analyzed with new approaches that estimate not only when populations diverged but also the relative amounts and directions of subsequent gene flow. These analyses will reemphasize the difficulty of achieving consistent species and subspecies definitions by revealing the extent of variation in the amount and duration of gene flow accompanying population divergences. PMID:19817223

  20. Direct Detection and Sequencing of Damaged DNA Bases

    PubMed Central

    2011-01-01

    Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597

  1. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  2. Improving DNA sequencing accuracy and throughput

    SciTech Connect

    Nelson, D.O. |

    1996-12-31

    LLNL is beginning to explore statistical approaches to the problem of determining the DNA sequence underlying data obtained from fluorescence-based gel electrophoresis. Among the features of this problem that make it interesting to statisticians include: (1) the underlying mechanics of electrophoresis is quite complex and still not completely understood; (2) the yield of fragments of any given size can be quite small and variable; (3) the mobility of fragments of a given size can depend on the terminating base; (4) the data consists of samples from one or more continuous, non-stationary signals; (5) boundaries between segments generated by distinct elements of the underlying sequence are ill-defined or nonexistent in the signal; and (6) the sampling rate of the signal greatly exceeds the rate of evolution of the underlying discrete sequence. Current approaches to base calling address only some of these issues, and usually in a heuristic, ad hoc way. In this article we describe some of our initial efforts towards increasing base calling accuracy and throughput by providing a rational, statistical foundation to the process of deducing sequence from signal. 31 refs., 12 figs.

  3. Image correlation method for DNA sequence alignment.

    PubMed

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment. PMID:22761742

  4. Detection of specific DNA sequences by fluorescence amplification: a color complementation assay.

    PubMed Central

    Chehab, F F; Kan, Y W

    1989-01-01

    We have developed a color complementation assay that allows rapid screening of specific genomic DNA sequences. It is based on the simultaneous amplification of two or more DNA segments with fluorescent oligonucleotide primers such that the generation of a color, or combination of colors, can be visualized and used for diagnosis. Color complementation assay obviates the need for gel electrophoresis and has been applied to the detection of a large and small gene deletion, a chromosomal translocation, an infectious agent, and a single-base substitution. DNA amplification with fluorescent oligonucleotide primers has also been used to multiplex and discriminate five different amplified DNA loci simultaneously. Each primer set is conjugated to a different dye, and the fluorescence of each dye respective to its amplified DNA locus is scored on a fluorometer. This method is valuable for DNA diagnostics of genetic, acquired, and infectious diseases, as well as in DNA forensics. It also lends itself to complete automation. Images PMID:2594760

  5. Automating the Photogrammetric Bridging Based on MMS Image Sequence Processing

    NASA Astrophysics Data System (ADS)

    Silva, J. F. C.; Lemes Neto, M. C.; Blasechi, V.

    2014-11-01

    The photogrammetric bridging or traverse is a special bundle block adjustment (BBA) for connecting a sequence of stereo-pairs and of determining the exterior orientation parameters (EOP). An object point must be imaged in more than one stereo-pair. In each stereo-pair the distance ratio between an object and its corresponding image point varies significantly. We propose to automate the photogrammetric bridging based on a fully automatic extraction of homologous points in stereo-pairs and on an arbitrary Cartesian datum to refer the EOP and tie points. The technique uses SIFT algorithm and the keypoint matching is given by similarity descriptors of each keypoint based on the smallest distance. All the matched points are used as tie points. The technique was applied initially to two pairs. The block formed by four images was treated by BBA. The process follows up to the end of the sequence and it is semiautomatic because each block is processed independently and the transition from one block to the next depends on the operator. Besides four image blocks (two pairs), we experimented other arrangements with block sizes of six, eight, and up to twenty images (respectively, three, four, five and up to ten bases). After the whole image pairs sequence had sequentially been adjusted in each experiment, a simultaneous BBA was run so to estimate the EOP set of each image. The results for classical ("normal case") pairs were analyzed based on standard statistics regularly applied to phototriangulation, and they show figures to validate the process.

  6. A Microfluidic DNA Library Preparation Platform for Next-Generation Sequencing

    PubMed Central

    Sinha, Anupama; Bent, Zachary W.; Solberg, Owen D.; Williams, Kelly P.; Langevin, Stanley A.; Renzi, Ronald F.; Van De Vreugde, James L.; Meagher, Robert J.; Schoeniger, Joseph S.; Lane, Todd W.; Branda, Steven S.; Bartsch, Michael S.; Patel, Kamlesh D.

    2013-01-01

    Next-generation sequencing (NGS) is emerging as a powerful tool for elucidating genetic information for a wide range of applications. Unfortunately, the surging popularity of NGS has not yet been accompanied by an improvement in automated techniques for preparing formatted sequencing libraries. To address this challenge, we have developed a prototype microfluidic system for preparing sequencer-ready DNA libraries for analysis by Illumina sequencing. Our system combines droplet-based digital microfluidic (DMF) sample handling with peripheral modules to create a fully-integrated, sample-in library-out platform. In this report, we use our automated system to prepare NGS libraries from samples of human and bacterial genomic DNA. E. coli libraries prepared on-device from 5 ng of total DNA yielded excellent sequence coverage over the entire bacterial genome, with >99% alignment to the reference genome, even genome coverage, and good quality scores. Furthermore, we produced a de novo assembly on a previously unsequenced multi-drug resistant Klebsiella pneumoniae strain BAA-2146 (KpnNDM). The new method described here is fast, robust, scalable, and automated. Our device for library preparation will assist in the integration of NGS technology into a wide variety of laboratories, including small research laboratories and clinical laboratories. PMID:23894387

  7. (abstract) Automated Constraint Checking of Spacecraft Command Sequences

    NASA Technical Reports Server (NTRS)

    Horvath, Joan; Alkalaj, Leon; Schneider, Karl; Spitale, Joseph

    1994-01-01

    Making certain that spacecraft command sequences do not violate any constraints is often tedious and expensive in terms of both personnel and software development. To reduce this cost, we have pursued the development of a flexible system for specifying models of spacecraft behavior in response to commands as well as constraints on that behavior. The potential need for modeling complex spacecraft behavior required that the system be designed to be usable both on a conventional workstation and a parallel supercomputer. Finally, it needed to be intuitive enough for the the intended mission operations users to easily design sets of rules and models to automate tedious, resource-consuming constraint checking of commands. We have defined a Specification And Verification Environment (SAVE) for spacecraft flight rules.

  8. A Molecular Fraction Collecting Tool for the ABI 310 Automated Sequencer

    PubMed Central

    Lin, Ming-Tseh; Rich, Roy G.; Shipley, Royce F.; Hafez, Michael J.; Tseng, Li-Hui; Murphy, Kathleen M.; Gocke, Christopher D.; Eshleman, James R.

    2007-01-01

    Several methods exist to retrieve and purify DNA fragments after agarose or polyacrylamide gel electrophoresis for subsequent analyses. However, molecules present in low concentration and molecules similar in size to their neighbors are difficult to purify. Capillary electrophoresis has become popular in molecular diagnostic laboratories because of its automation, excellent resolution, and high sensitivity. In the current study, the ABI Prism 310 Genetic Analyzer was reconfigured into a fraction collector by adapting the standard gel block to accommodate a collection tube at the distal end of capillary. The time to collect the desired peaks was estimated by extrapolating from standard capillary electrophoresis using the original gel block. Fraction collection from a mixture of DNA fragments amplified from wild type and several internal tandem duplication mutations of the FMS-like tyrosine kinase 3 (Flt3) gene yielded highly purified DNA fragments containing internal tandem duplication mutations and predictable electrokinetics using the reconstructed gel block. The reconfigured instrument could successfully isolate DNA amplicons from extremely low-amplitude peaks (110 relative fluorescent units), which were undetectable using polyacrylamide gel electrophoresis. In addition, we successfully isolated bands that were only three bases apart that comigrated on polyacrylamide gel electrophoresis. DNA sequencing was used to confirm that the correct peaks were recovered at sufficient purity. PMID:17916601

  9. A compilation of partial sequences of randomly selected cDNA clones from the rat incisor.

    PubMed

    Matsuki, Y; Nakashima, M; Amizuka, N; Warshawsky, H; Goltzman, D; Yamada, K M; Yamada, Y

    1995-01-01

    The formation of tooth organs is regulated by a series of developmental programs. We have initiated a genome project with the ultimate goal of identifying novel genes important for tooth development. As an initial approach, we constructed a unidirectional cDNA library from the non-calcified portion of incisors of 3- to 4-week-old rats, sequenced cDNA clones, and classified their sequences by homology search through the GenBank data base and the PIR protein data base. Here, we report partial DNA sequences obtained by automated DNA sequencing on 400 cDNA clones randomly selected from the library. Of the sequences determined, 51% represented sequences of new genes that were not related to any previously reported gene. Twenty-six percent of the clones strongly matched genes and proteins in the data bases, including amelogenin, alpha 1(I) and alpha 2(I) collagen chains, osteonectin, and decorin. Nine percent of clones revealed partial sequence homology to known genes such as transcription factors and cell surface receptors. A significant number of the previously identified genes were expressed redundantly and were found to encode extracellular matrix proteins. Identification and cataloging of cDNA clones in these tissues are the first step toward identification of markers expressed in a tissue- or stage-specific manner, as well as the genetic linkage study of tooth anomalies. Further characterization of the clones described in this paper should lead to the discovery of novel genes important for tooth development. PMID:7876422

  10. Determining orientation and direction of DNA sequences

    DOEpatents

    Goodwin, Edwin H.; Meyne, Julianne

    2000-01-01

    Determining orientation and direction of DNA sequences. A method by which fluorescence in situ hybridization can be made strand specific is described. Cell cultures are grown in a medium containing a halogenated nucleotide. The analog is partially incorporated in one DNA strand of each chromatid. This substitution takes place in opposite strands of the two sister chromatids. After staining with the fluorescent DNA-binding dye Hoechst 33258, cells are exposed to long-wavelength ultraviolet light which results in numerous strand nicks. These nicks enable the substituted strand to be denatured and solubilized by heat, treatment with high or low pH aqueous solutions, or by immersing the strands in 2.times.SSC (0.3M NaCl+0.03M sodium citrate), to name three procedures. It is unnecessary to enzymatically digest the strands using Exo III or another exonuclease in order to excise and solubilize nucleotides starting at the sites of the nicks. The denaturing/solubilizing process removes most of the substituted strand while leaving the prereplication strand largely intact. Hybridization of a single-stranded probe of a tandem repeat arranged in a head-to-tail orientation will result in hybridization only to the chromatid with the complementary strand present.

  11. Sequence-of-events-driven automation of the deep space network

    NASA Technical Reports Server (NTRS)

    Hill, R., Jr.; Fayyad, K.; Smyth, C.; Santos, T.; Chen, R.; Chien, S.; Bevan, R.

    1996-01-01

    In February 1995, sequence-of-events (SOE)-driven automation technology was demonstrated for a Voyager telemetry downlink track at DSS 13. This demonstration entailed automated generation of an operations procedure (in the form of a temporal dependency network) from project SOE information using artificial intelligence planning technology and automated execution of the temporal dependency network using the link monitor and control operator assistant system. This article describes the overall approach to SOE-driven automation that was demonstrated, identifies gaps in SOE definitions and project profiles that hamper automation, and provides detailed measurements of the knowledge engineering effort required for automation.

  12. Using Huffman coding method to visualize and analyze DNA sequences.

    PubMed

    Qi, Zhao-Hui; Li, Ling; Qi, Xiao-Qin

    2011-11-30

    On the basis of the Huffman coding method, we propose a new graphical representation of DNA sequence. The representation can avoid degeneracy and loss of information in the transfer of data from a DNA sequence to its graphical representation. Then a multicomponent vector from the representation is introduced to characterize quantitatively DNA sequences. The components of the vector are derived from the graphical representation of DNA primary sequence. The examination of similarities and dissimilarities among the complete coding sequences of β-globin gene of 11 species and six ND6 proteins shows the utility of the scheme. PMID:21953557

  13. Non-random DNA fragmentation in next-generation sequencing

    PubMed Central

    Poptsova, Maria S.; Il'icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-01-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed “reads” are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions. PMID:24681819

  14. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  15. Automated Analysis of Dynamic Ca2+ Signals in Image Sequences

    PubMed Central

    Francis, Michael; Waldrup, Josh; Qian, Xun; Taylor, Mark S.

    2014-01-01

    Intracellular Ca2+ signals are commonly studied with fluorescent Ca2+ indicator dyes and microscopy techniques. However, quantitative analysis of Ca2+ imaging data is time consuming and subject to bias. Automated signal analysis algorithms based on region of interest (ROI) detection have been implemented for one-dimensional line scan measurements, but there is no current algorithm which integrates optimized identification and analysis of ROIs in two-dimensional image sequences. Here an algorithm for rapid acquisition and analysis of ROIs in image sequences is described. It utilizes ellipses fit to noise filtered signals in order to determine optimal ROI placement, and computes Ca2+ signal parameters of amplitude, duration and spatial spread. This algorithm was implemented as a freely available plugin for ImageJ (NIH) software. Together with analysis scripts written for the open source statistical processing software R, this approach provides a high-capacity pipeline for performing quick statistical analysis of experimental output. The authors suggest that use of this analysis protocol will lead to a more complete and unbiased characterization of physiologic Ca2+ signaling. PMID:24962784

  16. MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

    PubMed

    Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2016-01-01

    The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a 'dark matter.' We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the

  17. MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes

    PubMed Central

    Verneau, Jonathan; Levasseur, Anthony; Raoult, Didier; La Scola, Bernard; Colson, Philippe

    2016-01-01

    The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a ‘dark matter.’ We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about

  18. SeqTrace: A Graphical Tool for Rapidly Processing DNA Sequencing Chromatograms

    PubMed Central

    Stucky, Brian J.

    2012-01-01

    Modern applications of Sanger DNA sequencing often require converting a large number of chromatogram trace files into high-quality DNA sequences for downstream analyses. Relatively few nonproprietary software tools are available to assist with this process. SeqTrace is a new, free, and open-source software application that is designed to automate the entire workflow by facilitating easy batch processing of large numbers of trace files. SeqTrace can identify, align, and compute consensus sequences from matching forward and reverse traces, filter low-quality base calls, and end-trim finished sequences. The software features a graphical interface that includes a full-featured chromatogram viewer and sequence editor. SeqTrace runs on most popular operating systems and is freely available, along with supporting documentation, at http://seqtrace.googlecode.com/. PMID:22942788

  19. DNA Shape versus Sequence Variations in the Protein Binding Process.

    PubMed

    Chen, Chuanying; Pettitt, B Montgomery

    2016-02-01

    The binding process of a protein with a DNA involves three stages: approach, encounter, and association. It has been known that the complexation of protein and DNA involves mutual conformational changes, especially for a specific sequence association. However, it is still unclear how the conformation and the information in the DNA sequences affects the binding process. What is the extent to which the DNA structure adopted in the complex is induced by protein binding, or is instead intrinsic to the DNA sequence? In this study, we used the multiscale simulation method to explore the binding process of a protein with DNA in terms of DNA sequence, conformation, and interactions. We found that in the approach stage the protein can bind both the major and minor groove of the DNA, but uses different features to locate the binding site. The intrinsic conformational properties of the DNA play a significant role in this binding stage. By comparing the specific DNA with the nonspecific in unbound, intermediate, and associated states, we found that for a specific DNA sequence, ∼40% of the bending in the association forms is intrinsic and that ∼60% is induced by the protein. The protein does not induce appreciable bending of nonspecific DNA. In addition, we proposed that the DNA shape variations induced by protein binding are required in the early stage of the binding process, so that the protein is able to approach, encounter, and form an intermediate at the correct site on DNA. PMID:26840719

  20. Development of an Automated DNA Detection System Using an Electrochemical DNA Chip Technology

    NASA Astrophysics Data System (ADS)

    Hongo, Sadato; Okada, Jun; Hashimoto, Koji; Tsuji, Koichi; Nikaido, Masaru; Gemma, Nobuhiro

    A new compact automated DNA detection system Genelyzer™ has been developed. After injecting a sample solution into a cassette with a built-in electrochemical DNA chip, processes from hybridization reaction to detection and analysis are all operated fully automatically. In order to detect a sample DNA, electrical currents from electrodes due to an oxidization reaction of electrochemically active intercalator molecules bound to hybridized DNAs are detected. The intercalator is supplied as a reagent solution by a fluid supply unit of the system. The feasibility test proved that the simultaneous typing of six single nucleotide polymorphisms (SNPs) associated with a rheumatoid arthritis (RA) was carried out within two hours and that all the results were consistent with those by conventional typing methods. It is expected that this system opens a new way to a DNA testing such as a test for infectious diseases, a personalized medicine, a food inspection, a forensic application and any other applications.

  1. Inferring coalescence times from DNA sequence data.

    PubMed

    Tavaré, S; Balding, D J; Griffiths, R C; Donnelly, P

    1997-02-01

    The paper is concerned with methods for the estimation of the coalescence time (time since the most recent common ancestor) of a sample of intraspecies DNA sequences. The methods take advantage of prior knowledge of population demography, in addition to the molecular data. While some theoretical results are presented, a central focus is on computational methods. These methods are easy to implement, and, since explicit formulae tend to be either unavailable or unilluminating, they are also more useful and more informative in most applications. Extensions are presented that allow for the effects of uncertainty in our knowledge of population size and mutation rates, for variability in population sizes, for regions of different mutation rate, and for inference concerning the coalescence time of the entire population. The methods are illustrated using recent data from the human Y chromosome. PMID:9071603

  2. On 2D graphical representation of DNA sequence of nondegeneracy

    NASA Astrophysics Data System (ADS)

    Zhang, Yusen; Liao, Bo; Ding, Kequan

    2005-08-01

    Some two-dimensional (2D) graphical representations of DNA sequences have been given by Gates, Nandy, Leong and Mogenthaler, Randić, and Liao et al., which give visual characterizations of DNA sequences. In this Letter, we introduce a nondegeneracy 2D graphical representation of DNA sequence, which is different from Randić's novel 2D representation and Liao's 2D representation. We also present the nondegeneracy forms corresponding to the representations of Gates, Nandy, Leong and Mogenthaler.

  3. Duplication count distributions in DNA sequences

    NASA Astrophysics Data System (ADS)

    Sindi, Suzanne S.; Hunt, Brian R.; Yorke, James A.

    2008-12-01

    We study quantitative features of complex repetitive DNA in several genomes by studying sequences that are sufficiently long that they are unlikely to have repeated by chance. For each genome we study, we determine the number of identical copies, the “duplication count,” of each sequence of length 40, that is of each “40-mer.” We say a 40-mer is “repeated” if its duplication count is at least 2. We focus mainly on “complex” 40-mers, those without short internal repetitions. We find that we can classify most of the complex repeated 40-mers into two categories: one category has its copies clustered closely together on one chromosome, the other has its copies distributed widely across multiple chromosomes. For each genome and each of the categories above, we compute N(c) , the number of 40-mers that have duplication count c , for each integer c . In each case, we observe a power-law-like decay in N(c) as c increases from 3 to 50 or higher. In particular, we find that N(c) decays much more slowly than would be predicted by evolutionary models where each 40-mer is equally likely to be duplicated. We also analyze an evolutionary model that does reflect the slow decay of N(c) .

  4. What Advances Are Being Made in DNA Sequencing?

    MedlinePlus

    ... the future. For more information about DNA sequencing technologies and their use: Genetics Home Reference discusses whether ... the University of Washington describes the different sequencing technologies and what the new technologies have meant for ...

  5. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    NASA Astrophysics Data System (ADS)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  6. DNA sequence determination by hybridization: A strategy for efficient large-scale sequencing

    SciTech Connect

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoody, J.; Crkvenjakov, R. ); Funkhouser, W.K.; Koop, B.; Hood, L. )

    1993-06-11

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project. 22 refs., 3 figs.

  7. Kinetoplast DNA minicircles: regions of extensive sequence divergence.

    PubMed Central

    Rogers, W O; Wirth, D F

    1987-01-01

    Previous work has shown that the kinetoplast minicircle DNA of Leishmania species exhibits species-specific sequence divergence and this observation has led to the development of a DNA probe-based diagnostic test for leishmaniasis. In the work reported here, we demonstrate that the minicircle is composed of three types of DNA sequences with differing specificities reflecting different rates of DNA sequence change. A library of cloned fragments of kinetoplast DNA (kDNA) from Leishmania mexicana amazonensis was prepared and the cloned subfragments were found to contain DNA sequences with different taxonomic specificities based on hybridization analysis with various species of Leishmania. Four groups of subfragments were found, those that hybridized with a large number of Leishmania sp. as well as sequences unique to the species, subspecies, or isolate. Analysis of nested deletions of a single, full-length minicircle demonstrates that these different taxonomic specificities are contained within a single minicircle. This implies that different regions of a single minicircle have DNA sequences that diverge at different rates. These sequences represent potentially valuable tools in diagnostic, epidemiologic, and ecological studies of leishmaniasis and provide the basis for a model of kDNA sequence evolution. Images PMID:3025880

  8. A Novel Constraint for Thermodynamically Designing DNA Sequences

    PubMed Central

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap. PMID:24015217

  9. A novel constraint for thermodynamically designing DNA sequences.

    PubMed

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap. PMID:24015217

  10. Guanine-rich sequences inhibit proofreading DNA polymerases

    PubMed Central

    Zhu, Xiao-Jing; Sun, Shuhui; Xie, Binghua; Hu, Xuemei; Zhang, Zunyi; Qiu, Mengsheng; Dai, Zhong-Min

    2016-01-01

    DNA polymerases with proofreading activity are important for accurate amplification of target DNA. Despite numerous efforts have been made to improve the proofreading DNA polymerases, they are more susceptible to be failed in PCR than non-proofreading DNA polymerases. Here we showed that proofreading DNA polymerases can be inhibited by certain primers. Further analysis showed that G-rich sequences such as GGGGG and GGGGHGG can cause PCR failure using proofreading DNA polymerases but not Taq DNA polymerase. The inhibitory effect of these G-rich sequences is caused by G-quadruplex and is dose dependent. G-rich inhibitory sequence-containing primers can be used in PCR at a lower concentration to amplify its target DNA fragment. PMID:27349576

  11. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted. PMID:26846812

  12. Preparing DNA Libraries for Multiplexed Paired-End Deep Sequencing for Illumina GA Sequencers

    PubMed Central

    Son, Mike S.; Taylor, Ronald K.

    2011-01-01

    Whole genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This Unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data. PMID:21400673

  13. Chimeric DNA methyltransferases target DNA methylation to specific DNA sequences and repress expression of target genes

    PubMed Central

    Li, Fuyang; Papworth, Monika; Minczuk, Michal; Rohde, Christian; Zhang, Yingying; Ragozin, Sergei; Jeltsch, Albert

    2007-01-01

    Gene silencing by targeted DNA methylation has potential applications in basic research and therapy. To establish targeted methylation in human cell lines, the catalytic domains (CDs) of mouse Dnmt3a and Dnmt3b DNA methyltransferases (MTases) were fused to different DNA binding domains (DBD) of GAL4 and an engineered Cys2His2 zinc finger domain. We demonstrated that (i) Dense DNA methylation can be targeted to specific regions in gene promoters using chimeric DNA MTases. (ii) Site-specific methylation leads to repression of genes controlled by various cellular or viral promoters. (iii) Mutations affecting any of the DBD, MTase or target DNA sequences reduce targeted methylation and gene silencing. (iv) Targeted DNA methylation is effective in repressing Herpes Simplex Virus type 1 (HSV-1) infection in cell culture with the viral titer reduced by at least 18-fold in the presence of an MTase fused to an engineered zinc finger DBD, which binds a single site in the promoter of HSV-1 gene IE175k. In short, we show here that it is possible to direct DNA MTase activity to predetermined sites in DNA, achieve targeted gene silencing in mammalian cell lines and interfere with HSV-1 propagation. PMID:17151075

  14. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    ERIC Educational Resources Information Center

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  15. Next Generation Sequencing to Characterize Mitochondrial Genomic DNA Heteroplasmy

    PubMed Central

    Huang, Taosheng

    2015-01-01

    This protocol is to describe the methodology to characterize mitochondria DNA (mtDNA) heteroplasmy with parallel sequencing. Mitochondria play a very important role in important cellular functions. Each eukaryotic cell contains hundreds of mitochondria with hundreds of mitochondria genomes. The mutant mtDNA and the wild type may co-exist as heteroplasmy, and cause human disease. The purpose of this methodology is to simultaneously determine mtDNA sequence and to quantify the heteroplasmy level. The protocol includes two-fragment mitochondria genome DNA PCR amplification. The PCR product is then mixed at an equimolar ratio. The samples will be barcoded and sequenced with high-throughput next-generation sequencing technology. We found that this technology is highly sensitive, specific, and accurate in determining mtDNA mutations and the degree of heteroplasmic level. PMID:21975941

  16. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present

    PubMed Central

    Chen, Cheng-Yao

    2014-01-01

    Next-generation sequencing (NGS) technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. Escherichia coli DNA polymerase I proteolytic (Klenow) fragment was originally utilized in Sanger’s dideoxy chain-terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today’s standard capillary electrophoresis (CE) and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ϕ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ϕ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies. PMID:25009536

  17. Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

    1998-03-01

    Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.

  18. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    PubMed Central

    Knapp, Michael; Hofreiter, Michael

    2010-01-01

    The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions. PMID:24710043

  19. Nanopores: A journey towards DNA sequencing

    PubMed Central

    Wanunu, Meni

    2013-01-01

    Much more than ever, nucleic acids are recognized as key building blocks in many of life's processes, and the science of studying these molecular wonders at the single-molecule level is thriving. A new method of doing so has been introduced in the mid 1990's. This method is exceedingly simple: a nanoscale pore that spans across an impermeable thin membrane is placed between two chambers that contain an electrolyte, and voltage is applied across the membrane using two electrodes. These conditions lead to a steady stream of ion flow across the pore. Nucleic acid molecules in solution can be driven through the pore, and structural features of the biomolecules are observed as measurable changes in the trans-membrane ion current. In essence, a nanopore is a high-throughput ion microscope and a single-molecule force apparatus. Nanopores are taking center stage as a tool that promises to read a DNA sequence, and this promise has resulted in overwhelming academic, industrial, and national interest. Regardless of the fate of future nanopore applications, in the process of this 16-year-long exploration, many studies have validated the indispensability of nanopores in the toolkit of single-molecule biophysics. This review surveys past and current studies related to nucleic acid biophysics, and will hopefully provoke a discussion of immediate and future prospects for the field. PMID:22658507

  20. Food Fish Identification from DNA Extraction through Sequence Analysis

    ERIC Educational Resources Information Center

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  1. Characteristics of cloned repeated DNA sequences in the barley genome

    SciTech Connect

    Anan'ev, E.V.; Bochkanov, S.S.; Ryzhik, M.V.; Sonina, N.V.; Chernyshev, A.I.; Shchipkova, N.I.; Yakovleva, E.Yu.

    1986-12-01

    A partial clone library of barley DNA fragments based on plasmid pBR325 was created. The cloned EcoRI-fragments of chromosomal DNA are from 2 to 14 kbp in length. More than 95% of the barley DNA inserts comprise repeated sequences of different complexity and copy number. Certain of these DNA sequences are from families comprising at least 1% of the barley genome. A significant proportion of the clones hybridize with numerous sets of restriction fragments of genome DNA and they are dispersed throughout the barley chromosomes.

  2. Sequence-Specific DNA Binding by a Short Peptide Dimer

    NASA Astrophysics Data System (ADS)

    Talanian, Robert V.; McKnight, C. James; Kim, Peter S.

    1990-08-01

    A recently described class of DNA binding proteins is characterized by the "bZIP" motif, which consists of a basic region that contacts DNA and an adjacent "leucine zipper" that mediates protein dimerization. A peptide model for the basic region of the yeast transcriptional activator GCN4 has been developed in which the leucine zipper has been replaced by a disulfide bond. The 34-residue peptide dimer, but not the reduced monomer, binds DNA with nanomolar affinity at 4^circC. DNA binding is sequence-specific as judged by deoxyribonuclease I footprinting. Circular dichroism spectroscopy suggests that the peptide adopts a helical structure when bound to DNA. These results demonstrate directly that the GCN4 basic region is sufficient for sequence-specific DNA binding and suggest that a major function of the GCN4 leucine zipper is simply to mediate protein dimerization. Our approach provides a strategy for the design of short sequence-specific DNA binding peptides.

  3. Deconvolving the recognition of DNA shape from sequence.

    PubMed

    Abe, Namiko; Dror, Iris; Yang, Lin; Slattery, Matthew; Zhou, Tianyin; Bussemaker, Harmen J; Rohs, Remo; Mann, Richard S

    2015-04-01

    Protein-DNA binding is mediated by the recognition of the chemical signatures of the DNA bases and the 3D shape of the DNA molecule. Because DNA shape is a consequence of sequence, it is difficult to dissociate these modes of recognition. Here, we tease them apart in the context of Hox-DNA binding by mutating residues that, in a co-crystal structure, only recognize DNA shape. Complexes made with these mutants lose the preference to bind sequences with specific DNA shape features. Introducing shape-recognizing residues from one Hox protein to another swapped binding specificities in vitro and gene regulation in vivo. Statistical machine learning revealed that the accuracy of binding specificity predictions improves by adding shape features to a model that only depends on sequence, and feature selection identified shape features important for recognition. Thus, shape readout is a direct and independent component of binding site selection by Hox proteins. PMID:25843630

  4. Use of robotics in high-throughput DNA sequencing.

    PubMed

    Keeney, Stephen

    2011-01-01

    Until relatively recently, full sequencing of genes consisting of more than several exons was not considered practicable within a routine diagnostic context. As a result, many approaches to unknown mutation detection in a specific gene involved a mutation pre-screening step to limit the amount of DNA sequencing required. Protocols to pre-screen for mutations and limit the amount of DNA sequencing may not localise every base change present and/or require considerable levels of manual intervention. Advances in technology, allied with careful protocol design, now permit direct DNA sequencing to be applied to larger areas of gene sequence, allowing unequivocal mutation identification in the area of a gene being analysed. The protocol described below utilises robotic systems, allied to custom-designed PCR primers, to facilitate rapid DNA sequencing of multiple gene targets. The general approach is amenable to adaptation for use with multi-channel pipettes. PMID:20938842

  5. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, S.; Richardson, C.

    1997-03-25

    A modified gene encoding a modified DNA polymerase is disclosed. The modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase. 6 figs.

  6. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles

    1997-01-01

    Modified gene encoding a modified DNA polymerase wherein the modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase.

  7. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  8. Integrated platform for detection of DNA sequence variants using capillary array electrophoresis

    SciTech Connect

    Qingbro, Li; Liu, Zhaowei; Monroe, Heidi M; Culiat, Cymbeline T

    2002-08-01

    We have developed a highly versatile platform that performs temperature gradient capillary electrophoresis (TGCE) for mutation/single-nucleotide polymorphism (SNP) detection, sequencing and mutation/SNP genotyping for identification of sequence variants on an automated 24-, 96- or 192-capillary array instrument. In the first mode, multiple DNA samples consisting of homoduplexes and heteroduplexes are separated by CE, during which a temperature gradient is applied that covers all possible temperatures of 50% melting equilibrium (Tms) for the samples. The differences in Tms result in separation of homoduplexes from heteroduplexes, thereby identifying the presence of DNA variants. The sequencing mode is then used to determine the exact location of the mutation/SNPs in the DNA variants. The first two modes allow the rapid identification of variants from the screening of a large number of samples. Only the variants need to be sequenced. The third mode utilizes multiplexed single-base extensions (SBEs) to survey mutations and SNPs at the known sites of DNA sequence. The TGCE approach combined with sequencing and SBE is fast and cost-effective for high-throughput mutation/SNP detection.

  9. Progress towards DNA sequencing at the single molecule level

    SciTech Connect

    Goodwin, P.M.; Affleck, R.L.; Ambrose, W.P.

    1995-12-01

    We describe progress towards sequencing DNA at the single molecule level. Our technique involves incorporation of fluorescently tagged nucleotides into a targeted sequence, anchoring the labeled DNA strand in a flowing stream, sequential exonuclease digestion of the DNA strand, and efficient detection and identification of single tagged nucleotides. Experiments demonstrating strand specific exonuclease digestion of fluorescently labeled DNA anchored in flow as well as the detection of single cleaved fluorescently tagged nucleotides from a small number of anchored DNA fragments axe described. We find that the turnover rate of Esherichia coli exonuclease III on fluorescently labeled DNA in flow at 36{degree}C is {approximately}7 nucleotides per DNA strand per second, which is approximately the same as that measured for this enzyme on native DNA under static, saturated (excess enzyme) conditions. Experiments demonstrating the efficient detection of single fluorescent molecules delivered electrokinetically to a {approximately}3 pL probe volume are also described.

  10. Advanced microinstrumentation for rapid DNA sequencing and large DNA fragment separation

    SciTech Connect

    Balch, J.; Davidson, J.; Brewer, L.; Gingrich, J.; Koo, J.; Mariella, R.; Carrano, A.

    1995-01-25

    Our efforts to develop novel technology for a rapid DNA sequencer and large fragment analysis system based upon gel electrophoresis are described. We are using microfabrication technology to build dense arrays of high speed micro electrophoresis lanes that will ultimately increase the sequencing rate of DNA by at least 100 times the rate of current sequencers. We have demonstrated high resolution DNA fragment separation needed for sequencing in polyacrylamide microgels formed in glass microchannels. We have built prototype arrays of microchannels having up to 48 channels. Significant progress has also been made in developing a sensitive fluorescence detection system based upon a confocal microscope design that will enable the diagnostics and detection of DNA fragments in ultrathin microchannel gels. Development of a rapid DNA sequencer and fragment analysis system will have a major impact on future DNA instrumentation used in clinical, molecular and forensic analysis of DNA fragments.

  11. Analysis of separate isolates of Bordetella pertussis repeated DNA sequences.

    PubMed

    McPheat, W L; Hanson, J H; Livey, I; Robertson, J S

    1989-06-01

    Two independent isolates of a Bordetella pertussis repeated DNA unit were sequenced and shown to be an insertion sequence element with five nucleotide differences between the two copies. The sequences were 1053 bp in length with near-perfect terminal inverted repeats of 28 bp, had three open reading frames, and were each flanked by short direct repeats. The two insertion sequences showed considerable homology to two other B. pertussis repeated DNA sequences reported recently: IS481 and a 530 bp repeated DNA unit. The B. pertussis insertion sequence would appear to comprise a group of closely related sequences differing mainly in flanking direct repeats and the terminal inverted repeats. The two isolates reported here, which were from the adenylate cyclase and agglutinogen 2 regions of the genome, were numbered IS48lvl and IS48lv2 respectively. PMID:2559151

  12. Advances in DNA sequencing technologies for high resolution HLA typing.

    PubMed

    Cereb, Nezih; Kim, Hwa Ran; Ryu, Jaejun; Yang, Soo Young

    2015-12-01

    This communication describes our experience in large-scale G group-level high resolution HLA typing using three different DNA sequencing platforms - ABI 3730 xl, Illumina MiSeq and PacBio RS II. Recent advances in DNA sequencing technologies, so-called next generation sequencing (NGS), have brought breakthroughs in deciphering the genetic information in all living species at a large scale and at an affordable level. The NGS DNA indexing system allows sequencing multiple genes for large number of individuals in a single run. Our laboratory has adopted and used these technologies for HLA molecular testing services. We found that each sequencing technology has its own strengths and weaknesses, and their sequencing performances complement each other. HLA genes are highly complex and genotyping them is quite challenging. Using these three sequencing platforms, we were able to meet all requirements for G group-level high resolution and high volume HLA typing. PMID:26423536

  13. Conserved Sequences at the Origin of Adenovirus DNA Replication

    PubMed Central

    Stillman, Bruce W.; Topp, William C.; Engler, Jeffrey A.

    1982-01-01

    The origin of adenovirus DNA replication lies within an inverted sequence repetition at either end of the linear, double-stranded viral DNA. Initiation of DNA replication is primed by a deoxynucleoside that is covalently linked to a protein, which remains bound to the newly synthesized DNA. We demonstrate that virion-derived DNA-protein complexes from five human adenovirus serological subgroups (A to E) can act as a template for both the initiation and the elongation of DNA replication in vitro, using nuclear extracts from adenovirus type 2 (Ad2)-infected HeLa cells. The heterologous template DNA-protein complexes were not as active as the homologous Ad2 DNA, most probably due to inefficient initiation by Ad2 replication factors. In an attempt to identify common features which may permit this replication, we have also sequenced the inverted terminal repeated DNA from human adenovirus serotypes Ad4 (group E), Ad9 and Ad10 (group D), and Ad31 (group A), and we have compared these to previously determined sequences from Ad2 and Ad5 (group C), Ad7 (group B), and Ad12 and Ad18 (group A) DNA. In all cases, the sequence around the origin of DNA replication can be divided into two structural domains: a proximal A · T-rich region which is partially conserved among these serotypes, and a distal G · C-rich region which is less well conserved. The G · C-rich region contains sequences similar to sequences present in papovavirus replication origins. The two domains may reflect a dual mechanism for initiation of DNA replication: adenovirus-specific protein priming of replication, and subsequent utilization of this primer by host replication factors for completion of DNA synthesis. Images PMID:7143575

  14. Multiplexed Sequence Encoding: A Framework for DNA Communication.

    PubMed

    Zakeri, Bijan; Carr, Peter A; Lu, Timothy K

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication-data encoding, data transfer & data extraction-and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system-Multiplexed Sequence Encoding (MuSE)-that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  15. Multiplexed Sequence Encoding: A Framework for DNA Communication

    PubMed Central

    Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  16. Quantitative Comparison of Large-Scale DNA Enrichment Sequencing Data.

    PubMed

    Lienhard, Matthias; Chavez, Lukas

    2016-01-01

    DNA enrichment followed by sequencing (DNA-IP seq) is a versatile tool in molecular biology with a wide variety of applications. Computational analysis of differential DNA enrichment between conditions is important for identifying epigenetic alterations in disease compared to healthy controls and for revealing dynamic epigenetic modifications throughout normal and distorted cell differentiation and development. We present a protocol for genome-wide comparative analysis of DNA-IP sequencing data to identify statistically significant differential sequencing coverage between two conditions by considering variation across replicates. The protocol provides a detailed description for the comparative analysis of DNA-IP sequencing data including basic data processing, quality controls, and identification of differential enrichment using the Bioconductor package "MEDIPS". PMID:27008016

  17. Compiling Multicopy Single-Stranded DNA Sequences from Bacterial Genome Sequences

    PubMed Central

    Yoo, Wonseok; Lim, Dongbin

    2016-01-01

    A retron is a bacterial retroelement that encodes an RNA gene and a reverse transcriptase (RT). The former, once transcribed, works as a template primer for reverse transcription by the latter. The resulting DNA is covalently linked to the upstream part of the RNA; this chimera is called multicopy single-stranded DNA (msDNA), which is extrachromosomal DNA found in many bacterial species. Based on the conserved features in the eight known msDNA sequences, we developed a detection method and applied it to scan National Center for Biotechnology Information (NCBI) RefSeq bacterial genome sequences. Among 16,844 bacterial sequences possessing a retron-type RT domain, we identified 48 unique types of msDNA. Currently, the biological role of msDNA is not well understood. Our work will be a useful tool in studying the distribution, evolution, and physiological role of msDNA. PMID:27103888

  18. Characterization of group A Streptococcus strains recovered from Mexican children with pharyngitis by automated DNA sequencing of virulence-related genes: unexpectedly large variation in the gene (sic) encoding a complement-inhibiting protein.

    PubMed

    Mejia, L M; Stockbauer, K E; Pan, X; Cravioto, A; Musser, J M

    1997-12-01

    Sequence variation was studied in several target genes in 54 strains of group A Streptococcus (GAS) cultured from children with pharyngitis in Mexico City. Although 16 distinct emm alleles were identified, only 4 had not been previously described. Virtually all bacteria (31 of 33 [94%] with the streptococcal pyrogenic exotoxin gene (speA) had emm1-related, emm3, or emm6 alleles. The gene (sic) encoding an extracellular GAS protein that inhibits complement function was unusually variable among isolates with the emm1 family of alleles, with a total of seven variants identified. The data suggest that many GAS strains infecting Mexican children are genetically similar to organisms commonly encountered in the United States and western Europe. Sequence variation in the sic gene is useful for rapid differentiation among GAS isolates with the emm1 family of alleles. PMID:9399523

  19. Characterization of group A Streptococcus strains recovered from Mexican children with pharyngitis by automated DNA sequencing of virulence-related genes: unexpectedly large variation in the gene (sic) encoding a complement-inhibiting protein.

    PubMed Central

    Mejia, L M; Stockbauer, K E; Pan, X; Cravioto, A; Musser, J M

    1997-01-01

    Sequence variation was studied in several target genes in 54 strains of group A Streptococcus (GAS) cultured from children with pharyngitis in Mexico City. Although 16 distinct emm alleles were identified, only 4 had not been previously described. Virtually all bacteria (31 of 33 [94%] with the streptococcal pyrogenic exotoxin gene (speA) had emm1-related, emm3, or emm6 alleles. The gene (sic) encoding an extracellular GAS protein that inhibits complement function was unusually variable among isolates with the emm1 family of alleles, with a total of seven variants identified. The data suggest that many GAS strains infecting Mexican children are genetically similar to organisms commonly encountered in the United States and western Europe. Sequence variation in the sic gene is useful for rapid differentiation among GAS isolates with the emm1 family of alleles. PMID:9399523

  20. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  1. Biological nanopore MspA for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Manrao, Elizabeth A.

    Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore

  2. Integrated on-line system for DNA sequencing by capillary electrophoresis: From template to called bases

    SciTech Connect

    Ton, H.; Yeung, E.S.

    1997-02-15

    An integrated on-line prototype for coupling a microreactor to capillary electrophoresis for DNA sequencing has been demonstrated. A dye-labeled terminator cycle-sequencing reaction is performed in a fused-silica capillary. Subsequently, the sequencing ladder is directly injected into a size-exclusion chromatographic column operated at nearly 95{degree}C for purification. On-line injection to a capillary for electrophoresis is accomplished at a junction set at nearly 70{degree}C. High temperature at the purification column and injection junction prevents the renaturation of DNA fragments during on-line transfer without affecting the separation. The high solubility of DNA in and the relatively low ionic strength of 1 x TE buffer permit both effective purification and electrokinetic injection of the DNA sample. The system is compatible with highly efficient separations by a replaceable poly(ethylene oxide) polymer solution in uncoated capillary tubes. Future automation and adaptation to a multiple-capillary array system should allow high-speed, high-throughput DNA sequencing from templates to called bases in one step. 32 refs., 5 figs.

  3. DNA sequence analysis with droplet-based microfluidics

    PubMed Central

    Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.

    2014-01-01

    Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402

  4. DNA Methyltransferase Accessibility Protocol for Individual Templates by Deep Sequencing

    PubMed Central

    Darst, Russell P.; Nabilsi, Nancy H.; Pardo, Carolina E.; Riva, Alberto; Kladde, Michael P.

    2013-01-01

    A single-molecule probe of chromatin structure can uncover dynamic chromatin states and rare epigenetic variants of biological importance that bulk measures of chromatin structure miss. In bisulfite genomic sequencing, each sequenced clone records the methylation status of multiple sites on an individual molecule of DNA. An exogenous DNA methyltransferase can thus be used to image nucleosomes and other protein–DNA complexes. In this chapter, we describe the adaptation of this technique, termed Methylation Accessibility Protocol for individual templates, to modern high-throughput sequencing, which both simplifies the workflow and extends its utility. PMID:22929770

  5. An Optimal Seed Based Compression Algorithm for DNA Sequences

    PubMed Central

    Gopalakrishnan, Gopakumar; Karunakaran, Muralikrishnan

    2016-01-01

    This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. PMID:27555868

  6. Profiling DNA Methylomes from Microarray to Genome-Scale Sequencing

    PubMed Central

    Huang, Yi-Wen; Huang, Tim H.-M.; Wang, Li-Shu

    2010-01-01

    DNA cytosine methylation is a central epigenetic modification which plays critical roles in cellular processes including genome regulation, development and disease. Here, we review current and emerging microarray and next-generation sequencing based technologies that enhance our knowledge of DNA methylation profiling. Each methodology has limitations and their unique applications, and combinations of several modalities may help build the entire methylome. With advances on next-generation sequencing technologies, it is now possible to globally map the DNA cytosine methylation at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. PMID:20218736

  7. Profiling DNA methylomes from microarray to genome-scale sequencing.

    PubMed

    Huang, Yi-Wei; Huang, Tim H-M; Wang, Li-Shu

    2010-04-01

    DNA cytosine methylation is a central epigenetic modification which plays critical roles in cellular processes including genome regulation, development and disease. Here, we review current and emerging microarray and next-generation sequencing based technologies that enhance our knowledge of DNA methylation profiling. Each methodology has limitations and their unique applications, and combinations of several modalities may help build the entire methylome. With advances on next-generation sequencing technologies, it is now possible to globally map the DNA cytosine methylation at single-base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. PMID:20218736

  8. Current-voltage characteristics of double-strand DNA sequences

    NASA Astrophysics Data System (ADS)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  9. Configuring the Orion Guidance, Navigation, and Control Flight Software for Automated Sequencing

    NASA Technical Reports Server (NTRS)

    Odegard, Ryan G.; Siliwinski, Tomasz K.; King, Ellis T.; Hart, Jeremy J.

    2010-01-01

    The Orion Crew Exploration Vehicle is being designed with greater automation capabilities than any other crewed spacecraft in NASA s history. The Guidance, Navigation, and Control (GN&C) flight software architecture is designed to provide a flexible and evolvable framework that accommodates increasing levels of automation over time. Within the GN&C flight software, a data-driven approach is used to configure software. This approach allows data reconfiguration and updates to automated sequences without requiring recompilation of the software. Because of the great dependency of the automation and the flight software on the configuration data, the data management is a vital component of the processes for software certification, mission design, and flight operations. To enable the automated sequencing and data configuration of the GN&C subsystem on Orion, a desktop database configuration tool has been developed. The database tool allows the specification of the GN&C activity sequences, the automated transitions in the software, and the corresponding parameter reconfigurations. These aspects of the GN&C automation on Orion are all coordinated via data management, and the database tool provides the ability to test the automation capabilities during the development of the GN&C software. In addition to providing the infrastructure to manage the GN&C automation, the database tool has been designed with capabilities to import and export artifacts for simulation analysis and documentation purposes. Furthermore, the database configuration tool, currently used to manage simulation data, is envisioned to evolve into a mission planning tool for generating and testing GN&C software sequences and configurations. A key enabler of the GN&C automation design, the database tool allows both the creation and maintenance of the data artifacts, as well as serving the critical role of helping to manage, visualize, and understand the data-driven parameters both during software development

  10. Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

    SciTech Connect

    Fields, C.A.

    1994-09-01

    This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.

  11. Sequence specific generation of a DNA panhandle permits PCR amplification of unknown flanking DNA.

    PubMed Central

    Jones, D H; Winistorfer, S C

    1992-01-01

    We present a novel method for the PCR amplification of unknown DNA that flanks a known segment directly from human genomic DNA. PCR requires that primer annealing sites be present on each end of the DNA segment that is to be amplified. In this method, known DNA is placed on the uncharacterized side of the sequence of interest via DNA polymerase mediated generation of a PCR template that is shaped like a pan with a handle. Generation of this template permits specific amplification of the unknown sequence. Taq (DNA) polymerase was used to form the original template and to generate the PCR product. 2.2 kb of the beta-globin gene, and 657 bp of the 5' flanking region of the cystic fibrosis transmembrane conductance regulator gene, were amplified directly from human genomic DNA using primers that initially flank only one side of the region amplified. This method will provide a powerful tool for acquiring DNA sequence information. Images PMID:1371352

  12. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA

    PubMed Central

    2015-01-01

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  13. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA.

    PubMed

    Belkin, Maxim; Chao, Shu-Han; Jonsson, Magnus P; Dekker, Cees; Aksimentiev, Aleksei

    2015-11-24

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  14. Semiconductor-based DNA sequencing of histone modification states.

    PubMed

    Cheng, Christine S; Rai, Kunal; Garber, Manuel; Hollinger, Andrew; Robbins, Dana; Anderson, Scott; Macbeth, Alyssa; Tzou, Austin; Carneiro, Mauricio O; Raychowdhury, Raktima; Russ, Carsten; Hacohen, Nir; Gershenwald, Jeffrey E; Lennon, Niall; Nusbaum, Chad; Chin, Lynda; Regev, Aviv; Amit, Ido

    2013-01-01

    The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues. PMID:24157732

  15. Semiconductor-based DNA sequencing of histone modification states

    PubMed Central

    Cheng, Christine S.; Rai, Kunal; Garber, Manuel; Hollinger, Andrew; Robbins, Dana; Anderson, Scott; Macbeth, Alyssa; Tzou, Austin; Carneiro, Mauricio O.; Raychowdhury, Raktima; Russ, Carsten; Hacohen, Nir; Gershenwald, Jeffrey E.; Lennon, Niall; Nusbaum, Chad; Chin, Lynda; Regev, Aviv; Amit, Ido

    2013-01-01

    The recent development of a semiconductor-based, non-optical DNA sequencing technology promises scalable, low-cost and rapid sequence data production. The technology has previously been applied mainly to genomic sequencing and targeted re-sequencing. Here we demonstrate the utility of Ion Torrent semiconductor-based sequencing for sensitive, efficient and rapid chromatin immunoprecipitation followed by sequencing (ChIP-seq) through the application of sample preparation methods that are optimized for ChIP-seq on the Ion Torrent platform. We leverage this method for epigenetic profiling of tumour tissues. PMID:24157732

  16. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Cancer.gov

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  17. Microchannel DNA Sequencing by End-Labelled Free Solution Electrophoresis

    SciTech Connect

    Barron, A.

    2005-09-29

    The further development of End-Labeled Free-Solution Electrophoresis will greatly simplify DNA separation and sequencing on microfluidic devices. The development and optimization of drag-tags is critical to the success of this research.

  18. DNA sequencing using polymerase substrate-binding kinetics

    PubMed Central

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  19. DNA sequencing using polymerase substrate-binding kinetics.

    PubMed

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  20. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    PubMed Central

    2013-01-01

    Background High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multiplexing approach relies on a specific DNA tag or barcode that is attached to the sequencing or amplification primer and hence appears at the beginning of the sequence in every read. After sequencing, each sample read is identified on the basis of the respective barcode sequence. Alterations of DNA barcodes during synthesis, primer ligation, DNA amplification, or sequencing may lead to incorrect sample identification unless the error is revealed and corrected. This can be accomplished by implementing error correcting algorithms and codes. This barcoding strategy increases the total number of correctly identified samples, thus improving overall sequencing efficiency. Two popular sets of error-correcting codes are Hamming codes and Levenshtein codes. Result Levenshtein codes operate only on words of known length. Since a DNA sequence with an embedded barcode is essentially one continuous long word, application of the classical Levenshtein algorithm is problematic. In this paper we demonstrate the decreased error correction capability of Levenshtein codes in a DNA context and suggest an adaptation of Levenshtein codes that is proven of efficiently correcting nucleotide errors in DNA sequences. In our adaption we take the DNA context into account and redefine the word length whenever an insertion or deletion is revealed. In simulations we show the superior error correction capability of the new method compared to traditional Levenshtein and Hamming based codes in the presence of multiple errors. Conclusion We present an adaptation of Levenshtein codes to DNA contexts capable of correction of a pre-defined number of insertion, deletion, and substitution mutations. Our improved

  1. Local alignment of two-base encoded DNA sequence

    PubMed Central

    Homer, Nils; Merriman, Barry; Nelson, Stanley F

    2009-01-01

    Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732

  2. Discovering simple DNA sequences by the algorithmic significance method.

    PubMed

    Milosavljević, A; Jurka, J

    1993-08-01

    A new method, 'algorithmic significance', is proposed as a tool for discovery of patterns in DNA sequences. The main idea is that patterns can be discovered by finding ways to encode the observed data concisely. In this sense, the method can be viewed as a formal version of the Occam's Razor principle. In this paper the method is applied to discover significantly simple DNA sequences. We define DNA sequences to be simple if they contain repeated occurrences of certain 'words' and thus can be encoded in a small number of bits. Such definition includes minisatellites and microsatellites. A standard dynamic programming algorithm for data compression is applied to compute the minimal encoding lengths of sequences in linear time. An electronic mail server for identification of simple sequences based on the proposed method has been installed at the Internet address pythia/anl.gov. PMID:8402207

  3. Nuclear and mitochondrial DNA sequences from two Denisovan individuals.

    PubMed

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V; Derevianko, Anatoly P; Prüfer, Kay; Kelso, Janet; Pääbo, Svante

    2015-12-22

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans. PMID:26630009

  4. Nuclear and mitochondrial DNA sequences from two Denisovan individuals

    PubMed Central

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V.; Derevianko, Anatoly P.; Prüfer, Kay; Pääbo, Svante

    2015-01-01

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans. PMID:26630009

  5. Effects of sequence on DNA wrapping around histones

    NASA Astrophysics Data System (ADS)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  6. Efficient DNA sequencing on microtiter plates using dried reagents and Bst DNA polymerase.

    PubMed

    Earley, J J; Kuivaniemi, H; Prockop, D J; Tromp, G

    1993-01-01

    Sequenase, Taq DNA polymerase and Bst DNA polymerase were tested for sequencing of DNA on microtiter plates using dried down reagents. Several parameters were investigated to expedite the drying process while minimizing damage to the enzyme. Sequenase did not tolerate drying very well, and frequently generated sequences with weak signals and many sites of premature termination. With Taq DNA polymerase it was possible to obtain sequences of good quality. However, there was considerable variation of results between experiments and between batches of microtiter plates. Bst DNA polymerase generated sequences of excellent quality. It was stable for more than a week in dried-down state at -20 degrees C and at least overnight at room temperature. The method described here using Bst DNA polymerase is well suited for laboratory robots and workstations that typically employ 96-well microtiter plates. PMID:8173079

  7. Analysis of the DNA Sequencing Quality and Efficiency of the Apollo100 Robotic Microcycler in a Core Facility Setting

    PubMed Central

    Logsdon, M. E.; Trounstine, M. C.; Zianni, M. R.

    2011-01-01

    Sanger, or dideoxynucleotide sequencing, is an important tool for biomolecular research. An important trend in DNA sequencing is to find new and innovative ways to provide high-quality, reliable sequences in a more efficient manner, using automated capillary electrophoresis. The Apollo100 combines Sanger cycle sequencing and solid-phase reversible immobilization for product purification in a single instrument with robotic liquid handling and microfluidic (Microscale On-chip Valve) chips that have onboard thermal cycling and pneumatic mixing. Experiments were performed to determine how the DNA sequencing results from the Apollo100 compared with conventional, manual methods used in a core facility setting. Through rigorous experimentation of multiple baseline runs and a dilution series of template concentration, the Apollo100 generated sequencing that exceeded 900 bases with a quality score of 20 or above. When comparing actual client samples of amplicons, plasmids, and cosmids, Apollo100 sequencing results did not differ significantly from those reactions prepared manually. In addition, bacterial genomic DNA was sequenced successfully, directly with the Apollo100, although results were of lower quality than the standard manual method. As a result of the microscale capabilities, the Apollo100 offers valuable savings with respect to the quantity of reagents consumed compared with current manual sequencing methods, thereby continuing the demand for smaller template and reagent requirements. In conclusion, the Apollo100 can generate high-quality DNA sequences for common templates equivalent to those produced using manual sequencing methods and increases efficiency through reduced labor and reagents. PMID:21738437

  8. The DNA damage checkpoint allows recombination between divergent DNA sequences in budding yeast

    PubMed Central

    George, Carolyn M.; Lyndaker, Amy M.; Alani, Eric

    2011-01-01

    In the early steps of homologous recombination, single-stranded DNA (ssDNA) from a broken chromosome invades homologous sequence located in a sister or homolog donor. In genomes that contain numerous repetitive DNA elements or gene paralogs, recombination can potentially occur between non-allelic/divergent (homeologous) sequences that share sequence identity. Such recombination events can lead to lethal chromosomal deletions or rearrangements. However, homeologous recombination events can be suppressed through rejection mechanisms that involve recognition of DNA mismatches in heteroduplex DNA by mismatch repair factors, followed by active unwinding of the heteroduplex DNA by helicases. Because factors required for heteroduplex rejection are hypothesized to be targets and/or effectors of the DNA damage response (DDR), a cell cycle control mechanism that ensures timely and efficient repair, we tested whether the DDR, and more specifically, the RAD9 gene, had a role in regulating rejection. We performed these studies using a DNA repair assay that measures repair by single-strand annealing (SSA) of a double-strand break (DSB) using homeologous DNA templates. We found that repair of homeologous DNA sequences, but not identical sequences, induced a RAD9- dependent cell cycle delay in the G2 stage of the cell cycle. Repair through a divergent DNA template occurred more frequently in RAD9 compared to rad9Δ strains. However, repair in rad9Δ mutants could be restored to wild-type levels if a G2 delay was induced by nocodazole. These results suggest that cell cycle arrest induced by the Rad9-dependent DDR allows repair between divergent DNA sequences despite the potential for creating deleterious genome rearrangements, and illustrates the importance of additional cellular mechanisms that act to suppress recombination between divergent DNA sequences. PMID:21978436

  9. HLA typing by direct DNA sequencing.

    PubMed

    Smith, Linda K

    2012-01-01

    Sequencing-based typing is a high resolution method for the identification of HLA polymorphisms. The majority of HLA Class I alleles can be discriminated by their exon 2 and 3 sequence, and for Class II alleles, exon 2 is generally sufficient. There are polymorphic positions in other exons which may require additional sequencing to exclude certain alleles with differences outside exon 2 and 3, depending on the clinical requirement and relevant accredition guidelines. The process involves selective amplification of target alleles by PCR, agarose gel electrophoresis of the PCR products to assess the quantity and quality, followed by purification of PCR amplicons to remove excess primer and dNTPs. Cycle sequencing reactions using Applied Biosystems™ BigDye(®) Terminator Ready Reaction v1.1 or v3.1 Kit are performed, then purification of sequence reactions before electrophoresing using Applied Biosystems™ 3730 or 3730XL Genetic Analyser (or similar). Data is processed by specialised software packages, which compare the sample sequence to the sequences of all possible theoretical allele combinations to assign an accurate genotype. Examination of all nucleotides, both at conserved and polymorphic positions enables the direct identification of new alleles, which may not be possible with techniques such as SSP and SSO typing. PMID:22665229

  10. Amplification of human papillomavirus DNA sequences by using conserved primers.

    PubMed Central

    Gregoire, L; Arella, M; Campione-Piccardo, J; Lancaster, W D

    1989-01-01

    The polymerase chain reaction has potential for use in the detection of small amounts of human papillomavirus (HPV) viral nucleic acids present in clinical specimens. However, new HPV types for which no probes exist would remain undetected by using type-specific primers for the polymerase chain reaction before hybridization. Primers corresponding to highly conserved HPV sequences may be useful for detecting low amounts of known HPV DNA as well as new HPV types. Here we analyze a pair of primers derived from conserved sequences within the E1 open reading frame for HPV sequence amplification by using the polymerase chain reaction. The longest perfect homology among HPV sequences is a 12-mer within the first exon of E1M. A region of conserved amino acids coded by the E1 open reading frame allowed the detection of another highly conserved region about 850 base pairs downstream. Two 21-mers derived from these conserved regions were used to amplify sequences from all HPV DNAs used as templates. The amplified DNA was shown to be specific for HPV sequences within the E1 open reading frame. DNA from HPVs whose sequences were not available were amplified by using these two primers. HPV DNA sequences in clinical specimens could also be amplified with the primers. Images PMID:2556429

  11. Channel catfish, Ictalurus punctatus, cyclophilin B cDNA sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cyclophilin B is a member of highly conserved immunophilins and ubiquitously found intracellularly. The complete sequence of the channel catfish cyclophilin B cDNA gene consisted of 996 nucleotides. Analysis of the nucleotide sequence reveals one open reading frame and 5’- and 3’-end untranslated...

  12. Ancient DNA sequence revealed by error-correcting codes.

    PubMed

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  13. An integer programming approach to DNA sequence assembly.

    PubMed

    Chang, Youngjung; Sahinidis, Nikolaos V

    2011-08-10

    De novo sequence assembly is a ubiquitous combinatorial problem in all DNA sequencing technologies. In the presence of errors in the experimental data, the assembly problem is computationally challenging, and its solution may not lead to a unique reconstruct. The enumeration of all alternative solutions is important in drawing a reliable conclusion on the target sequence, and is often overlooked in the heuristic approaches that are currently available. In this paper, we develop an integer programming formulation and global optimization solution strategy to solve the sequence assembly problem with errors in the data. We also propose an efficient technique to identify all alternative reconstructs. When applied to examples of sequencing-by-hybridization, our approach dramatically increases the length of DNA sequences that can be handled with global optimality certificate to over 10,000, which is more than 10 times longer than previously reported. For some problem instances, alternative solutions exhibited a wide range of different ability in reproducing the target DNA sequence. Therefore, it is important to utilize the methodology proposed in this paper in order to obtain all alternative solutions to reliably infer the true reconstruct. These alternative solutions can be used to refine the obtained results and guide the design of further experiments to correctly reconstruct the target DNA sequence. PMID:21864794

  14. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  15. Do short, frequent DNA sequence motifs mould the epigenome?

    PubMed

    Quante, Timo; Bird, Adrian

    2016-04-01

    'Epigenome' refers to the panoply of chemical modifications borne by DNA and its associated proteins that locally affect genome function. Epigenomic patterns are thought to be determined by external constraints resulting from development, disease and the environment, but DNA sequence is also a potential influence. We propose that domains of relatively uniform DNA base composition may modulate the epigenome through cell type-specific proteins that recognize short, frequent sequence motifs. Differential recruitment of epigenomic modifiers may adjust gene expression in multigene blocks as an alternative to tuning the activity of each gene separately, thus simplifying gene expression programming. PMID:26837845

  16. Sequence specificity of DNA cleavage by Micrococcus luteus. gamma. endonuclease

    SciTech Connect

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-04-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by ..gamma..-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus ..gamma.. endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to ..gamma.. radiation.

  17. Electronic Transport and Thermopower in Aperiodic DNA Sequences

    NASA Astrophysics Data System (ADS)

    Roche, Stephan; Maciá, Enrique

    A detailed study of charge transport properties of synthetic and genomic DNA sequences is reported. Genomic sequences of the Chromosome 22, λ-bacteriophage, and D1s80 genes of Human and Pygmy chimpanzee are considered in this work, and compared with both periodic and quasiperiodic (Fibonacci) sequences of nucleotides. Charge transfer efficiency is compared for all these different sequences, and large variations in charge transfer efficiency, stemming from sequence-dependent effects, are reported. In addition, basic characteristics of tunneling currents, including contact effects, are described. Finally, the thermoelectric power of nucleobases connected in between metallic contacts at different temperatures is presented.

  18. DNA linking number change induced by sequence-specific DNA-binding proteins

    PubMed Central

    Chen, Bo; Xiao, Yazhong; Liu, Chang; Li, Chenzhong; Leng, Fenfei

    2010-01-01

    Sequence-specific DNA-binding proteins play a key role in many fundamental biological processes, such as transcription, DNA replication and recombination. Very often, these DNA-binding proteins introduce structural changes to the target DNA-binding sites including DNA bending, twisting or untwisting and wrapping, which in many cases induce a linking number change (ΔLk) to the DNA-binding site. Due to the lack of a feasible approach, ΔLk induced by sequence-specific DNA-binding proteins has not been fully explored. In this paper we successfully constructed a series of DNA plasmids that carry many tandem copies of a DNA-binding site for one sequence-specific DNA-binding protein, such as λ O, LacI, GalR, CRP and AraC. In this case, the protein-induced ΔLk was greatly amplified and can be measured experimentally. Indeed, not only were we able to simultaneously determine the protein-induced ΔLk and the DNA-binding constant for λ O and GalR, but also we demonstrated that the protein-induced ΔLk is an intrinsic property for these sequence-specific DNA-binding proteins. Our results also showed that protein-mediated DNA looping by AraC and LacI can induce a ΔLk to the plasmid DNA templates. Furthermore, we demonstrated that the protein-induced ΔLk does not correlate with the protein-induced DNA bending by the DNA-binding proteins. PMID:20185570

  19. Folding complex DNA nanostructures from limited sets of reusable sequences

    PubMed Central

    Niekamp, Stefan; Blumer, Katy; Nafisi, Parsa M.; Tsui, Kathy; Garbutt, John; Douglas, Shawn M.

    2016-01-01

    Scalable production of DNA nanostructures remains a substantial obstacle to realizing new applications of DNA nanotechnology. Typical DNA nanostructures comprise hundreds of DNA oligonucleotide strands, where each unique strand requires a separate synthesis step. New design methods that reduce the strand count for a given shape while maintaining overall size and complexity would be highly beneficial for efficiently producing DNA nanostructures. Here, we report a method for folding a custom template strand by binding individual staple sequences to multiple locations on the template. We built several nanostructures for well-controlled testing of various design rules, and demonstrate folding of a 6-kb template by as few as 10 unique strand sequences binding to 10 ± 2 locations on the template strand. PMID:27036861

  20. Elongation method for electronic structure calculations of random DNA sequences.

    PubMed

    Orimoto, Yuuichi; Liu, Kai; Aoki, Yuriko

    2015-10-30

    We applied ab initio order-N elongation (ELG) method to calculate electronic structures of various deoxyribonucleic acid (DNA) models. We aim to test potential application of the method for building a database of DNA electronic structures. The ELG method mimics polymerization reactions on a computer and meets the requirements for linear scaling computational efficiency and high accuracy, even for huge systems. As a benchmark test, we applied the method for calculations of various types of random sequenced A- and B-type DNA models with and without counterions. In each case, the ELG method maintained high accuracy with small errors in energy on the order of 10(-8) hartree/atom compared with conventional calculations. We demonstrate that the ELG method can provide valuable information such as stabilization energies and local densities of states for each DNA sequence. In addition, we discuss the "restarting" feature of the ELG method for constructing a database that exhaustively covers DNA species. PMID:26337429

  1. Bayesian classification for promoter prediction in human DNA sequences

    NASA Astrophysics Data System (ADS)

    Bercher, J.-F.; Jardin, P.; Duriez, B.

    2006-11-01

    Many Computational methods are yet available for data retrieval and analysis of genomic sequences, but some functional sites are difficult to characterize. In this work, we examine the problem of promoter localization in human DNA sequences. Promoters are regulatory regions that governs the expression of genes, and their prediction is reputed difficult, so that this issue is still open. We present the Chaos Game representation (CGR) of DNA sequences which has many interesting properties, and the notion of `genomic signature' that proved relevant in phylogeny applications. Based on this notion, we develop a (naïve) bayesian classifier, evaluate its performances, and show that its adaptive implementation enable to reveal or assess core-promoter positions along a DNA sequence.

  2. Cloning and sequencing of chloroperoxidase cDNA.

    PubMed Central

    Fang, G H; Kenigsberg, P; Axley, M J; Nuell, M; Hager, L P

    1986-01-01

    An oligod-d(T) 12-18 primed cDNA library has been prepared from Caldariomyces fumago mRNA. A clone containing a full-length insert was sequenced on the supercoiled plasmid, pBR322. The complete primary sequence of chloroperoxidase has been derived. We have also determined about 73% of the peptide sequence by amino acid sequencing. The DNA sequence data matches all of the available known peptide sequences. The mature polypeptide contains 300 amino acids having a combined molecular weight of 32,974 daltons. A putative signal peptide of 21 amino acids is proposed from DNA sequence data. The chloroperoxidase gene encodes three potential glycosylation sites recognized as Asn-X-Thr/Ser sequences. Three cysteine residues are found in the protein sequence. A small region around Cys87 bears a minimal homology to the active site of cytochrome P450cam. No other heme protein homologues can be detected. We propose that Cys87 serves as a thiolate ligand to the iron of heme prosthetic group. A rare arginine codon, AGG, is used three times out of twelve in contrast to the very infrequent use of this codon in E. coli or yeast. PMID:3774552

  3. DNA sequence of the yeast transketolase gene.

    PubMed

    Fletcher, T S; Kwee, I L; Nakada, T; Largman, C; Martin, B M

    1992-02-18

    Transketolase (EC 2.2.1.1) is the enzyme that, together with aldolase, forms a reversible link between the glycolytic and pentose phosphate pathways. We have cloned and sequenced the transketolase gene from yeast (Saccharomyces cerevisiae). This is the first transketolase gene of the pentose phosphate shunt to be sequenced from any source. The molecular mass of the proposed translated protein is 73,976 daltons, in good agreement with the observed molecular mass of about 75,000 daltons. The 5'-nontranslated region of the gene is similar to other yeast genes. There is no evidence of 5'-splice junctions or branch points in the sequence. The 3'-nontranslated region contains the polyadenylation signal (AATAAA), 80 base pairs downstream from the termination codon. A high degree of homology is found between yeast transketolase and dihydroxyacetone synthase (formaldehyde transketolase) from the yeast Hansenula polymorpha. The overall sequence identity between these two proteins is 37%, with four regions of much greater similarity. The regions from amino acid residues 98-131, 157-182, 410-433, and 474-489 have sequence identities of 74%, 66%, 83%, and 82%, respectively. One of these regions (157-182) includes a possible thiamin pyrophosphate (TPP) binding domain, and another (410-433) may contain the catalytic domain. PMID:1737042

  4. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    PubMed

    Star, Bastiaan; Nederbragt, Alexander J; Hansen, Marianne H S; Skage, Morten; Gilfillan, Gregor D; Bradbury, Ian R; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S; Jentoft, Sissel

    2014-01-01

    Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104

  5. Rényi continuous entropy of DNA sequences.

    PubMed

    Vinga, Susana; Almeida, Jonas S

    2004-12-01

    Entropy measures of DNA sequences estimate their randomness or, inversely, their repeatability. L-block Shannon discrete entropy accounts for the empirical distribution of all length-L words and has convergence problems for finite sequences. A new entropy measure that extends Shannon's formalism is proposed. Renyi's quadratic entropy calculated with Parzen window density estimation method applied to CGR/USM continuous maps of DNA sequences constitute a novel technique to evaluate sequence global randomness without some of the former method drawbacks. The asymptotic behaviour of this new measure was analytically deduced and the calculation of entropies for several synthetic and experimental biological sequences was performed. The results obtained were compared with the distributions of the null model of randomness obtained by simulation. The biological sequences have shown a different p-value according to the kernel resolution of Parzen's method, which might indicate an unknown level of organization of their patterns. This new technique can be very useful in the study of DNA sequence complexity and provide additional tools for DNA entropy estimation. The main MATLAB applications developed and additional material are available at the webpage . Specialized functions can be obtained from the authors. PMID:15501469

  6. DNA sequence organization in the genomes of five marine invertebrates.

    PubMed

    Goldberg, R B; Crain, W R; Ruderman, J V; Moore, G P; Barnett, T R; Higgins, R C; Gelfand, R A; Galau, G A; Britten, R J; Davidson, E H

    1975-07-21

    The arrangement of repetitive and non-repetitive sequence was studied in the genomic DNA of the oyster (Crassostrea virginica), the surf clam (Spisula solidissima), the horseshoe crab (Limulus polyphemus), a nemertean worm (Cerebratulus lacteus) and a jelly-fish (Aurelia aurita). Except for the jellyfish these animals belong to the protostomial branch of animal evolution, for which little information regarding DNA sequence organization has previously been available. The reassociation kinetics of short (250-300 nucleotide) and long (2,000-3,000 nucleotide) DNA fragments was studied by the hydroxyapatite method. It was shown that in each case a major fraction of the DNA consists of single copy sequences less than about 3,000 nucleotides in length, interspersed with short repetitive sequences. The lengths of the repetitive sequences were estimated by optical hyperchromicity and S1 nuclease measurements made on renaturation products. All the genomes studied include a prominent fraction of interspersed repetitive sequences about 300 nucleotides in length, as well as longer repetitive sequence regions. PMID:238802

  7. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed Central

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-01-01

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region. PMID:3671088

  8. PCR Primers for Metazoan Mitochondrial 12S Ribosomal DNA Sequences

    PubMed Central

    Machida, Ryuji J.; Kweskin, Matthew; Knowlton, Nancy

    2012-01-01

    Background Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. Methodology/Principal Findings A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. Conclusions/Significance Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans. PMID:22536450

  9. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-10-26

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region. PMID:3671088

  10. Measurement of the sequence specificity of covalent DNA modification by antineoplastic agents using Taq DNA polymerase.

    PubMed Central

    Ponti, M; Forrow, S M; Souhami, R L; D'Incalci, M; Hartley, J A

    1991-01-01

    A polymerase stop assay has been developed to determine the DNA nucleotide sequence specificity of covalent modification by antineoplastic agents using the thermostable DNA polymerase from Thermus aquaticus and synthetic labelled primers. The products of linear amplification are run on sequencing gels to reveal the sites of covalent drug binding. The method has been studied in detail for a number of agents including nitrogen mustards, platinum analogues and mitomycin C, and the sequence specificities obtained accord with those obtained by other procedures. The assay is advantageous in that it is not limited to a single type of DNA lesion (as in the piperidine cleavage assay for guanine-N7 alkylation), does not require a strand breakage step, and is more sensitive than other primer extension procedures which have only one cycle of polymerization. In particular the method has considerable potential for examining the sequence selectivity of damage and repair in single copy gene sequences in genomic DNA from cells. Images PMID:2057351

  11. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  12. Automated detection of meteors in observed image sequence

    NASA Astrophysics Data System (ADS)

    Šimberová, Stanislava; Suk, Tomáš

    2015-12-01

    We propose a new detection technique based on statistical characteristics of images in the video sequence. These characteristics displayed in time enable to catch any bright track during the whole sequence. We applied our method to the image datacubes that are created from camera pictures of the night sky. Meteor flying through the Earth's atmosphere leaves a light trail lasting a few seconds on the sky background. We developed a special technique to recognize this event automatically in the complete observed video sequence. For further analysis leading to the precise recognition of object we suggest to apply Fourier and Hough transformations.

  13. Selective enrichment of damaged DNA molecules for ancient genome sequencing

    PubMed Central

    2014-01-01

    Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA—the presence of deoxyuracils—for selective enrichment of endogenous DNA against a complex background of contamination during DNA library preparation. By applying the method to Neanderthal DNA extracts that are heavily contaminated with present-day human DNA, we show that the fraction of useful sequence information increases ∼10-fold and that the resulting sequences are more efficiently depleted of human contamination than when using purely computational approaches. Furthermore, we show that U selection can lead to a four- to fivefold increase in the proportion of endogenous DNA sequences relative to those of microbial contaminants in some samples. U selection may thus help to lower the costs for ancient genome sequencing of nonhuman samples also. PMID:25081630

  14. Detection, sequence patterns and function of unusual DNA structures.

    PubMed Central

    Anderson, J N

    1986-01-01

    Unusual DNA structures were detected by an electrophoretic procedure in which DNA fragments were separated according to size on agarose gels and then by shape on polyacrylamide gels. Fragments from yeast centromeres migrated faster in polyacrylamide than predicted from their base composition and size and this property was attributed to a nonrandom distribution of oligomeric A tracts that exhibited minima at 10-11 base intervals. Fragments from seven loci in 107 kb of DNA migrated anomalously slow and these fragments contained blocks of A2-6 in a 10-11 base periodicity which is indicative of bent DNA. The most pronounced bent sequences were found within yeast ARS1 and centered at 245 and 240 bp from the left and right ends of the adenovirus genome. Each sequence is approximately 150 bp away from a replication origin and the adenovirus sequences are within 50 bp of enhancers. Nuclear matrix attachment sites, which are also adjacent to enhancers, contain sequences characteristic of bent DNA. These results suggest that bent structures reside at the base of DNA loops in chromosomes. Images PMID:3786134

  15. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    PubMed Central

    Inbamalar, T. M.; Sivakumar, R.

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system. PMID:26000337

  16. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    PubMed

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system. PMID:26000337

  17. Mitochondrial DNA Sequence Analysis - Validation and Use for Forensic Casework.

    PubMed

    Holland, M M; Parsons, T J

    1999-06-01

    With the discovery of the polymerase chain reaction (PCR) in the mid-1980's, the last in a series of critical molecular biology techniques (to include the isolation of DNA from human and non-human biological material, and primary sequence analysis of DNA) had been developed to rapidly analyze minute quantities of mitochondrial DNA (mtDNA). This was especially true for mtDNA isolated from challenged sources, such as ancient or aged skeletal material and hair shafts. One of the beneficiaries of this work has been the forensic community. Over the last decade, a significant amount of research has been conducted to develop PCR-based sequencing assays for the mtDNA control region (CR), which have subsequently been used to further characterize the CR. As a result, the reliability of these assays has been investigated, the limitations of the procedures have been determined, and critical aspects of the analysis process have been identified, so that careful control and monitoring will provide the basis for reliable testing. With the application of these assays to forensic identification casework, mtDNA sequence analysis has been properly validated, and is a reliable procedure for the examination of biological evidence encountered in forensic criminalistic cases. PMID:26255820

  18. Sequence specificity of psoralen photobinding to DNA: a quantitative approach.

    PubMed

    Gia, O; Magno, S M; Garbesi, A; Colonna, F P; Palumbo, M

    1992-12-01

    The effects of different DNA sequences on the photoreaction of various furocoumarin derivatives was investigated from a quantitative point of view using a number of self-complementary oligonucleotides. These contained 5'-TA and 5'-AT residues, having various flanking sequences. The furocoumarins included classical bifunctional derivatives, such as 8-methoxy- and 5-methoxypsoralen, as well as monofunctional compounds, such as angelicin and benzopsoralen. Taking into an account the thermodynamic constant for noncovalent binding of each psoralen to each DNA sequence, the rate constants for the photobinding process to each fragment were evaluated. The extent of photoreaction is greatly affected by the DNA sequence examined. While sequences of the type 5'-(GTAC)n are quite reactive towards all furocoumarins, 5'-TATA exhibited a reduced rate of photobinding using monofunctional psoralens. In addition terminal 5'-TA groups were the least reactive with 5- and 8-methoxypsoralen, but not with angelicin or benzopsoralen. Also 5'-AT-containing fragments exhibited remarkably variable responses toward monofunctional or bifunctional psoralen derivatives. As a general trend the photoreactivity rate of the former is less sequence-sensitive, the ratio between maximum and minimum being less than 2 for the examined fragments. The same ratio is about 3.4 for 8-methoxypsoralen and 6.2 for 5-methoxypsoralen. This approach, in combination with footprinting studies, appears to be quite useful for a quantitative investigation of the process of covalent binding of psoralens to specific sites in DNA. PMID:1445915

  19. Mapping DNA polymerase errors by single-molecule sequencing.

    PubMed

    Lee, David F; Lu, Jenny; Chang, Seungwoo; Loparo, Joseph J; Xie, Xiaoliang S

    2016-07-27

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replication product is tagged with a unique nucleotide sequence before amplification. This allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases. PMID:27185891

  20. Label-free DNA sequencing using Millikan detection.

    PubMed

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications. PMID:26151683

  1. BARCRAWL and BARTAB: software tools for the design and implementation of barcoded primers for highly multiplexed DNA sequencing

    PubMed Central

    Frank, Daniel N

    2009-01-01

    Background Advances in automated DNA sequencing technology have greatly increased the scale of genomic and metagenomic studies. An increasingly popular means of increasing project throughput is by multiplexing samples during the sequencing phase. This can be achieved by covalently linking short, unique "barcode" DNA segments to genomic DNA samples, for instance through incorporation of barcode sequences in PCR primers. Although several strategies have been described to insure that barcode sequences are unique and robust to sequencing errors, these have not been integrated into the overall primer design process, thus potentially introducing bias into PCR amplification and/or sequencing steps. Results Barcrawl is a software program that facilitates the design of barcoded primers, for multiplexed high-throughput sequencing. The program bartab can be used to deconvolute DNA sequence datasets produced by the use of multiple barcoded primers. This paper describes the functions implemented by barcrawl and bartab and presents a proof-of-concept case study of both programs in which barcoded rRNA primers were designed and validated by high-throughput sequencing. Conclusion Barcrawl and bartab can benefit researchers who are engaged in metagenomic projects that employ multiplexed specimen processing. The source code is released under the GNU general public license and can be accessed at . PMID:19874596

  2. Theoretical modelling of epigenetically modified DNA sequences.

    PubMed

    Carvalho, Alexandra Teresa Pires; Gouveia, Maria Leonor; Raju Kanna, Charan; Wärmländer, Sebastian K T S; Platts, Jamie; Kamerlin, Shina Caroline Lynn

    2015-01-01

    We report herein a set of calculations designed to examine the effects of epigenetic modifications on the structure of DNA. The incorporation of methyl, hydroxymethyl, formyl and carboxy substituents at the 5-position of cytosine is shown to hardly affect the geometry of CG base pairs, but to result in rather larger changes to hydrogen-bond and stacking binding energies, as predicted by dispersion-corrected density functional theory (DFT) methods. The same modifications within double-stranded GCG and ACA trimers exhibit rather larger structural effects, when including the sugar-phosphate backbone as well as sodium counterions and implicit aqueous solvation. In particular, changes are observed in the buckle and propeller angles within base pairs and the slide and roll values of base pair steps, but these leave the overall helical shape of DNA essentially intact. The structures so obtained are useful as a benchmark of faster methods, including molecular mechanics (MM) and hybrid quantum mechanics/molecular mechanics (QM/MM) methods. We show that previously developed MM parameters satisfactorily reproduce the trimer structures, as do QM/MM calculations which treat bases with dispersion-corrected DFT and the sugar-phosphate backbone with AMBER. The latter are improved by inclusion of all six bases in the QM region, since a truncated model including only the central CG base pair in the QM region is considerably further from the DFT structure. This QM/MM method is then applied to a set of double-stranded DNA heptamers derived from a recent X-ray crystallographic study, whose size puts a DFT study beyond our current computational resources. These data show that still larger structural changes are observed than in base pairs or trimers, leading us to conclude that it is important to model epigenetic modifications within realistic molecular contexts. PMID:26448859

  3. Correlations in DNA sequences across the three domains of life

    NASA Astrophysics Data System (ADS)

    Guharay, Sabyasachi; Hunt, Brian R.; Yorke, James A.; White, Owen R.

    2000-11-01

    We report statistical studies of correlation properties of ∼7500 gene sequences, covering coding (exon) and non-coding (intron) sequences for DNA and primary amino acid sequences for proteins, across all three domains of life, namely Eukaryotes (cells with nuclei), Prokaryotes (bacteria) and Archaea (archaebacteria). Mutual information function, power spectrum and Hölder exponent analyses show exons with somewhat greater correlation content than the introns studied. These results are further confirmed with hypothesis testing. While ∼30% of the Eukaryote coding sequences show distinct correlations above noise threshold, this is true for only ∼10% of the Prokaryote and Archaea coding sequences. For protein sequences, we observe correlation lengths similar to that of “random” sequences.

  4. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    PubMed Central

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  5. Spatial Control of DNA Reaction Networks by DNA Sequence

    PubMed Central

    Allen, Peter B.; Chen, Xi; Ellington, Andrew D.

    2013-01-01

    We have developed a set of DNA circuits that execute during gel electrophoresis to yield immobile, fluorescent features in the gel. The parallel execution of orthogonal circuits led to the simultaneous production of different fluorescent lines at different positions in the gel. The positions of the lines could be rationally manipulated by changing the mobilities of the reactants. The ability to program at the nanoscale so as to produce patterns at the macroscale is a step towards programmable, synthetic chemical systems for generating defined spatiotemporal patterns. PMID:23143151

  6. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  7. Dialects of the DNA uptake sequence in Neisseriaceae.

    PubMed

    Frye, Stephan A; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-04-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation in

  8. Automated genomic DNA purification options in agricultural applications using MagneSil paramagnetic particles

    NASA Astrophysics Data System (ADS)

    Bitner, Rex M.; Koller, Susan C.

    2002-06-01

    The automated high throughput purification of genomic DNA form plant materials can be performed using MagneSil paramagnetic particles on the Beckman-Coulter FX, BioMek 2000, and the Tecan Genesis robot. Similar automated methods are available for DNA purifications from animal blood. These methods eliminate organic extractions, lengthy incubations and cumbersome filter plates. The DNA is suitable for applications such as PCR and RAPD analysis. Methods are described for processing traditionally difficult samples such as those containing large amounts of polyphenolics or oils, while still maintaining a high level of DNA purity. The robotic protocols have ben optimized for agricultural applications such as marker assisted breeding, seed-quality testing, and SNP discovery and scoring. In addition to high yield purification of DNA from plant samples or animal blood, the use of Promega's DNA-IQ purification system is also described. This method allows for the purification of a narrow range of DNA regardless of the amount of additional DNA that is present in the initial sample. This simultaneous Isolation and Quantification of DNA allows the DNA to be used directly in applications such as PCR, SNP analysis, and RAPD, without the need for separate quantitation of the DNA.

  9. A novel chaotic image encryption scheme using DNA sequence operations

    NASA Astrophysics Data System (ADS)

    Wang, Xing-Yuan; Zhang, Ying-Qian; Bao, Xue-Mei

    2015-10-01

    In this paper, we propose a novel image encryption scheme based on DNA (Deoxyribonucleic acid) sequence operations and chaotic system. Firstly, we perform bitwise exclusive OR operation on the pixels of the plain image using the pseudorandom sequences produced by the spatiotemporal chaos system, i.e., CML (coupled map lattice). Secondly, a DNA matrix is obtained by encoding the confused image using a kind of DNA encoding rule. Then we generate the new initial conditions of the CML according to this DNA matrix and the previous initial conditions, which can make the encryption result closely depend on every pixel of the plain image. Thirdly, the rows and columns of the DNA matrix are permuted. Then, the permuted DNA matrix is confused once again. At last, after decoding the confused DNA matrix using a kind of DNA decoding rule, we obtain the ciphered image. Experimental results and theoretical analysis show that the scheme is able to resist various attacks, so it has extraordinarily high security.

  10. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    PubMed Central

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The application to these ultrathin gels of electric fields up to 250 volts/cm permits the rapid separation of multiple DNA sequencing reactions in parallel. When used in conjunction with 32P-based autoradiography, the DNA bands appear substantially sharper than those obtained in conventional electrophoresis. This increased sharpness permits shorter autoradiographic exposure times and longer sequence reads. Images PMID:1870968

  11. Compilation of DNA sequences of Escherichia coli (update 1991)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Rice, Peter

    1991-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the third listing replacing and increasing the former listing roughly by one fifth. However, in order to save space this printed version contains DNA sequence information only. The complete compilation is now available in machine readable form from the EMBL data library (ECD release 6). After deletion of all detected overlaps a total of 1 492 282 individual bp is found to be determined till the beginning of 1991. This corresponds to a total of 31.62% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for statistical purposes only. PMID:2041799

  12. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  13. Sequence-specific binding of luzopeptin to DNA.

    PubMed Central

    Fox, K R; Davies, H; Adams, G R; Portugal, J; Waring, M J

    1988-01-01

    We have examined the binding of luzopeptin, an antitumor antibiotic, to five DNA fragments of varying base composition. The drug forms a tight, possibly covalent, complex with the DNA causing a reduction in mobility on nondenaturing polyacrylamide gels and some smearing of the bands consistent with intramolecular cross-linking of DNA duplexes. DNAase I and micrococcal nuclease footprinting experiments suggest that the drug binds best to regions containing alternating A and T residues, although no consensus di- or trinucleotide sequence emerges. Binding to other sites is not excluded and at moderate ligand concentrations the DNA is almost totally protected from enzyme attack. Ligand-induced enhancement of DNAase I cleavage is observed at both AT and GC-rich regions. The sequence selectivity and characteristics of luzopeptin binding are quite different from those of echinomycin, a bifunctional intercalator of related structure. Images PMID:3362673

  14. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  15. Multiple Base Substitution Corrections in DNA Sequence Evolution

    NASA Astrophysics Data System (ADS)

    Kowalczuk, M.; Mackiewicz, P.; Szczepanik, D.; Nowicka, A.; Dudkiewicz, M.; Dudek, M. R.; Cebrat, S.

    We discuss the Jukes and Cantor's one-parameter model and Kimura's two-parameter model unability to describe evolution of asymmetric DNA molecules. The standard distance measure between two DNA sequences, which is the number of substitutions per site, should include the effect of multiple base substitutions separately for each type of the base. Otherwise, the respective tables of substitutions cannot reconstruct the asymmetric DNA molecule with respect to the composition. Basing on Kimura's neutral theory, we have derived a linear law for the correlation of the mean survival time of nucleotides under constant mutation pressure and their fraction in the genome. According to the law, the corrections to Kimura's theory have been discussed to describe evolution of genomes with asymmetric nucleotide composition. We consider the particular case of the strongly asymmetric Borrelia burgdorferi genome and we discuss in detail the corrections, which should be introduced into the distance measure between two DNA sequences to include multiple base substitutions.

  16. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  17. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  18. Automated quantification of lumbar vertebral kinematics from dynamic fluoroscopic sequences

    NASA Astrophysics Data System (ADS)

    Camp, Jon; Zhao, Kristin; Morel, Etienne; White, Dan; Magnuson, Dixon; Gay, Ralph; An, Kai-Nan; Robb, Richard

    2009-02-01

    We hypothesize that the vertebra-to-vertebra patterns of spinal flexion and extension motion of persons with lower back pain will differ from those of persons who are pain-free. Thus, it is our goal to measure the motion of individual lumbar vertebrae noninvasively from dynamic fluoroscopic sequences. Two-dimensional normalized mutual information-based image registration was used to track frame-to-frame motion. Software was developed that required the operator to identify each vertebra on the first frame of the sequence using a four-point "caliper" placed at the posterior and anterior edges of the inferior and superior end plates of the target vertebrae. The program then resolved the individual motions of each vertebra independently throughout the entire sequence. To validate the technique, 6 cadaveric lumbar spine specimens were potted in polymethylmethacrylate and instrumented with optoelectric sensors. The specimens were then placed in a custom dynamic spine simulator and moved through flexion-extension cycles while kinematic data and fluoroscopic sequences were simultaneously acquired. We found strong correlation between the absolute flexionextension range of motion of each vertebra as recorded by the optoelectric system and as determined from the fluoroscopic sequence via registration. We conclude that this method is a viable way of noninvasively assessing twodimensional vertebral motion.

  19. Ancient mtDNA sequences from the First Australians revisited

    PubMed Central

    Subramanian, Sankar; Wright, Joanne L.; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D.; Willerslev, Eske; Lambert, David M.

    2016-01-01

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537–542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the “Out of Africa” model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  20. Ancient mtDNA sequences from the First Australians revisited.

    PubMed

    Heupink, Tim H; Subramanian, Sankar; Wright, Joanne L; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D; Willerslev, Eske; Lambert, David M

    2016-06-21

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537-542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the "Out of Africa" model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  1. Biased distribution of DNA uptake sequences towards genome maintenance genes.

    PubMed

    Davidsen, Tonje; Rødland, Einar A; Lagesen, Karin; Seeberg, Erling; Rognes, Torbjørn; Tønjum, Tone

    2004-01-01

    Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group in these organisms. Pasteurella multocida also displayed high frequencies of a putative DUS identical to that previously identified in H.influenzae and with a skewed distribution towards genome maintenance genes, indicating that this bacterium might be transformation competent under certain conditions. These results imply that the high frequency of DUS in genome maintenance genes is conserved among phylogenetically divergent species and thus are of significant biological importance. Increased DUS density is expected to enhance DNA uptake and the over-representation of DUS in genome maintenance genes might reflect facilitated recovery of genome preserving functions. For example, transient and beneficial increase in genome instability can be allowed during pathogenesis simply through loss of antimutator genes, since these DUS-containing sequences will be preferentially recovered. Furthermore, uptake of such genes could provide a mechanism for facilitated recovery from DNA damage after genotoxic stress. PMID:14960717

  2. Sequence-selective binding of an ellipticine derivative to DNA.

    PubMed Central

    Bailly, C; OhUigin, C; Rivalle, C; Bisagni, E; Hénichart, J P; Waring, M J

    1990-01-01

    The DNA sequence specificity of an ellipticine derivative bearing an aminoalkyl side chain has been determined by a variety of footprinting methods. The drug exhibits sequence selective binding and discriminates against runs of adenines or thymines. Binding is shown to occur at various sequences with a preference for GC rich regions of DNA. A large enhancement of DNAase I and of hydroxyl radical cleavage in regions rich in A's or T's is observed together with hyperreactivity of adenines towards diethylpyrocarbonate in the presence of drug. This indicates the occurrence of drug-induced changes in critical conformational features of DNA. The total absence of hyperreactivity of guanine residues towards diethylpyrocarbonate appears to be related to the sequence selectivity of drug binding. No alteration of the dimethyl sulphate and methylene blue-induced cleavage of DNA is observed. Irradiation of ellipticine derivative-DNA complexes with UV light followed by alkali treatment leads to selective photocleavage at guanine residues, consistent with the deduced degree of selectivity of the binding reaction. Images PMID:2173825

  3. Distribution of repetitious sequences in chick nuclear DNA

    PubMed Central

    Tapiero, H.; Monier, M.N.; Shaool, D.; Harel, J.

    1974-01-01

    By an improved method of hydroxylapatite chromatography, the reassociated sequences of chick nuclear DNA were isolated, and their base composition analysed. By increasing the amount of reassociation, the G + C content of the renatured sequences decreased progressively to reach a mean value corresponding to that of the total DNA. In order to study the distribution of the families, or group of families having different amount of reassociation, DNA was fractionated by CsC1 density gradient centrifugation. Fractions having different G + C content were obtained, and their reassociation rates analysed. At high Cot value of renaturation (Cot=50) the amount of reassociated sequences included in the high or in the low buoyant density DNA fractions was approximately the same, but their G + C content was as expected different. At lower Cot values of renaturation (between Cot of 0.2 and the Cot of 10), the results indicated an heterogeneity of the repeated sequences in the A + T rich DNA fractions, as compared to the G + C rich ones. PMID:4213036

  4. Sequence dependence of transcription factor-mediated DNA looping

    PubMed Central

    Johnson, Stephanie; Lindén, Martin; Phillips, Rob

    2012-01-01

    DNA is subject to large deformations in a wide range of biological processes. Two key examples illustrate how such deformations influence the readout of the genetic information: the sequestering of eukaryotic genes by nucleosomes and DNA looping in transcriptional regulation in both prokaryotes and eukaryotes. These kinds of regulatory problems are now becoming amenable to systematic quantitative dissection with a powerful dialogue between theory and experiment. Here, we use a single-molecule experiment in conjunction with a statistical mechanical model to test quantitative predictions for the behavior of DNA looping at short length scales and to determine how DNA sequence affects looping at these lengths. We calculate and measure how such looping depends upon four key biological parameters: the strength of the transcription factor binding sites, the concentration of the transcription factor, and the length and sequence of the DNA loop. Our studies lead to the surprising insight that sequences that are thought to be especially favorable for nucleosome formation because of high flexibility lead to no systematically detectable effect of sequence on looping, and begin to provide a picture of the distinctions between the short length scale mechanics of nucleosome formation and looping. PMID:22718983

  5. Mitochondrial DNA sequences from a 7000-year old brain.

    PubMed Central

    Pääbo, S; Gifford, J A; Wilson, A C

    1988-01-01

    Pieces of mitochondrial DNA from a 7000-year-old human brain were amplified by the polymerase chain reaction and sequenced. Albumin and high concentrations of polymerase were required to overcome a factor in the brain extract that inhibits amplification. For this and other sources of ancient DNA, we find an extreme inverse dependence of the amplification efficiency on the length of the sequence to be amplified. This property of ancient DNA distinguishes it from modern DNA and thus provides a new criterion of authenticity for use in research on ancient DNA. The brain is from an individual recently excavated from Little Salt Spring in southwestern Florida and the anthropologically informative sequences it yielded are the first obtained from archaeologically retrieved remains. The sequences show that this ancient individual belonged to a mitochondrial lineage that is rare in the Old World and not previously known to exist among Native Americans. Our finding brings to three the number of maternal lineages known to have been involved in the prehistoric colonization of the New World. Images PMID:3186445

  6. Mitochondrial DNA sequences in the nuclear genome of a locust.

    PubMed

    Gellissen, G; Bradfield, J Y; White, B N; Wyatt, G R

    The endosymbiotic theory of the origin of mitochondria is widely accepted, and implies that loss of genes from the mitochondria to the nucleus of eukaryotic cells has occurred over evolutionary time. However, evidence at the DNA sequence level for gene transfer between these organelles has so far been limited to a single example, the demonstration that a mitochondrial ATPase subunit gene of Neurospora crassa has an homologous partner in the nuclear genome. From a gene library of the insect, Locusta migratoria, we have now isolated two clones, representing separate fragments of nuclear DNA, which contain sequences homologous to the mitochondrial genes for ribosomal RNA, as well as regions of homology with highly repeated nuclear sequences. The results suggest the transfer of sequences between mitochondrial and nuclear genomes, followed by evolutionary divergence. PMID:6298629

  7. DNA Qualification Workflow for Next Generation Sequencing of Histopathological Samples

    PubMed Central

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T.; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  8. DNA qualification workflow for next generation sequencing of histopathological samples.

    PubMed

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  9. RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization ( 7th Annual SFAF Meeting, 2012)

    ScienceCinema

    Lane, Todd [SNL

    2013-02-11

    Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  10. RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization ( 7th Annual SFAF Meeting, 2012)

    SciTech Connect

    Lane, Todd

    2012-06-01

    Todd Lane on "RapTOR: Automated sequencing library preparation and suppression for rapid pathogen characterization" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.

  11. Defining the sequence requirements for the positioning of base J in DNA using SMRT sequencing

    PubMed Central

    Genest, Paul-Andre; Baugh, Loren; Taipale, Alex; Zhao, Wanqi; Jan, Sabrina; van Luenen, Henri G.A.M.; Korlach, Jonas; Clark, Tyson; Luong, Khai; Boitano, Matthew; Turner, Steve; Myler, Peter J.; Borst, Piet

    2015-01-01

    Base J (β-D-glucosyl-hydroxymethyluracil) replaces 1% of T in the Leishmania genome and is only found in telomeric repeats (99%) and in regions where transcription starts and stops. This highly restricted distribution must be co-determined by the thymidine hydroxylases (JBP1 and JBP2) that catalyze the initial step in J synthesis. To determine the DNA sequences recognized by JBP1/2, we used SMRT sequencing of DNA segments inserted into plasmids grown in Leishmania tarentolae. We show that SMRT sequencing recognizes base J in DNA. Leishmania DNA segments that normally contain J also picked up J when present in the plasmid, whereas control sequences did not. Even a segment of only 10 telomeric (GGGTTA) repeats was modified in the plasmid. We show that J modification usually occurs at pairs of Ts on opposite DNA strands, separated by 12 nucleotides. Modifications occur near G-rich sequences capable of forming G-quadruplexes and JBP2 is needed, as it does not occur in JBP2-null cells. We propose a model whereby de novo J insertion is mediated by JBP2. JBP1 then binds to J and hydroxylates another T 13 bp downstream (but not upstream) on the complementary strand, allowing JBP1 to maintain existing J following DNA replication. PMID:25662217

  12. An optimization approach and its application to compare DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Liwei; Li, Chao; Bai, Fenglan; Zhao, Qi; Wang, Ying

    2015-02-01

    Studying the evolutionary relationship between biological sequences has become one of the main tasks in bioinformatics research by means of comparing and analyzing the gene sequence. Many valid methods have been applied to the DNA sequence alignment. In this paper, we propose a novel comparing method based on the Lempel-Ziv (LZ) complexity to compare biological sequences. Moreover, we introduce a new distance measure and make use of the corresponding similarity matrix to construct phylogenic tree without multiple sequence alignment. Further, we construct phylogenic tree for 24 species of Eutherian mammals and 48 countries of Hepatitis E virus (HEV) by an optimization approach. The results indicate that this new method improves the efficiency of sequence comparison and successfully construct phylogenies.

  13. DNA sequence of the maize transposable element Dissociation.

    PubMed

    Döring, H P; Tillmann, E; Starlinger, P

    The DNA sequence of the terminal 4.2 kilobases (kb) of the 30-kb insertion in the endosperm sucrose synthase gene of maize mutant sh-m5933 shows that it comprises two identical 2,040-base pair (bp) segments, one inserted in the reverse direction into the other. We suggest that the 2,040-bp sequence is an example of the transposable element Dissociation described by Barbara McClintock. PMID:6318121

  14. Fast DNA sequencing by electrical means inches closer

    NASA Astrophysics Data System (ADS)

    Di Ventra, Massimiliano

    2013-08-01

    The sequencing of the human genome offered a glimpse of future medical practices, where information retrieved from the genome could be harnessed to inform treatment decisions. However, making DNA sequencing accessible enough for widespread use poses a number of challenges. This perspective article traces the progress made in the field so far and looks at how close we may be already to real-life applications.

  15. Restriction and sequence alterations affect DNA uptake sequence-dependent transformation in Neisseria meningitidis.

    PubMed

    Ambur, Ole Herman; Frye, Stephan A; Nilsen, Mariann; Hovland, Eirik; Tønjum, Tone

    2012-01-01

    Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS). In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC) but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s) responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s) of one of the most conserved mechanisms in nature: homologous recombination. PMID

  16. Restriction and Sequence Alterations Affect DNA Uptake Sequence-Dependent Transformation in Neisseria meningitidis

    PubMed Central

    Ambur, Ole Herman; Frye, Stephan A.; Nilsen, Mariann; Hovland, Eirik; Tønjum, Tone

    2012-01-01

    Transformation is a complex process that involves several interactions from the binding and uptake of naked DNA to homologous recombination. Some actions affect transformation favourably whereas others act to limit it. Here, meticulous manipulation of a single type of transforming DNA allowed for quantifying the impact of three different mediators of meningococcal transformation: NlaIV restriction, homologous recombination and the DNA Uptake Sequence (DUS). In the wildtype, an inverse relationship between the transformation frequency and the number of NlaIV restriction sites in DNA was observed when the transforming DNA harboured a heterologous region for selection (ermC) but not when the transforming DNA was homologous with only a single nucleotide heterology. The influence of homologous sequence in transforming DNA was further studied using plasmids with a small interruption or larger deletions in the recombinogenic region and these alterations were found to impair transformation frequency. In contrast, a particularly potent positive driver of DNA uptake in Neisseria sp. are short DUS in the transforming DNA. However, the molecular mechanism(s) responsible for DUS specificity remains unknown. Increasing the number of DUS in the transforming DNA was here shown to exert a positive effect on transformation. Furthermore, an influence of variable placement of DUS relative to the homologous region in the donor DNA was documented for the first time. No effect of altering the orientation of DUS was observed. These observations suggest that DUS is important at an early stage in the recognition of DNA, but does not exclude the existence of more than one level of DUS specificity in the sequence of events that constitute transformation. New knowledge on the positive and negative drivers of transformation may in a larger perspective illuminate both the mechanisms and the evolutionary role(s) of one of the most conserved mechanisms in nature: homologous recombination. PMID

  17. Essential DNA sequence for the replication of Rts1.

    PubMed Central

    Itoh, Y; Kamio, Y; Terawaki, Y

    1987-01-01

    The promoter sequence of the mini-Rts1 repA gene encoding the 33,000-dalton RepA protein that is essential for replication was defined by RNA polymerase protection experiments and by analyzing RepA protein synthesized in maxicells harboring mini-Rts1 derivatives deleted upstream of or within the presumptive promoter region. The -10 region of the promoter which shows homology to the incII repeat sequences overlaps two inverted repeats. One of the repeats forms a pair with a sequence in the -35 region, and the other forms a pair with the translation initiation region. The replication origin region, ori(Rts1), which was determined by supplying RepA protein in trans, was localized within 188 base pairs in a region containing three incII repeats and four GATC sequences. Dyad dnaA boxes that exist upstream from the GATC sequences appeared to be dispensable for the origin function, but deletion of both dnaA boxes from ori(Rts1) resulted in reduced replication frequency, suggesting that host-encoded DnaA protein is involved in the replication of Rts1 as a stimulatory element. Combination of the minimal repA and ori(Rts1) segments, even in the reverse orientation compared with the natural sequence, resulted in reconstitution of an autonomously replicating molecule. Images PMID:3546265

  18. Automated detection of cardiac phase from intracoronary ultrasound image sequences.

    PubMed

    Sun, Zheng; Dong, Yi; Li, Mengchan

    2015-01-01

    Intracoronary ultrasound (ICUS) is a widely used interventional imaging modality in clinical diagnosis and treatment of cardiac vessel diseases. Due to cyclic cardiac motion and pulsatile blood flow within the lumen, there exist changes of coronary arterial dimensions and relative motion between the imaging catheter and the lumen during continuous pullback of the catheter. The action subsequently causes cyclic changes to the image intensity of the acquired image sequence. Information on cardiac phases is implied in a non-gated ICUS image sequence. A 1-D phase signal reflecting cardiac cycles was extracted according to cyclical changes in local gray-levels in ICUS images. The local extrema of the signal were then detected to retrieve cardiac phases and to retrospectively gate the image sequence. Results of clinically acquired in vivo image data showed that the average inter-frame dissimilarity of lower than 0.1 was achievable with our technique. In terms of computational efficiency and complexity, the proposed method was shown to be competitive when compared with the current methods. The average frame processing time was lower than 30 ms. We effectively reduced the effect of image noises, useless textures, and non-vessel region on the phase signal detection by discarding signal components caused by non-cardiac factors. PMID:26406038

  19. A novel 2-D graphical representation of DNA sequences of low degeneracy

    NASA Astrophysics Data System (ADS)

    Guo, Xiaofeng; Randic, Milan; Basak, Subhash C.

    2001-12-01

    Some 2-D and 3-D graphical representations of DNA sequences have been given by Nandy, Leong and Mogenthaler, and Randic et al., which give visual characterizations of DNA sequences. In this Letter, we introduce a novel graphical representation of DNA sequences by taking four special vectors in 2-D space to represent the four nucleic acid bases in DNA sequences, so that a DNA sequence is denoted on a plane by a successive vector sequence, which is also a directed walk on the plane. It is showed that the novel graphical representation of DNA sequences has lower degeneracy and less overlapping.

  20. Magnetic bead purification of labeled DNA fragments forhigh-throughput capillary electrophoresis sequencing

    SciTech Connect

    Elkin, Christopher; Kapur, Hitesh; Smith, Troy; Humphries, David; Pollard, Martin; Hammon, Nancy; Hawkins, Trevor

    2001-09-15

    We have developed an automated purification method for terminator sequencing products based on a magnetic bead technology. This 384-well protocol generates labeled DNA fragments that are essentially free of contaminates for less than $0.005 per reaction. In comparison to laborious ethanol precipitation protocols, this method increases the phred20 read length by forty bases with various DNA templates such as PCR fragments, Plasmids, Cosmids and RCA products. Our method eliminates centrifugation and is compatible with both the MegaBACE 1000 and ABIPrism 3700 capillary instruments. As of September 2001, this method has produced over 1.6 million samples with 93 percent averaging 620 phred20 bases as part of Joint Genome Institutes Production Process.

  1. Effect of Noise on DNA Sequencing via Transverse Electronic Transport

    PubMed Central

    Krems, Matt; Zwolak, Michael; Pershin, Yuriy V.; Di Ventra, Massimiliano

    2009-01-01

    Abstract Previous theoretical studies have shown that measuring the transverse current across DNA strands while they translocate through a nanopore or channel may provide a statistically distinguishable signature of the DNA bases, and may thus allow for rapid DNA sequencing. However, fluctuations of the environment, such as ionic and DNA motion, introduce important scattering processes that may affect the viability of this approach to sequencing. To understand this issue, we have analyzed a simple model that captures the role of this complex environment in electronic dephasing and its ability to remove charge carriers from current-carrying states. We find that these effects do not strongly influence the current distributions due to the off-resonant nature of tunneling through the nucleotides—a result we expect to be a common feature of transport in molecular junctions. In particular, only large scattering strengths, as compared to the energetic gap between the molecular states and the Fermi level, significantly alter the form of the current distributions. Since this gap itself is quite large, the current distributions remain protected from this type of noise, further supporting the possibility of using transverse electronic transport measurements for DNA sequencing. PMID:19804730

  2. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  3. Sequence-specific DNA primer effects on telomerase polymerization activity.

    PubMed Central

    Lee, M S; Blackburn, E H

    1993-01-01

    The ribonucleoprotein enzyme telomerase synthesizes one strand of telomeric DNA by copying a template sequence within the RNA moiety of the enzyme. Kinetic studies of this polymerization reaction were used to analyze the mechanism and properties of the telomerase from Tetrahymena thermophila. This enzyme synthesizes TTGGGG repeats, the telomeric DNA sequence of this species, by elongating a DNA primer whose 3' end base pairs with the template-forming domain of the RNA. The enzyme was found to act nonprocessively with short (10- to 12-nucleotide) primers but to become processive as TTGGGG repeats were added. Variation of the 5' sequences of short primers with a common 3' end identified sequence-specific effects which are distinct from those involving base pairing of the 3' end of the primer with the RNA template and which can markedly induce enzyme activity by increasing the catalytic rate of the telomerase polymerization reaction. These results identify an additional mechanistic basis for telomere and DNA end recognition by telomerase in vivo. Images PMID:8413255

  4. Decoding long nanopore sequencing reads of natural DNA.

    PubMed

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands. PMID:24964173

  5. DNA-sequence-specific erasers of epigenetic memory.

    PubMed

    Mozgova, Iva; Köhler, Claudia

    2016-05-27

    How epigenetic regulators find their specific targets remains a challenging question. Two parallel studies show that REF6, a plant H3K27me3 demethylase, binds a specific DNA motif via its zinc-finger domains and recruits the SWI/SNF-type ATPase BRAHMA, demonstrating a sequence-specific recruitment mechanism for a chromatin-modifying complex. PMID:27230685

  6. Derivatized versions of ligase enzymes for constructing DNA sequences

    DOEpatents

    Mariella, Jr., Raymond P.; Christian, Allen T.; Tucker, James D.; Dzenitis, John M.; Papavasiliou, Alexandros P.

    2006-08-15

    A method of making very long, double-stranded synthetic poly-nucleotides. A multiplicity of short oligonucleotides is provided. The short oligonucleotides are sequentially hybridized to each other. Enzymatic ligation of the oligonucleotides provides a contiguous piece of PCR-ready DNA of predetermined sequence.

  7. Mitochondrial DNA sequence evolution in the Arctoidea.

    PubMed Central

    Zhang, Y P; Ryder, O A

    1993-01-01

    Some taxa in the superfamily Arctoidea, such as the giant panda and the lesser panda, have presented puzzles to taxonomists. In the present study, approximately 397 bases of the cytochrome b gene, 364 bases of the 12S rRNA gene, and 74 bases of the tRNA(Thr) and tRNA(Pro) genes from the giant panda, lesser panda, kinkajou, raccoon, coatimundi, and all species of the Ursidae were sequenced. The high transition/transversion ratios in cytochrome b and RNA genes prior to saturation suggest that the presumed transition bias may represent a trend for some mammalian lineages rather than strictly a primate phenomenon. Transversions in the 12S rRNA gene accumulate in arctoids at about half the rate reported for artiodactyls. Different arctoid lineages evolve at different rates: the kinkajou, a procyonid, evolves the fastest, 1.7-1.9 times faster than the slowest lineage that comprises the spectacled and polar bears. Generation-time effect can only partially explain the different rates of nucleotide substitution in arctoids. Our results based on parsimony analysis show that the giant panda is more closely related to bears than to the lesser panda; the lesser panda is neither closely related to bears nor to the New World procyonids. The kinkajou, raccoon, and coatimundi diverged from each other very early, even though they group together. The polar bear is closely related to the spectacled bear, and they began to diverge from a common mitochondrial ancestor approximately 2 million years ago. Relationships of the remaining five bear species are derived. PMID:8415740

  8. Sequence heterogeneity accelerates protein search for targets on DNA

    NASA Astrophysics Data System (ADS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-12-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  9. Sequence heterogeneity accelerates protein search for targets on DNA

    SciTech Connect

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  10. The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data

    PubMed Central

    2014-01-01

    Background Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade. Results The HTS barcode checker pipeline is an application for automated processing of sets of 'next generation’ barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity. Conclusions The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is

  11. DNA sequence alignment by microhomology sampling during homologous recombination

    PubMed Central

    Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A.; Sung, Patrick

    2015-01-01

    Summary Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair ssDNA with a homologous dsDNA template. Here we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real-time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a 9th nucleotide coincides with an additional reduction in binding free energy and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. PMID:25684365

  12. Automated reconstruction of 3D scenes from sequences of images

    NASA Astrophysics Data System (ADS)

    Pollefeys, M.; Koch, R.; Vergauwen, M.; Van Gool, L.

    Modelling of 3D objects from image sequences is a challenging problem and has been an important research topic in the areas of photogrammetry and computer vision for many years. In this paper, a system is presented which automatically extracts a textured 3D surface model from a sequence of images of a scene. The system can deal with unknown camera settings. In addition, the parameters of this camera are allowed to change during acquisition (e.g., by zooming or focusing). No prior knowledge about the scene is necessary to build the 3D models. Therefore, this system offers a high degree of flexibility. The system is based on state-of-the-art algorithms recently developed in computer vision. The 3D modelling task is decomposed into a number of successive steps. Gradually, more knowledge of the scene and the camera setup is retrieved. At this point, the obtained accuracy is not yet at the level required for most metrology applications, but the visual quality is very convincing. This system has been applied to a number of applications in archaeology. The Roman site of Sagalassos (southwest Turkey) was used as a test case to illustrate the potential of this new approach.

  13. Rotating rod renewable microcolumns for automated, solid-phase DNA hybridization studies.

    PubMed

    Bruckner-Lea, C J; Stottlemyre, M S; Holman, D A; Grate, J W; Brockman, F J; Chandler, D P

    2000-09-01

    The development of a new temperature-controlled renewable microcolumn flow cell for solid-phase nucleic acid hybridization in an automated sequential injection system is described. The flow cell included a stepper motor-driven rotating rod with the working end cut to a 45 degrees angle. In one position, the end of the rod prevented passage of microbeads while allowing fluid flow; rotation of the rod by 180 degrees releases the beads. This system was used to rapidly test many hybridization and elution protocols to examine the temperature and solution conditions required for sequence-specific nucleic acid hybridization. Target nucleic acids labeled with a near-infrared fluorescent dye were detected immediately postcolumn during all column perfusion and elution steps using a flow-through fluorescence detector. Temperature control of the column and the presence of Triton X-100 surfactant were critical for specific hybridization. Perfusion of the column with complementary oligonucleotide (200 microL, 10 nM) resulted in hybridization with 8% of the DNA binding sites on the microbeads with a solution residence time of less than 1 s and a total sample perfusion time of 40 s. The use of the renewable column system for detection of an unlabeled PCR product in a sandwich assay was also demonstrated. PMID:10994975

  14. The DNA sequence and comparative analysis of human chromosome 10.

    PubMed

    Deloukas, P; Earthrowl, M E; Grafham, D V; Rubenfield, M; French, L; Steward, C A; Sims, S K; Jones, M C; Searle, S; Scott, C; Howe, K; Hunt, S E; Andrews, T D; Gilbert, J G R; Swarbreck, D; Ashurst, J L; Taylor, A; Battles, J; Bird, C P; Ainscough, R; Almeida, J P; Ashwell, R I S; Ambrose, K D; Babbage, A K; Bagguley, C L; Bailey, J; Banerjee, R; Bates, K; Beasley, H; Bray-Allen, S; Brown, A J; Brown, J Y; Burford, D C; Burrill, W; Burton, J; Cahill, P; Camire, D; Carter, N P; Chapman, J C; Clark, S Y; Clarke, G; Clee, C M; Clegg, S; Corby, N; Coulson, A; Dhami, P; Dutta, I; Dunn, M; Faulkner, L; Frankish, A; Frankland, J A; Garner, P; Garnett, J; Gribble, S; Griffiths, C; Grocock, R; Gustafson, E; Hammond, S; Harley, J L; Hart, E; Heath, P D; Ho, T P; Hopkins, B; Horne, J; Howden, P J; Huckle, E; Hynds, C; Johnson, C; Johnson, D; Kana, A; Kay, M; Kimberley, A M; Kershaw, J K; Kokkinaki, M; Laird, G K; Lawlor, S; Lee, H M; Leongamornlert, D A; Laird, G; Lloyd, C; Lloyd, D M; Loveland, J; Lovell, J; McLaren, S; McLay, K E; McMurray, A; Mashreghi-Mohammadi, M; Matthews, L; Milne, S; Nickerson, T; Nguyen, M; Overton-Larty, E; Palmer, S A; Pearce, A V; Peck, A I; Pelan, S; Phillimore, B; Porter, K; Rice, C M; Rogosin, A; Ross, M T; Sarafidou, T; Sehra, H K; Shownkeen, R; Skuce, C D; Smith, M; Standring, L; Sycamore, N; Tester, J; Thorpe, A; Torcasso, W; Tracey, A; Tromans, A; Tsolas, J; Wall, M; Walsh, J; Wang, H; Weinstock, K; West, A P; Willey, D L; Whitehead, S L; Wilming, L; Wray, P W; Young, L; Chen, Y; Lovering, R C; Moschonas, N K; Siebert, R; Fechtel, K; Bentley, D; Durbin, R; Hubbard, T; Doucette-Stamm, L; Beck, S; Smith, D R; Rogers, J

    2004-05-27

    The finished sequence of human chromosome 10 comprises a total of 131,666,441 base pairs. It represents 99.4% of the euchromatic DNA and includes one megabase of heterochromatic sequence within the pericentromeric region of the short and long arm of the chromosome. Sequence annotation revealed 1,357 genes, of which 816 are protein coding, and 430 are pseudogenes. We observed widespread occurrence of overlapping coding genes (either strand) and identified 67 antisense transcripts. Our analysis suggests that both inter- and intrachromosomal segmental duplications have impacted on the gene count on chromosome 10. Multispecies comparative analysis indicated that we can readily annotate the protein-coding genes with current resources. We estimate that over 95% of all coding exons were identified in this study. Assessment of single base changes between the human chromosome 10 and chimpanzee sequence revealed nonsense mutations in only 21 coding genes with respect to the human sequence. PMID:15164054

  15. Terminal region sequence variations in variola virus DNA.

    PubMed

    Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J

    1996-07-15

    Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted. PMID:8661439

  16. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W. . Dept. of Computer Sciences); Noordewier, M.O. . Dept. of Computer Science)

    1992-01-01

    We are primarily developing a machine teaming (ML) system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being teamed. Using this information, our teaming algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, our KBANN algorithm maps inference rules about a given recognition task into a neural network. Neural network training techniques then use the training examples to refine these inference rules. We call these rules a domain theory, following the convention in the machine teaming community. We have been applying this approach to several problems in DNA sequence analysis. In addition, we have been extending the capabilities of our teaming system along several dimensions. We have also been investigating parallel algorithms that perform sequence alignments in the presence of frameshift errors.

  17. The DNA sequence and comparative analysis of human chromosome 20.

    PubMed

    Deloukas, P; Matthews, L H; Ashurst, J; Burton, J; Gilbert, J G; Jones, M; Stavrides, G; Almeida, J P; Babbage, A K; Bagguley, C L; Bailey, J; Barlow, K F; Bates, K N; Beard, L M; Beare, D M; Beasley, O P; Bird, C P; Blakey, S E; Bridgeman, A M; Brown, A J; Buck, D; Burrill, W; Butler, A P; Carder, C; Carter, N P; Chapman, J C; Clamp, M; Clark, G; Clark, L N; Clark, S Y; Clee, C M; Clegg, S; Cobley, V E; Collier, R E; Connor, R; Corby, N R; Coulson, A; Coville, G J; Deadman, R; Dhami, P; Dunn, M; Ellington, A G; Frankland, J A; Fraser, A; French, L; Garner, P; Grafham, D V; Griffiths, C; Griffiths, M N; Gwilliam, R; Hall, R E; Hammond, S; Harley, J L; Heath, P D; Ho, S; Holden, J L; Howden, P J; Huckle, E; Hunt, A R; Hunt, S E; Jekosch, K; Johnson, C M; Johnson, D; Kay, M P; Kimberley, A M; King, A; Knights, A; Laird, G K; Lawlor, S; Lehvaslaiho, M H; Leversha, M; Lloyd, C; Lloyd, D M; Lovell, J D; Marsh, V L; Martin, S L; McConnachie, L J; McLay, K; McMurray, A A; Milne, S; Mistry, D; Moore, M J; Mullikin, J C; Nickerson, T; Oliver, K; Parker, A; Patel, R; Pearce, T A; Peck, A I; Phillimore, B J; Prathalingam, S R; Plumb, R W; Ramsay, H; Rice, C M; Ross, M T; Scott, C E; Sehra, H K; Shownkeen, R; Sims, S; Skuce, C D; Smith, M L; Soderlund, C; Steward, C A; Sulston, J E; Swann, M; Sycamore, N; Taylor, R; Tee, L; Thomas, D W; Thorpe, A; Tracey, A; Tromans, A C; Vaudin, M; Wall, M; Wallis, J M; Whitehead, S L; Whittaker, P; Willey, D L; Williams, L; Williams, S A; Wilming, L; Wray, P W; Hubbard, T; Durbin, R M; Bentley, D R; Beck, S; Rogers, J

    The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5' and a 3' untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes. PMID:11780052

  18. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W.

    1992-01-01

    We are developing a machine learning system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being learned. Using this information (which we call a domain theory''), our learning algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, the KBANN algorithm maps inference rules, such as consensus sequences, into a neural (connectionist) network. Neural network training techniques then use the training examples of refine these inference rules. We have been applying this approach to several problems in DNA sequence analysis and have also been extending the capabilities of our learning system along several dimensions.

  19. A novel approach to sequence validating protein expression clones with automated decision making

    PubMed Central

    Taycher, Elena; Rolfs, Andreas; Hu, Yanhui; Zuo, Dongmei; Mohr, Stephanie E; Williamson, Janice; LaBaer, Joshua

    2007-01-01

    Background Whereas the molecular assembly of protein expression clones is readily automated and routinely accomplished in high throughput, sequence verification of these clones is still largely performed manually, an arduous and time consuming process. The ultimate goal of validation is to determine if a given plasmid clone matches its reference sequence sufficiently to be "acceptable" for use in protein expression experiments. Given the accelerating increase in availability of tens of thousands of unverified clones, there is a strong demand for rapid, efficient and accurate software that automates clone validation. Results We have developed an Automated Clone Evaluation (ACE) system – the first comprehensive, multi-platform, web-based plasmid sequence verification software package. ACE automates the clone verification process by defining each clone sequence as a list of multidimensional discrepancy objects, each describing a difference between the clone and its expected sequence including the resulting polypeptide consequences. To evaluate clones automatically, this list can be compared against user acceptance criteria that specify the allowable number of discrepancies of each type. This strategy allows users to re-evaluate the same set of clones against different acceptance criteria as needed for use in other experiments. ACE manages the entire sequence validation process including contig management, identifying and annotating discrepancies, determining if discrepancies correspond to polymorphisms and clone finishing. Designed to manage thousands of clones simultaneously, ACE maintains a relational database to store information about clones at various completion stages, project processing parameters and acceptance criteria. In a direct comparison, the automated analysis by ACE took less time and was more accurate than a manual analysis of a 93 gene clone set. Conclusion ACE was designed to facilitate high throughput clone sequence verification projects. The

  20. Detection of DNA sequence polymorphisms in human genomic DNA by using denaturing gradient gel blots

    SciTech Connect

    Gray, M.R. )

    1992-02-01

    Denaturing gradient gel electrophoresis can detect sequence differences outside restriction-enzyme recognition sites. DNA sequence polymorphisms can be detected as restriction-fragment melting polymorphisms (RFMPs) in genomic DNA by using blots made from denaturing gradient gels. In contrast to the use of Southern blots to find sequence differences, denaturing gradient gel blots can detect differences almost anywhere, not just at 4-6-bp restriction-enzyme recognition sites. Human genomic DNA was digested with one of several randomly selected 4-bp recognition-site restriction enzymes, electrophoresed in denaturing gradient gels, and transferred to nylon membranes. The blots were hydridized with radioactive probes prepared from the factor VIII, type II collagen, insulin receptor, [beta][sub 2]-adrenergic receptor, and 21-hydroxylase genes; in unrelated individuals, several RFM's were found in fragments from every locus tested. No restriction map or sequence information was used to detect RFMP's.

  1. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    PubMed Central

    2009-01-01

    Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms. Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. Results The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. Conclusion The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of Bdelloidea and Eutardigrada in

  2. Numerical characterization of DNA sequences in a 2-D graphical representation scheme of low degeneracy

    NASA Astrophysics Data System (ADS)

    Guo, Xiaofeng; Nandy, Ashesh

    2003-02-01

    Some 2-D and 3-D graphical representations of DNA sequences have been given by Gate, Nandy, Leong, Randic, and Guo et al. Based on 2-D graphical representation of DNA sequences, Raychaudhury and Nandy introduced the first-order moments of the x and y coordinates and the radius of the plot of a DNA sequence for indexing scheme and similarity measures of DNA sequences. In this Letter, based on Guo's novel 2-D graphical representation of DNA sequences of low degeneracy, we introduce the improved first-order moments of the x and y coordinates and the radius of DNA sequences, and the distance of two DNA sequences. The new descriptors of DNA sequences give a good numerical characterization of DNA sequences, which have lower degeneracy.

  3. DNA sequence copy number analysis by Comparative Genomic Hybridization (CGH)

    SciTech Connect

    Pinkel, D.; Kallioniemi, A.; Kallioniemi, O.; Waldman, F.; Sudar, D.; Gray, I. ); Rutovitz, D.; Piper, I. )

    1993-01-01

    Comparative Genomic Hybridization (CGH) uses the kinetics of in situ hybridization to compare the copy numbers of different DNA sequences within the same genome and the copy numbers of the same sequences among different genomes. In a typical application genomic DNA from a tumor and from normal cells are differentially labeled and simultaneously hybridized to normal metaphase chromosomes, and detected with different fluorochromes. Properly registered images of each fluorochrome are obtained using a microscope equipped with multi-band filters and a CCD camera. Digital image analysis permits measurement of intensity ratio profiles along each of the target chromosomes. Studies of cells with known aberrations indicate that the intensity ratio at each position is proportional to the ratio of the copy numbers of the sequences that bind there in the tumor and normal genomes. Analytical challenges posed by the need to efficiently obtain copy number karyotypes are discussed.

  4. Nanopore-based Fourth-generation DNA Sequencing Technology

    PubMed Central

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  5. Nanopore-based fourth-generation DNA sequencing technology.

    PubMed

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-02-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications. PMID:25743089

  6. Wide-field imaging design for a multiple-capillary DNA-sequencing system

    NASA Astrophysics Data System (ADS)

    Nay, Lyle M.; Sinclair, Robert; Swerdlow, Harold

    1997-05-01

    A laser-induced fluorescence detection system compatible with a capillary electrophoresis array was developed. The design incorporates fiber-optic excitation and a detection system including a diffraction grating and a CCD camera. The system employs no moving parts and is capable of producing data comparable to commercially available systems. It is based on a spectrally-resolved four-dye sequencing scheme. The conceptual design was proven, however, refinements must be made to optimize performance for high-throughput capillary-array DNA sequencing. Automated sample preparation and loading in combination with a refillable separation- matrix capillary-array system could prove to be an invaluable tool for completion of the Human Genome Project.

  7. The DNA-bending protein HMG-1 enhances progesterone receptor binding to its target DNA sequences.

    PubMed Central

    Oñate, S A; Prendergast, P; Wagner, J P; Nissen, M; Reeves, R; Pettijohn, D E; Edwards, D P

    1994-01-01

    Steroid hormone receptors are ligand-dependent transcriptional activators that exert their effects by binding as dimers to cis-acting DNA sequences termed hormone response elements. When human progesterone receptor (PR), expressed as a full-length protein in a baculovirus system, was purified to homogeneity, it retained its ability to bind hormonal ligand and to dimerize but exhibited a dramatic loss in DNA binding activity for specific progesterone response elements (PREs). Addition of nuclear extracts from several cellular sources restored DNA binding activity, suggesting that PR requires a ubiquitous accessory protein for efficient interaction with specific DNA sequences. Here we have demonstrated that the high-mobility-group chromatin protein HMG-1, as a highly purified protein, dramatically enhanced binding of purified PR to PREs in gel mobility shift assays. This effect appeared to be highly selective for HMG-1, since a number of other nonspecific proteins failed to enhance PRE binding. Moreover, HMG-1 was effective when added in stoichiometric amounts with receptor, and it was capable of enhancing the DNA binding of both the A and B amino-terminal variants of PR. The presence of HMG-1 measurably increased the binding affinity of purified PR by 10-fold when a synthetic palindromic PRE was the target DNA. The increase in binding affinity for a partial palindromic PRE present in natural target genes was greater than 10-fold. Coimmunoprecipitation assays using anti-PR or anti-HMG-1 antibodies demonstrated that both PR and HMG-1 are present in the enhanced complex with PRE. HMG-1 protein has two conserved DNA binding domains (A and B), which recognize DNA structure rather than specific sequences. The A- or B-box domain expressed and purified from Escherichia coli independently stimulated the binding of PR to PRE, and the B box was able to functionally substitute for HMG-1 in enhancing PR binding. DNA ligase-mediated ring closure assays demonstrated that both the

  8. A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering.

    PubMed

    Yin, Changchuan; Chen, Ying; Yau, Stephen S-T

    2014-10-21

    Multiple sequence alignment (MSA) is a prominent method for classification of DNA sequences, yet it is hampered with inherent limitations in computational complexity. Alignment-free methods have been developed over past decade for more efficient comparison and classification of DNA sequences than MSA. However, most alignment-free methods may lose structural and functional information of DNA sequences because they are based on feature extractions. Therefore, they may not fully reflect the actual differences among DNA sequences. Alignment-free methods with information conservation are needed for more accurate comparison and classification of DNA sequences. We propose a new alignment-free similarity measure of DNA sequences using the Discrete Fourier Transform (DFT). In this method, we map DNA sequences into four binary indicator sequences and apply DFT to the indicator sequences to transform them into frequency domain. The Euclidean distance of full DFT power spectra of the DNA sequences is used as similarity distance metric. To compare the DFT power spectra of DNA sequences with different lengths, we propose an even scaling method to extend shorter DFT power spectra to equal the longest length of the sequences compared. After the DFT power spectra are evenly scaled, the DNA sequences are compared in the same DFT frequency space dimensionality. We assess the accuracy of the similarity metric in hierarchical clustering using simulated DNA and virus sequences. The results demonstrate that the DFT based method is an effective and accurate measure of DNA sequence similarity. PMID:24911780

  9. Impacts of degraded DNA on restriction enzyme associated DNA sequencing (RADSeq).

    PubMed

    Graham, Carly F; Glenn, Travis C; McArthur, Andrew G; Boreham, Douglas R; Kieran, Troy; Lance, Stacey; Manzon, Richard G; Martino, Jessica A; Pierson, Todd; Rogers, Sean M; Wilson, Joanna Y; Somers, Christopher M

    2015-11-01

    Degraded DNA from suboptimal field sampling is common in molecular ecology. However, its impact on techniques that use restriction site associated next-generation DNA sequencing (RADSeq, GBS) is unknown. We experimentally examined the effects of in situDNA degradation on data generation for a modified double-digest RADSeq approach (3RAD). We generated libraries using genomic DNA serially extracted from the muscle tissue of 8 individual lake whitefish (Coregonus clupeaformis) following 0-, 12-, 48- and 96-h incubation at room temperature posteuthanasia. This treatment of the tissue resulted in input DNA that ranged in quality from nearly intact to highly sheared. All samples were sequenced as a multiplexed pool on an Illumina MiSeq. Libraries created from low to moderately degraded DNA (12-48 h) performed well. In contrast, the number of RADtags per individual, number of variable sites, and percentage of identical RADtags retained were all dramatically reduced when libraries were made using highly degraded DNA (96-h group). This reduction in performance was largely due to a significant and unexpected loss of raw reads as a result of poor quality scores. Our findings remained consistent after changes in restriction enzymes, modified fold coverage values (2- to 16-fold), and additional read-length trimming. We conclude that starting DNA quality is an important consideration for RADSeq; however, the approach remains robust until genomic DNA is extensively degraded. PMID:25783180

  10. Colloquium: Physical approaches to DNA sequencing and detection

    NASA Astrophysics Data System (ADS)

    Zwolak, Michael; di Ventra, Massimiliano

    2008-01-01

    With the continued improvement of sequencing technologies, the prospect of genome-based medicine is now at the forefront of scientific research. To realize this potential, however, a revolutionary sequencing method is needed for the cost-effective and rapid interrogation of individual genomes. This capability is likely to be provided by a physical approach to probing DNA at the single-nucleotide level. This is in sharp contrast to current techniques and instruments that probe (through chemical elongation, electrophoresis, and optical detection) length differences and terminating bases of strands of DNA. Several physical approaches to DNA detection have the potential to deliver fast and low-cost sequencing. Central to these approaches is the concept of nanochannels or nanopores, which allow for the spatial confinement of DNA molecules. In addition to their possible impact in medicine and biology, the methods offer ideal test beds to study open scientific issues and challenges in the relatively unexplored area at the interface between solids, liquids, and biomolecules at the nanometer length scale. This Colloquium emphasizes the physics behind these methods and ideas, critically describes their advantages and drawbacks, and discusses future research opportunities in the field.

  11. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  12. Phosphorothioate primers improve the amplification of DNA sequences by DNA polymerases with proofreading activity.

    PubMed Central

    Skerra, A

    1992-01-01

    Two thermostable DNA polymerases with proofreading activity--Vent DNA polymerase and Pfu DNA polymerase--have attracted recent attention, mainly because of their enhanced fidelities during amplification of DNA sequences by the polymerase chain reaction. A severe disadvantage for their practical application, however, results from the observation that due to their 3' to 5' exonuclease activities these enzymes degrade the oligodeoxynucleotides serving as primers for the DNA synthesis. It is demonstrated that this exonucleolytic attack on the primer molecules can be efficiently prevented by the introduction of single phosphorothioate bonds at their 3' termini. This strategy, which can be easily accomplished using routine DNA synthesis methodology, may open the way to a widespread use of these novel enzymes in the polymerase chain reaction. Images PMID:1641322

  13. A CLIQUE algorithm using DNA computing techniques based on closed-circle DNA sequences.

    PubMed

    Zhang, Hongyan; Liu, Xiyu

    2011-07-01

    DNA computing has been applied in broad fields such as graph theory, finite state problems, and combinatorial problem. DNA computing approaches are more suitable used to solve many combinatorial problems because of the vast parallelism and high-density storage. The CLIQUE algorithm is one of the gird-based clustering techniques for spatial data. It is the combinatorial problem of the density cells. Therefore we utilize DNA computing using the closed-circle DNA sequences to execute the CLIQUE algorithm for the two-dimensional data. In our study, the process of clustering becomes a parallel bio-chemical reaction and the DNA sequences representing the marked cells can be combined to form a closed-circle DNA sequences. This strategy is a new application of DNA computing. Although the strategy is only for the two-dimensional data, it provides a new idea to consider the grids to be vertexes in a graph and transform the search problem into a combinatorial problem. PMID:21511001

  14. Systematic analysis of mRNA 5' coding sequence incompleteness in Danio rerio: an automated EST-based approach

    PubMed Central

    Frabetti, Flavia; Casadei, Raffaella; Lenzi, Luca; Canaider, Silvia; Vitale, Lorenza; Facchin, Federica; Carinci, Paolo; Zannotti, Maria; Strippoli, Pierluigi

    2007-01-01

    Background All standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. The aim of this work was to estimate mRNA open reading frame (ORF) 5' region sequence completeness in the model organism Danio rerio (zebrafish). Results We implemented a novel automated approach (5'_ORF_Extender) that systematically compares available expressed sequence tags (ESTs) with all the zebrafish experimentally determined mRNA sequences, identifies additional sequence stretches at 5' region and scans for the presence of all conditions needed to define a new, extended putative ORF. Our software was able to identify 285 (3.3%) mRNAs with putatively incomplete ORFs at 5' region and, in three example cases selected (selt1a, unc119.2, nppa), the extended coding region at 5' end was cloned by reverse transcription-polymerase chain reaction (RT-PCR). Conclusion The implemented method, which could also be useful for the analysis of other genomes, allowed us to describe the relevance of the "5' end mRNA artifact" problem for genomic annotation and functional genomic experiment design in zebrafish. Open peer review This article was reviewed by Alexey V. Kochetov (nominated by Mikhail Gelfand), Shamil Sunyaev, and Gáspár Jékely. For the full reviews, please go to the Reviewers' Comments section. PMID:18042283

  15. Induced topological changes in DNA complexes: influence of DNA sequences and small molecule structures

    PubMed Central

    Hunt, Rebecca A.; Munde, Manoj; Kumar, Arvind; Ismail, Mohamed A.; Farahat, Abdelbasset A.; Arafa, Reem K.; Say, Martial; Batista-Parra, Adalgisa; Tevis, Denise; Boykin, David W.; Wilson, W. David

    2011-01-01

    Heterocyclic diamidines are compounds with antiparasitic properties that target the minor groove of kinetoplast DNA. The mechanism of action of these compounds is unknown, but topological changes to DNA structures are likely to be involved. In this study, we have developed a polyacrylamide gel electrophoresis-based screening method to determine topological effects of heterocyclic diamidines on four minor groove target sequences: AAAAA, TTTAA, AAATT and ATATA. The AAAAA and AAATT sequences have the largest intrinsic bend, whereas the TTTAA and ATATA sequences are relatively straight. The changes caused by binding of the compounds are sequence dependent, but generally the topological effects on AAAAA and AAATT are similar as are the effects on TTTAA and ATATA. A total of 13 compounds with a variety of structural differences were evaluated for topological changes to DNA. All compounds decrease the mobility of the ATATA sequence that is consistent with decreased minor groove width and bending of the relatively straight DNA into the minor groove. Similar, but generally smaller, effects are seen with TTTAA. The intrinsically bent AAAAA and AAATT sequences, which have more narrow minor grooves, have smaller mobility changes on binding that are consistent with increased or decreased bending depending on compound structure. PMID:21266485

  16. Prediction of fine-tuned promoter activity from DNA sequence

    PubMed Central

    Siwo, Geoffrey; Rider, Andrew; Tan, Asako; Pinapati, Richard; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael

    2016-01-01

    The quantitative prediction of transcriptional activity of genes using promoter sequence is fundamental to the engineering of biological systems for industrial purposes and understanding the natural variation in gene expression. To catalyze the development of new algorithms for this purpose, the Dialogue on Reverse Engineering Assessment and Methods (DREAM) organized a community challenge seeking predictive models of promoter activity given normalized promoter activity data for 90 ribosomal protein promoters driving expression of a fluorescent reporter gene. By developing an unbiased modeling approach that performs an iterative search for predictive DNA sequence features using the frequencies of various k-mers, inferred DNA mechanical properties and spatial positions of promoter sequences, we achieved the best performer status in this challenge. The specific predictive features used in the model included the frequency of the nucleotide G, the length of polymeric tracts of T and TA, the frequencies of 6 distinct trinucleotides and 12 tetranucleotides, and the predicted protein deformability of the DNA sequence. Our method accurately predicted the activity of 20 natural variants of ribosomal protein promoters (Spearman correlation r = 0.73) as compared to 33 laboratory-mutated variants of the promoters (r = 0.57) in a test set that was hidden from participants. Notably, our model differed substantially from the rest in 2 main ways: i) it did not explicitly utilize transcription factor binding information implying that subtle DNA sequence features are highly associated with gene expression, and ii) it was entirely based on features extracted exclusively from the 100 bp region upstream from the translational start site demonstrating that this region encodes much of the overall promoter activity. The findings from this study have important implications for the engineering of predictable gene expression systems and the evolution of gene expression in naturally occurring

  17. Evolution of Protein-binding DNA Sequences through Competitive Binding

    NASA Astrophysics Data System (ADS)

    Peng, Weiqun; Gerland, Ulrich; Hwa, Terence; Levine, Herbert

    2002-03-01

    The dynamics of in vitro DNA evolution controlled via competitive binding of DNA sequences to proteins has been explored in a recent serial transfer experiment footnote B. Dubertret, S.Liu, Q. Ouyang, A. Libchaber, Phys. Rev. Lett. 86, 6022 (2001).. Motivated by the experiment, we investigate a continuum model for this evolution process in various parameter regimes. We establish a self-consistent mean-field evolution equation, determine its dynamical properties and finite population size corrections. In addition, we discuss the experimental implications of our results.

  18. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  19. Client side decompression technique provides faster DNA sequence data delivery.

    PubMed

    Sufi, Fahim; Fang, Qiang; Cosic, Irena; Ferguson, Roy

    2005-01-01

    DNA sequences are generally very long chains of sequentially linked nucleotides. There are four different nucleotides and combinations of these build the nucleotide information of sequence files contained in data sources. When a user searches for any sequence for an organism, a compressed sequence file can be sent from the data source to the user. The compressed file then can be decompressed at the client end resulting in reduced transmission time over the Internet. A compression algorithm that provides a moderately high compression rate with minimal decompression time is proposed in this paper. We also compare a number of different compression techniques for achieving efficient delivery methods from an intelligent genomic search agent over the Internet. PMID:17282828

  20. Internal Transcribed Spacer 2 (nu ITS2 rRNA) Sequence-Structure Phylogenetics: Towards an Automated Reconstruction of the Green Algal Tree of Life

    PubMed Central

    Buchheim, Mark A.; Keller, Alexander; Koetschan, Christian; Förster, Frank; Merget, Benjamin; Wolf, Matthias

    2011-01-01

    Background Chloroplast-encoded genes (matK and rbcL) have been formally proposed for use in DNA barcoding efforts targeting embryophytes. Extending such a protocol to chlorophytan green algae, though, is fraught with problems including non homology (matK) and heterogeneity that prevents the creation of a universal PCR toolkit (rbcL). Some have advocated the use of the nuclear-encoded, internal transcribed spacer two (ITS2) as an alternative to the traditional chloroplast markers. However, the ITS2 is broadly perceived to be insufficiently conserved or to be confounded by introgression or biparental inheritance patterns, precluding its broad use in phylogenetic reconstruction or as a DNA barcode. A growing body of evidence has shown that simultaneous analysis of nucleotide data with secondary structure information can overcome at least some of the limitations of ITS2. The goal of this investigation was to assess the feasibility of an automated, sequence-structure approach for analysis of IT2 data from a large sampling of phylum Chlorophyta. Methodology/Principal Findings Sequences and secondary structures from 591 chlorophycean, 741 trebouxiophycean and 938 ulvophycean algae, all obtained from the ITS2 Database, were aligned using a sequence structure-specific scoring matrix. Phylogenetic relationships were reconstructed by Profile Neighbor-Joining coupled with a sequence structure-specific, general time reversible substitution model. Results from analyses of the ITS2 data were robust at multiple nodes and showed considerable congruence with results from published phylogenetic analyses. Conclusions/Significance Our observations on the power of automated, sequence-structure analyses of ITS2 to reconstruct phylum-level phylogenies of the green algae validate this approach to assessing diversity for large sets of chlorophytan taxa. Moreover, our results indicate that objections to the use of ITS2 for DNA barcoding should be weighed against the utility of an automated

  1. DNA sequence representation by trianders and determinative degree of nucleotides

    PubMed Central

    Duplij, Diana; Duplij, Steven

    2005-01-01

    A new version of DNA walks, where nucleotides are regarded unequal in their contribution to a walk is introduced, which allows us to study thoroughly the “fine structure” of nucleotide sequences. The approach is based on the assumption that nucleotides have an inner abstract characteristic, the determinative degree, which reflects genetic code phenomenological properties and is adjusted to nucleotides physical properties. We consider each codon position independently, which gives three separate walks characterized by different angles and lengths, and that such an object is called triander which reflects the “strength” of branch. A general method for identifying DNA sequence “by triander” which can be treated as a unique “genogram” (or “gene passport”) is proposed. The two- and three-dimensional trianders are considered. The difference of sequences fine structure in genes and the intergenic space is shown. A clear triplet signal in coding sequences was found which is absent in the intergenic space and is independent from the sequence length. This paper presents the topological classification of trianders which can allow us to provide a detailed working out signatures of functionally different genomic regions. PMID:16052707

  2. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  3. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    PubMed Central

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  4. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison.

    PubMed

    Kato, Mikio

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  5. [Determination of hepatitis B virus genotypes by DNA sequence analysis in patients from Ankara, Turkey].

    PubMed

    Külah, Canan; Cirak, Meltem Yalinay

    2010-04-01

    Hepatitis B virus (HBV) genotypes vary depending on the geographical region. The HBV genotype determined in Turkey has been genotype D which is found as the homogenously disseminated single genotype. The aim of this study was to determine HBV genotypes in a group of HBV infected patients who were admitted to a university hospital in Ankara, Turkey. Serum samples from HBsAg positive and anti-HBs negative 84 (52 male, 32 female) patients with HBV infection were included into the study. Anti-HBc was positive in 95.2%, HBeAg was positive in 47.6% and anti-HBe was positive in 11.9% of the patients. Mean HBV-DNA levels of the patients were 5.7 x 10(7) +/- 4.6 x 10(7) IU/ml; mean ALT levels were 131 +/- 171 IU/ml and mean AST levels were 98 +/- 170 IU/ml. HBV-DNA was extracted from serum by the phenol-chloroform method and PCR was performed to amplify the S gene region of HBV-DNA. Cycle sequencing of PCR products was performed by a commercial "Cy5/Cy5.5 Dye Primer Cycle Sequencing Kit" (Visible Genetics, Canada) based on dideoxy chain termination method. The sequences were read and analyzed in an automated fluorescence-based DNA-sequencing system (Long-Read Tower System, Visible Genetics, Canada). The nucleotide sequences of the patient samples were compared with the previously reported sequences in gene bank for each genotype. According to the comparative analysis of S-sequences of all patient samples with the published sequences of the genotypes in gene bank, all of the 84 hepatitis B strains (100%) were shown to be related to D genotypic group, subtype ayw. A phylogenetic analysis was performed and phylogenetic trees were constructed using programs in the PHYLIP phylogeny inference package. The patient samples clustered within the genotypic group D. According to these results, the main HBV genotype in our patients was genotype D in accordance with the previous molecular epidemiologic information on HBV in this geographic area. HBV genotype determination may help to

  6. Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains

    SciTech Connect

    Lee, James Weifu; Meller, Amit

    2007-01-01

    Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, which looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.

  7. Negatively supercoiled simian virus 40 DNA contains Z-DNA segments within transcriptional enhancer sequences

    NASA Technical Reports Server (NTRS)

    Nordheim, A.; Rich, A.

    1983-01-01

    Three 8-base pair (bp) segments of alternating purine-pyrimidine from the simian virus 40 enhancer region form Z-DNA on negative supercoiling; minichromosome DNase I-hypersensitive sites determined by others bracket these three segments. A survey of transcriptional enhancer sequences reveals a pattern of potential Z-DNA-forming regions which occur in pairs 50-80 bp apart. This may influence local chromatin structure and may be related to transcriptional activation.

  8. Automated centrifugal-microfluidic platform for DNA purification using laser burst valve and coriolis effect.

    PubMed

    Choi, Min-Seong; Yoo, Jae-Chern

    2015-04-01

    We report a fully automated DNA purification platform with a micropored membrane in the channel utilizing centrifugal microfluidics on a lab-on-a-disc (LOD). The microfluidic flow in the LOD, into which the reagents are injected for DNA purification, is controlled by a single motor and laser burst valve. The sample and reagents pass successively through the micropored membrane in the channel when each laser burst valve is opened. The Coriolis effect is used by rotating the LOD bi-directionally to increase the purity of the DNA, thereby preventing the mixing of the waste and elution solutions. The total process from the lysed sample injection into the LOD to obtaining the purified DNA was finished within 7 min with only one manual step. The experimental result for Salmonella shows that the proposed microfluidic platform is comparable to the existing devices in terms of the purity and yield of DNA. PMID:25737025

  9. Silicene as a new potential DNA sequencing device.

    PubMed

    Amorim, Rodrigo G; Scheicher, Ralph H

    2015-04-17

    Silicene, a hexagonal buckled 2D allotrope of silicon, shows potential as a platform for numerous new applications, and may allow for easier integration with existing silicon-based microelectronics than graphene. Here, we show that silicene could function as an electrical DNA sequencing device. We investigated the stability of this novel nano-bio system, its electronic properties and the pronounced effects on the transverse electronic transport, i.e., changes in the transmission and the conductance caused by adsorption of each nucleobase, explored by us through the non-equilibrium Green's function method. Intriguingly, despite the relatively weak interaction between nucleobases and silicene, significant changes in the transmittance at zero bias are predicted by us, in particular for the two nucleobases cytosine and guanine. Our findings suggest that silicene could be utilized as an integrated-circuit biosensor as part of a lab-on-a-chip device for DNA sequencing. PMID:25797645

  10. Effect of dephasing on DNA sequencing via transverse electronic transport

    SciTech Connect

    Zwolak, Michael; Krems, Matt; Pershin, Yuriy V; Di Ventra, Massimiliano

    2009-01-01

    We study theoretically the effects of dephasing on DNA sequencing in a nanopore via transverse electronic transport. To do this, we couple classical molecular dynamics simulations with transport calculations using scattering theory. Previous studies, which did not include dephasing, have shown that by measuring the transverse current of a particular base multiple times, one can get distributions of currents for each base that are distinguishable. We introduce a dephasing parameter into transport calculations to simulate the effects of the ions and other fluctuations. These effects lower the overall magnitude of the current, but have little effect on the current distributions themselves. The results of this work further implicate that distinguishing DNA bases via transverse electronic transport has potential as a sequencing tool.

  11. Recent progress in atomistic simulation of electrical current DNA sequencing.

    PubMed

    Kim, Han Seul; Kim, Yong-Hoon

    2015-07-15

    We review recent advances in the DNA sequencing method based on measurements of transverse electrical currents. Device configurations proposed in the literature are classified according to whether the molecular fingerprints appear as the major (Mode I) or perturbing (Mode II) current signals. Scanning tunneling microscope and tunneling electrode gap configurations belong to the former category, while the nanochannels with or without an embedded nanopore belong to the latter. The molecular sensing mechanisms of Modes I and II roughly correspond to the electron tunneling and electrochemical gating, respectively. Special emphasis will be given on the computer simulation studies, which have been playing a critical role in the initiation and development of the field. We also highlight low-dimensional nanomaterials such as carbon nanotubes, graphene, and graphene nanoribbons that allow the novel Mode II approach. Finally, several issues in previous computational studies are discussed, which points to future research directions toward more reliable simulation of electrical current DNA sequencing devices. PMID:25744599

  12. Model for the distributions of k -mers in DNA sequences

    NASA Astrophysics Data System (ADS)

    Chen, Yaw-Hwang; Nyeo, Su-Long; Yeh, Chiung-Yuh

    2005-07-01

    The evolutionary features based on the distributions of k -mers in the DNA sequences of various organisms are studied. The organisms are classified into three groups based on their evolutionary periods: (a) E. coli and T. pallidum (b) yeast, zebrafish, A. thaliana, and fruit fly, (c) mouse, chicken, and human. The distributions of 6-mers of these three groups are shown to be, respectively, (a) unimodal, (b) unimodal with peaks generally shifted to smaller frequencies of occurrence, (c) bimodal. To describe the bimodal feature of the k -mer distributions of group (c), a model based on the cytosine-guanine “ CG ” content of the DNA sequences is introduced and shown to provide reasonably good agreements.

  13. Automated carboxy-terminal sequence analysis of peptides and proteins using diphenyl phosphoroisothiocyanatidate.

    PubMed Central

    Bailey, J. M.; Nikfarjam, F.; Shenoy, N. R.; Shively, J. E.

    1992-01-01

    Proteins and peptides can be sequenced from the carboxy-terminus with isothiocyanate reagents to produce amino acid thiohydantoin derivatives. Previous studies in our laboratory have focused on the automation of the thiocyanate chemistry using acetic anhydride and trimethylsilylisothiocyanate (TMS-ITC) to derivatize the C-terminal amino acid to a thiohydantoin and sodium trimethylsilanolate for specific hydrolysis of the derivatized C-terminal amino acid (Bailey, J.M., Shenoy, N.R., Ronk, M., & Shively, J.E., 1992, Protein Sci. 1, 68-80). A major limitation of this approach was the need to activate the C-terminus with acetic anhydride. We now describe the use of a new reagent, diphenyl phosphoroisothiocyanatidate (DPP-ITC) and pyridine, which combines the activation and derivatization steps to produce peptidylthiohydantoins. Previous work by Kenner et al. (Kenner, G.W., Khorana, H.G., & Stedman, R.J., 1953, Chem. Soc. J., 673-678) with this reagent demonstrated slow kinetics. Several days were required for complete reaction. We show here that the inclusion of pyridine was found to promote the formation of C-terminal thiohydantoins by DPP-ITC resulting in complete conversion of the C-terminal amino acid to a thiohydantoin in less than 1 h. Reagents such as imidazole, triazine, and tetrazole were also found to promote the reaction with DPP-ITC as effectively as pyridine. General base catalysts, such as triethylamine, do not promote the reaction, but are required to convert the C-terminal carboxylic acid to a salt prior to the reaction with DPP-ITC and pyridine. By introducing the DPP-ITC reagent and pyridine in separate steps in an automated sequencer, we observed improved sequencing yields for amino acids normally found difficult to derivatize with acetic anhydride/TMS-ITC. This was particularly true for aspartic acid, which now can be sequenced in yields comparable to most of the other amino acids. Automated programs are described for the C-terminal sequencing of

  14. Mitochondrial DNA and nuclear DNA from normal rat liver have a common sequence.

    PubMed Central

    Hadler, H I; Dimitrijevic, B; Mahalingam, R

    1983-01-01

    Although Pst I does not cut the circular mitochondrial genome of the rat, BamHI generates from this genome two unequal fragments of DNA. Each of these fragments was cloned in pBR322. Nuclear DNA was digested from rat liver singly or doubly with Pst I and BamHI, and it was demonstrated that nuclear DNA shared a common sequence with the larger mitochondrial DNA BamHI fragment. The cloned larger mitochondrial DNA fragment was further subdivided with HindIII into four pieces that were labeled and then used to probe the double-digested nuclear DNA. The hybridization data showed that the common sequence is less than 3 kilobase pairs long and lies within the part of the mitochondrial genome containing the D-loop and a portion of the rRNA genes. It therefore appears that, as in lower eukaryotes, there are shared sequences between the nuclear and mitochondrial genomes in mammals. Images PMID:6579536

  15. DNA sequencing by multiple capillaries that form a waveguide

    SciTech Connect

    Dhadwal, S.H.; Quesada, M.A.; Studier, F.W.

    1997-05-01

    A 12-capillary prototype electrophoresis system for DNA sequencing has been constructed. Laser illumination is introduced into an optical waveguide that is formed by an array of individual capillaries that serve both as the optical elements of the periodic array and as the channels containing sieving media for electrophoresis. A theoretical framework and experimental data will be presented to illustrate the viability of this approach.

  16. Computational optimisation of targeted DNA sequencing for cancer detection.

    PubMed

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

    2013-01-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting "hotspot" regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection. PMID:24296834

  17. Color image encryption scheme using CML and DNA sequence operations.

    PubMed

    Wang, Xing-Yuan; Zhang, Hui-Li; Bao, Xue-Mei

    2016-06-01

    In this paper, an encryption algorithm for color images using chaotic system and DNA (Deoxyribonucleic acid) sequence operations is proposed. Three components for the color plain image is employed to construct a matrix, then perform confusion operation on the pixels matrix generated by the spatiotemporal chaos system, i.e., CML (coupled map lattice). DNA encoding rules, and decoding rules are introduced in the permutation phase. The extended Hamming distance is proposed to generate new initial values for CML iteration combining color plain image. Permute the rows and columns of the DNA matrix and then get the color cipher image from this matrix. Theoretical analysis and experimental results prove the cryptosystem secure and practical, and it is suitable for encrypting color images of any size. PMID:27026385

  18. Computational optimisation of targeted DNA sequencing for cancer detection

    NASA Astrophysics Data System (ADS)

    Martinez, Pierre; McGranahan, Nicholas; Birkbak, Nicolai Juul; Gerlinger, Marco; Swanton, Charles

    2013-12-01

    Despite recent progress thanks to next-generation sequencing technologies, personalised cancer medicine is still hampered by intra-tumour heterogeneity and drug resistance. As most patients with advanced metastatic disease face poor survival, there is need to improve early diagnosis. Analysing circulating tumour DNA (ctDNA) might represent a non-invasive method to detect mutations in patients, facilitating early detection. In this article, we define reduced gene panels from publicly available datasets as a first step to assess and optimise the potential of targeted ctDNA scans for early tumour detection. Dividing 4,467 samples into one discovery and two independent validation cohorts, we show that up to 76% of 10 cancer types harbour at least one mutation in a panel of only 25 genes, with high sensitivity across most tumour types. Our analyses demonstrate that targeting ``hotspot'' regions would introduce biases towards in-frame mutations and would compromise the reproducibility of tumour detection.

  19. Contrasting DNA sequence organisation patterns in sauropsidian genomes.

    PubMed

    Epplen, J T; Diedrich, U; Wagenmann, M; Schmidtke, J; Engel, W

    1979-11-01

    The genomic DNA organisation patterns of four sauropsidian species, namely Python reticularis, Caiman crocodilus, Terrapene carolina triungius and Columba livia domestica were investigated by reassociation of short and long DNA fragments, by hyperchromicity measurements of reannealed fragments and by length estimations of S1-nuclease resistant repetitive duplexes. While the genomic DNA of the three reptilian species shows a short period interspersion pattern, the genome of the avian species is organised in a long period interspersion pattern apparently typical for birds. These findings are discussed in view of the close phylogenetic relationships of birds and reptiles, and also with regard to a possible relationship between the extent of sequence interspersion and genome size. PMID:533670

  20. HPV-QUEST: A highly customized system for automated HPV sequence analysis capable of processing Next Generation sequencing data set.

    PubMed

    Yin, Li; Yao, Jiqiang; Gardner, Brent P; Chang, Kaifen; Yu, Fahong; Goodenow, Maureen M

    2012-01-01

    Next Generation sequencing (NGS) applied to human papilloma viruses (HPV) can provide sensitive methods to investigate the molecular epidemiology of multiple type HPV infection. Currently a genotyping system with a comprehensive collection of updated HPV reference sequences and a capacity to handle NGS data sets is lacking. HPV-QUEST was developed as an automated and rapid HPV genotyping system. The web-based HPV-QUEST subtyping algorithm was developed using HTML, PHP, Perl scripting language, and MYSQL as the database backend. HPV-QUEST includes a database of annotated HPV reference sequences with updated nomenclature covering 5 genuses, 14 species and 150 mucosal and cutaneous types to genotype blasted query sequences. HPV-QUEST processes up to 10 megabases of sequences within 1 to 2 minutes. Results are reported in html, text and excel formats and display e-value, blast score, and local and coverage identities; provide genus, species, type, infection site and risk for the best matched reference HPV sequence; and produce results ready for additional analyses. PMID:22570520

  1. DNA topology confers sequence specificity to nonspecific architectural proteins.

    PubMed

    Wei, Juan; Czapla, Luke; Grosner, Michael A; Swigon, David; Olson, Wilma K

    2014-11-25

    Topological constraints placed on short fragments of DNA change the disorder found in chain molecules randomly decorated by nonspecific, architectural proteins into tightly organized 3D structures. The bacterial heat-unstable (HU) protein builds up, counter to expectations, in greater quantities and at particular sites along simulated DNA minicircles and loops. Moreover, the placement of HU along loops with the "wild-type" spacing found in the Escherichia coli lactose (lac) and galactose (gal) operons precludes access to key recognition elements on DNA. The HU protein introduces a unique spatial pathway in the DNA upon closure. The many ways in which the protein induces nearly the same closed circular configuration point to the statistical advantage of its nonspecificity. The rotational settings imposed on DNA by the repressor proteins, by contrast, introduce sequential specificity in HU placement, with the nonspecific protein accumulating at particular loci on the constrained duplex. Thus, an architectural protein with no discernible DNA sequence-recognizing features becomes site-specific and potentially assumes a functional role upon loop formation. The locations of HU on the closed DNA reflect long-range mechanical correlations. The protein responds to DNA shape and deformability—the stiff, naturally straight double-helical structure—rather than to the unique features of the constituent base pairs. The structures of the simulated loops suggest that HU architecture, like nucleosomal architecture, which modulates the ability of regulatory proteins to recognize their binding sites in the context of chromatin, may influence repressor-operator interactions in the context of the bacterial nucleoid. PMID:25385626

  2. Comparison of DNA Quantification Methods for Next Generation Sequencing

    PubMed Central

    Robin, Jérôme D.; Ludlow, Andrew T.; LaRanger, Ryan; Wright, Woodring E.; Shay, Jerry W.

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library’s heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  3. Large microchannel array fabrication and results for DNA sequencing

    SciTech Connect

    Pastrone, R L; Balch, J W; Brewer, L R; Copeland, A C; Davidson , J C; Fitch, J P; Kimbrough, J R; Madabhushi, R S; Richardson, P M; Swierkowski, S P; Tarte, L A; Vainer, M

    1999-01-07

    We have developed a process for the production of microchannel arrays on bonded glass substrates up to I4 x 58 cm, for DNA sequencing. Arrays of 96 and 384 microchannels, each 46 cm long have been built. This technology offers significant advantages over discrete capillaries or conventional slab-gel approaches. High throughput DNA sequencing with over 550 base pairs resolution has been achieved. With custom fabrication apparatus, microchannels are etched in a borosilicate substrate, and then fusion bonded to a top substrate 1.1 mm thick that has access holes formed in it. SEM examination shows a typical microchannel to be 40 x 180 micrometers by 46 cm Iong; the etch is approximately isotropic, leaving a key undercut, for forming a rounded channel. The surface roughness at the bottom of the 40 micrometer deep channel has been profilometer measured to be as low as 20 nm; the roughness at the top surface was 2 nm. Etch uniformity of about 5% has been obtained using a 22% vol. HF / 78% Acetic acid solution. The simple lithography, etching, and bonding of these substrates enables efficient production of these arrays and extremely precise replication From master masks and precision machining with a mandrel. Keywords: microchannels, microchannel plates, DNA sequencing, electrophoresis, borosilicate glass

  4. Comparison of DNA Quantification Methods for Next Generation Sequencing.

    PubMed

    Robin, Jérôme D; Ludlow, Andrew T; LaRanger, Ryan; Wright, Woodring E; Shay, Jerry W

    2016-01-01

    Next Generation Sequencing (NGS) is a powerful tool that depends on loading a precise amount of DNA onto a flowcell. NGS strategies have expanded our ability to investigate genomic phenomena by referencing mutations in cancer and diseases through large-scale genotyping, developing methods to map rare chromatin interactions (4C; 5C and Hi-C) and identifying chromatin features associated with regulatory elements (ChIP-seq, Bis-Seq, ChiA-PET). While many methods are available for DNA library quantification, there is no unambiguous gold standard. Most techniques use PCR to amplify DNA libraries to obtain sufficient quantities for optical density measurement. However, increased PCR cycles can distort the library's heterogeneity and prevent the detection of rare variants. In this analysis, we compared new digital PCR technologies (droplet digital PCR; ddPCR, ddPCR-Tail) with standard methods for the titration of NGS libraries. DdPCR-Tail is comparable to qPCR and fluorometry (QuBit) and allows sensitive quantification by analysis of barcode repartition after sequencing of multiplexed samples. This study provides a direct comparison between quantification methods throughout a complete sequencing experiment and provides the impetus to use ddPCR-based quantification for improvement of NGS quality. PMID:27048884

  5. Compilation of DNA sequences of Escherichia coli (update 1992)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Schachtel, Gabriel; Rice, Peter

    1992-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the fourth listing replacing and increasing the former listings substantially. However, in order to save space this printed version contains DNA sequence information only, if they are publically available in electronic form. The complete compilation including a full set of genetic map data and the E.coli protein index can be obtained in machine readable form from the EMBL data library (ECD release 10) or from the CD-ROM version of this supplement issue directly. After deletion of all detected overlaps a total of 1 820 237 individual bp is found to be determined till the beginning of 1992. This corresponds to a total of 38.56% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for other strains of E.coli. PMID:1598239

  6. DNA sequence of the Serratia marcescens lipoprotein gene

    PubMed Central

    Nakamura, Kenzo; Inouye, Masayori

    1980-01-01

    The Serratia marcescens gene for the outer membrane lipoprotein (lpp) was cloned in λ phage vector Charon 14. The recombinant phage was very unstable, and the lpp gene with a 300-base-pair deletion at the transcription termination site was further cloned in pBR322. The DNA sequence of 834 base pairs encompassing the lpp gene was determined and compared with that of the Escherichia coli lpp gene. The sequence comparisons exhibit several unique features. (i) The promoter region is highly conserved (84% homology) and has an extremely high A+T content (78%) as in E. coli (80%). (ii) The 5′ nontranslated region of the lipoprotein mRNA is also highly conserved (95% homology). (iii) In the DNA sequence corresponding to the signal peptide of this secretory protein, there are three drastic changes, including addition of one base pair and deletion of four base pairs in S. marcescens as compared to E. coli. The resultant alterations in the amino acid sequence, however, do not change the basic properties of the signal peptide, which are assumed to be essential for its function in the secretory mechanism. (iv) The DNA sequence from the amino terminus to the 51st residue of the mature lipoprotein is highly conserved (95% homology) and there is no amino acid substitution. (v) The DNA sequence corresponding to the seven amino acid residues at the carboxyl terminus has only 42% homology, resulting in four amino acid substitutions. (vi) Within the section of 40 base pairs beginning with the termination codon (UAA) and ending immediately before the oligo(T) transcription termination site in the E. coli lpp gene, there is about 60% homology. However, after this section, there is no obvious homology between the two sequences, probably because of a deletion of 300 base pairs at this region. (vii) Seven stable stem-and-loop structures could be formed in the mRNA region. (viii) Alterations in the third position of codons used in the lpp gene suggest that the gene has evolved somewhat

  7. Sequencing degraded DNA from non-destructively sampled museum specimens for RAD-tagging and low-coverage shotgun phylogenetics.

    PubMed

    Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich

    2014-01-01

    Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation. PMID:24828244

  8. RNA sequencing using fluorescent-labeled dideoxynucleotides and automated fluorescence detection.

    PubMed Central

    Bauer, G J

    1990-01-01

    Although dideoxy terminated sequencing of RNA, using reverse transcriptase and oligodeoxynucleotide primers, is now a well established method, the accuracy is limited by sequence ambiguities due to unspecific chain termination events. A protocol is described which circumvents these ambiguities by using fluorescence labels tagged to dideoxynucleotides. Only chain terminations caused by dideoxynucleotides were detected while premature terminated cDNA's remain undetectable. In addition, the remaining multiple signals at nucleotide positions can be assigned to sequence heterogeneities within the RNA sequence to be determined. Images PMID:1690393

  9. DNA Sequence Chromatogram Browsing Using JAVA and CORBA

    PubMed Central

    Parsons, Jeremy D.; Buehler, Eugen; Hillier, LaDeana

    1999-01-01

    DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/∼jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.] PMID:10077534

  10. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  11. Analysis of the complete DNA sequence of murine cytomegalovirus.

    PubMed Central

    Rawlinson, W D; Farrell, H E; Barrell, B G

    1996-01-01

    The complete DNA sequence of the Smith strain of murine cytomegalovirus (MCMV) was determined from virion DNA by using a whole-genome shotgun approach. The genome has an overall G+C content of 58.7%, consists of 230,278 bp, and is arranged as a single unique sequence with short (31-bp) terminal direct repeats and several short internal repeats. Significant similarity to the genome of the sequenced human cytomegalovirus (HCMV) strain AD169 is evident, particularly for 78 open reading frames encoded by the central part of the genome. There is a very similar distribution of G+C content across the two genomes. Sequences toward the ends of the MCMV genome encode tandem arrays of homologous glycoproteins (gps) arranged as two gene families. The left end encodes 15 gps that represent one family, and the right end encodes a different family of 11 gps. A homolog (m144) of cellular major histocompatibility complex (MHC) class I genes is located at the end of the genome opposite the HCMV MHC class I homolog (UL18). G protein-coupled receptor (GCR) homologs (M33 and M78) occur in positions congruent with two (UL33 and UL78) of the four putative HCMV GCR homologs. Counterparts of all of the known enzyme homologs in HCMV are present in the MCMV genome, including the phosphotransferase gene (M97), whose product phosphorylates ganciclovir in HCMV-infected cells, and the assembly protein (M80). PMID:8971012

  12. Complete VAX/VMS DNA/protein sequence analysis system

    SciTech Connect

    Smith, D.W.

    1987-05-01

    A complete yet flexible system of programs and database libraries for analysis of DNA, RNA and protein sequences is implemented for VAX/VMS computers. Types of analysis include 1) construction and analysis of chimeric sequences (cloning in the VAX), 2) multiple analysis of one or more single sequences, 3) search and comparison studies using sequence libraries, and 4) direct input and analysis of experimental data. Published groups of programs, including the Staden, Los Alamos, Zuker, Pearson, and PHYLIP programs, are used. GenBank and EMBL DNA libraries and PIR and Doolittle NEWAT protein libraries are available, with associated programs. The system is tutorial, with online documentation for relevent VAX software, the programs, and the databases. The complete documentation is flexibly maintained on reserve via computer printout placed in 3-ring binders. Command files are used extensively; porting of the entire system to another VAX/VMS system requires modification of a single command. Users of the system are members of a VAX group, with automatic implementation of the system upon login. The present system occupies about 140,000 blocks, and is easily expanded, or contracted, as desired. The UCSD system is used extensively for both teaching and research purposes. Use of microcomputers emulating Tektronix 4014 graphics terminals permits saving of graphics output to disk for subsequent modification to generate high quality publishable figures.

  13. DNA Sequence Determinants Controlling Affinity, Stability and Shape of DNA Complexes Bound by the Nucleoid Protein Fis

    PubMed Central

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; Johnson, Reid C.

    2016-01-01

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequences in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. The affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions. PMID:26959646

  14. DNA Sequence Determinants Controlling Affinity, Stability and Shape of DNA Complexes Bound by the Nucleoid Protein Fis.

    PubMed

    Hancock, Stephen P; Stella, Stefano; Cascio, Duilio; Johnson, Reid C

    2016-01-01

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequences in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. The affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions. PMID:26959646

  15. Demographic history of India and mtDNA-sequence diversity.

    PubMed Central

    Mountain, J L; Hebert, J M; Bhattacharyya, S; Underhill, P A; Ottolenghi, C; Gadgil, M; Cavalli-Sforza, L L

    1995-01-01

    The demographic history of India was examined by comparing mtDNA sequences obtained from members of three culturally divergent Indian subpopulations (endogamous caste groups). While an inferred tree revealed some clustering according to caste affiliation, there was no clear separation into three genetically distinct groups along caste lines. Comparison of pairwise nucleotide difference distributions, however, did indicate a difference in growth patterns between two of the castes. The Brahmin population appears to have undergone either a rapid expansion or steady growth. The low-ranking Mukri caste, however, may have either maintained a roughly constant population size or undergone multiple bottlenecks during that period. Comparison of the Indian sequences to those obtained from other populations, using a tree, revealed that the Indian sequences, along with all other non-African samples, form a starlike cluster. This cluster may represent a major expansion, possibly originating in southern Asia, taking place at some point after modern humans initially left Africa. PMID:7717409

  16. Conservation patterns in angiosperm rDNA ITS2 sequences.

    PubMed Central

    Hershkovitz, M A; Zimmer, E A

    1996-01-01

    The two internal transcribed spacers (ITS1 and ITS2) of nuclear ribosomal DNA have become commonly exploited sources of informative variation for interspecific-/intergeneric-level phylogenetic analyses among angiosperms and other eukaryotes. We present an alignment in which one-third to one-half of the ITS2 sequence is alignable above the family level in angiosperms and a phenetic analysis showing that ITS2 contains information sufficient to diagnose lineages at several hierarchical levels. Base compositional analysis shows that angiosperm ITS2 is inherently GC-rich, and that the proportion of T is much more variable than that for other bases. We propose a general model of angiosperm ITS2 secondary structure that shows common pairing relationships for most of the conserved sequence tracts. Variations in our secondary structure predictions for sequences from different taxa indicate that compensatory mutation is not limited to paired positions. PMID:8760866

  17. On-Demand Indexing for Referential Compression of DNA Sequences

    PubMed Central

    Alves, Fernando; Cogo, Vinicius; Wandelt, Sebastian; Leser, Ulf; Bessani, Alysson

    2015-01-01

    The decreasing costs of genome sequencing is creating a demand for scalable storage and processing tools and techniques to deal with the large amounts of generated data. Referential compression is one of these techniques, in which the similarity between the DNA of organisms of the same or an evolutionary close species is exploited to reduce the storage demands of genome sequences up to 700 times. The general idea is to store in the compressed file only the differences between the to-be-compressed and a well-known reference sequence. In this paper, we propose a method for improving the performance of referential compression by removing the most costly phase of the process, the complete reference indexing. Our approach, called On-Demand Indexing (ODI) compresses human chromosomes five to ten times faster than other state-of-the-art tools (on average), while achieving similar compression ratios. PMID:26146838

  18. Compilation of DNA sequences of Escherichia coli (update 1990)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Rice, Peter

    1990-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the second listing replacing and increasing the former listing roughly by one third. After deletion of all detected overlaps a total of 1 248 696 individual bp is found to be determined till the beginning of 1990. This corresponds to a total of 26.46% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2% derived from the sequence of lysogenic bacteriophage lambda and various insertion sequences. This compilation is now available in machine readable form from the EMBL data library. PMID:2185457

  19. Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

    PubMed Central

    Laehnemann, David; Borkhardt, Arndt

    2016-01-01

    Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159

  20. The most frequent short sequences in non-coding DNA.

    PubMed

    Subirana, Juan A; Messeguer, Xavier

    2010-03-01

    The purpose of this work is to determine the most frequent short sequences in non-coding DNA. They may play a role in maintaining the structure and function of eukaryotic chromosomes. We present a simple method for the detection and analysis of such sequences in several genomes, including Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens. We also study two chromosomes of man and mouse with a length similar to the whole genomes of the other species. We provide a list of the most common sequences of 9-14 bases in each genome. As expected, they are present in human Alu sequences. Our programs may also give a graph and a list of their position in the genome. Detection of clusters is also possible. In most cases, these sequences contain few alternating regions. Their intrinsic structure and their influence on nucleosome formation are not known. In particular, we have found new features of short sequences in C. elegans, which are distributed in heterogeneous clusters. They appear as punctuation marks in the chromosomes. Such clusters are not found in either A. thaliana or D. melanogaster. We discuss the possibility that they play a role in centromere function and homolog recognition in meiosis. PMID:19966278

  1. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    PubMed

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA. PMID:27117661

  2. Next generation sequencing of DNA-launched Chikungunya vaccine virus.

    PubMed

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-03-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3' untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. PMID:26855330

  3. Electromechanical Signatures for DNA Sequencing through a Mechanosensitive Nanopore.

    PubMed

    Farimani, A Barati; Heiranian, M; Aluru, N R

    2015-02-19

    Biological nanopores have been extensively used for DNA base detection since these pores are widely available and tunable through mutations. Distinguishing bases of nucleic acids by passing them through nanopores has so far primarily relied on electrical signals-specifically, ionic currents through the nanopores. However, the low signal-to-noise ratio makes detection of ionic currents difficult. In this study, we show that the initially closed mechanosensitive channel of large conductance (MscL) protein pore opens for single-stranded DNA (ssDNA) translocation under an applied electric field. As each nucleotide translocates through the pore, a unique mechanical signal is observed-specifically, the tension in the membrane containing the MscL pore is different for each nucleotide. In addition to the membrane tension, we found that the ionic current is also different for the four nucleotide types. The initially closed MscL adapts its opening for nucleotide translocation due to the flexibility of the pore. This unique operation of MscL provides single nucleotide resolution in both electrical and mechanical signals. Finally, we also show that the speed of DNA translocation is roughly 1 order of magnitude slower in MscL compared to Mycobacterium smegmatis porin A (MspA), suggesting MscL to be an attractive protein pore for DNA sequencing. PMID:26262481

  4. Complete genome sequence of mitochondrial DNA (mtDNA) of Chlorella sorokiniana.

    PubMed

    Orsini, Massimiliano; Costelli, Cristina; Malavasi, Veronica; Cusano, Roberto; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete sequence of mitochondrial genome of the Chlorella sorokiniana strain (SAG 111-8 k) is presented in this work. Within the Chlorella genus, it represents the second species with a complete sequenced and annotated mitochondrial genome (GenBank accession no. KM241869). The genome consists of circular chromosomes of 52,528 bp and encodes a total of 31 protein coding genes, 3 rRNAs and 26 tRNAs. The overall AT contents of the C. sorokiniana mtDNA is 70.89%, while the coding sequence is of 97.4%. PMID:25186028

  5. Renewable Microcolumns for Automated DNA Purification and Flow-through Amplification: From Sediment Samples through Polymerase Chain Reaction

    SciTech Connect

    Bruckner-Lea, Cindy J. ); Tsukuda, Toyoko ); Dockendorff, Brian P. ); Follansbee, James C. ); Kingsley, Mark T. ); Ocampo, Catherine O.; Stults, Jennie R.; Chandler, Darrell P.

    2001-12-01

    There is an increasing need for field-portable systems for the detection and characterization of microorganisms in the environment. Nucleic acids analysis is frequently the method of choice for discriminating between bacteria in complex systems, but standard protocols are difficult to automate and current microfluidic devices are not configured specifically for environmental sample analysis. In this report, we describe the development of an integrated DNA purification and PCR amplification system and demonstrate its use for the automated purification and amplification of Geobacter chapelli DNA (genomic DNA or plasmid targets) from sediments. The system includes renewable separation columns for the automated capture and release of microparticle purification matrices, and can be easily reprogrammed for new separation chemistries and sample types. The DNA extraction efficiency for the automated system ranged from 3 to 25 percent, depending on the length and concentration of the DNA target . The system was more efficient than batch capture methods for the recovery of dilute genomic DNA even though the reagen volumes were smaller than required for the batch procedure. The automated DNA concentration and purification module was coupled to a flow-through, Peltier-controlled DNA amplification chamber, and used to successfully purify and amplify genomic and plasmid DNA from sediment extracts. Cleaning protocols were also developed to allow reuse of the integrated sample preparation system, including the flow-through PCR tube.

  6. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    ERIC Educational Resources Information Center

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  7. Automated microfluidic DNA/RNA extraction with both disposable and reusable components

    NASA Astrophysics Data System (ADS)

    Kim, Jungkyu; Johnson, Michael; Hill, Parker; Sonkul, Rahul S.; Kim, Jongwon; Gale, Bruce K.

    2012-01-01

    An automated microfluidic nucleic extraction system was fabricated with a multilayer polydimethylsiloxane (PDMS) structure that consists of sample wells, microvalves, a micropump and a disposable microfluidic silica cartridge. Both the microvalves and micropump structures were fabricated in a single layer and are operated pneumatically using a 100 µm PDMS membrane. To fabricate the disposable microfluidic silica cartridge, two-cavity structures were made in a PDMS replica to fit the stacked silica membranes. A handheld controller for the microvalves and pumps was developed to enable system automation. With purified ribonucleic acid (RNA), whole blood and E. coli samples, the automated microfluidic nucleic acid extraction system was validated with a guanidine-based solid phase extraction procedure. An extraction efficiency of ~90% for deoxyribonucleic acid (DNA) and ~54% for RNA was obtained in 12 min from whole blood and E. coli samples, respectively. In addition, the same quantity and quality of extracted DNA was confirmed by polymerase chain reaction (PCR) amplification. The PCR also presented the appropriate amplification and melting profiles. Automated, programmable fluid control and physical separation of the reusable components and the disposable components significantly decrease the assay time and manufacturing cost and increase the flexibility and compatibility of the system with downstream components.

  8. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    PubMed Central

    2011-01-01

    Background High throughput sequencing (HTS) technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR). We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants. PMID:21599914

  9. Base sequence effects on interactions of aromatic mutagens with DNA

    SciTech Connect

    Geacintov, N.E.

    1992-09-30

    The chemical binding of bulky, mutagenic and carcinogenic polynuclear aromatic compounds to certain base-sequences in genomic DNA is known to inhibit DNA replication, and to induce mutations and cancer. In particular, sequences that contain multiple consecutive guanines appear to be hot spots of mutation. The objectives of this research are to determine how the base sequence around the mutagen-modified target bases influences the local DNA conformation and gives rise to mispairing of bases, or deletions, near the lesion. Oligonucleotides containing one, two, or three guanines were synthesized and chemically reacted with the mutagen anti-7,8-dihydroxy-9,10-epoxy-benzo(a)pyrene (BPDE), one of the most mutagenic and tumorigenic metabolites of benzo(a)pyrene. Adducts are formed in which only one of the guanines is modified by trans or cis addition to the exocyclic amino group. The BPDE-oligonucleotides are separated chromatographically, and the site of modification is established by Maxam-Gilbert high resolution gel electrophoresis techniques. The thermodynamic properties of duplexes using complementary, or partially complementary strands were examined. In the latter, the base opposite the modified guanine was varied in order to investigate the probability of mispairing of the modified G with A,T and G. The successful synthesis of stereospecific and site-specific mutagen-oligonucleotide adducts opens new possibilities for correlating adduct structure-biological activity relationships, and thus lead to a better understanding of base-sequence effects in mutagenesis induced by energy-related bulky polynuclear aromatic chemicals.

  10. DNA Targeting Sequence Improves Magnetic Nanoparticle-Based Plasmid DNA Transfection Efficiency in Model Neurons

    PubMed Central

    Vernon, Matthew M.; Dean, David A.; Dobson, Jon

    2015-01-01

    Efficient non-viral plasmid DNA transfection of most stem cells, progenitor cells and primary cell lines currently presents an obstacle for many applications within gene therapy research. From a standpoint of efficiency and cell viability, magnetic nanoparticle-based DNA transfection is a promising gene vectoring technique because it has demonstrated rapid and improved transfection outcomes when compared to alternative non-viral methods. Recently, our research group introduced oscillating magnet arrays that resulted in further improvements to this novel plasmid DNA (pDNA) vectoring technology. Continued improvements to nanomagnetic transfection techniques have focused primarily on magnetic nanoparticle (MNP) functionalization and transfection parameter optimization: cell confluence, growth media, serum starvation, magnet oscillation parameters, etc. Noting that none of these parameters can assist in the nuclear translocation of delivered pDNA following MNP-pDNA complex dissociation in the cell’s cytoplasm, inclusion of a cassette feature for pDNA nuclear translocation is theoretically justified. In this study incorporation of a DNA targeting sequence (DTS) feature in the transfecting plasmid improved transfection efficiency in model neurons, presumably from increased nuclear translocation. This observation became most apparent when comparing the response of the dividing SH-SY5Y precursor cell to the non-dividing and differentiated SH-SY5Y neuroblastoma cells. PMID:26287182

  11. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  12. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE PAGESBeta

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; Johnson, Reid C.; Leng, Fenfei

    2016-03-09

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  13. Long DNA sequences and large data sets: investigating the Quaternary via ancient DNA

    NASA Astrophysics Data System (ADS)

    Hofreiter, Michael

    2008-12-01

    Progress in technical development has allowed piecing together increasingly long DNA sequences from subfossil remains of both extinct and extant species. At the same time, more and more species are analyzed on the population level, leading to a better understanding of population dynamics over time. Finally, new sequencing techniques have allowed targeting complete nuclear genomes of extinct species. The sequences obtained yield insights into a variety of research fields. First, phylogenetic relationships can be resolved with much greater accuracy and it becomes possible to date divergence events of species during and before the Quaternary. Second, large data sets in population genetics facilitate the assessment of changes in genetic diversity over time, an approach that has substantially revised our views about phylogeographic patterns and population dynamics. In the future, the combination of population genetics with long DNA sequences, e.g. complete mitochondrial (mt) DNA genomes, should lead to much more precise estimates of population size changes to be made. This will enable us to make inferences about - and hopefully understand - the causes for faunal turnover and extinctions during the Quaternary. Third, with regard to the nuclear genome, complete genes and genomes can now be sequenced and studied with regard to their function, revealing insights about the numerous traits of extinct species that are not preserved in the fossil record.

  14. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  15. cDNA encoding a polypeptide including a hevein sequence

    SciTech Connect

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  16. Sequence Heterogeneity Accelerates Protein Search for Targets on DNA

    NASA Astrophysics Data System (ADS)

    Shvets, Alexey; Kolomeisky, Anatoly

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry and heterogeneity of a genome. The work was supported by the Welch Foundation (Grant C-1559), by the NSF (Grant CHE-1360979), and by the Center for Theoretical Biological Physics sponsored by the NSF (Grant PHY-1427654).

  17. Random-breakage mapping method applied to human DNA sequences

    NASA Technical Reports Server (NTRS)

    Lobrich, M.; Rydberg, B.; Cooper, P. K.; Chatterjee, A. (Principal Investigator)

    1996-01-01

    The random-breakage mapping method [Game et al. (1990) Nucleic Acids Res., 18, 4453-4461] was applied to DNA sequences in human fibroblasts. The methodology involves NotI restriction endonuclease digestion of DNA from irradiated calls, followed by pulsed-field gel electrophoresis, Southern blotting and hybridization with DNA probes recognizing the single copy sequences of interest. The Southern blots show a band for the unbroken restriction fragments and a smear below this band due to radiation induced random breaks. This smear pattern contains two discontinuities in intensity at positions that correspond to the distance of the hybridization site to each end of the restriction fragment. By analyzing the positions of those discontinuities we confirmed the previously mapped position of the probe DXS1327 within a NotI fragment on the X chromosome, thus demonstrating the validity of the technique. We were also able to position the probes D21S1 and D21S15 with respect to the ends of their corresponding NotI fragments on chromosome 21. A third chromosome 21 probe, D21S11, has previously been reported to be close to D21S1, although an uncertainty about a second possible location existed. Since both probes D21S1 and D21S11 hybridized to a single NotI fragment and yielded a similar smear pattern, this uncertainty is removed by the random-breakage mapping method.

  18. Isolation of Human Genomic DNA Sequences with Expanded Nucleobase Selectivity.

    PubMed

    Rathi, Preeti; Maurer, Sara; Kubik, Grzegorz; Summerer, Daniel

    2016-08-10

    We report the direct isolation of user-defined DNA sequences from the human genome with programmable selectivity for both canonical and epigenetic nucleobases. This is enabled by the use of engineered transcription-activator-like effectors (TALEs) as DNA major groove-binding probes in affinity enrichment. The approach provides the direct quantification of 5-methylcytosine (5mC) levels at single genomic nucleotide positions in a strand-specific manner. We demonstrate the simple, multiplexed typing of a variety of epigenetic cancer biomarker 5mC with custom TALE mixes. Compared to antibodies as the most widely used affinity probes for 5mC analysis, i.e., employed in the methylated DNA immunoprecipitation (MeDIP) protocol, TALEs provide superior sensitivity, resolution and technical ease. We engineer a range of size-reduced TALE repeats and establish full selectivity profiles for their binding to all five human cytosine nucleobases. These provide insights into their nucleobase recognition mechanisms and reveal the ability of TALEs to isolate genomic target sequences with selectivity for single 5-hydroxymethylcytosine and, in combination with sodium borohydride reduction, single 5-formylcytosine nucleobases. PMID:27429302

  19. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  20. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  1. A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes

    ERIC Educational Resources Information Center

    Christensen, Doug

    2009-01-01

    An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…

  2. Genetic variability of Taenia saginata inferred from mitochondrial DNA sequences.

    PubMed

    Rostami, Sima; Salavati, Reza; Beech, Robin N; Babaei, Zahra; Sharbatkhori, Mitra; Harandi, Majid Fasihi

    2015-04-01

    Taenia saginata is an important tapeworm, infecting humans in many parts of the world. The present study was undertaken to identify inter- and intraspecific variation of T. saginata isolated from cattle in different parts of Iran using two mitochondrial CO1 and 12S rRNA genes. Up to 105 bovine specimens of T. saginata were collected from 20 slaughterhouses in three provinces of Iran. DNA were extracted from the metacestode Cysticercus bovis. After PCR amplification, sequencing of CO1 and 12S rRNA genes were carried out and two phylogenetic analyses of the sequence data were generated by Bayesian inference on CO1 and 12S rRNA sequences. Sequence analyses of CO1 and 12S rRNA genes showed 11 and 29 representative profiles respectively. The level of pairwise nucleotide variation between individual haplotypes of CO1 gene was 0.3-2.4% while the overall nucleotide variation among all 11 haplotypes was 4.6%. For 12S rRNA sequence data, level of pairwise nucleotide variation was 0.2-2.5% and the overall nucleotide variation was determined as 5.8% among 29 haplotypes of 12S rRNA gene. Considerable genetic diversity was found in both mitochondrial genes particularly in 12S rRNA gene. PMID:25687521

  3. 3D-dynamic representation of DNA sequences.

    PubMed

    Wąż, Piotr; Bielińska-Wąż, Dorota

    2014-03-01

    A new 3D graphical representation of DNA sequences is introduced. This representation is called 3D-dynamic representation. It is a generalization of the 2D-dynamic dynamic representation. The sequences are represented by sets of "material points" in the 3D space. The resulting 3D-dynamic graphs are treated as rigid bodies. The descriptors characterizing the graphs are analogous to the ones used in the classical dynamics. The classification diagrams derived from this representation are presented and discussed. Due to the third dimension, "the history of the graph" can be recognized graphically because the 3D-dynamic graph does not overlap with itself. Specific parts of the graphs correspond to specific parts of the sequence. This feature is essential for graphical comparisons of the sequences. Numerically, both 2D and 3D approaches are of high quality. In particular, a difference in a single base between two sequences can be identified and correctly described (one can identify which base) by both 2D and 3D methods. PMID:24567158

  4. Phylogenetic inference of Indian malaria vectors from multilocus DNA sequences.

    PubMed

    Dixit, Jyotsana; Srivastava, Hemlata; Sharma, Meenu; Das, Manoj K; Singh, O P; Raghavendra, K; Nanda, Nutan; Dash, Aditya P; Saksena, D N; Das, Aparup

    2010-08-01

    Inferences on the taxonomic positions, phylogenetic interrelationships and divergence time among closely related species of medical importance is essential to understand evolutionary patterns among species, and based on which, disease control measures could be devised. To this respect, malaria is one of the important mosquito borne diseases of tropical and sub-tropical parts of the globe. Taxonomic status of malaria vectors has been so far documented based on morphological, cytological and few molecular genetic features. However, utilization of multilocus DNA sequences in phylogenetic inferences are still in dearth. India contains one of the richest resources of mosquito species diversity but little molecular taxonomic information is available in Indian malaria vectors. We herewith utilized the whole genome sequence information of An. gambiae to amplify and sequence three orthologous nuclear genetic regions in six Indian malaria vector species (An. culicifacies, An. minimus, An. sundaicus, An. fluviatilis, An. annularis and An. stephensi). Further, we utilized the previously published DNA sequence information on the COII and ITS2 genes in all the six species, making the total number of loci to five. Multilocus molecular phylogenetic study of Indian anophelines and An. gambiae was conducted at each individual genetic region using Neighbour Joining (NJ), Maximum Likelihood (ML), Maximum Parsimony (MP) and Bayesian approaches. Although tree topologies with COII, and ITS2 genes were similar, for no other three genetic regions similar tree topologies were observed. In general, the reconstructed phylogenetic status of Indian malaria vectors follows the pattern based on morphological and cytological classifications that was reconfirmed with COII and ITS2 genetic regions. Further, divergence times based on COII gene sequences were estimated among the seven Anopheles species which corroborate the earlier hypothesis on the radiation of different species of the Anopheles

  5. Compilation of DNA sequences of Escherichia coli (update 1993).

    PubMed Central

    Kröger, M; Wahl, R; Rice, P

    1993-01-01

    We have compiled the DNA sequence data for E. coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the fifth listing replacing and increasing the former listings substantially. However, in order to save space this printed version contains DNA sequence information only, if they are publically available in electronic form. The complete compilation including a full set of genetic map data and the E. coli protein index can be obtained in machine readable form from the EMBL data library (ECD release 15) as a part of the CD-ROM issue of the EMBL sequence database, released and updated every three months. After deletion of all detected overlaps a total of 2,353,635 individual bp is found to be determined till the end of April 1993. This corresponds to a total of 49.87% of the entire E. coli chromosome consisting of about 4,720 kbp. This number may actually be higher by 9161 bp derived from other strains of E. coli. PMID:8332520

  6. Automated property optimization via ab initio O(N) elongation method: Application to (hyper-)polarizability in DNA

    NASA Astrophysics Data System (ADS)

    Orimoto, Yuuichi; Aoki, Yuriko

    2016-07-01

    An automated property optimization method was developed based on the ab initio O(N) elongation (ELG) method and applied to the optimization of nonlinear optical (NLO) properties in DNA as a first test. The ELG method mimics a polymerization reaction on a computer, and the reaction terminal of a starting cluster is attacked by monomers sequentially to elongate the electronic structure of the system by solving in each step a limited space including the terminal (localized molecular orbitals at the terminal) and monomer. The ELG-finite field (ELG-FF) method for calculating (hyper-)polarizabilities was used as the engine program of the optimization method, and it was found to show linear scaling efficiency while maintaining high computational accuracy for a random sequenced DNA model. Furthermore, the self-consistent field convergence was significantly improved by using the ELG-FF method compared with a conventional method, and it can lead to more feasible NLO property values in the FF treatment. The automated optimization method successfully chose an appropriate base pair from four base pairs (A, T, G, and C) for each elongation step according to an evaluation function. From test optimizations for the first order hyper-polarizability (β) in DNA, a substantial difference was observed depending on optimization conditions between "choose-maximum" (choose a base pair giving the maximum β for each step) and "choose-minimum" (choose a base pair giving the minimum β). In contrast, there was an ambiguous difference between these conditions for optimizing the second order hyper-polarizability (γ) because of the small absolute value of γ and the limitation of numerical differential calculations in the FF method. It can be concluded that the ab initio level property optimization method introduced here can be an effective step towards an advanced computer aided material design method as long as the numerical limitation of the FF method is taken into account.

  7. Automated property optimization via ab initio O(N) elongation method: Application to (hyper-)polarizability in DNA.

    PubMed

    Orimoto, Yuuichi; Aoki, Yuriko

    2016-07-14

    An automated property optimization method was developed based on the ab initio O(N) elongation (ELG) method and applied to the optimization of nonlinear optical (NLO) properties in DNA as a first test. The ELG method mimics a polymerization reaction on a computer, and the reaction terminal of a starting cluster is attacked by monomers sequentially to elongate the electronic structure of the system by solving in each step a limited space including the terminal (localized molecular orbitals at the terminal) and monomer. The ELG-finite field (ELG-FF) method for calculating (hyper-)polarizabilities was used as the engine program of the optimization method, and it was found to show linear scaling efficiency while maintaining high computational accuracy for a random sequenced DNA model. Furthermore, the self-consistent field convergence was significantly improved by using the ELG-FF method compared with a conventional method, and it can lead to more feasible NLO property values in the FF treatment. The automated optimization method successfully chose an appropriate base pair from four base pairs (A, T, G, and C) for each elongation step according to an evaluation function. From test optimizations for the first order hyper-polarizability (β) in DNA, a substantial difference was observed depending on optimization conditions between "choose-maximum" (choose a base pair giving the maximum β for each step) and "choose-minimum" (choose a base pair giving the minimum β). In contrast, there was an ambiguous difference between these conditions for optimizing the second order hyper-polarizability (γ) because of the small absolute value of γ and the limitation of numerical differential calculations in the FF method. It can be concluded that the ab initio level property optimization method introduced here can be an effective step towards an advanced computer aided material design method as long as the numerical limitation of the FF method is taken into account. PMID:27421397

  8. Discovering Motifs in Ranked Lists of DNA Sequences

    PubMed Central

    Eden, Eran; Lipson, Doron; Yogev, Sivan; Yakhini, Zohar

    2007-01-01

    Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP–chip (chromatin immuno-precipitation on a microarray) measurements. Several major challenges in sequence motif discovery still require consideration: (i) the need for a principled approach to partitioning the data into target and background sets; (ii) the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii) the need for an appropriate framework for accounting for motif multiplicity; (iv) the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs), which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP–chip and CpG methylation data and obtained the following results. (i) Identification of 50 novel putative transcription factor (TF) binding sites in yeast ChIP–chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii) Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked. Overall

  9. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens

    PubMed Central

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-01-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content. PMID:24641208

  10. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  11. Retroviral DNA Sequences as a Means for Determining Ancient Diets

    PubMed Central

    Rivera-Perez, Jessica I.; Cano, Raul J.; Narganes-Storde, Yvonne; Chanlatte-Baik, Luis; Toranzos, Gary A.

    2015-01-01

    For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host’s diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures. PMID:26660678

  12. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    PubMed

    Rivera-Perez, Jessica I; Cano, Raul J; Narganes-Storde, Yvonne; Chanlatte-Baik, Luis; Toranzos, Gary A

    2015-01-01

    For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures. PMID:26660678

  13. Graphical representation for DNA sequences via joint diagonalization of matrix pencil.

    PubMed

    Yu, Hong-Jie; Huang, De-Shuang

    2013-05-01

    Graphical representations provide us with a tool allowing visual inspection of the sequences. To visualize and compare different DNA sequences, a novel alignment-free method is proposed in this paper for both graphical representation and similarity analysis of sequences. We introduce a transformation to represent each DNA sequence with neighboring nucleotide matrix. Then, based on approximate joint diagonalization theory, we transform each DNA primary sequence into a corresponding eigenvalue vector (EVV), which can be considered as numerical characterization of DNA sequence. Meanwhile, we get graphical representation for DNA sequence via the plot of EVV in 2-D plane. Moreover, using k-means, we cluster these feature curves of sequences into several reasonable subclasses. In addition, similarity analyses are performed by computing the distances among the obtained vectors. This approach contains more sequence information, and it analyzes all the involved sequence information jointly rather than separately. A typical dendrogram constructed by this method demonstrates the effectiveness of our approach. PMID:24592449

  14. Entropy and long-range correlations in DNA sequences.

    PubMed

    Melnik, S S; Usatenko, O V

    2014-12-01

    We analyze the structure of DNA molecules of different organisms by using the additive Markov chain approach. Transforming nucleotide sequences into binary strings, we perform statistical analysis of the corresponding "texts". We develop the theory of N-step additive binary stationary ergodic Markov chains and analyze their differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain by means of the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses two point correlators instead of probability of block occurring, it makes possible to calculate the entropy of subsequences at much longer distances than with the use of the standard methods. We utilize the obtained analytical result for numerical evaluation of the entropy of coarse-grained DNA texts. We believe that the entropy study can be used for biological classification of living species. PMID:25213853

  15. In vivo generation of DNA sequence diversity for cellular barcoding

    PubMed Central

    Peikon, Ian D.; Gizatullina, Diana I.; Zador, Anthony M.

    2014-01-01

    Heterogeneity is a ubiquitous feature of biological systems. A complete understanding of such systems requires a method for uniquely identifying and tracking individual components and their interactions with each other. We have developed a novel method of uniquely tagging individual cells in vivo with a genetic ‘barcode’ that can be recovered by DNA sequencing. Our method is a two-component system comprised of a genetic barcode cassette whose fragments are shuffled by Rci, a site-specific DNA invertase. The system is highly scalable, with the potential to generate theoretical diversities in the billions. We demonstrate the feasibility of this technique in Escherichia coli. Currently, this method could be employed to track the dynamics of populations of microbes through various bottlenecks. Advances of this method should prove useful in tracking interactions of cells within a network, and/or heterogeneity within complex biological samples. PMID:25013177

  16. Development of positive control materials for DNA-based detection of cystic fibrosis: Cloning and sequencing of 31 mutations

    SciTech Connect

    Iovannisci, D.; Brown, C.; Winn-Deen, E.

    1994-09-01

    The cloning and sequencing of the gene associated with cystic fibrosis (CF) now provides the opportunity for earlier detection and carrier screening through DNA-based detection schemes. To date, over 300 mutations have been reported to the CF Consortium; however, only 30 mutations have been observed frequently enough world-wide to warrant routine screening. Many of these mutations are not available as cloned material or as established tissue culture cell lines to aid in the development of DNA-based detection assays. We have therefore cloned the 30 most frequently reported mutations, plus the mutation R347H due to its association with male infertility (31 mutations, total). Two approaches were employed: direct PCR amplification, where mutations were available from patient sources, and site-directed PCR mutagenesis of normal genomic DNA to generate the remaining mutations. After amplification, products were cloned into a sequencing vector, bacterial transformants were screened by a novel method (PCR/oligonucleotide litigation assay/sequence-coded separation), and plamid DNA sequences determined by automated fluorescent methods on the Applied Biosystems 373A. Mixing of the clones allows the construction of artificial genotypes useful as positive control material for assay validation. A second round of mutagenesis, resulting in the construction of plasmids bearing multiple mutations, will be evaluated for their utility as reagent control materials in kit development.

  17. Automation of Molecular-Based Analyses: A Primer on Massively Parallel Sequencing

    PubMed Central

    Nguyen, Lan; Burnett, Leslie

    2014-01-01

    Recent advances in genetics have been enabled by new genetic sequencing techniques called massively parallel sequencing (MPS) or next-generation sequencing. Through the ability to sequence in parallel hundreds of thousands to millions of DNA fragments, the cost and time required for sequencing has dramatically decreased. There are a number of different MPS platforms currently available and being used in Australia. Although they differ in the underlying technology involved, their overall processes are very similar: DNA fragmentation, adaptor ligation, immobilisation, amplification, sequencing reaction and data analysis. MPS is being used in research, translational and increasingly now also in clinical settings. Common applications include sequencing of whole genomes, whole exomes or targeted genes for disease-causing gene discovery, genetic diagnosis and targeted cancer therapy. Even though the revolution that is occurring with MPS is exciting due to its increasing use, improving and emerging technologies and new applications, significant challenges still exist. Particularly challenging issues are the bioinformatics required for data analysis, interpretation of results and the ethical dilemma of ‘incidental findings’. PMID:25336762

  18. Rapid Quantification of Hepatitis B Virus DNA by Automated Sample Preparation and Real-Time PCR

    PubMed Central

    Stelzl, Evelyn; Muller, Zsofia; Marth, Egon; Kessler, Harald H.

    2004-01-01

    Monitoring of hepatitis B virus (HBV) DNA in serum by molecular methods has become the standard for assessment of the replicative activity of HBV. Several molecular assays for the detection and quantification of HBV DNA have been described. However, they usually lack automated sample preparation. Moreover, those assays, which are based on PCR, are limited by a short dynamic range (2 to 3 log units). In the present study, the use of RealArt HBV LC PCR Reagents in conjunction with automated extraction on the COBAS AMPLIPREP analyzer was evaluated. Members of an HBV proficiency program panel were tested; linearity, interassay, and intra-assay variations were determined. The performance of the assay in a routine clinical laboratory was evaluated with a total of 117 clinical specimens. When members of the HBV proficiency program panel were tested by the new molecular assay, the results were found to be within ±0.5 log unit of the results obtained by reference laboratories. Determination of linearity resulted in a quasilinear curve over more than 6 log units. The interassay variation of the RealArt HBV LC PCR Reagents by use of the automated sample preparation protocol ranged from 16 to 73%, and the intra-assay variation ranged from 9 to 40%. When clinical samples were tested by the new assay with the automated sample preparation protocol and the results were compared with those obtained by the COBAS AMPLICOR HBV MONITOR Test with manual sample preparation, the results for 76% of all samples with positive results by both tests were found to be within ±0.5 log unit and the results for another 18% were found to be within between 0.5 and 1.0 log unit. In conclusion, the real-time PCR assay with automated sample preparation proved to be suitable for the routine molecular laboratory and required less hands-on time. PMID:15184417

  19. Nonlinear analysis of correlations in Alu repeat sequences in DNA

    NASA Astrophysics Data System (ADS)

    Xiao, Yi; Huang, Yanzhao; Li, Mingfeng; Xu, Ruizhen; Xiao, Saifeng

    2003-12-01

    We report on a nonlinear analysis of deterministic structures in Alu repeats, one of the richest repetitive DNA sequences in the human genome. Alu repeats contain the recognition sites for the restriction endonuclease AluI, which is what gives them their name. Using the nonlinear prediction method developed in chaos theory, we find that all Alu repeats have novel deterministic structures and show strong nonlinear correlations that are absent from exon and intron sequences. Furthermore, the deterministic structures of Alus of younger subfamilies show panlike shapes. As young Alus can be seen as mutation free copies from the “master genes,” it may be suggested that the deterministic structures of the older subfamilies are results of an evolution from a “panlike” structure to a more diffuse correlation pattern due to mutation.

  20. Optimizing Data Intensive GPGPU Computations for DNA Sequence Alignment

    PubMed Central

    Trapnell, Cole; Schatz, Michael C.

    2009-01-01

    MUMmerGPU uses highly-parallel commodity graphics processing units (GPU) to accelerate the data-intensive computation of aligning next generation DNA sequence data to a reference sequence for use in diverse applications such as disease genotyping and personal genomics. MUMmerGPU 2.0 features a new stackless depth-first-search print kernel and is 13× faster than the serial CPU version of the alignment code and nearly 4× faster in total computation time than MUMmerGPU 1.0. We exhaustively examined 128 GPU data layout configurations to improve register footprint and running time and conclude higher occupancy has greater impact than reduced latency. MUMmerGPU is available open-source at http://mummergpu.sourceforge.net. PMID:20161021

  1. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context

    PubMed Central

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833

  2. PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.

    PubMed

    Zhou, Jiyun; Xu, Ruifeng; He, Yulan; Lu, Qin; Wang, Hongpeng; Kong, Bing

    2016-01-01

    Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community. PMID:27282833

  3. The Israel DNA database--the establishment of a rapid, semi-automated analysis system.

    PubMed

    Zamir, Ashira; Dell'Ariccia-Carmon, Aviva; Zaken, Neomi; Oz, Carla

    2012-03-01

    The Israel Police DNA database, also known as IPDIS (Israel Police DNA Index System), has been operating since February 2007. During that time more than 135,000 reference samples have been uploaded and more than 2000 hits reported. We have developed an effective semi-automated system that includes two automated punchers, three liquid handler robots and four genetic analyzers. An inhouse LIMS program enables full tracking of every sample through the entire process of registration, pre-PCR handling, analysis of profiles, uploading to the database, hit reports and ultimately storage. The LIMS is also responsible for the future tracking of samples and their profiles to be expunged from the database according to the Israeli DNA legislation. The database is administered by an in-house developed software program, where reference and evidentiary profiles are uploaded, stored, searched and matched. The DNA database has proven to be an effective investigative tool which has gained the confidence of the Israeli public and on which the Israel National Police force has grown to rely. PMID:21727053

  4. Sequence dependence of isothermal DNA amplification via EXPAR

    PubMed Central

    Qian, Jifeng; Ferguson, Tanya M.; Shinde, Deepali N.; Ramírez-Borrero, Alissa J.; Hintze, Arend; Adami, Christoph; Niemz, Angelika

    2012-01-01

    Isothermal nucleic acid amplification is becoming increasingly important for molecular diagnostics. Therefore, new computational tools are needed to facilitate assay design. In the isothermal EXPonential Amplification Reaction (EXPAR), template sequences with similar thermodynamic characteristics perform very differently. To understand what causes this variability, we characterized the performance of 384 template sequences, and used this data to develop two computational methods to predict EXPAR template performance based on sequence: a position weight matrix approach with support vector machine classifier, and RELIEF attribute evaluation with Naïve Bayes classification. The methods identified well and poorly performing EXPAR templates with 67–70% sensitivity and 77–80% specificity. We combined these methods into a computational tool that can accelerate new assay design by ruling out likely poor performers. Furthermore, our data suggest that variability in template performance is linked to specific sequence motifs. Cytidine, a pyrimidine base, is over-represented in certain positions of well-performing templates. Guanosine and adenosine, both purine bases, are over-represented in similar regions of poorly performing templates, frequently as GA or AG dimers. Since polymerases have a higher affinity for purine oligonucleotides, polymerase binding to GA-rich regions of a single-stranded DNA template may promote non-specific amplification in EXPAR and other nucleic acid amplification reactions. PMID:22416064

  5. H3 and H4 histone cDNA sequences from Xenopus: a sequence comparison of H4 genes.

    PubMed Central

    Turner, P C; Woodland, H R

    1982-01-01

    Ovarian poly (A) + RNA from Xenopus laevis and Xenopus borealis was used to construct two cDNA libraries which were screened for histone sequences. cDNA clones to H4 mRNA were obtained from both species and an H3 cDNA clone from Xenopus laevis. The complete DNA sequences of these clones have been determined and are presented. These new sequences are compared with other H3 and H4 DNA sequences both in the coding and 3' noncoding regions. We find that there is considerable non-random codon usage in ten H4 genes. In addition there are some sequence similarities in the 3' noncoding regions of H3 and H4 genes. PMID:6896750

  6. {open_quotes}Feature{close_quotes} mapping of the HLA-C linked DNA region: Construction by sequencing from nested deletions

    SciTech Connect

    Krishnan, B.R.; Chaplin, D.D. |

    1994-09-01

    The HLA complex located on chromosome 6p spans {approximately}4 Mb and is gene dense. To enable systematic analysis of less well-characterized portions of HLA, we are defining significant {open_quotes}features{close_quotes} of these DNA regions: locations of putative genes (prediction of exons by GRAIL analysis) and Alu elements, regions with homology to the database, and regions of evolutionarily conserved DNA sequence. Initially, we cloned a 35 kb DNA segment adjacent to HLA-C into a transposon {gamma}{delta}-based cosmid vector designed for generating nested deletions in vivo. Over 70 informative nested deletions were obtained and sequenced by fluorescent-automated technology. Islands of DNA sequences were obtained and used to construct a feature map of the 35 kb HLA segment. Our data (i) defined the organization of the previously identified keratinocyte-specific S gene, (ii) generated the DNA sequence of two evolutionarily conserved DNA segments, and (iii) located otherwise undefined putative exons and Alu elements. The construction of such feature maps of large DNA segments using the nested deletion-sequencing approach provides an efficient means to identify DNA segments meriting systematic and detailed analysis.

  7. Noncontinuously binding loop-out primers for avoiding problematic DNA sequences in PCR and sanger sequencing.

    PubMed

    Sumner, Kelli; Swensen, Jeffrey J; Procter, Melinda; Jama, Mohamed; Wooderchak-Donahue, Whitney; Lewis, Tracey; Fong, Michael; Hubley, Lindsey; Schwarz, Monica; Ha, Youna; Paul, Eleri; Brulotte, Benjamin; Lyon, Elaine; Bayrak-Toydemir, Pinar; Mao, Rong; Pont-Kingdon, Genevieve; Best, D Hunter

    2014-09-01

    We present a method in which noncontinuously binding (loop-out) primers are used to exclude regions of DNA that typically interfere with PCR amplification and/or analysis by Sanger sequencing. Several scenarios were tested using this design principle, including M13-tagged PCR primers, non-M13-tagged PCR primers, and sequencing primers. With this technique, a single oligonucleotide is designed in two segments that flank, but do not include, a short region of problematic DNA sequence. During PCR amplification or sequencing, the problematic region is looped-out from the primer binding site, where it does not interfere with the reaction. Using this method, we successfully excluded regions of up to 46 nucleotides. Loop-out primers were longer than traditional primers (27 to 40 nucleotides) and had higher melting temperatures. This method allows the use of a standardized PCR protocol throughout an assay, keeps the number of PCRs to a minimum, reduces the chance for laboratory error, and, above all, does not interrupt the clinical laboratory workflow. PMID:25017792

  8. Complete genome sequence of chloroplast DNA (cpDNA) of Chlorella sorokiniana.

    PubMed

    Orsini, Massimiliano; Cusano, Roberto; Costelli, Cristina; Malavasi, Veronica; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete chloroplast genome sequence of Chlorella sorokiniana strain (SAG 111-8 k) is presented in this study. The genome consists of circular chromosomes of 109,811 bp, which encode a total of 109 genes, including 74 proteins, 3 rRNAs and 31 tRNAs. Moreover, introns are not detected and all genes are present in single copy. The overall AT contents of the C. sorokiniana cpDNA is 65.9%, the coding sequence is 59.1% and a large inverted repeat (IR) is not observed. PMID:24865923

  9. Automated DNA extraction platforms offer solutions to challenges of assessing microbial biofouling in oil production facilities.

    PubMed

    Oldham, Athenia L; Drilling, Heather S; Stamps, Blake W; Stevenson, Bradley S; Duncan, Kathleen E

    2012-01-01

    The analysis of microbial assemblages in industrial, marine, and medical systems can inform decisions regarding quality control or mitigation. Modern molecular approaches to detect, characterize, and quantify microorganisms provide rapid and thorough measures unbiased by the need for cultivation. The requirement of timely extraction of high quality nucleic acids for molecular analysis is faced with specific challenges when used to study the influence of microorganisms on oil production. Production facilities are often ill equipped for nucleic acid extraction techniques, making the preservation and transportation of samples off-site a priority. As a potential solution, the possibility of extracting nucleic acids on-site using automated platforms was tested. The performance of two such platforms, the Fujifilm QuickGene-Mini80™ and Promega Maxwell®16 was compared to a widely used manual extraction kit, MOBIO PowerBiofilm™ DNA Isolation Kit, in terms of ease of operation, DNA quality, and microbial community composition. Three pipeline biofilm samples were chosen for these comparisons; two contained crude oil and corrosion products and the third transported seawater. Overall, the two more automated extraction platforms produced higher DNA yields than the manual approach. DNA quality was evaluated for amplification by quantitative PCR (qPCR) and end-point PCR to generate 454 pyrosequencing libraries for 16S rRNA microbial community analysis. Microbial community structure, as assessed by DGGE analysis and pyrosequencing, was comparable among the three extraction methods. Therefore, the use of automated extraction platforms should enhance the feasibility of rapidly evaluating microbial biofouling at remote locations or those with limited resources. PMID:23168231

  10. Automated DNA extraction platforms offer solutions to challenges of assessing microbial biofouling in oil production facilities

    PubMed Central

    2012-01-01

    The analysis of microbial assemblages in industrial, marine, and medical systems can inform decisions regarding quality control or mitigation. Modern molecular approaches to detect, characterize, and quantify microorganisms provide rapid and thorough measures unbiased by the need for cultivation. The requirement of timely extraction of high quality nucleic acids for molecular analysis is faced with specific challenges when used to study the influence of microorganisms on oil production. Production facilities are often ill equipped for nucleic acid extraction techniques, making the preservation and transportation of samples off-site a priority. As a potential solution, the possibility of extracting nucleic acids on-site using automated platforms was tested. The performance of two such platforms, the Fujifilm QuickGene-Mini80™ and Promega Maxwell®16 was compared to a widely used manual extraction kit, MOBIO PowerBiofilm™ DNA Isolation Kit, in terms of ease of operation, DNA quality, and microbial community composition. Three pipeline biofilm samples were chosen for these comparisons; two contained crude oil and corrosion products and the third transported seawater. Overall, the two more automated extraction platforms produced higher DNA yields than the manual approach. DNA quality was evaluated for amplification by quantitative PCR (qPCR) and end-point PCR to generate 454 pyrosequencing libraries for 16S rRNA microbial community analysis. Microbial community structure, as assessed by DGGE analysis and pyrosequencing, was comparable among the three extraction methods. Therefore, the use of automated extraction platforms should enhance the feasibility of rapidly evaluating microbial biofouling at remote locations or those with limited resources. PMID:23168231

  11. Phylogenomics of phrynosomatid lizards: conflicting signals from sequence capture versus restriction site associated DNA sequencing.

    PubMed

    Leaché, Adam D; Chavez, Andreas S; Jones, Leonard N; Grummer, Jared A; Gottscho, Andrew D; Linkem, Charles W

    2015-03-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both "recent" and "deep" timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  12. Hypervariable minisatellite DNA sequences in the Indian peafowl Pavo cristatus.

    PubMed

    Hanotte, O; Burke, T; Armour, J A; Jeffreys, A J

    1991-04-01

    We report here for the first time the large-scale isolation of hypervariable minisatellite DNA sequences from a non-human species, the Indian peafowl (Pavo cristatus). A size-selected genomic DNA fraction, rich in hypervariable minisatellites, was cloned into Charomid 9-36. This library was screened using two multilocus hypervariable probes, 33.6 and 33.15 and also, in a "probe-walking" approach, with five of the peafowl minisatellites initially isolated. Forty-eight positively hybridizing clones were characterized and found to originate from 30 different loci, 18 of which were polymorphic. Five of these variable minisatellite loci were studied further. They all showed Mendelian inheritance. The heterozygosities of these loci were relatively low (range 22-78%) in comparison with those of previously cloned human loci, as expected in view of inbreeding in our semicaptive study population. No new length allele mutations were observed in families and the mean mutation rate per locus is low (less than 0.004, 95% confidence maximum). These loci were also investigated by cross-species hybridization in related taxa. The ability of the probes to detect hypervariable sequences in other species within the same avian family was found to vary, from those probes that are species-specific to those that are apparently general to the family. We also illustrate the potential usefulness of these probes for paternity analysis in a study of sexual selection, and discuss the general application of specific hypervariable probes in behavioral and evolutionary studies. PMID:1674723

  13. Partition enrichment of nucleotide sequences (PINS)--a generally applicable, sequence based method for enrichment of complex DNA samples.

    PubMed

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5' and 50 base pairs 3' to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  14. Partition Enrichment of Nucleotide Sequences (PINS) - A Generally Applicable, Sequence Based Method for Enrichment of Complex DNA Samples

    PubMed Central

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5′ and 50 base pairs 3′ to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  15. Mylodon darwinii DNA sequences from ancient fecal hair shafts.

    PubMed

    Clack, Andrew A; MacPhee, Ross D E; Poinar, Hendrik N

    2012-01-20

    Preserved hair has been increasingly used as an ancient DNA source in high throughput sequencing endeavors, and it may actually offer several advantages compared to more traditional ancient DNA substrates like bone. However, cold environments have yielded the most informative ancient hair specimens, while its preservation, and thus utility, in temperate regions is not well documented. Coprolites could represent a previously underutilized preservation substrate for hairs, which, if present therein, represent macroscopic packages of specific cells that are relatively simple to separate, clean and process. In this pilot study, we report amplicons 147-152 base pairs in length (w/primers) from hair shafts preserved in a south Chilean coprolite attributed to Darwin's extinct ground sloth, Mylodon darwinii. Our results suggest that hairs preserved in coprolites from temperate cave environments can serve as an effective source of ancient DNA. This bodes well for potential molecular-based population and phylogeographic studies on sloths, several species of which have been understudied despite leaving numerous coprolites in caves across of the Americas. PMID:21640569

  16. Statistical methods for detecting periodic fragments in DNA sequence data

    PubMed Central

    2011-01-01

    Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and confirmatory testing for a priori specified periods has not been developed. Results We compared the estimation accuracy and suitability for confirmatory testing of autocorrelation, discrete Fourier transform (DFT), integer period discrete Fourier transform (IPDFT) and a previously proposed Hybrid measure. A number of different statistical significance procedures were evaluated but a blockwise bootstrap proved superior. When applied to synthetic data whose period-10 signal had been eroded, or for which the signal was approximately period-10, the Hybrid technique exhibited superior properties during exploratory period estimation. In contrast, confirmatory testing using the blockwise bootstrap procedure identified IPDFT as having the greatest statistical power. These properties were validated on yeast sequences defined from a ChIP-chip study where the Hybrid metric confirmed the expected dominance of period-10 in nucleosome associated DNA but IPDFT identified more significant occurrences of period-10. Application to the whole genomes of yeast and mouse identified ~ 21% and ~ 19% respectively of these genomes as spanned by period-10 nucleosome positioning sequences (NPS). Conclusions For estimating the dominant period, we find the Hybrid period estimation method empirically to be the most effective for both eroded and approximate periodicity. The blockwise bootstrap was found to be effective as a significance measure, performing particularly well in the problem of period detection in the

  17. First paraben substituted cyclotetraphosphazene compounds and DNA interaction analysis with a new automated biosensor.

    PubMed

    Çiftçi, Gönül Yenilmez; Şenkuytu, Elif; İncir, Saadet Elif; Yuksel, Fatma; Ölçer, Zehra; Yıldırım, Tuba; Kılıç, Adem; Uludağ, Yıldız

    2016-06-15

    Cancer, as one of the leading causes of death in the world, is caused by malignant cell division and growth that depends on rapid DNA replication. To develop anti-cancer drugs this feature of cancer could be exploited by utilizing DNA-damaging molecules. To achieve this, the paraben substituted cyclotetraphosphazene compounds have been synthesized for the first time and their effect on DNA (genotoxicity) has been investigated. The conventional genotoxicity testing methods are laborious, take time and are expensive. Biosensor based assays provide an alternative to investigate this drug/compound DNA interactions. Here for the first time, a new, easy and rapid screening method has been used to investigate the DNA damage, which is based on an automated biosensor device that relies on the real-time electrochemical profiling (REP™) technology. Using both the biosensor based screening method and the in vitro biological assay, the compounds 9 and 11 (propyl and benzyl substituted cyclotetraphosphazene compounds, respectively), have resulted in higher DNA damage than the others with 65% and 80% activity reduction, respectively. PMID:26852202

  18. Development of an Automated Microfluidic System for DNA Collection, Amplification, and Detection of Pathogens

    SciTech Connect

    Hagan, Bethany S.; Bruckner-Lea, Cynthia J.

    2002-12-01

    This project was focused on developing and testing automated routines for a microfluidic Pathogen Detection System. The basic pathogen detection routine has three primary components; cell concentration, DNA amplification, and detection. In cell concentration, magnetic beads are held in a flow cell by an electromagnet. Sample liquid is passed through the flow cell and bacterial cells attach to the beads. These beads are then released into a small volume of fluid and delivered to the peltier device for cell lysis and DNA amplification. The cells are lysed during initial heating in the peltier device, and the released DNA is amplified using polymerase chain reaction (PCR) or strand displacement amplification (SDA). Once amplified, the DNA is then delivered to a laser induced fluorescence detection unit in which the sample is detected. These three components create a flexible platform that can be used for pathogen detection in liquid and sediment samples. Future developments of the system will include on-line DNA detection during DNA amplification and improved capture and release methods for the magnetic beads during cell concentration.

  19. Construction of a Sequencing Library from Circulating Cell-Free DNA.

    PubMed

    Fang, Nan; Löffert, Dirk; Akinci-Tolun, Rumeysa; Heitz, Katja; Wolf, Alexander

    2016-01-01

    Circulating DNA is cell-free DNA (cfDNA) in serum or plasma that can be used for non-invasive prenatal testing, as well as cancer diagnosis, prognosis, and stratification. High-throughput sequence analysis of the cfDNA with next-generation sequencing technologies has proven to be a highly sensitive and specific method in detecting and characterizing mutations in cancer and other diseases, as well as aneuploidy during pregnancy. This unit describes detailed procedures to extract circulating cfDNA from human serum and plasma and generate sequencing libraries from a wide concentration range of circulating DNA. © 2016 by John Wiley & Sons, Inc. PMID:27038390

  20. Long-range correlations and charge transport properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

    2010-04-01

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  1. Single-stranded DNA ligation and XLF-stimulated incompatible DNA end ligation by the XRCC4-DNA ligase IV complex: influence of terminal DNA sequence.

    PubMed

    Gu, Jiafeng; Lu, Haihui; Tsai, Albert G; Schwarz, Klaus; Lieber, Michael R

    2007-01-01

    The double-strand DNA break repair pathway, non-homologous DNA end joining (NHEJ), is distinctive for the flexibility of its nuclease, polymerase and ligase activities. Here we find that the joining of ends by XRCC4-ligase IV is markedly influenced by the terminal sequence, and a steric hindrance model can account for this. XLF (Cernunnos) stimulates the joining of both incompatible DNA ends and compatible DNA ends at physiologic concentrations of Mg2+, but only of incompatible DNA ends at higher concentrations of Mg2+, suggesting charge neutralization between the two DNA ends within the ligase complex. XRCC4-DNA ligase IV has the distinctive ability to ligate poly-dT single-stranded DNA and long dT overhangs in a Ku- and XLF-independent manner, but not other homopolymeric DNA. The dT preference of the ligase is interesting given the sequence bias of the NHEJ polymerase. These distinctive properties of the XRCC4-DNA ligase IV complex explain important aspects of its in vivo roles. PMID:17717001

  2. DNA sequences, recombinant DNA molecules and processes for producing bovine growth hormone-like polypeptides in high yield

    SciTech Connect

    Buell, G.N.

    1987-09-15

    This patent describes a process for increasing the yield of a bovine growth hormone-like polypeptide to at least 100 times that of a bovine growth hormone-like polypeptide encoded by a DNA sequence. The process comprises the steps of culturing a host transformed with a recombinant DNA molecule comprising DNA sequence encoding a Met ..lambda.. or ..lambda.. bovine growth hormone-like polypetide operatively linked to an expression control sequence. The ..lambda.. is an amino terminal deletion from the amino acid sequence of mature bovine growth hormone.

  3. Sequence rearrangement and duplication of double stranded fibronectin cDNA probably occurring during cDNA synthesis by AMV reverse transcriptase and Escherichia coli DNA polymerase I.

    PubMed Central

    Fagan, J B; Pastan, I; de Crombrugghe, B

    1980-01-01

    Two cloned cDNAs derived from the mRNA for cell fibronectin have been sequenced, providing evidence that transcription with AMV reverse transcriptase or Escherichia coli DNA polymerase I may not always result in double stranded cDNA that is exactly homologous with its mRNA template. Instead, the sequences of these cloned cDNAs are consistent with the duplication and rearrangement of sequences during synthesis of double stranded cDNA. PMID:6159581

  4. The evolution processes of DNA sequences, languages and carols

    NASA Astrophysics Data System (ADS)

    Hauck, Jürgen; Henkel, Dorothea; Mika, Klaus

    2001-04-01

    The sequences of bases A, T, C and G of about 100 enolase, secA and cytochrome DNA were analyzed for attractive or repulsive interactions by the numbers T 1,T 2,T 3; r of nearest, next-nearest and third neighbor bases of the same kind and the concentration r=other bases/analyzed base. The area of possible T1, T2 values is limited by the linear borders T 2=2T 1-2, T 2=0 or T1=0 for clustering, attractive or repulsive interactions and the border T2=-2 T1+2(2- r) for a variation from repulsive to attractive interactions at r⩽2. Clustering is preferred by most bases in sequences of enolases and secA’ s. Major deviations with repulsive interactions of some bases are observed for archaea bacteria in secA and for highly developed animals and the human species in enolase sequences. The borders of the structure map for enthalpy stabilized structures with maximum interactions are approached in few cases. Most letters of the natural languages and some music notes are at the borders of the structure map.

  5. The DNA sequence of the human X chromosome.

    PubMed

    Ross, Mark T; Grafham, Darren V; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R; Burrows, Christine; Bird, Christine P; Frankish, Adam; Lovell, Frances L; Howe, Kevin L; Ashurst, Jennifer L; Fulton, Robert S; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C; Hurles, Matthew E; Andrews, T Daniel; Scott, Carol E; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P; Hunt, Sarah E; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Ainscough, Rachael; Ambrose, Kerrie D; Ansari-Lari, M Ali; Aradhya, Swaroop; Ashwell, Robert I S; Babbage, Anne K; Bagguley, Claire L; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E; Barlow, Karen F; Barrett, Ian P; Bates, Karen N; Beare, David M; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M; Brown, Andrew J; Brown, Mary J; Bonnin, David; Bruford, Elspeth A; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y; Clarke, Graham; Clee, Chris M; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G; Conquer, Jen S; Corby, Nicole; Connor, Richard E; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; Deshazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A; Hawes, Alicia; Heath, Paul D; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J; Huckle, Elizabeth J; Hume, Jennifer; Hunt, Paul J; Hunt, Adrienne R; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J; Joseph, Shirin S; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M; Loulseged, Hermela; Loveland, Jane E; Lovell, Jamieson D; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O'Dell, Christopher N; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V; Pearson, Danita M; Pelan, Sarah E; Perez, Lesette; Porter, Keith M; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A; Schlessinger, David; Schueler, Mary G; Sehra, Harminder K; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M; Shownkeen, Ratna; Skuce, Carl D; Smith, Michelle L; Sotheran, Elizabeth C; Steingruber, Helen E; Steward, Charles A; Storey, Roy; Swann, R Mark; Swarbreck, David; Tabor, Paul E; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C; d'Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L; Whiteley, Mathew N; Wilkinson, Jane E; Willey, David L; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L; Wray, Paul W; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J; Hillier, Ladeana W; Willard, Huntington F; Wilson, Richard K; Waterston, Robert H; Rice, Catherine M; Vaudin, Mark; Coulson, Alan; Nelson, David L; Weinstock, George; Sulston, John E; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A; Beck, Stephan; Rogers, Jane; Bentley, David R

    2005-03-17

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence. PMID:15772651

  6. The DNA sequence of the human X chromosome

    PubMed Central

    Ross, Mark T.; Grafham, Darren V.; Coffey, Alison J.; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R.; Burrows, Christine; Bird, Christine P.; Frankish, Adam; Lovell, Frances L.; Howe, Kevin L.; Ashurst, Jennifer L.; Fulton, Robert S.; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C.; Hurles, Matthew E.; Andrews, T. Daniel; Scott, Carol E.; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P.; Hunt, Sarah E.; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L.; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A.; Worley, Kim C.; Ainscough, Rachael; Ambrose, Kerrie D.; Ansari-Lari, M. Ali; Aradhya, Swaroop; Ashwell, Robert I. S.; Babbage, Anne K.; Bagguley, Claire L.; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E.; Barlow, Karen F.; Barrett, Ian P.; Bates, Karen N.; Beare, David M.; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M.; Brown, Andrew J.; Brown, Mary J.; Bonnin, David; Bruford, Elspeth A.; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M.; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C.; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y.; Clarke, Graham; Clee, Chris M.; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G.; Conquer, Jen S.; Corby, Nicole; Connor, Richard E.; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; DeShazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K. James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L.; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E.; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G.; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A.; Hawes, Alicia; Heath, Paul D.; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J.; Huckle, Elizabeth J.; Hume, Jennifer; Hunt, Paul J.; Hunt, Adrienne R.; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J.; Joseph, Shirin S.; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K.; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J.; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K.; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M.; Loulseged, Hermela; Loveland, Jane E.; Lovell, Jamieson D.; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H.; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L.; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C.; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O’Dell, Christopher N.; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V.; Pearson, Danita M.; Pelan, Sarah E.; Perez, Lesette; Porter, Keith M.; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A.; Schlessinger, David; Schueler, Mary G.; Sehra, Harminder K.; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M.; Shownkeen, Ratna; Skuce, Carl D.; Smith, Michelle L.; Sotheran, Elizabeth C.; Steingruber, Helen E.; Steward, Charles A.; Storey, Roy; Swann, R. Mark; Swarbreck, David; Tabor, Paul E.; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C.; d’Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L.; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L.; Whiteley, Mathew N.; Wilkinson, Jane E.; Willey, David L.; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L.; Wray, Paul W.; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J.; Hillier, LaDeana W.; Willard, Huntington F.; Wilson, Richard K.; Waterston, Robert H.; Rice, Catherine M.; Vaudin, Mark; Coulson, Alan; Nelson, David L.; Weinstock, George; Sulston, John E.; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A.; Beck, Stephan; Rogers, Jane; Bentley, David R.

    2009-01-01

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence. PMID:15772651

  7. Sequence-selective metal ion binding to DNA oligonucleotides.

    PubMed

    Frøystein, N A; Davis, J T; Reid, B R; Sletten, E

    1993-07-01

    Metal ion titrations of several DNA oligonucleotides, 10 dodecamers and one decamer have been monitored by 1H NMR spectroscopy in order to elucidate metal ion binding patterns. Also, the effects of paramagnetic impurities on resonance linewidths and NOESY cross-peak intensities have been reversed by EDTA back-titration experiments. 1H 1D NMR spectra were recorded after successive additions of aliquots of different metal salts to oligonucleotide samples. Paramagnetic manganese(II) salts were used in most cases, but a few samples were also titrated with diamagnetic zinc(II). From this study, we conclude that there exists a sequence-selective metal ion binding pattern. The metal ions bind predominantly to 5'-G in the contexts 5'-GC and 5'-GA. The order of preference seems to be GG > or = GA > GT > > GC. No evidence of metal ion binding to 5'-G in 5'-GC steps or to non-G residues was found. The H6 or H8 resonances on preceding (5'-) bases were affected by the adjacent bound paramagnetic metal ion, but no effect was observed on the protons of the succeeding (3'-) base. The metal binding site in the duplexes is most likely at G-N7, as manifested by the pronounced paramagnetic line broadening or diamagnetic shift of the G-H8 signal. This sequence selectivity may be qualitatively explained by a sequence-dependent variation in the molecular electrostatic potentials of guanine residues (MEPs) along the oligonucleotide chain. PMID:8363924

  8. Development of DNA Damage Response Signaling Biomarkers using Automated, Quantitative Image Analysis

    PubMed Central

    Nikolaishvilli-Feinberg, Nana; Cohen, Stephanie M.; Midkiff, Bentley; Zhou, Yingchun; Olorvida, Mark; Ibrahim, Joseph G.; Omolo, Bernard; Shields, Janiel M.; Thomas, Nancy E.; Groben, Pamela A.; Kaufmann, William K.

    2014-01-01

    The DNA damage response (DDR) coordinates DNA repair with cell cycle checkpoints to ameliorate or mitigate the pathological effects of DNA damage. Automated quantitative analysis (AQUA) and Tissue Studio are commercial technologies that use digitized immunofluorescence microscopy images to quantify antigen expression in defined tissue compartments. Because DDR is commonly activated in cancer and may reflect genetic instability within the lesion, a method to quantify DDR in cancer offers potential diagnostic and/or prognostic value. In this study, both AQUA and Tissue Studio algorithms were used to quantify the DDR in radiation-damaged skin fibroblasts, melanoma cell lines, moles, and primary and metastatic melanomas. Digital image analysis results for three markers of DDR (γH2AX, P-ATM, P-Chk2) correlated with immunoblot data for irradiated fibroblasts, whereas only γH2AX and P-Chk2 correlated with immunoblot data in melanoma cell lines. Melanoma cell lines displayed substantial variation in γH2AX and P-Chk2 expression, and P-Chk2 expression was significantly correlated with radioresistance. Moles, primary melanomas, and melanoma metastases in brain, lung and liver displayed substantial variation in γH2AX expression, similar to that observed in melanoma cell lines. Automated digital analysis of immunofluorescent images stained for DDR biomarkers may be useful for predicting tumor response to radiation and chemotherapy. PMID:24309508

  9. Development of DNA damage response signaling biomarkers using automated, quantitative image analysis.

    PubMed

    Nikolaishvilli-Feinberg, Nana; Cohen, Stephanie M; Midkiff, Bentley; Zhou, Yingchun; Olorvida, Mark; Ibrahim, Joseph G; Omolo, Bernard; Shields, Janiel M; Thomas, Nancy E; Groben, Pamela A; Kaufmann, William K; Miller, C Ryan

    2014-03-01

    The DNA damage response (DDR) coordinates DNA repair with cell cycle checkpoints to ameliorate or mitigate the pathological effects of DNA damage. Automated quantitative analysis (AQUA) and Tissue Studio are commercial technologies that use digitized immunofluorescence microscopy images to quantify antigen expression in defined tissue compartments. Because DDR is commonly activated in cancer and may reflect genetic instability within the lesion, a method to quantify DDR in cancer offers potential diagnostic and/or prognostic value. In this study, both AQUA and Tissue Studio algorithms were used to quantify the DDR in radiation-damaged skin fibroblasts, melanoma cell lines, moles, and primary and metastatic melanomas. Digital image analysis results for three markers of DDR (γH2AX, P-ATM, P-Chk2) correlated with immunoblot data for irradiated fibroblasts, whereas only γH2AX and P-Chk2 correlated with immunoblot data in melanoma cell lines. Melanoma cell lines displayed substantial variation in γH2AX and P-Chk2 expression, and P-Chk2 expression was significantly correlated with radioresistance. Moles, primary melanomas, and melanoma metastases in brain, lung and liver displayed substantial variation in γH2AX expression, similar to that observed in melanoma cell lines. Automated digital analysis of immunofluorescent images stained for DDR biomarkers may be useful for predicting tumor response to radiation and chemotherapy. PMID:24309508

  10. Challenges in DNA motion control and sequence readout using nanopore devices

    PubMed Central

    Carson, Spencer; Wanunu, Meni

    2016-01-01

    Nanopores are being hailed as a potential next-generation DNA sequencer that could provide cheap, high-throughput DNA analysis. In this review we present a detailed summary of the various sensing techniques being investigated for use in DNA sequencing and mapping applications. A crucial impasse to the success of nanopores as a reliable DNA analysis tool is the fast and stochastic nature of DNA translocation. We discuss the incorporation of biological motors to step DNA through a pore base-by-base, as well as the many experimental modifications attempted for the purpose of slowing and controlling DNA transport. PMID:25642629

  11. Simultaneous loading of 200 sample lanes for DNA sequencing on vertical and horizontal, standard and ultrathin gels.

    PubMed

    Erfle, H; Ventzki, R; Voss, H; Rechmann, S; Benes, V; Stegemann, J; Ansorge, W

    1997-06-01

    We have developed a simple and efficient technique for automated parallel loading of >/=200 lanes on a 30 cm-wide gel in automated DNA sequencing, using porous filter materials and an associated manual or robotic system. The samples are loaded onto the teeth of a comb made of the porous material. The comb, with samples, is inserted directly above the straight edge of the polymerized gel. The samples are driven from the comb into the gel by the applied electrical field. A particularly advantageous aspect of this method is the elimination of the thin gel walls separating the sample wells in the standard gel loading technique. The time for sample loading is significantly reduced to a few minutes. The loading technique is applicable to horizontal or vertical systems, with standard or ultrathin gels. PMID:9153326

  12. SeqMule: automated pipeline for analysis of human exome/genome sequencing data.

    PubMed

    Guo, Yunfei; Ding, Xiaolei; Shen, Yufeng; Lyon, Gholson J; Wang, Kai

    2015-01-01

    Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org. PMID:26381817

  13. SeqMule: automated pipeline for analysis of human exome/genome sequencing data

    PubMed Central

    Guo, Yunfei; Ding, Xiaolei; Shen, Yufeng; Lyon, Gholson J.; Wang, Kai

    2015-01-01

    Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org. PMID:26381817

  14. Automated method for tracing leading and trailing processes of migrating neurons in confocal image sequences

    SciTech Connect

    Kerekes, Ryan A; Gleason, Shaun Scott; Trivedi, Dr. Niraj; Solecki, Dr. David

    2010-01-01

    Segmentation, tracking, and tracing of neurons in video imagery are important steps in many neuronal migration studies and can be in accurate and time-consuming when performed manually. In this paper, we present an automated method for tracing the leading and trailing processes of migrating neurons in time-lapse image stacks acquired with a confocal fluorescence microscope. In our approach, we first locate and track the soma of the cell of interest by smoothing each frame and tracking the local maxima through the sequence. We then trace the leading process in each frame by starting at the center of the soma and stepping repeatedly in the most likely direction of the leading process. This direction is found at each step by examining second derivatives of fluorescent intensity along curves of constant radius around the current point. Tracing terminates after a fixed number of steps or when fluorescent intensity drops below a fixed threshold. We evolve the resulting trace to form an improved trace that more closely follows the approximate centerline of the leading process. We apply a similar algorithm to the trailing process of the cell by starting the trace in the opposite direction. We demonstrate our algorithm on two time-lapse confocal video sequences of migrating cerebellar granule neurons(CGNs). We show that the automated traces closely approximate ground truth traces to within 1 or 2 pixels on average. Additionally, we compute line intensity profiles of fluorescence along the automated traces and quantitatively demonstrate their similarity to manually generated profiles in terms of fluorescence peak locations.

  15. Population variation of human mtDNA control region sequences detected by enzymatic amplification and sequence-specific oligonucleotide probes.

    PubMed Central

    Stoneking, M; Hedgecock, D; Higuchi, R G; Vigilant, L; Erlich, H A

    1991-01-01

    A method for detecting sequence variation of hypervariable segments of the mtDNA control region was developed. The technique uses hybridization of sequence-specific oligonucleotide (SSO) probes to DNA sequences that have been amplified by PCR. The nucleotide sequences of the two hypervariable segments of the mtDNA control region from 52 individuals were determined; these sequences were then used to define nine regions suitable for SSO typing. A total of 23 SSO probes were used to detect sequence variants at these nine regions in 525 individuals from five ethnic groups (African, Asian, Caucasian, Japanese, and Mexican). The SSO typing revealed an enormous amount of variability, with 274 mtDNA types observed among these 525 individuals and with diversity values, for each population, exceeding .95. For each of the nine mtDNA regions significant differences in the frequencies of sequence variants were observed between these five populations. The mtDNA SSO-typing system was successfully applied to a case involving individual identification of skeletal remains; the probability of a random match was approximately 0.7%. The potential useful applications of this mtDNA SSO-typing system thus include the analysis of individual identity as well as population genetic studies. Images Figure 3 PMID:1990843

  16. No-wash ethanol precipitation of dye-labeled reaction products improves DNA sequencing reads.

    PubMed

    Fujikura, Kohei

    2015-01-01

    The advent of DNA sequencing has significantly accelerated molecular biology and clinical genetic testing. Despite recent increases in next-generation sequencing throughput, the most popular platform for DNA sequencing is still the multi-capillary DNA sequencer, which is ideally suited for small-scale sequencing projects and is highly accurate. However, the methods remain time-consuming and laborious. Here, I describe a modified ethylenediaminetetraacetic acid (EDTA) method that skips the washing step in ethanol precipitation. My improvements to standard methods save labor, time, and cost per run and increase the sequence reads by 5 to 10%. This modified method will provide immediate benefits to many researchers. PMID:25256164

  17. AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees

    PubMed Central

    Zhou, Chan; Mao, Fenglou; Yin, Yanbin; Huang, Jinling; Gogarten, Johann Peter; Xu, Ying

    2014-01-01

    A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php. PMID:24892935

  18. Generation of single-stranded DNA by the polymerase chain reaction and its application to direct sequencing of the HLA-DQA locus.

    PubMed Central

    Gyllensten, U B; Erlich, H A

    1988-01-01

    Single-copy sequences can be enzymatically amplified from genomic DNA by the polymerase chain reaction. By using unequal molar amounts of the two amplification primers, it is possible in a single step to amplify a single-copy gene and produce an excess of single-stranded DNA of a chosen strand for direct sequencing or for use as a hybridization probe. Further, individual alleles in a heterozygote can be sequenced directly by using allele-specific oligonucleotides either in the amplification reaction or as sequencing primers. By using these methods, we have studied the allelic diversity at the HLA-DQA locus and its association with the serologically defined HLA-DR and -DQ types. This analysis has revealed a total of eight alleles and three additional haplotypes. This procedure has wide applications in screening for mutations in human genes and facilitates the linking of enzymatic amplification of genes to automated sequencing. Images PMID:3174659

  19. Targeted multiplex next-generation sequencing: advances in techniques of mitochondrial and nuclear DNA sequencing for population genomics.

    PubMed

    Hancock-Hanser, Brittany L; Frey, Amy; Leslie, Matthew S; Dutton, Peter H; Archer, Frederick I; Morin, Phillip A

    2013-03-01

    Next-generation sequencing (NGS) is emerging as an efficient and cost-effective tool in population genomic analyses of nonmodel organisms, allowing simultaneous resequencing of many regions of multi-genomic DNA from multiplexed samples. Here, we detail our synthesis of protocols for targeted resequencing of mitochondrial and nuclear loci by generating indexed genomic libraries for multiplexing up to 100 individuals in a single sequencing pool, and then enriching the pooled library using custom DNA capture arrays. Our use of DNA sequence from one species to capture and enrich the sequencing libraries of another species (i.e. cross-species DNA capture) indicates that efficient enrichment occurs when sequences are up to about 12% divergent, allowing us to take advantage of genomic information in one species to sequence orthologous regions in related species. In addition to a complete mitochondrial genome on each array, we have included between 43 and 118 nuclear loci for low-coverage sequencing of between 18 kb and 87 kb of DNA sequence per individual for single nucleotide polymorphisms discovery from 50 to 100 individuals in a single sequencing lane. Using this method, we have generated a total of over 500 whole mitochondrial genomes from seven cetacean species and green sea turtles. The greater variation detected in mitogenomes relative to short mtDNA sequences is helping to resolve genetic structure ranging from geographic to species-level differences. These NGS and analysis techniques have allowed for simultaneous population genomic studies of mtDNA and nDNA with greater genomic coverage and phylogeographic resolution than has previously been possible in marine mammals and turtles. PMID:23351075

  20. Episodic Statistics of Evolutionary Substitutions in DNA Sequences

    NASA Astrophysics Data System (ADS)

    West, Bruce J.

    1998-03-01

    The number of molecular substitutions occurring in a DNA sequence in a given time interval is described by a fractional-difference equation whose statistics are described by a truncated Levy distribution and which has an inverse power law correlation function. This is an empirically motivated stochastic model of molecular evolution and does not address the evolutionary mechanisms that lead to substitutions. The Levy stable process yields a Fano Factor, the ratio of the variance to the mean in the number of molecular substitutions, that increases as a power law in time. This prediction agrees with the observed statistics across 49 different genes in mammals. This model of molecular evolution is episodic and is consistent with the punctuated equilibrium model of macroevolution without making additional statistical assumptions.

  1. Experiences and achievements in automated image sequence orientation for close-range photogrammetric projects

    NASA Astrophysics Data System (ADS)

    Barazzetti, Luigi; Forlani, Gianfranco; Remondino, Fabio; Roncella, Riccardo; Scaioni, Marco

    2011-07-01

    Automatic image orientation of close-range image blocks is becoming a task of increasing importance in the practice of photogrammetry. Although image orientation procedures based on interactive tie point measurements do not require any preferential block structure, the use of structured sequences can help to accomplish this task in an automated way. Automatic orientation of image sequences has been widely investigated in the Computer Vision community. Here the method is generally named "Structure from Motion" (SfM), or "Structure and Motion". These refer to the simultaneous estimation of the image orientation parameters and 3D object points of a scene from a set of image correspondences. Such approaches, that generally disregard camera calibration data, do not ensure an accurate 3D reconstruction, which is a requirement for photogrammetric projects. The major contribution of SfM is therefore viewed in the photogrammetric community as a powerful tool to automatically provide a dense set of tie points as well as initial parameters for a final rigorous bundle adjustment. The paper, after a brief overview of automatic procedures for close-range image sequence orientation, will show some characteristic examples. Although powerful and reliable image orientation solutions are nowadays available at research level, there are certain questions that are still open. Thus the paper will also report some open issues, like the geometric characteristics of the sequences, scene's texture and shape, ground constraints (control points and/or free-network adjustment), feature matching techniques, outlier rejection and bundle adjustment models.

  2. Rapid removal of unincorporated label and proteins from DNA sequencing reactions.

    PubMed

    Kaczorowski, T; Sektas, M

    1996-04-01

    This article presents a simple and rapid method for removal of unincorporated label and proteins from DNA sequencing reactions by using Wizard purification resin. This method can be successfully applied for preparation of end-labeled oligonucleotides free of unincorporated label, which is important in experiments (including DNA sequencing) when the level of background should be as low as possible. Also, this method is effective in removal of proteins from DNA sequencing reactions. PMID:8734430

  3. Factorial Moments Analyses Show a Characteristic Length Scale in DNA Sequences

    NASA Astrophysics Data System (ADS)

    Mohanty, A. K.; Narayana Rao, A. V. S. S.

    2000-02-01

    A unique feature of most of the DNA sequences, found through the factorial moments analysis, is the existence of a characteristic length scale around which the density distribution is nearly Poissonian. Above this point, the DNA sequences, irrespective of their intron contents, show long range correlations with a significant deviation from the Gaussian statistics, while, below this point, the DNA statistics are essentially Gaussian. The famous DNA walk representation is also shown to be a special case of the present analysis.

  4. The Orion GN and C Data-Driven Flight Software Architecture for Automated Sequencing and Fault Recovery

    NASA Technical Reports Server (NTRS)

    King, Ellis; Hart, Jeremy; Odegard, Ryan

    2010-01-01

    The Orion Crew Exploration Vehicle (CET) is being designed to include significantly more automation capability than either the Space Shuttle or the International Space Station (ISS). In particular, the vehicle flight software has requirements to accommodate increasingly automated missions throughout all phases of flight. A data-driven flight software architecture will provide an evolvable automation capability to sequence through Guidance, Navigation & Control (GN&C) flight software modes and configurations while maintaining the required flexibility and human control over the automation. This flexibility is a key aspect needed to address the maturation of operational concepts, to permit ground and crew operators to gain trust in the system and mitigate unpredictability in human spaceflight. To allow for mission flexibility and reconfrgurability, a data driven approach is being taken to load the mission event plan as well cis the flight software artifacts associated with the GN&C subsystem. A database of GN&C level sequencing data is presented which manages and tracks the mission specific and algorithm parameters to provide a capability to schedule GN&C events within mission segments. The flight software data schema for performing automated mission sequencing is presented with a concept of operations for interactions with ground and onboard crew members. A prototype architecture for fault identification, isolation and recovery interactions with the automation software is presented and discussed as a forward work item.

  5. SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

    NASA Technical Reports Server (NTRS)

    Cooke, Daniel; Rushton, Nelson

    2013-01-01

    With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less

  6. Automated DNA-preparation system for bacteria out of air sampler liquids

    NASA Astrophysics Data System (ADS)

    Gransee, Rainer; Röser, Tina; Drese, Klaus Stefan; Düchs, Dominik; Disqué, Claudia; Zoll, Gudrun; Köhne, Stefan; Ritzi-Lehnert, Marion

    2012-06-01

    Preventing bacterial contaminations is a significant challenge in applications across a variety of industries, e.g. in food processing, the life sciences or biohazard detection. Here we present a fully automated lab-on-a-chip system wherein a disposable microfluidic chip moulded by polymeric injection is inserted into an operating device. Liquid samples, here obtained from an air sampler, can be processed to extract and lyse bacteria, and subsequently to purify their DNA using a silica matrix. After the washing and elution steps, the DNA solution is dispensed into a reaction vessel for further analysis in a conventional laboratory polymerase chain reaction (PCR) device. We demonstrate the workability and efficiency of our approach with results from a 9 ml liquid sample spiked with E. coli.

  7. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye. PMID:26774580

  8. Rate variation of DNA sequence evolution in the Drosophila lineages.

    PubMed Central

    Takano, T S

    1998-01-01

    Rate constancy of DNA sequence evolution was examined for three species of Drosophila, using two samples: the published sequences of eight genes from regions of the normal recombination rates and new data of the four AS-C (ac, sc, l'sc and ase) and ci genes. The AS-C and ci genes were chosen because these genes are located in the regions of very reduced recombination in Drosophila melanogaster and their locations remain unchanged throughout the entire lineages involved, yielding less effect of ancestral polymorphism in the study of rate constancy. The synonymous substitution pattern of the three lineages was found to be erratic in both samples. The dispersion index for replacement substitution was relatively high for the per, G6pd and ac genes. A significant heterogeneity was found in the number of synonymous substitutions in the three lineages between the two samples of genes with different recombination rates. This is partly due to a lack of the lineage effect in the D. melanogaster and Drosophila simulans lineages in the AS-C and ci genes in contrast to Akashi's observation of genes in regions of normal recombination. The higher codon bias in Drosophila yakuba as compared with D. melanogaster and D. simulans was observed in the four AS-C genes, which suggests change(s) in action of natural selection involved in codon usage on these genes. Fluctuating selection intensity may also be responsible for the observed locus-lineage interaction effects in synonymous substitution. PMID:9611206

  9. Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.

    PubMed

    Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T

    2016-09-01

    Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of

  10. True single-molecule DNA sequencing of a pleistocene horse bone

    PubMed Central

    Orlando, Ludovic; Ginolhac, Aurelien; Raghavan, Maanasa; Vilstrup, Julia; Rasmussen, Morten; Magnussen, Kim; Steinmann, Kathleen E.; Kapranov, Philipp; Thompson, John F.; Zazula, Grant; Froese, Duane; Moltke, Ida; Shapiro, Beth; Hofreiter, Michael; Al-Rasheid, Khaled A.S.; Gilbert, M. Thomas P.; Willerslev, Eske

    2011-01-01

    Second-generation sequencing platforms have revolutionized the field of ancient DNA, opening access to complete genomes of past individuals and extinct species. However, these platforms are dependent on library construction and amplification steps that may result in sequences that do not reflect the original DNA template composition. This is particularly true for ancient DNA, where templates have undergone extensive damage post-mortem. Here, we report the results of the first “true single molecule sequencing” of ancient DNA. We generated 115.9 Mb and 76.9 Mb of DNA sequences from a permafrost-preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing libraries of ancient DNA molecules, as required for second-generation sequencing, introduce biases into the data that reduce the efficiency of the sequencing process and limit our ability to fully explore the molecular complexity of ancient DNA extracts. We demonstrate that simple modifications to the standard Helicos DNA template preparation protocol further increase the proportion of horse DNA for this sample by threefold. Comparison of Helicos-specific biases and sequence errors in modern DNA with those in ancient DNA also reveals extensive cytosine deamination damage at the 3′ ends of ancient templates, indicating the presence of 3′-sequence overhangs. Our results suggest that paleogenomes could be sequenced in an unprecedented manner by combining current second- and third-generation sequencing approaches. PMID:21803858

  11. newDNA-Prot: Prediction of DNA-binding proteins by employing support vector machine and a comprehensive sequence representation.

    PubMed

    Zhang, Yanping; Xu, Jun; Zheng, Wei; Zhang, Chen; Qiu, Xingye; Chen, Ke; Ruan, Jishou

    2014-10-01

    Identification of DNA-binding proteins is essential in studying cellular activities as the DNA-binding proteins play a pivotal role in gene regulation. In this study, we propose newDNA-Prot, a DNA-binding protein predictor that employs support vector machine classifier and a comprehensive feature representation. The sequence representation are categorized into 6 groups: primary sequence based, evolutionary profile based, predicted secondary structure based, predicted relative solvent accessibility based, physicochemical property based and biological function based features. The mRMR, wrapper and two-stage feature selection methods are employed for removing irrelevant features and reducing redundant features. Experiments demonstrate that the two-stage method performs better than the mRMR and wrapper methods. We also perform a statistical analysis on the selected features and results show that more than 95% of the selected features are statistically significant and they cover all 6 feature groups. The newDNA-Prot method is compared with several state of the art algorithms, including iDNA-Prot, DNAbinder and DNA-Prot. The results demonstrate that newDNA-Prot method outperforms the iDNA-Prot, DNAbinder and DNA-Prot methods. More specific, newDNA-Prot improves the runner-up method, DNA-Prot for around 10% on several evaluation measures. The proposed newDNA-Prot method is available at http://sourceforge.net/projects/newdnaprot/ PMID:25240115

  12. Sequence analysis of mitochondrial DNA hypervariable regions using infrared fluorescence detection.

    PubMed

    Steffens, D L; Roy, R

    1998-06-01

    The non-coding region of the mitochondrial genome provides an attractive target for human forensic identification studies. Two hypervariable (HV) regions, each approximately 250-350 bp in length, contain the majority of mitochondrial DNA (mtDNA) sequence variability among different individuals. Various approaches to determine mtDNA sequence were evaluated utilizing highly sensitive infrared (IR) fluorescence detection. HV regions were amplified either together or separately and cycle-sequenced using a Thermo Sequenase protocol. An M13 universal primer sequence tail covalently attached to the 5' terminus of an amplification primer facilitated electrophoretic analysis and direct sequencing of the amplification products using IR detection. PMID:9631201

  13. Crystal Structure of Human Thymine DNA Glycosylase Bound to DNA Elucidates Sequence-Specific Mismatch Recognition

    SciTech Connect

    Maiti, A.; Morgan, M.T.; Pozharski, E.; Drohat, A.C.

    2009-05-19

    Cytosine methylation at CpG dinucleotides produces m{sup 5}CpG, an epigenetic modification that is important for transcriptional regulation and genomic stability in vertebrate cells. However, m{sup 5}C deamination yields mutagenic G{center_dot}T mispairs, which are implicated in genetic disease, cancer, and aging. Human thymine DNA glycosylase (hTDG) removes T from G{center_dot}T mispairs, producing an abasic (or AP) site, and follow-on base excision repair proteins restore the G{center_dot}C pair. hTDG is inactive against normal A{center_dot}T pairs, and is most effective for G{center_dot}T mispairs and other damage located in a CpG context. The molecular basis of these important catalytic properties has remained unknown. Here, we report a crystal structure of hTDG (catalytic domain, hTDG{sup cat}) in complex with abasic DNA, at 2.8 {angstrom} resolution. Surprisingly, the enzyme crystallized in a 2:1 complex with DNA, one subunit bound at the abasic site, as anticipated, and the other at an undamaged (nonspecific) site. Isothermal titration calorimetry and electrophoretic mobility-shift experiments indicate that hTDG and hTDG{sup cat} can bind abasic DNA with 1:1 or 2:1 stoichiometry. Kinetics experiments show that the 1:1 complex is sufficient for full catalytic (base excision) activity, suggesting that the 2:1 complex, if adopted in vivo, might be important for some other activity of hTDG, perhaps binding interactions with other proteins. Our structure reveals interactions that promote the stringent specificity for guanine versus adenine as the pairing partner of the target base and interactions that likely confer CpG sequence specificity. We find striking differences between hTDG and its prokaryotic ortholog (MUG), despite the relatively high (32%) sequence identity.

  14. Extraction of complementary from non-complementary DNA sequences through phase separation and centrifugation

    NASA Astrophysics Data System (ADS)

    Robins, Taiquitha; McPherson, Dacia; Zhu, Chenhui; Moran, Mark; Walba, Dave; Zanchetta, Giuliano; Bellini, Tommaso; Clark, Noel

    2008-03-01

    Double stranded deoxyribonucleic acid (DNA) is known to form lyotropic liquid crystal (LC) phases, nematic and then columnar with increasing DNA concentration in water. Single stranded (DNA) does not form liquid crystal phases. We study the phase separation of both long (900bp) and short (6-20bp) DNA. In the mixture solution of a self complementary sequences (scDNA) and non complementary sequences (nscDNA), the scDNA forms DNA double helices and hence forms LC phases while the nscDNA stays in the isotropic phase, the LC appearing in the form of phase separated droplets. We report results of the use of centrifugation to produce complete spatial segregation of complementary and noncomplementary DNA, based on their different LC-formation tendencies.

  15. Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).

    PubMed

    Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco

    2016-03-01

    Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences. PMID:26812576

  16. Sequencing of megabase plus DNA by hybridization: Method development ENT. Final technical progress report

    SciTech Connect

    Crkvenjakov, R.; Drmanac, R.

    1991-01-31

    Sequencing by hybridization (SBH) is the only sequencing method based on the experimental determination of the content of oligonucleotide sequences. The data acquisition relies on the natural process of base pairing. It is possible to determine the content of complementary oligosequences in the target DNA by the process of hybridization with oligonucleotide probes of known sequences.

  17. A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...

  18. [Sequencing of low-molecular-weight DNA in blood plasma of irradiated rats].

    PubMed

    Vasilieva, I N; Bespalov, V G; Zinkin, V N; Podgornaya, O I

    2015-01-01

    Extracellular low-molecular-weight DNA in blood of irradiated rats was sequenced for the first time. The screening of sequences in the DDBJ database displayed homology of various parts of the rodent genome. Sequences of low-molecular-weight DNA in rat's plasma are enriched with G/C pairs and long interspersed elements relative to rat genome. DNA sequences in blood of rats irradiated at the doses of 8 and 100 Gy have marked distinctions. Data of sequencing of extracellular DNA from normal humans and with pathology were analyzed. DNA sequences of irradiated rats differ from the human ones by a wealth of long interspersed elements. This new knowledge lays the foundation for development of minimally invasive technologies of diagnosing the probability of pathology and controlling the adaptive resources of people in extreme environments. PMID:25958466

  19. DUC-Curve, a highly compact 2D graphical representation of DNA sequences and its application in sequence alignment

    NASA Astrophysics Data System (ADS)

    Li, Yushuang; Liu, Qian; Zheng, Xiaoqi

    2016-08-01

    A highly compact and simple 2D graphical representation of DNA sequences, named DUC-Curve, is constructed through mapping four nucleotides to a unit circle with a cyclic order. DUC-Curve could directly detect nucleotide, di-nucleotide compositions and microsatellite structure from DNA sequences. Moreover, it also could be used for DNA sequence alignment. Taking geometric center vectors of DUC-Curves as sequence descriptor, we perform similarity analysis on the first exons of β-globin genes of 11 species, oncogene TP53 of 27 species and twenty-four Influenza A viruses, respectively. The obtained reasonable results illustrate that the proposed method is very effective in sequence comparison problems, and will at least play a complementary role in classification and clustering problems.

  20. Water Mediates Recognition of DNA Sequence via Ionic Current Blockade in a Biological Nanopore.

    PubMed

    Bhattacharya, Swati; Yoo, Jejoong; Aksimentiev, Aleksei

    2016-04-26

    Electric field-driven translocation of DNA strands through biological nanopores has been shown to produce blockades of the nanopore ionic current that depend on the nucleotide composition of the strands. Coupling a biological nanopore MspA to a DNA processing enzyme has made DNA sequencing via measurement of ionic current blockades possible. Nevertheless, the physical mechanism enabling the DNA sequence readout has remained undetermined. Here, we report the results of all-atom molecular dynamics simulations that elucidated the physical mechanism of ionic current blockades in the biological nanopore MspA. We find that the amount of water displaced from the nanopore by the DNA strand determines the nanopore ionic current, whereas the steric and base-stacking properties of the DNA nucleotides determine the amount of water displaced. Unexpectedly, we find the effective force on DNA in MspA to undergo large fluctuations, which may produce insertion errors in the DNA sequence readout. PMID:27054820