Science.gov

Sample records for active dna sequences

  1. Dna Sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  2. Heterogeneity of mammalian DNA ligase detected on activity and DNA sequencing gels.

    PubMed Central

    Mezzina, M; Sarasin, A; Politi, N; Bertazzoni, U

    1984-01-01

    A new method to detect DNA ligase activity in situ after NaDodSO4 polyacrylamide gel electrophoresis has been developed. After renaturation of active polypeptides the ligase reaction occurs in situ by incubating the intact gel in the presence of Mg++ and ATP. Further treatment with alkaline phosphatase removes the unligated 5'-32P-end of oligo (dT) used as a substrate and active polypeptides having ligase activity are identified by autoradiography. Analysis on DNA sequencing gels of the oligo (dT) reaction products present in the activity bands ensures that the radioactive material detected in activity gels or in standard in vitro ligase assays corresponds unambiguously to a ligase activity. Using these methods, we have analysed the purified phage T4 DNA ligase, and the activities present in crude extracts and in purified fractions from monkey kidney (CV1-P) cells. The purified T4 enzyme yields one or two active peptides with Mr values of 60,000 and 70,000. Crude extracts from CV1-P cells contain several polypeptides having DNA ligase activity. Partial purification of these extracts shows that DNA ligase I isolated from hydroxylapatite column is enriched in polypeptides with Mr 200,000, 150,000 and 120,000, while DNA ligase II is enriched in those with Mr 60,000 and 70,000. Images PMID:6377238

  3. A human cellular sequence implicated in trk oncogene activation is DNA damage inducible

    SciTech Connect

    Ben-Ishai, R.; Scharf, R.; Sharon, R.; Kapten, I. )

    1990-08-01

    Xeroderma pigmentosum cells, which are deficient in the repair of UV light-induced DNA damage, have been used to clone DNA-damage-inducible transcripts in human cells. The cDNA clone designated pC-5 hybridizes on RNA gel blots to a 1-kilobase transcript, which is moderately abundant in nontreated cells and whose synthesis is enhanced in human cells following UV irradiation or treatment with several other DNA-damaging agents. UV-enhanced transcription of C-5 RNA is transient and occurs at lower fluences and to a greater extent in DNA-repair-deficient than in DNA-repair-proficient cells. Southern blot analysis indicates that the C-5 gene belongs to a multigene family. A cDNA clone containing the complete coding sequence of C-5 was isolated. Sequence analysis revealed that it is homologous to a human cellular sequence encoding the amino-terminal activating sequence of the trk-2h chimeric oncogene. The presence of DNA-damage-responsive sequences at the 5' end of a chimeric oncogene could result in enhanced expression of the oncogene in response to carcinogens.

  4. Biases during DNA extraction of activated sludge samples revealed by high throughput sequencing.

    PubMed

    Guo, Feng; Zhang, Tong

    2013-05-01

    Standardization of DNA extraction is a fundamental issue of fidelity and comparability in investigations of environmental microbial communities. Commercial kits for soil or feces are often adopted for studies of activated sludge because of a lack of specific kits, but they have never been evaluated regarding their effectiveness and potential biases based on high throughput sequencing. In this study, seven common DNA extraction kits were evaluated, based on not only yield/purity but also sequencing results, using two activated sludge samples (two sub-samples each, i.e. ethanol-fixed and fresh, as-is). The results indicate that the bead-beating step is necessary for DNA extraction from activated sludge. The two kits without the bead-beating step yielded very low amounts of DNA, and the least abundant operational taxonomic units (OTUs), and significantly underestimated the Gram-positive Actinobacteria, Nitrospirae, Chloroflexi, and Alphaproteobacteria and overestimated Gammaproteobacteria, Deltaproteobacteria, Bacteroidetes, and the rare phyla whose cell walls might have been readily broken. Among the other five kits, FastDNA(@) SPIN Kit for Soil extracted the most and the purest DNA. Although the number of total OTUs obtained using this kit was not the highest, the abundant OTUs and abundance of Actinobacteria demonstrated its efficiency. The three MoBio kits and one ZR kit produced fair results, but had a relatively low DNA yield and/or less Actinobacteria-related sequences. Moreover, the 50 % ethanol fixation increased the DNA yield, but did not change the sequenced microbial community in a significant way. Based on the present study, the FastDNA SPIN kit for Soil is recommended for DNA extraction of activated sludge samples. More importantly, the selection of the DNA extraction kit must be done carefully if the samples contain dominant lysing-resistant groups, such as Actinobacteria and Nitrospirae.

  5. The Dynamics of DNA Sequencing.

    ERIC Educational Resources Information Center

    Morvillo, Nancy

    1997-01-01

    Describes a paper-and-pencil activity that helps students understand DNA sequencing and expands student understanding of DNA structure, replication, and gel electrophoresis. Appropriate for advanced biology students who are familiar with the Sanger method. (DDR)

  6. Substrate specificity and sequence-dependent activity of the Saccharomyces cerevisiae 3-methyladenine DNA glycosylase (Mag).

    PubMed

    Lingaraju, Gondichatnahalli M; Kartalou, Maria; Meira, Lisiane B; Samson, Leona D

    2008-06-01

    DNA glycosylases initiate base excision repair by first binding, then excising aberrant DNA bases. Saccharomyces cerevisiae encodes a 3-methyladenine (3MeA) DNA glycosylase, Mag, that recognizes 3MeA and various other DNA lesions including 1,N6-ethenoadenine (epsilon A), hypoxanthine (Hx) and abasic (AP) sites. In the present study, we explore the relative substrate specificity of Mag for these lesions and in addition, show that Mag also recognizes cisplatin cross-linked adducts, but does not catalyze their excision. Through competition binding and activity studies, we show that in the context of a random DNA sequence Mag binds epsilon A and AP-sites the most tightly, followed by the cross-linked 1,2-d(ApG) cisplatin adduct. While epsilon A binding and excision by Mag was robust in this sequence context, binding and excision of Hx was extremely poor. We further studied the recognition of epsilon A and Hx by Mag, when these lesions are present at different positions within A:T and G:C tracts. Overall, epsilon A was slightly less well excised from each position within the A:T and G:C tracts compared to excision from the random sequence, whereas Hx excision was greatly increased in these sequence contexts (by up to 7-fold) compared to the random sequence. However, given most sequence contexts, Mag had a clear preference for epsilon A relative to Hx, except in the TTXTT (X=epsilon A or Hx) sequence context from which Mag removed both lesions with almost equal efficiency. We discuss how DNA sequence context affects base excision by various 3MeA DNA glycosylases.

  7. Gene activation properties of a mouse DNA sequence isolated by expression selection.

    PubMed Central

    von Hoyningen-Huene, V; Norbury, C; Griffiths, M; Fried, M

    1986-01-01

    The MES-1 element was previously isolated from restricted total mouse cellular DNA by "expression selection"--the ability to reactivate expression of a test gene devoid of its 5' enhancer sequences. Mes-1 has been tested in long-term transformation and short-term CAT expression assays. In both assays MES-1 is active independent of orientation and at a distance when placed 5' to the test gene. The element is active with heterologous promoters and functions efficiently in both rat and mouse cells. MES-1 activates expression by increasing transcription from the test gene's own start (cap) site. Thus the expression selection technique can be used for the isolation of DNA sequences with enhancer-like properties from total cellular DNA. Images PMID:3016657

  8. DNA sequences that activate isocitrate lyase gene expression during late embryogenesis and during postgerminative growth.

    PubMed Central

    Zhang, J Z; Santes, C M; Engel, M L; Gasser, C S; Harada, J J

    1996-01-01

    We analyzed DNA sequences that regulate the expression of an isocitrate lyase gene from Brassica napus L. during late embryogenesis and during postgerminative growth to determine whether glyoxysomal function is induced by a common mechanism at different developmental stages. beta-Glucuronidase constructs were used both in transient expression assays in B. napus and in transgenic Arabidopsis thaliana to identify the segments of the isocitrate lyase 5' flanking region that influence promoter activity. DNA sequences that play the principal role in activating the promoter during post-germinative growth are located more than 1,200 bp upstream of the gene. Distinct DNA sequences that were sufficient for high-level expression during late embryogenesis but only low-level expression during postgerminative growth were also identified. Other parts of the 5' flanking region increased promoter activity both in developing seed and in seedlings. We conclude that a combination of elements is involved in regulating the isocitrate lyase gene and that distinct DNA sequences play primary roles in activating the gene in embryos and in seedlings. These findings suggest that different signals contribute to the induction of glyoxysomal function during these two developmental stages. We also showed that some of the constructs were expressed differently in transient expression assays and in transgenic plants. PMID:8934622

  9. Pentaprobe: a comprehensive sequence for the one-step detection of DNA-binding activities.

    PubMed

    Kwan, Ann H Y; Czolij, Robert; Mackay, Joel P; Crossley, Merlin

    2003-10-15

    The rapid increase in the number of novel proteins identified in genome projects necessitates simple and rapid methods for assigning function. We describe a strategy for determining whether novel proteins possess typical sequence-specific DNA-binding activity. Many proteins bind recognition sequences of 5 bp or less. Given that there are 4(5) possible 5 bp sites, one might expect the length of sequence required to cover all possibilities would be 4(5) x 5 or 5120 nt. But by allowing overlaps, utilising both strands and using a computer algorithm to generate the minimum sequence, we find the length required is only 516 base pairs. We generated this sequence as six overlapping double-stranded oligonucleotides, termed pentaprobe, and used it in gel retardation experiments to assess DNA binding by both known and putative DNA-binding proteins from several protein families. We have confirmed binding by the zinc finger proteins BKLF, Eos and Pegasus, the Ets domain protein PU.1 and the treble clef N- and C-terminal fingers of GATA-1. We also showed that the N-terminal zinc finger domain of FOG-1 does not behave as a typical DNA-binding domain. Our results suggest that pentaprobe, and related sequences such as hexaprobe, represent useful tools for probing protein function.

  10. DNA sequencing conference, 2

    SciTech Connect

    Cook-Deegan, R.M.; Venter, J.C.; Gilbert, W.; Mulligan, J.; Mansfield, B.K.

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  11. Automated DNA Sequencing System

    SciTech Connect

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  12. cDNA sequence and chromosomal localization of human enterokinase, the proteolytic activator of trypsinogen.

    PubMed

    Kitamoto, Y; Veile, R A; Donis-Keller, H; Sadler, J E

    1995-04-11

    Enterokinase is a serine protease of the duodenal brush border membrane that cleaves trypsinogen and produces active trypsin, thereby leading to the activation of many pancreatic digestive enzymes. Overlapping cDNA clones that encode the complete human enterokinase amino acid sequence were isolated from a human intestine cDNA library. Starting from the first ATG codon, the composite 3696 nt cDNA sequence contains an open reading frame of 3057 nt that encodes a 784 amino acid heavy chain followed by a 235 amino acid light chain; the two chains are linked by at least one disulfide bond. The heavy chain contains a potential N-terminal myristoylation site, a potential signal anchor sequence near the amino terminus, and six structural motifs that are found in otherwise unrelated proteins. These domains resemble motifs of the LDL receptor (two copies), complement component Clr (two copies), the metalloprotease meprin (one copy), and the macrophage scavenger receptor (one copy). The enterokinase light chain is homologous to the trypsin-like serine proteinases. These structural features are conserved among human, bovine, and porcine enterokinase. By Northern blotting, a 4.4 kb enterokinase mRNA was detected only in small intestine. The enterokinase gene was localized to human chromosome 21q21 by fluorescence in situ hybridization.

  13. DNA sequence, structure, and tyrosine kinase activity of the Drosophila melanogaster abelson proto-oncogene homolog

    SciTech Connect

    Henkemeyer, M.J.; Bennett, R.L.; Gertler, F.B.; Hoffmann, F.M.

    1988-02-01

    The authors report their molecular characterization of the Drosophila melanogaster Abelson gene (abl), a gene in which recessive loss-of-function mutations result in lethality at the pupal stage of development. This essential gene consists of 10 exons extending over 26 kilobase pairs of genomic DNA. The DNA sequence encodes a protein of 1,520 amino acids with strong sequence similarity to the human c-abl proto-oncogene beginning in the type 1b 5' exon and extending through the region essential for tyrosine kinase activity. When the tyrosine kinase homologous region was expressed in Escherichia coli, phosphorylation of proteins on tyrosine residues was observed with an antiphosphotyrosine antibody. These results show that the abl gene is highly conserved through evolution and encodes a functional tyrosine protein kinase required for Drosophila development.

  14. DNA Sequencing apparatus

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1992-01-01

    An automated DNA sequencing apparatus having a reactor for providing at least two series of DNA products formed from a single primer and a DNA strand, each DNA product of a series differing in molecular weight and having a chain terminating agent at one end; separating means for separating the DNA products to form a series bands, the intensity of substantially all nearby bands in a different series being different, band reading means for determining the position an This invention was made with government support including a grant from the U.S. Public Health Service, contract number AI-06045. The U.S. government has certain rights in the invention.

  15. Indirect DNA Sequence Recognition and Its Impact on Nuclease Cleavage Activity.

    PubMed

    Lambert, Abigail R; Hallinan, Jazmine P; Shen, Betty W; Chik, Jennifer K; Bolduc, Jill M; Kulshina, Nadia; Robins, Lori I; Kaiser, Brett K; Jarjour, Jordan; Havens, Kyle; Scharenberg, Andrew M; Stoddard, Barry L

    2016-06-07

    LAGLIDADG meganucleases are DNA cleaving enzymes used for genome engineering. While their cleavage specificity can be altered using several protein engineering and selection strategies, their overall targetability is limited by highly specific indirect recognition of the central four base pairs within their recognition sites. In order to examine the physical basis of indirect sequence recognition and to expand the number of such nucleases available for genome engineering, we have determined the target sites, DNA-bound structures, and central four cleavage fidelities of nine related enzymes. Subsequent crystallographic analyses of a meganuclease bound to two noncleavable target sites, each containing a single inactivating base pair substitution at its center, indicates that a localized slip of the mutated base pair causes a small change in the DNA backbone conformation that results in a loss of metal occupancy at one binding site, eliminating cleavage activity.

  16. Mitochondrial DNA sequence variation is associated with free-living activity energy expenditure in the elderly.

    PubMed

    Tranah, Gregory J; Lam, Ernest T; Katzman, Shana M; Nalls, Michael A; Zhao, Yiqiang; Evans, Daniel S; Yokoyama, Jennifer S; Pawlikowska, Ludmila; Kwok, Pui-Yan; Mooney, Sean; Kritchevsky, Stephen; Goodpaster, Bret H; Newman, Anne B; Harris, Tamara B; Manini, Todd M; Cummings, Steven R

    2012-09-01

    The decline in activity energy expenditure underlies a range of age-associated pathological conditions, neuromuscular and neurological impairments, disability, and mortality. The majority (90%) of the energy needs of the human body are met by mitochondrial oxidative phosphorylation (OXPHOS). OXPHOS is dependent on the coordinated expression and interaction of genes encoded in the nuclear and mitochondrial genomes. We examined the role of mitochondrial genomic variation in free-living activity energy expenditure (AEE) and physical activity levels (PAL) by sequencing the entire (~16.5 kilobases) mtDNA from 138 Health, Aging, and Body Composition Study participants. Among the common mtDNA variants, the hypervariable region 2 m.185G>A variant was significantly associated with AEE (p=0.001) and PAL (p=0.0005) after adjustment for multiple comparisons. Several unique nonsynonymous variants were identified in the extremes of AEE with some occurring at highly conserved sites predicted to affect protein structure and function. Of interest is the p.T194M, CytB substitution in the lower extreme of AEE occurring at a residue in the Qi site of complex III. Among participants with low activity levels, the burden of singleton variants was 30% higher across the entire mtDNA and OXPHOS complex I when compared to those having moderate to high activity levels. A significant pooled variant association across the hypervariable 2 region was observed for AEE and PAL. These results suggest that mtDNA variation is associated with free-living AEE in older persons and may generate new hypotheses by which specific mtDNA complexes, genes, and variants may contribute to the maintenance of activity levels in late life.

  17. Qualitative analysis of sequence specific binding of flavones to DNA using restriction endonuclease activity assays.

    PubMed

    Duran, Elizabeth; Ramsauer, Victoria P; Ballester, Maria; Torrenegra, Ruben D; Rodriguez, Oscar E; Winkle, Stephen A

    2013-08-01

    Flavones, found in nature as secondary plant metabolites, have shown efficacy as anti-cancer agents. We have examined the binding of two flavones, 5,7-dihydroxy-3,6,8-trimethoxy-2-phenyl-4H-chromen-4-one (5,7-dihydroxy-3,6,8-trimethoxy flavone; FlavA) and 3,5-dihydroxy-6,7,8-trimethoxy-2-phenyl-4H-chromen-4-one (3,5-dihydroxy-6,7,8-trimethoxy flavone; FlavB), to phiX174 RF DNA using restriction enzyme activity assays employing the restriction enzymes Alw44, AvaII, BssHII, DraI, MluI, NarI, NciI, NruI, PstI, and XhoI. These enzymes possess differing target and flanking sequences allowing for observation of sequence specificity analysis. Using restriction enzymes that cleave once with a mixture of supercoiled and relaxed DNA substrates provides for observation of topological effects on binding. FlavA and FlavB show differing sequence specificities in their respective binding to phiX. For example, with relaxed DNA, FlavA shows inhibition of cleavage with DraI (reaction site (5') TTTAAA) but not BssHII ((5') GCGCGC) while FlavB shows the opposite results. Evidence for tolological specificity is also observed, Molecular modeling and conformational analysis of the flavones suggests that the phenyl ring of FlavB is coplanar with the flavonoid ring while the phenyl ring of FlavA is at an angle relative to the flavonoid ring. This may account for aspects of the observed sequence and topological specificities in the effects on restriction enzyme activity.

  18. Analysis of the relationship between ribosomal DNA ITS sequences and active components in Rhodiola plants.

    PubMed

    Zhang, D J; Yuan, W T; Li, M T; Zhang, Y H

    2016-12-23

    Rhodiola plants are a valuable resource in traditional Chinese medicine. The objective of this study was to evaluate the correlation between ribosomal DNA internal transcribed spacer (ITS) sequences and the three active components in Rhodiola plants. For this, we determined ITS sequence polymorphisms and the concentrations of active components salidroside, tyrosol, and gallic acid in different Rhodiola species from the Tibetan Plateau. In a total of 23 Rhodiola samples, 16 different haplotypes were defined based on their ITS sequences. Analysis of the active components in these same samples revealed that salidroside was not detected in species with haplotypes H4, H5, or H10, tyrosol was not detected with haplotypes H3, H5, H7, H10, H14, or H15, and gallic acid was detected in with all haplotypes except H14 and H15. In addition, the concentrations of salidroside, tyrosol and gallic acid varied between samples with different haplotypes as well as those with the same haplotype, implying that no significant correlation exists between haplotype and salidroside, tyrosol or gallic acid concentrations. However, a statistically significant positive correlation was observed for among these three active components.

  19. GT-2: in vivo transcriptional activation activity and definition of novel twin DNA binding domains with reciprocal target sequence selectivity.

    PubMed

    Ni, M; Dehesh, K; Tepperman, J M; Quail, P H

    1996-06-01

    GT-2 is a novel DNA binding protein that interacts with a triplet functionally defined, positively acting GT-box motifs (GT1-bx, GT2-bx, and GT3-bx) in the rice phytochrome A gene (PHYA) promoter. Data from a transient transfection assay used here show that recombinant GT-2 enhanced transcription from both homologous and heterologous GT-box-containing promoters, thereby indicating that this protein can function as a transcriptional activator in vivo. Previously, we have shown that GT-2 contains separate DNA binding determinants in its N- and C-terminal halves, with binding site preferences for the GT3-bx and GT2-bx promoter motifs, respectively. Here, we demonstrate that the minimal DNA binding domains reside within dual 90-amino acid polypeptide segments encompassing duplicated sequences, termed trihelix regions, in each half of the molecule, plus 15 additional immediately adjacent amino acids downstream. These minimal binding domains retained considerable target sequence selectivity for the different GT-box motifs, but this selectivity was enhanced by a separate polypeptide segment farther downstream on the C-terminal side of each trihelix region. Therefore, the data indicate that the twin DNA binding domains of GT-2 each consist of a general GT-box recognition core with intrinsic differential binding activity toward closely related target motifs and a modified sequence conferring higher resolution reciprocal selectivity between these motifs.

  20. Bulged Invader probes: activated duplexes for mixed-sequence dsDNA recognition with improved thermodynamic and kinetic profiles.

    PubMed

    Guenther, Dale C; Karmakar, Saswata; Hrdlicka, Patrick J

    2015-10-18

    Double-stranded oligonucleotides with +1 interstrand zipper arrangements of intercalator-functionalized nucleotides are energetically activated for recognition of mixed-sequence double-stranded DNA. Incorporation of nonyl (C9) bulges at specific positions of these probes, results in more highly affine (>5-fold), faster (>4-fold) and more persistent dsDNA recognition relative to conventional Invader probes.

  1. Transposon facilitated DNA sequencing

    SciTech Connect

    Berg, D.E.; Berg, C.M.; Huang, H.V.

    1990-01-01

    The purpose of this research is to investigate and develop methods that exploit the power of bacterial transposable elements for large scale DNA sequencing: Our premise is that the use of transposons to put primer binding sites randomly in target DNAs should provide access to all portions of large DNA fragments, without the inefficiencies of methods involving random subcloning and attendant repetitive sequencing, or of sequential synthesis of many oligonucleotide primers that are used to match systematically along a DNA molecule. Two unrelated bacterial transposons, Tn5 and {gamma}{delta}, are being used because they have both proven useful for molecular analyses, and because they differ sufficiently in mechanism and specificity of transposition to merit parallel development.

  2. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  3. DNA-stacking interactions determine the sequence specificity of the deoxyribonuclease activity of 1,10-phenanthroline-copper ion.

    PubMed

    Schaeffer, F; Rimsky, S; Spassky, A

    1996-07-26

    Bis(1,10-phenanthroline)-copper(I) ion (OP2Cu+) binds reversibly to B-DNA and makes single-stranded cuts by oxidative attack on the deoxyribose moiety. The deoxyribonuclease activity is sequence-dependent yet not nucleotide-specific at the cutting site. OP2Cu+ sequence specificity was analysed in terms of local variations of DNA stability. Kinetic constants of strand cleavage were measured at sequence positions on the two strands and converted into activation free energies of the cleavage reaction. DNA unwinding free energies were calculated from the base sequence using B-DNA stacking parameters for calculations. The two free-energy variations were statistically compared for a series of DNA restriction fragments bearing the binding sites of regulatory proteins and representing a total of 345 DNA base positions. This study shows that the mean activation free energy of strand cleavage at a pair of opposing sugars across the DNA minor groove varies like the unwinding free energy of the DNA sequence delimited by opposing sugars (3 to 4 bp). A statistical equality between the two free-energy variations is demonstrated when considering the sum of the two cleavage events at the opposing sugars. Systematic deviations between the two free-energy distributions were observed at specific sequences, including polypurine-polypyrimidine tracts (AnTm/AmTn, CnTmCp/GpAmGn), alternating purine-pyrimidine tracts ((TA)n/(TA)n, (TG)n/(CA)n) and at certain G+C-rich triplets (GGC, GCC and CGC). The physical significance of these observations is discussed and a model of OP2Cu+ binding and cleavage specificity based on the free-energy equality is proposed.

  4. DNA sequences encoding osteoinductive products

    SciTech Connect

    Wang, E.A.; Wozney, J.M.; Rosen, V.

    1991-05-07

    This patent describes an isolated DNA sequence encoding an osteoinductive protein the DNA sequence comprising a coding sequence. It comprises: nucleotide No.1 through nucleotide No.387, nucleotide No.356 through nucleotide No.1543, nucleotide $402 through nucleotide No.1626, naturally occurring allelic sequences and equivalent degenerative codon sequences and sequences which hybridize to any of sequences under stringent hybridization conditions; and encode a protein characterized by the ability to induce the formation of bone and/or cartilage.

  5. Nucleosome positioning and kinetics near transcription-start-site barriers are controlled by interplay between active remodeling and DNA sequence.

    PubMed

    Parmar, Jyotsana J; Marko, John F; Padinhateeri, Ranjith

    2014-01-01

    We investigate how DNA sequence, ATP-dependent chromatin remodeling and nucleosome-depleted 'barriers' co-operate to determine the kinetics of nucleosome organization, in a stochastic model of nucleosome positioning and dynamics. We find that 'statistical' positioning of nucleosomes against 'barriers', hypothesized to control chromatin structure near transcription start sites, requires active remodeling and therefore cannot be described using equilibrium statistical mechanics. We show that, unlike steady-state occupancy, DNA site exposure kinetics near a barrier is dominated by DNA sequence rather than by proximity to the barrier itself. The timescale for formation of positioning patterns near barriers is proportional to the timescale for active nucleosome eviction. We also show that there are strong gene-to-gene variations in nucleosome positioning near barriers, which are eliminated by averaging over many genes. Our results suggest that measurement of nucleosome kinetics can reveal information about sequence-dependent regulation that is not apparent in steady-state nucleosome occupancy.

  6. Targeting of the activation-induced cytosine deaminase is strongly influenced by the sequence and structure of the targeted DNA.

    PubMed

    Shen, Hong Ming; Ratnam, Sarayu; Storb, Ursula

    2005-12-01

    Activation-induced deaminase (AID) initiates immunoglobulin somatic hypermutation (SHM). Since in vitro AID was shown to deaminate cytosines on single-stranded DNA or the nontranscribed strand, it remained a puzzle how in vivo AID targets both DNA strands equally. Here we investigate the roles of transcription and DNA sequence in cytosine deamination. Strikingly different results are found with different substrates. Depending on the target sequence, the transcribed DNA strand is targeted as well as or better than the nontranscribed strand. The preferential targeting is not related to the frequency of AID hot spots. Comparison of cytosine deamination by AID and bisulfite shows different targeting patterns suggesting that AID may locally unwind the DNA. We conclude that somatic hypermutation on both DNA strands is the natural outcome of AID action on a transcribed gene; furthermore, the DNA sequence or structure and topology play major roles in targeting AID in vitro and in vivo. On the other hand, the lack of mutations in the first approximately 100 nucleotides and beyond about 1 to 2 kb from the promoter of immunoglobulin genes during SHM must be due to special conditions of transcription and chromatin in vivo.

  7. Biosensors for DNA sequence detection

    NASA Technical Reports Server (NTRS)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  8. Efficient DNA fingerprinting based on the targeted sequencing of active retrotransposon insertion sites using a bench-top high-throughput sequencing platform.

    PubMed

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-10-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships.

  9. Efficient DNA Fingerprinting Based on the Targeted Sequencing of Active Retrotransposon Insertion Sites Using a Bench-Top High-Throughput Sequencing Platform

    PubMed Central

    Monden, Yuki; Yamamoto, Ayaka; Shindo, Akiko; Tahara, Makoto

    2014-01-01

    In many crop species, DNA fingerprinting is required for the precise identification of cultivars to protect the rights of breeders. Many families of retrotransposons have multiple copies throughout the eukaryotic genome and their integrated copies are inherited genetically. Thus, their insertion polymorphisms among cultivars are useful for DNA fingerprinting. In this study, we conducted a DNA fingerprinting based on the insertion polymorphisms of active retrotransposon families (Rtsp-1 and LIb) in sweet potato. Using 38 cultivars, we identified 2,024 insertion sites in the two families with an Illumina MiSeq sequencing platform. Of these insertion sites, 91.4% appeared to be polymorphic among the cultivars and 376 cultivar-specific insertion sites were identified, which were converted directly into cultivar-specific sequence-characterized amplified region (SCAR) markers. A phylogenetic tree was constructed using these insertion sites, which corresponded well with known pedigree information, thereby indicating their suitability for genetic diversity studies. Thus, the genome-wide comparative analysis of active retrotransposon insertion sites using the bench-top MiSeq sequencing platform is highly effective for DNA fingerprinting without any requirement for whole genome sequence information. This approach may facilitate the development of practical polymerase chain reaction-based cultivar diagnostic system and could also be applied to the determination of genetic relationships. PMID:24935865

  10. DNA sequence-selective C8-linked pyrrolobenzodiazepine-heterocyclic polyamide conjugates show anti-tubercular-specific activities.

    PubMed

    Brucoli, Federico; Guzman, Juan D; Basher, Mohammad A; Evangelopoulos, Dimitrios; McMahon, Eleanor; Munshi, Tulika; McHugh, Timothy D; Fox, Keith R; Bhakta, Sanjib

    2016-12-01

    New chemotherapeutic agents with novel mechanisms of action are in urgent need to combat the tuberculosis pandemic. A library of 12 C8-linked pyrrolo[2,1-c][1,4]benzodiazepine (PBD)-heterocyclic polyamide conjugates (1-12) was evaluated for anti-tubercular activity and DNA sequence selectivity. The PBD conjugates were screened against slow-growing Mycobacterium bovis Bacillus Calmette-Guérin and M. tuberculosis H37Rv, and fast-growing Escherichia coli, Pseudomonas putida and Rhodococcus sp. RHA1 bacteria. DNase I footprinting and DNA thermal denaturation experiments were used to determine the molecules' DNA recognition properties. The PBD conjugates were highly selective for the mycobacterial strains and exhibited significant growth inhibitory activity against the pathogenic M. tuberculosis H37Rv, with compound 4 showing MIC values (MIC=0.08 mg l(-1)) similar to those of rifampin and isoniazid. DNase I footprinting results showed that the PBD conjugates with three heterocyclic moieties had enhanced sequence selectivity and produced larger footprints, with distinct cleavage patterns compared with the two-heterocyclic chain PBD conjugates. DNA melting experiments indicated a covalent binding of the PBD conjugates to two AT-rich DNA-duplexes containing either a central GGATCC or GTATAC sequence, and showed that the polyamide chains affect the interactions of the molecules with DNA. The PBD-C8 conjugates tested in this study have a remarkable anti-mycobacterial activity and can be further developed as DNA-targeted anti-tubercular drugs.

  11. Graphene nanodevices for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  12. Synthesis, anti-mycobacterial activity and DNA sequence-selectivity of a library of biaryl-motifs containing polyamides.

    PubMed

    Brucoli, Federico; Guzman, Juan D; Maitra, Arundhati; James, Colin H; Fox, Keith R; Bhakta, Sanjib

    2015-07-01

    The alarming rise of extensively drug-resistant tuberculosis (XDR-TB) strains, compel the development of new molecules with novel modes of action to control this world health emergency. Distamycin analogues containing N-terminal biaryl-motifs 2(1-5)(1-7) were synthesised using a solution-phase approach and evaluated for their anti-mycobacterial activity and DNA-sequence selectivity. Thiophene dimer motif-containing polyamide 2(2,6) exhibited 10-fold higher inhibitory activity against Mycobacterium tuberculosis compared to distamycin and library member 2(5,7) showed high binding affinity for the 5'-ACATAT-3' sequence.

  13. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, Stefan K.

    1998-01-01

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei.

  14. Sequence independent amplification of DNA

    DOEpatents

    Bohlander, S.K.

    1998-03-24

    The present invention is a rapid sequence-independent amplification procedure (SIA). Even minute amounts of DNA from various sources can be amplified independent of any sequence requirements of the DNA or any a priori knowledge of any sequence characteristics of the DNA to be amplified. This method allows, for example, the sequence independent amplification of microdissected chromosomal material and the reliable construction of high quality fluorescent in situ hybridization (FISH) probes from YACs or from other sources. These probes can be used to localize YACs on metaphase chromosomes but also--with high efficiency--in interphase nuclei. 25 figs.

  15. Chromosome specific repetitive DNA sequences

    DOEpatents

    Moyzis, Robert K.; Meyne, Julianne

    1991-01-01

    A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).

  16. Activity of FEN1 endonuclease on nucleosome substrates is dependent upon DNA sequence but not flap orientation.

    PubMed

    Jagannathan, Indu; Pepenella, Sharon; Hayes, Jeffrey J

    2011-05-20

    We demonstrated previously that human FEN1 endonuclease, an enzyme involved in excising single-stranded DNA flaps that arise during Okazaki fragment processing and base excision repair, cleaves model flap substrates assembled into nucleosomes. Here we explore the effect of flap orientation with respect to the surface of the histone octamer on nucleosome structure and FEN1 activity in vitro. We find that orienting the flap substrate toward the histone octamer does not significantly alter the rotational orientation of two different nucleosome positioning sequences on the surface of the histone octamer but does cause minor perturbation of nucleosome structure. Surprisingly, flaps oriented toward the nucleosome surface are accessible to FEN1 cleavage in nucleosomes containing the Xenopus 5S positioning sequence. In contrast, neither flaps oriented toward nor away from the nucleosome surface are cleaved by the enzyme in nucleosomes containing the high-affinity 601 nucleosome positioning sequence. The data are consistent with a model in which sequence-dependent motility of DNA on the nucleosome is a major determinant of FEN1 activity. The implications of these findings for the activity of FEN1 in vivo are discussed.

  17. Biotools: Patenting DNA sequences

    SciTech Connect

    Yablonsky, M.D.; Hone, W.J.

    1995-07-01

    The decision, known as In re Deuel{sup 2}, rejects the PTO`s interpretation of a previous decision of the Federal Circuit and makes it more possible that a {open_quotes}nucleic acid of a particular sequence{close_quotes} - commonly known as a gene sequence - may be patentable. 15 refs.

  18. DNA sequencing: bench to bedside and beyond†

    PubMed Central

    Hutchison, Clyde A.

    2007-01-01

    Fifteen years elapsed between the discovery of the double helix (1953) and the first DNA sequencing (1968). Modern DNA sequencing began in 1977, with development of the chemical method of Maxam and Gilbert and the dideoxy method of Sanger, Nicklen and Coulson, and with the first complete DNA sequence (phage ϕX174), which demonstrated that sequence could give profound insights into genetic organization. Incremental improvements allowed sequencing of molecules >200 kb (human cytomegalovirus) leading to an avalanche of data that demanded computational analysis and spawned the field of bioinformatics. The US Human Genome Project spurred sequencing activity. By 1992 the first ‘sequencing factory’ was established, and others soon followed. The first complete cellular genome sequences, from bacteria, appeared in 1995 and other eubacterial, archaebacterial and eukaryotic genomes were soon sequenced. Competition between the public Human Genome Project and Celera Genomics produced working drafts of the human genome sequence, published in 2001, but refinement and analysis of the human genome sequence will continue for the foreseeable future. New ‘massively parallel’ sequencing methods are greatly increasing sequencing capacity, but further innovations are needed to achieve the ‘thousand dollar genome’ that many feel is prerequisite to personalized genomic medicine. These advances will also allow new approaches to a variety of problems in biology, evolution and the environment. PMID:17855400

  19. Length heterogeneity at conserved sequence block 2 in human mitochondrial DNA acts as a rheostat for RNA polymerase POLRMT activity

    PubMed Central

    Tan, Benedict G.; Wellesley, Frederick C.; Savery, Nigel J.; Szczelkun, Mark D.

    2016-01-01

    The guanine (G)-tract of conserved sequence block 2 (CSB 2) in human mitochondrial DNA can result in transcription termination due to formation of a hybrid G-quadruplex between the nascent RNA and the nontemplate DNA strand. This structure can then influence genome replication, stability and localization. Here we surveyed the frequency of variation in sequence identity and length at CSB 2 amongst human mitochondrial genomes and used in vitro transcription to assess the effects of this length heterogeneity on the activity of the mitochondrial RNA polymerase, POLRMT. In general, increased G-tract length correlated with increased termination levels. However, variation in the population favoured CSB 2 sequences which produced efficient termination while particularly weak or strong signals were avoided. For all variants examined, the 3′ end of the transcripts mapped to the same downstream sequences and were prevented from terminating by addition of the transcription factor TEFM. We propose that CSB 2 length heterogeneity allows variation in the efficiency of transcription termination without affecting the position of the products or the capacity for regulation by TEFM. PMID:27436287

  20. The sequence of sequencers: The history of sequencing DNA.

    PubMed

    Heather, James M; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way.

  1. Characterization of sequence elements from Malvastrum yellow vein betasatellite regulating promoter activity and DNA replication

    PubMed Central

    2012-01-01

    Background Many monopartite begomoviruses are associated with betasatellites, but only several promoters from which were isolated and studied. In this study, the βC1 promoter from Malvastrum yellow vein betasatellite (MYVB) was characterized and important sequence elements were identified to modulate promoter activity and replication of MYVB. Results A 991 nucleotide (nt) fragment upstream of the translation start site of the βC1 open reading frame of MYVB and a series of deletions within this fragment were constructed and fused to the β-glucuronidase (GUS) and green fluorescent protein (GFP) reporter genes, respectively. Agrobacterium-mediated transient expression assays showed that the 991 nt fragment was functional and that a 28 nt region (between −390 nt and −418 nt), which includes a 5′UTR Py-rich stretch motif, was important for promoter activity. Replication assays using Nicotiana benthamiana leaf discs and whole plants showed that deletion of the 5′UTR Py-rich stretch impaired viral satellite replication in the presence of the helper virus. Transgenic assays demonstrated that the 991 nt fragment conferred a constitutive expression pattern in transgenic tobacco plants and that a 214 nt fragment at the 3'-end of this sequence was sufficient to drive this expression pattern. Conclusion Our results showed that the βC1 promoter of MYVB displayed a constitutive expression pattern and a 5′UTR Py-rich stretch motif regulated both βC1 promoter activity and MYVB replication. PMID:23057573

  2. The influence of nucleotide sequence and temperature on the activity of thermostable DNA polymerases.

    PubMed

    Montgomery, Jesse L; Rejali, Nick; Wittwer, Carl T

    2014-05-01

    Extension rates of a thermostable, deletion-mutant polymerase were measured from 50°C to 90°C using a fluorescence activity assay adapted for real-time PCR instruments. Substrates with a common hairpin (6-base loop and a 14-bp stem) were synthesized with different 10-base homopolymer tails. Rates for A, C, G, T, and 7-deaza-G incorporation at 75°C were 81, 150, 214, 46, and 120 seconds(-1). Rates for U were half as fast as T and did not increase with increasing concentration. Hairpin substrates with 25-base tails from 0% to 100% GC content had maximal extension rates near 60% GC and were predicted from the template sequence and mononucleotide incorporation rates to within 30% for most sequences. Addition of dimethyl sulfoxide at 7.5% increased rates to within 1% to 17% of prediction for templates with 40% to 90% GC. When secondary structure was designed into the template region, extension rates decreased. Oligonucleotide probes reduced extension rates by 65% (5'-3' exo-) and 70% (5'-3' exo+). When using a separate primer and a linear template to form a polymerase substrate, rates were dependent on both the primer melting temperature (Tm) and the annealing/extension temperature. Maximum rates were observed from Tm to Tm - 5°C with little extension by Tm + 5°C. Defining the influence of sequence and temperature on polymerase extension will enable more rapid and efficient PCR.

  3. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  4. Image analysis for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Palaniappan, Kannappan; Huang, Thomas S.

    1991-07-01

    There is a great deal of interest in automating the process of DNA (deoxyribonucleic acid) sequencing to support the analysis of genomic DNA such as the Human and Mouse Genome projects. In one class of gel-based sequencing protocols autoradiograph images are generated in the final step and usually require manual interpretation to reconstruct the DNA sequence represented by the image. The need to handle a large volume of sequence information necessitates automation of the manual autoradiograph reading step through image analysis in order to reduce the length of time required to obtain sequence data and reduce transcription errors. Various adaptive image enhancement, segmentation and alignment methods were applied to autoradiograph images. The methods are adaptive to the local characteristics of the image such as noise, background signal, or presence of edges. Once the two-dimensional data is converted to a set of aligned one-dimensional profiles waveform analysis is used to determine the location of each band which represents one nucleotide in the sequence. Different classification strategies including a rule-based approach are investigated to map the profile signals, augmented with the original two-dimensional image data as necessary, to textual DNA sequence information.

  5. DNA Sequencing Sensors: An Overview

    PubMed Central

    Garrido-Cardenas, Jose Antonio; Garcia-Maroto, Federico; Alvarez-Bermejo, Jose Antonio; Manzano-Agugliaro, Francisco

    2017-01-01

    The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS) meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years. PMID:28335417

  6. Statistical properties of DNA sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Simons, M.; Stanley, H. E.

    1995-01-01

    We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.

  7. A conserved sequence extending motif III of the motor domain in the Snf2-family DNA translocase Rad54 is critical for ATPase activity.

    PubMed

    Zhang, Xiao-Ping; Janke, Ryan; Kingsley, James; Luo, Jerry; Fasching, Clare; Ehmsen, Kirk T; Heyer, Wolf-Dietrich

    2013-01-01

    Rad54 is a dsDNA-dependent ATPase that translocates on duplex DNA. Its ATPase function is essential for homologous recombination, a pathway critical for meiotic chromosome segregation, repair of complex DNA damage, and recovery of stalled or broken replication forks. In recombination, Rad54 cooperates with Rad51 protein and is required to dissociate Rad51 from heteroduplex DNA to allow access by DNA polymerases for recombination-associated DNA synthesis. Sequence analysis revealed that Rad54 contains a perfect match to the consensus PIP box sequence, a widely spread PCNA interaction motif. Indeed, Rad54 interacts directly with PCNA, but this interaction is not mediated by the Rad54 PIP box-like sequence. This sequence is located as an extension of motif III of the Rad54 motor domain and is essential for full Rad54 ATPase activity. Mutations in this motif render Rad54 non-functional in vivo and severely compromise its activities in vitro. Further analysis demonstrated that such mutations affect dsDNA binding, consistent with the location of this sequence motif on the surface of the cleft formed by two RecA-like domains, which likely forms the dsDNA binding site of Rad54. Our study identified a novel sequence motif critical for Rad54 function and showed that even perfect matches to the PIP box consensus may not necessarily identify PCNA interaction sites.

  8. Isolation and sequencing of active origins of DNA replication by nascent strand capture and release (NSCR)

    PubMed Central

    Kunnev, Dimiter; Freeland, Amy; Qin, Maochun; Wang, Jianmin; Pruitt, Steven C.

    2015-01-01

    Nascent strand capture and release (NSCR) is a method for isolation of short nascent strands to identify origins of DNA replication. The protocol provided involves isolation of total DNA, denaturation, size fractionation on a sucrose gradient, 5′-biotinylation of the appropriate size nucleic acids, binding to a streptavidin coated magnetic beads, intensive washing, and specific release of only the RNA-containing chimeric nascent strand DNA using ribonuclease I (RNase I). The method has been applied to mammalian cells derived from proliferative tissues and cell culture but could be used for any system where DNA replication is primed by a small RNA resulting in chimeric RNA-DNA molecules. PMID:26949711

  9. Intranuclear Anchoring of Repetitive DNA Sequences

    PubMed Central

    Weipoltshammer, Klara; Schöfer, Christian; Almeder, Marlene; Philimonenko, Vlada V.; Frei, Klemens; Wachtler, Franz; Hozák, Pavel

    1999-01-01

    Centromeres, telomeres, and ribosomal gene clusters consist of repetitive DNA sequences. To assess their contributions to the spatial organization of the interphase genome, their interactions with the nucleoskeleton were examined in quiescent and activated human lymphocytes. The nucleoskeletons were prepared using “physiological” conditions. The resulting structures were probed for specific DNA sequences of centromeres, telomeres, and ribosomal genes by in situ hybridization; the electroeluted DNA fractions were examined by blot hybridization. In both nonstimulated and stimulated lymphocytes, centromeric alpha-satellite repeats were almost exclusively found in the eluted fraction, while telomeric sequences remained attached to the nucleoskeleton. Ribosomal genes showed a transcription-dependent attachment pattern: in unstimulated lymphocytes, transcriptionally inactive ribosomal genes located outside the nucleolus were eluted completely. When comparing transcription unit and intergenic spacer, significantly more of the intergenic spacer was removed. In activated lymphocytes, considerable but similar amounts of both rDNA fragments were eluted. The results demonstrate that: (a) the various repetitive DNA sequences differ significantly in their intranuclear anchoring, (b) telomeric rather than centromeric DNA sequences form stable attachments to the nucleoskeleton, and (c) different attachment mechanisms might be responsible for the interaction of ribosomal genes with the nucleoskeleton. PMID:10613900

  10. Structural Complexity of DNA Sequence

    PubMed Central

    Liou, Cheng-Yuan; Cheng, Wei-Chen; Tsai, Huai-Ying

    2013-01-01

    In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results. PMID:23662161

  11. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  12. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1996-01-01

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection.

  13. Apparatus for improved DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1996-05-07

    This invention is a means for the rapid sequencing of DNA samples. More specifically, it consists of a new design direct blotting electrophoresis unit. The DNA sequence is deposited on a membrane attached to a rotating drum. Initial data compaction is facilitated by the use of a machined multi-channeled plate called a ribbon channel plate. Each channel is an isolated mini gel system much like a gel filled capillary. The system as a whole, however, is in a slab gel like format with the advantages of uniformity and easy reusability. The system can be used in different embodiments. The drum system is unique in that after deposition the drum rotates the deposited DNA into a large non-buffer open space where processing and detection can occur. The drum can also be removed in toto to special workstations for downstream processing, multiplexing and detection. 18 figs.

  14. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  15. A novel, evolutionarily conserved gene family with putative sequence-specific single-stranded DNA-binding activity.

    PubMed

    Castro, Patricia; Liang, Hong; Liang, Jan C; Nagarajan, Lalitha

    2002-07-01

    Complete and partial deletions of chromosome 5q are recurrent cytogenetic anomalies associated with aggressive myeloid malignancies. Earlier, we identified an approximately 1.5-Mb region of loss at 5q13.3 between the loci D5S672 and D5S620 in primary leukemic blasts. A leukemic cell line, ML3, is diploid for all of chromosome 5, except for an inversion-coupled translocation within the D5S672-D5S620 interval. Here, we report the development of a bacterial artificial chromosome (BAC) contig to define the breakpoint and the identification of a novel gene SSBP2, the target of disruption in ML3 cells. A preliminary evaluation of SSBP2 as a tumor suppressor gene in primary leukemic blasts and cell lines suggests that the remaining allele does not undergo intragenic mutations. SSBP2 is one of three members of a closely related, evolutionarily conserved, and ubiquitously expressed gene family. SSBP3 is the human ortholog of a chicken gene, CSDP, that encodes a sequence-specific single-stranded DNA-binding protein. SSBP3 localizes to chromosome 1p31.3, and the third member, SSBP4, maps to chromosome 19p13.1. Chromosomal localization and the putative single-stranded DNA-binding activity suggest that all three members of this family are capable of potential tumor suppressor activity by gene dosage or other epigenetic mechanisms.

  16. Robustness of nucleosome patterns in the presence of DNA sequence-specific free energy landscapes and active remodeling

    NASA Astrophysics Data System (ADS)

    Nuebler, Johannes; Obermayer, Benedikt; Möbius, Wolfram; Wolff, Michael; Gerland, Ulrich

    Proper positioning of nucleosomes in eukaryotic cells is important for transcription regulation. When averaged over many genes, nucleosome positions in coding regions follow a simple oscillatory pattern, which is described to a surprising degree of accuracy by a simple one-dimensional gas model for particles interacting via a soft-core repulsion. The quantitative agreement is surprising given that nucleosome positions are known to be determined by a complex interplay of mechanisms including DNA sequence-specific nucleosome stability and active repositioning of nucleosomes by remodeling enzymes. We rationalize the observed robustness of the simple oscillatory pattern by showing that the main effect of several known nucleosome positioning mechanisms is a renormalization of the particle interaction. For example, ``disorder'' from sequence-specific affinities leads to an apparent softening, while active remodeling can result in apparent softening for directional sliding or apparent stiffening for clamping mechanisms. We suggest that such parameter renormalization can explain the apparent difference of nucleosome properties in two yeast species, S. cerevisiae and S. pombe.

  17. Channel plate for DNA sequencing

    DOEpatents

    Douthart, Richard J.; Crowell, Shannon L.

    1998-01-01

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface.

  18. Channel plate for DNA sequencing

    DOEpatents

    Douthart, R.J.; Crowell, S.L.

    1998-01-13

    This invention is a channel plate that facilitates data compaction in DNA sequencing. The channel plate has a length, a width and a thickness, and further has a plurality of channels that are parallel. Each channel has a depth partially through the thickness of the channel plate. Additionally an interface edge permits electrical communication across an interface through a buffer to a deposition membrane surface. 15 figs.

  19. DNA Sequencing Using capillary Electrophoresis

    SciTech Connect

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  20. Particle sizer and DNA sequencer

    DOEpatents

    Olivares, Jose A.; Stark, Peter C.

    2005-09-13

    An electrophoretic device separates and detects particles such as DNA fragments, proteins, and the like. The device has a capillary which is coated with a coating with a low refractive index such as Teflon.RTM. AF. A sample of particles is fluorescently labeled and injected into the capillary. The capillary is filled with an electrolyte buffer solution. An electrical field is applied across the capillary causing the particles to migrate from a first end of the capillary to a second end of the capillary. A detector light beam is then scanned along the length of the capillary to detect the location of the separated particles. The device is amenable to a high throughput system by providing additional capillaries. The device can also be used to determine the actual size of the particles and for DNA sequencing.

  1. 5-Methylcytosine DNA glycosylase activity is also present in the human MBD4 (G/T mismatch glycosylase) and in a related avian sequence.

    PubMed

    Zhu, B; Zheng, Y; Angliker, H; Schwarz, S; Thiry, S; Siegmann, M; Jost, J P

    2000-11-01

    A 1468 bp cDNA coding for the chicken homolog of the human MBD4 G/T mismatch DNA glycosylase was isolated and sequenced. The derived amino acid sequence (416 amino acids) shows 46% identity with the human MBD4 and the conserved catalytic region at the C-terminal end (170 amino acids) has 90% identity. The non-conserved region of the avian protein has no consensus sequence for the methylated DNA binding domain. The recombinant proteins from human and chicken have G/T mismatch as well as 5-methylcytosine (5-MeC) DNA glycosylase activities. When tested by gel shift assays, human recombinant protein with or without the methylated DNA binding domain binds equally well to symmetrically, hemimethylated DNA and non-methylated DNA. However, the enzyme has only 5-MeC DNA glycosylase activity with the hemimethylated DNA. Footprinting of human MBD4 and of an N-terminal deletion mutant with partially depurinated and depyrimidinated substrate reveal a selective binding of the proteins to the modified substrate around the CpG. As for 5-MeC DNA glycosylase purified from chicken embryos, MBD4 does not use oligonucleotides containing mCpA, mCpT or mCpC as substrates. An mCpG within an A+T-rich oligonucleotide is a much better substrate than an A+T-poor sequence. The K:(m) of human MBD4 for hemimethylated DNA is approximately 10(-7) M with a V:(max) of approximately 10(-11) mol/h/microgram protein. Deletion mutations show that G/T mismatch and 5-MeC DNA glycosylase are located in the C-terminal conserved region. In sharp contrast to the 5-MeC DNA glycosylase isolated from the chicken embryo DNA demethylation complex, the two enzymatic activities of MBD4 are strongly inhibited by RNA. In situ hybridization with antisense RNA indicate that MBD4 is only located in dividing cells of differentiating embryonic tissues.

  2. Cathepsin B from the white shrimp Litopenaeus vannamei: cDNA sequence analysis, tissues-specific expression and biological activity.

    PubMed

    Stephens, A; Rojo, L; Araujo-Bernal, S; Garcia-Carreño, F; Muhlia-Almazan, A

    2012-01-01

    Cathepsin B is a cystein proteinase scarcely studied in crustaceans. Its function has not been clearly described in shrimp species belonging to the sub-order Dendrobranchiata, which includes the white shrimp Litopenaeus vannamei and other species from the Penaeidae family. Studies on vertebrates suggest that these lysosomal enzymes intracellularly hydrolize protein, as other cystein proteinases. However, the expression of the gene encoding the shrimp cathepsin B in the midgut gland was affected by starvation in a similar way as other digestive proteinases which extracellularly hydrolyze food protein. In this study the white shrimp L. vannamei cathepsin B (LvCathB) cDNA was sequenced, and characterized. Its gene expression was evaluated in various shrimp tissues, and changes in the mRNA amounts were compared with those observed on other digestive proteinases from the midgut gland during starvation. By using qRT-PCR it was found that LvCathB is expressed in most shrimp tissues except in pleopods and eye stalk. Changes on LvCathB mRNA during starvation suggest that the enzyme participates during intracellular protein hydrolysis but also, after food ingestion, it participates in hydrolyzing food proteins extracellularly as confirmed by the high activity levels we found in the gastric juice and midgut gland of the white shrimp.

  3. Synthetic oligonucleotides with particular base sequences from the cDNA encoding proteins of Mycobacterium bovis BCG induce interferons and activate natural killer cells.

    PubMed

    Tokunaga, T; Yano, O; Kuramoto, E; Kimura, Y; Yamamoto, T; Kataoka, T; Yamamoto, S

    1992-01-01

    Thirteen kinds of 45-mer single-stranded oligonucleotide, having sequence randomly selected from the known cDNA encoding BCG proteins, were tested for their capability to augment natural killer (NK) cell activity of mouse spleen cells in vitro. Six out of the 13 oligonucleotides showed the activity, while the others did not. In order to know the minimal and essential sequence(s) responsible for the biological activity, 2 kinds of 30-mer and 5 kinds of 15-mer oligonucleotide fragments of an active 45-mer nucleotide were tested for their activity. One of the 30-mer oligonucleotides, designated BCG-A4a, was active, but the other 30-mer was inactive. All of the 15-mer oligonucleotide fragments were inactive. The BCG-A4a also stimulated the spleen cells to produce interferon (IFN)-alpha and -gamma. An experiment using anti-IFN antisera showed that the NK cell activation by the oligonucleotide was ascribed to the IFN-alpha produced. It was noticed that all of the biologically active oligonucleotides possessed one or more palindrome sequence(s), and the inactive ones did not, with an exception of a 45-mer inactive oligonucleotide containing overlapping palindrome sequences (GGGCCCGGG). These findings strongly suggest that certain palindrome sequences, like GACGTC, GGCGCC and TGCGCA, are essential for 30-mer oligonucleotides, like BCG-A4a, to induce IFNs.

  4. Plant DNA sequencing for phylogenetic analyses: from plants to sequences.

    PubMed

    Neves, Susana S; Forrest, Laura L

    2011-01-01

    DNA sequences are important sources of data for phylogenetic analysis. Nowadays, DNA sequencing is a routine technique in molecular biology laboratories. However, there are specific questions associated with project design and sequencing of plant samples for phylogenetic analysis, which may not be familiar to researchers starting in the field. This chapter gives an overview of methods and protocols involved in the sequencing of plant samples, including general recommendations on the selection of species/taxa and DNA regions to be sequenced, and field collection of plant samples. Protocols of plant sample preparation, DNA extraction, PCR and cloning, which are critical to the success of molecular phylogenetic projects, are described in detail. Common problems of sequencing (using the Sanger method) are also addressed. Possible applications of second-generation sequencing techniques in plant phylogenetics are briefly discussed. Finally, orientation on the preparation of sequence data for phylogenetic analyses and submission to public databases is also given.

  5. The Value of DNA Sequencing - TCGA

    Cancer.gov

    DNA sequencing: what it tells us about DNA changes in cancer, how looking across many tumors will help to identify meaningful changes and potential drug targets, and how genomics is changing the way we think about cancer.

  6. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, Andrew M.; Dawson, John

    1993-01-01

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source.

  7. DNA sequence from Cretaceous period bone fragments.

    PubMed

    Woodward, S R; Weyand, N J; Bunnell, M

    1994-11-18

    DNA was extracted from 80-million-year-old bone fragments found in strata of the Upper Cretaceous Blackhawk Formation in the roof of an underground coal mine in eastern Utah. This DNA was used as the template in a polymerase chain reaction that amplified and sequenced a portion of the gene encoding mitochondrial cytochrome b. These sequences differ from all other cytochrome b sequences investigated, including those in the GenBank and European Molecular Biology Laboratory databases. DNA isolated from these bone fragments and the resulting gene sequences demonstrate that small fragments of DNA may survive in bone for millions of years.

  8. "First generation" automated DNA sequencing technology.

    PubMed

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines.

  9. Fibonacci Sequence and Supramolecular Structure of DNA.

    PubMed

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  10. Sequence and Structure Dependent DNA-DNA Interactions

    NASA Astrophysics Data System (ADS)

    Kopchick, Benjamin; Qiu, Xiangyun

    Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.

  11. Hepatitis B virus X protein inhibits p53 sequence-specific DNA binding, transcriptional activity, and association with transcription factor ERCC3.

    PubMed Central

    Wang, X W; Forrester, K; Yeh, H; Feitelson, M A; Gu, J R; Harris, C C

    1994-01-01

    Chronic active hepatitis caused by infection with hepatitis B virus, a DNA virus, is a major risk factor for human hepatocellular carcinoma. Since the oncogenicity of several DNA viruses is dependent on the interaction of their viral oncoproteins with cellular tumor-suppressor gene products, we investigated the interaction between hepatitis B virus X protein (HBX) and human wild-type p53 protein. HBX complexes with the wild-type p53 protein and inhibits its sequence-specific DNA binding in vitro. HBX expression also inhibits p53-mediated transcriptional activation in vivo and the in vitro association of p53 and ERCC3, a general transcription factor involved in nucleotide excision repair. Therefore, HBX may affect a wide range of p53 functions and contribute to the molecular pathogenesis of human hepatocellular carcinoma. Images PMID:8134379

  12. Using DNA looping to measure sequence dependent DNA elasticity

    NASA Astrophysics Data System (ADS)

    Kandinov, Alan; Raghunathan, Krishnan; Meiners, Jens-Christian

    2012-10-01

    We are using tethered particle motion (TPM) microscopy to observe protein-mediated DNA looping in the lactose repressor system in DNA constructs with varying AT / CG content. We use these data to determine the persistence length of the DNA as a function of its sequence content and compare the data to direct micromechanical measurements with constant-force axial optical tweezers. The data from the TPM experiments show a much smaller sequence effect on the persistence length than the optical tweezers experiments.

  13. The Human SWI-SNF Complex Protein p270 Is an ARID Family Member with Non-Sequence-Specific DNA Binding Activity

    PubMed Central

    Dallas, Peter B.; Pacchione, Stephen; Wilsker, Deborah; Bowrin, Valerie; Kobayashi, Ryuji; Moran, Elizabeth

    2000-01-01

    p270 is an integral member of human SWI-SNF complexes, first identified through its shared antigenic specificity with p300 and CREB binding protein. The deduced amino acid sequence of p270 reported here indicates that it is a member of an evolutionarily conserved family of proteins distinguished by the presence of a DNA binding motif termed ARID (AT-rich interactive domain). The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions. PMID:10757798

  14. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  15. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  16. Fractal analysis of DNA sequence data

    SciTech Connect

    Berthelsen, C.L.

    1993-01-01

    DNA sequence databases are growing at an almost exponential rate. New analysis methods are needed to extract knowledge about the organization of nucleotides from this vast amount of data. Fractal analysis is a new scientific paradigm that has been used successfully in many domains including the biological and physical sciences. Biological growth is a nonlinear dynamic process and some have suggested that to consider fractal geometry as a biological design principle may be most productive. This research is an exploratory study of the application of fractal analysis to DNA sequence data. A simple random fractal, the random walk, is used to represent DNA sequences. The fractal dimension of these walks is then estimated using the [open quote]sandbox method[close quote]. Analysis of 164 human DNA sequences compared to three types of control sequences (random, base-content matched, and dimer-content matched) reveals that long-range correlations are present in DNA that are not explained by base or dimer frequencies. The study also revealed that the fractal dimension of coding sequences was significantly lower than sequences that were primarily noncoding, indicating the presence of longer-range correlations in functional sequences. The multifractal spectrum is used to analyze fractals that are heterogeneous and have a different fractal dimension for subsets with different scalings. The multifractal spectrum of the random walks of twelve mitochondrial genome sequences was estimated. Eight vertebrate mtDNA sequences had uniformly lower spectra values than did four invertebrate mtDNA sequences. Thus, vertebrate mitochondria show significantly longer-range correlations than to invertebrate mitochondria. The higher multifractal spectra values for invertebrate mitochondria suggest a more random organization of the sequences. This research also includes considerable theoretical work on the effects of finite size, embedding dimension, and scaling ranges.

  17. Alignment method for spectrograms of DNA sequences.

    PubMed

    Bucur, Anca; van Leeuwen, Jasper; Dimitrova, Nevenka; Mittal, Chetan

    2010-01-01

    DNA spectrograms express the periodicities of each of the four nucleotides A, T, C, and G in one or several genomic sequences to be analyzed. DNA spectral analysis can be applied to systematically investigate DNA patterns, which may correspond to relevant biological features. As opposed to looking at nucleotide sequences, spectrogram analysis may detect structural characteristics in very long sequences that are not identifiable by sequence alignment. Alignment of DNA spectrograms can be used to facilitate analysis of very long sequences or entire genomes at different resolutions. Standard clustering algorithms have been used in spectral analysis to find strong patterns in spectra. However, as they use a global distance metric, these algorithms can only detect strong patterns coexisting in several frequencies. In this paper, we propose a new method and several algorithms for aligning spectra suitable for efficient spectral analysis and allowing for the easy detection of strong patterns in both single frequencies and multiple frequencies.

  18. Mineralocorticoid Receptor (MR) trans-Activation of Inflammatory AP-1 Signaling: DEPENDENCE ON DNA SEQUENCE, MR CONFORMATION, AND AP-1 FAMILY MEMBER EXPRESSION.

    PubMed

    Dougherty, Edward J; Elinoff, Jason M; Ferreyra, Gabriela A; Hou, Angela; Cai, Rongman; Sun, Junfeng; Blaine, Kevin P; Wang, Shuibang; Danner, Robert L

    2016-11-04

    Glucocorticoids are commonly used to treat inflammatory disorders. The glucocorticoid receptor (GR) can tether to inflammatory transcription factor complexes, such as NFκB and AP-1, and trans-repress the transcription of cytokines, chemokines, and adhesion molecules. In contrast, aldosterone and the mineralocorticoid receptor (MR) primarily promote cardiovascular inflammation by incompletely understood mechanisms. Although MR has been shown to weakly repress NFκB, its role in modulating AP-1 has not been established. Here, the effects of GR and MR on NFκB and AP-1 signaling were directly compared using a variety of ligands, two different AP-1 consensus sequences, GR and MR DNA-binding domain mutants, and siRNA knockdown or overexpression of core AP-1 family members. Both GR and MR repressed an NFκB reporter without influencing p65 or p50 binding to DNA. Likewise, neither GR nor MR affected AP-1 binding, but repression or activation of AP-1 reporters occurred in a ligand-, AP-1 consensus sequence-, and AP-1 family member-specific manner. Notably, aldosterone interactions with both GR and MR demonstrated a potential to activate AP-1. DNA-binding domain mutations that eliminated the ability of GR and MR to cis-activate a hormone response element-driven reporter variably affected the strength and polarity of these responses. Importantly, MR modulation of NFκB and AP-1 signaling was consistent with a trans-mechanism, and AP-1 effects were confirmed for specific gene targets in primary human cells. Steroid nuclear receptor trans-effects on inflammatory signaling are context-dependent and influenced by nuclear receptor conformation, DNA sequence, and the expression of heterologous binding partners. Aldosterone activation of AP-1 may contribute to its proinflammatory effects in the vasculature.

  19. Sequence dependent modulating effect of camptothecin on the DNA-cleaving activity of the calf thymus type I topoisomerase.

    PubMed Central

    Gromova, I I; Buchman, V L; Abagyan, R A; Ulyanov, A V; Bronstein, I B

    1990-01-01

    High-resolution mapping of topol cleavages in the regions of human DNA including the oncogene c-Ha-ras and p53, has revealed three kinds of topol cleavage sites: cleavage sites not affected by camptothecin; cleavage sites reinforced only in the presence of camptothecin, and cleavage sites which weaken in the presence of camptothecin. Statistical analysis of sequences revealed certain nucleotide or dinucleotide preferences for three groups studied. The preferences in camptothecin-reduced sites predominate upstream from the cleavage point, whereas in camptothecin-induced sites the situation is reversed. The influence of camptothecin on cleavage sites induced by two molecular forms of topol has been also studied. Images PMID:2155407

  20. The evolutionary pathway from a biologically inactive polypeptide sequence to a folded, active structural mimic of DNA

    PubMed Central

    Kanwar, Nisha; Roberts, Gareth A.; Cooper, Laurie P.; Stephanou, Augoustinos S.; Dryden, David T.F.

    2016-01-01

    The protein Ocr (overcome classical restriction) from bacteriophage T7 acts as a mimic of DNA and inhibits all Type I restriction/modification (RM) enzymes. Ocr is a homodimer of 116 amino acids and adopts an elongated structure that resembles the shape of a bent 24 bp DNA molecule. Each monomer includes 34 acidic residues and only six basic residues. We have delineated the mimicry of Ocr by focusing on the electrostatic contribution of its negatively charged amino acids using directed evolution of a synthetic form of Ocr, termed pocr, in which all of the 34 acidic residues were substituted for a neutral amino acid. In vivo analyses confirmed that pocr did not display any antirestriction activity. Here, we have subjected the gene encoding pocr to several rounds of directed evolution in which codons for the corresponding acidic residues found in Ocr were specifically re-introduced. An in vivo selection assay was used to detect antirestriction activity after each round of mutation. Our results demonstrate the variation in importance of the acidic residues in regions of Ocr corresponding to different parts of the DNA target which it is mimicking and for the avoidance of deleterious effects on the growth of the host. PMID:27095198

  1. The evolutionary pathway from a biologically inactive polypeptide sequence to a folded, active structural mimic of DNA.

    PubMed

    Kanwar, Nisha; Roberts, Gareth A; Cooper, Laurie P; Stephanou, Augoustinos S; Dryden, David T F

    2016-05-19

    The protein Ocr (overcome classical restriction) from bacteriophage T7 acts as a mimic of DNA and inhibits all Type I restriction/modification (RM) enzymes. Ocr is a homodimer of 116 amino acids and adopts an elongated structure that resembles the shape of a bent 24 bp DNA molecule. Each monomer includes 34 acidic residues and only six basic residues. We have delineated the mimicry of Ocr by focusing on the electrostatic contribution of its negatively charged amino acids using directed evolution of a synthetic form of Ocr, termed pocr, in which all of the 34 acidic residues were substituted for a neutral amino acid. In vivo analyses confirmed that pocr did not display any antirestriction activity. Here, we have subjected the gene encoding pocr to several rounds of directed evolution in which codons for the corresponding acidic residues found in Ocr were specifically re-introduced. An in vivo selection assay was used to detect antirestriction activity after each round of mutation. Our results demonstrate the variation in importance of the acidic residues in regions of Ocr corresponding to different parts of the DNA target which it is mimicking and for the avoidance of deleterious effects on the growth of the host.

  2. Nucleotide capacitance calculation for DNA sequencing

    SciTech Connect

    Lu, Jun-Qiang; Zhang, Xiaoguang

    2008-01-01

    Using a first-principles linear response theory, the capacitance of the DNA nucleotides, adenine, cytosine, guanine and thymine, are calculated. The difference in the capacitance between the nucleotides is studied with respect to conformational distortion. The result suggests that although an alternate current capacitance measurement of a single-stranded DNA chain threaded through a nano-gap electrodes may not sufficient to be used as a stand alone method for rapid DNA sequencing, the capacitance of the nucleotides should be taken into consideration in any GHz-frequency electric measurements and may also serve as an additional criterion for identifying the DNA sequence.

  3. Novel function of the poly(c)-binding protein α-CP2 as a transcriptional activator that binds to single-stranded DNA sequences.

    PubMed

    Kang, Duk-Hee; Song, Kyu Young; Wei, Li-Na; Law, Ping-Yee; Loh, Horace H; Choi, Hack Sun

    2013-11-01

    α-complex protein 2 (α-CP2) is known as an RNA-binding protein that interacts in a sequence-specific manner with single-stranded polycytosine [poly(C)]. This protein is involved in various post-transcriptional regulations, such as mRNA stabilization and translational regulation. In this study, the full-length mouse α-CP2 gene was expressed in an insoluble form with an N-terminal histidine tag in Escherichia coli and purified for homogeneity using affinity column chromatography. Its identity was confirmed using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry. Recombinant α-CP2 was expressed and refolded. The protein folding conditions for denatured α-CP2 were optimized. DNA and RNA electrophoretic mobility shift assays demonstrated that the recombinant α-CP2 is capable of binding to both single-stranded DNA and RNA poly(C) sequences. Furthermore, plasmids expressing α-CP2 activated the expression of a luciferase reporter when co-transfected with a single-stranded (pGL-SS) construct containing a poly(C) sequence. To our knowledge, this study demonstrates for the first time that α-CP2 functions as a transcriptional activator by binding to a single-stranded poly(C) sequence.

  4. Visible periodicity of strong nucleosome DNA sequences.

    PubMed

    Salih, Bilal; Tripathi, Vijay; Trifonov, Edward N

    2015-01-01

    Fifteen years ago, Lowary and Widom assembled nucleosomes on synthetic random sequence DNA molecules, selected the strongest nucleosomes and discovered that the TA dinucleotides in these strong nucleosome sequences often appear at 10-11 bases from one another or at distances which are multiples of this period. We repeated this experiment computationally, on large ensembles of natural genomic sequences, by selecting the strongest nucleosomes--i.e. those with such distances between like-named dinucleotides, multiples of 10.4 bases, the structural and sequence period of nucleosome DNA. The analysis confirmed the periodicity of TA dinucleotides in the strong nucleosomes, and revealed as well other periodic sequence elements, notably classical AA and TT dinucleotides. The matrices of DNA bendability and their simple linear forms--nucleosome positioning motifs--are calculated from the strong nucleosome DNA sequences. The motifs are in full accord with nucleosome positioning sequences derived earlier, thus confirming that the new technique, indeed, detects strong nucleosomes. Species- and isochore-specific variations of the matrices and of the positioning motifs are demonstrated. The strong nucleosome DNA sequences manifest the highest hitherto nucleosome positioning sequence signals, showing the dinucleotide periodicities in directly observable rather than in hidden form.

  5. Counterintuitive DNA Sequence Dependence in Supercoiling-Induced DNA Melting

    PubMed Central

    Vlijm, Rifka; v.d. Torre, Jaco; Dekker, Cees

    2015-01-01

    The metabolism of DNA in cells relies on the balance between hybridized double-stranded DNA (dsDNA) and local de-hybridized regions of ssDNA that provide access to binding proteins. Traditional melting experiments, in which short pieces of dsDNA are heated up until the point of melting into ssDNA, have determined that AT-rich sequences have a lower binding energy than GC-rich sequences. In cells, however, the double-stranded backbone of DNA is destabilized by negative supercoiling, and not by temperature. To investigate what the effect of GC content is on DNA melting induced by negative supercoiling, we studied DNA molecules with a GC content ranging from 38% to 77%, using single-molecule magnetic tweezer measurements in which the length of a single DNA molecule is measured as a function of applied stretching force and supercoiling density. At low force (<0.5pN), supercoiling results into twisting of the dsDNA backbone and loop formation (plectonemes), without inducing any DNA melting. This process was not influenced by the DNA sequence. When negative supercoiling is introduced at increasing force, local melting of DNA is introduced. We measured for the different DNA molecules a characteristic force Fchar, at which negative supercoiling induces local melting of the dsDNA. Surprisingly, GC-rich sequences melt at lower forces than AT-rich sequences: Fchar = 0.56pN for 77% GC but 0.73pN for 38% GC. An explanation for this counterintuitive effect is provided by the realization that supercoiling densities of a few percent only induce melting of a few percent of the base pairs. As a consequence, denaturation bubbles occur in local AT-rich regions and the sequence-dependent effect arises from an increased DNA bending/torsional energy associated with the plectonemes. This new insight indicates that an increased GC-content adjacent to AT-rich DNA regions will enhance local opening of the double-stranded DNA helix. PMID:26513573

  6. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. ); Arlinghaus, H.F. )

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  7. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    SciTech Connect

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A.; Arlinghaus, H.F.

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  8. Data structures for DNA sequence manipulation.

    PubMed Central

    Lawrence, C B

    1986-01-01

    Two data structures designated Fragment and Construct are described. The Fragment data structure defines a continuous nucleic acid sequence from a unique genetic origin. The Construct defines a continuous sequence composed of sequences from multiple genetic origins. These data structures are manipulated by a set of software tools to simulate the construction of mosaic recombinant DNA molecules. They are also used as an interface between sequence data banks and analytical programs. PMID:3753765

  9. EGNAS: an exhaustive DNA sequence design algorithm

    PubMed Central

    2012-01-01

    Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA) is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences) offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS. PMID:22716030

  10. DNA sequencing using electrical conductance measurements of a DNA polymerase

    NASA Astrophysics Data System (ADS)

    Chen, Yu-Shiun; Lee, Chia-Hui; Hung, Meng-Yen; Pan, Hsu-An; Chiou, Jin-Chern; Huang, G. Steven

    2013-06-01

    The development of personalized medicine--in which medical treatment is customized to an individual on the basis of genetic information--requires techniques that can sequence DNA quickly and cheaply. Single-molecule sequencing technologies, such as nanopores, can potentially be used to sequence long strands of DNA without labels or amplification, but a viable technique has yet to be established. Here, we show that single DNA molecules can be sequenced by monitoring the electrical conductance of a phi29 DNA polymerase as it incorporates unlabelled nucleotides into a template strand of DNA. The conductance of the polymerase is measured by attaching it to a protein transistor that consists of an antibody molecule (immunoglobulin G) bound to two gold nanoparticles, which are in turn connected to source and drain electrodes. The electrical conductance of the DNA polymerase exhibits well-separated plateaux that are ~3 pA in height. Each plateau corresponds to an individual base and is formed at a rate of ~22 nucleotides per second. Additional spikes appear on top of the plateaux and can be used to discriminate between the four different nucleotides. We also show that the sequencing platform works with a variety of DNA polymerases and can sequence difficult templates such as homopolymers.

  11. Identification of base and backbone contacts used for DNA sequence recognition and high-affinity binding by LAC9, a transcription activator containing a C6 zinc finger

    SciTech Connect

    Halvorsen, Yuan-Di C.; Nandabalan, K.; Dickson, R.C. )

    1991-04-01

    The LAC9 protein of Kluyveromyces lactis is a transcriptional regulator of genes in the lactose-galactose regulon. To regulate transcription, LAC9 must bind to 17-bp upstream activator sequences (UASs) located in front of each target gene. LAC9 is homologous to the GAL4 protein of Saccharomyces cerevisiae, and the two proteins must bind DNA in a very similar manner. In this paper the authors show that high-affinity, sequence-specific binding by LAC9 dimers is mediated primarily by 3 bp at each end of the UAS. In addition, at least one half of the UAS must have a GC or CG base pair at position 1 for high-affinity binding; LAC9k binds preferentially to the half containing the GC base pair. Hydroxyl radical footprinting shows that a LAC9 dimer binds an unusually broad region on one face of the DNA helix. Because of the data, they suggest that LAC9 contacts positions 6, 7, and 8, both plus and minus, of the UAS, which are separated by more than one turn of the DNA helix, and twists part way around the DNA, thus protecting the broad region of the minor groove between the major-groove contacts.

  12. Method for sequencing DNA base pairs

    DOEpatents

    Sessler, A.M.; Dawson, J.

    1993-12-14

    The base pairs of a DNA structure are sequenced with the use of a scanning tunneling microscope (STM). The DNA structure is scanned by the STM probe tip, and, as it is being scanned, the DNA structure is separately subjected to a sequence of infrared radiation from four different sources, each source being selected to preferentially excite one of the four different bases in the DNA structure. Each particular base being scanned is subjected to such sequence of infrared radiation from the four different sources as that particular base is being scanned. The DNA structure as a whole is separately imaged for each subjection thereof to radiation from one only of each source. 6 figures.

  13. Extracting biological knowledge from DNA sequences

    SciTech Connect

    De La Vega, F.M.; Thieffry, D. |; Collado-Vides, J.

    1996-12-31

    This session describes the elucidation of information from dna sequences and what challenges computational biologists face in their task of summarizing and deciphering the human genome. Techniques discussed include methods from statistics, information theory, artificial intelligence and linguistics. 1 ref.

  14. Nanopore DNA sequencing using kinetic proofreading

    NASA Astrophysics Data System (ADS)

    Ling, Xinsheng

    We propose a method of DNA sequencing by combining the physical method of nanopore electrical measurements and Southern's sequencing-by-hybridization. The new key ingredient, essential to both lowering the costs and increasing the precision, is an asymmetric nanopore sandwich device capable of measuring the DNA hybridization probe twice separated by a designed waiting time. Those incorrect probes appearing only once in nanopore ionic current traces are discriminated from the correct ones that appear twice. This method of discrimination is similar to the principle of kinetic proofreading proposed by Hopfield and Ninio in gene transcription and translation processes. An error analysis is of this nanopore kinetic proofreading (nKP) technique for DNA sequencing is carried out in comparison with the most precise 3' dideoxy termination method developed by Sanger. Nanopore DNA sequencing using kinetic proofreading.

  15. gargammel: a sequence simulator for ancient DNA.

    PubMed

    Renaud, Gabriel; Hanghøj, Kristian; Willerslev, Eske; Orlando, Ludovic

    2016-10-29

    Ancient DNA has emerged as a remarkable tool to infer the history of extinct species and past populations. However, many of its characteristics, such as extensive fragmentation, damage and contamination, can influence downstream analyses. To help investigators measure how these could impact their analyses in silico, we have developed gargammel, a package that simulates ancient DNA fragments given a set of known reference genomes. Our package simulates the entire molecular process from post-mortem DNA fragmentation and DNA damage to experimental sequencing errors, and reproduces most common bias observed in ancient DNA datasets.

  16. Image Correlation Method for DNA Sequence Alignment

    PubMed Central

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were “digitally” obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment. PMID:22761742

  17. Image correlation method for DNA sequence alignment.

    PubMed

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  18. The B cell coactivator Bob1 shows DNA sequence-dependent complex formation with Oct-1/Oct-2 factors, leading to differential promoter activation.

    PubMed

    Gstaiger, M; Georgiev, O; van Leeuwen, H; van der Vliet, P; Schaffner, W

    1996-06-03

    We have shown previously that both octamer binding transcription factors, namely the ubiquitous Oct-1 and the B cell-specific Oct-2A protein, can be enhanced in transcriptional activity by their association with the B cell-specific coactivator protein Bob1, also called OBF-1 or OCA-B. Here we study the structural requirements for ternary complex formation of DNA-Oct-Bob1 and coactivation function of Bob1. In analogy to DNA-bound transcription factors, Bob1 has a modular structure that includes an interaction domain (amino acids 1-65) and a C-terminal domain (amino acids 65-256), both important for transcriptional activation. A mutational analysis has resolved a region of seven amino acids (amino acids 26-32) in the N-terminus of Bob1 that are important for contacting the DNA binding POU domain of Oct-1 or Oct-2. In contrast to the viral coactivator VP16 (vmw65), which interacts with Oct-1 via the POU homeosubdomain, Bob1 association with Oct factors requires residues located in the POU-specific subdomain. Because the same residues are also involved in DNA recognition, we surmised that this association would affect the DNA binding specificity of the Oct-Bob1 complex compared with free Oct factors. While Oct-1 or Oct-2 bind to a large variety of octamer sequences, Bob1 ternary complex formation is indeed highly selective and occurs only in a subset of these sequences, leading to the differential coactivation of octamer-containing promoters. The results uncover a new level in selectivity that furthers our understanding in the regulation of cell type-specific gene expression.

  19. Identification of base and backbone contacts used for DNA sequence recognition and high-affinity binding by LAC9, a transcription activator containing a C6 zinc finger.

    PubMed Central

    Halvorsen, Y D; Nandabalan, K; Dickson, R C

    1991-01-01

    The LAC9 protein of Kluyveromyces lactis is a transcriptional regulator of genes in the lactose-galactose regulon. To regulate transcription, LAC9 must bind to 17-bp upstream activator sequences (UASs) located in front of each target gene. LAC9 is homologous to the GAL4 protein of Saccharomyces cerevisiae, and the two proteins must bind DNA in a very similar manner. In this paper we show that high-affinity, sequence-specific binding by LAC9 dimers is mediated primarily by 3 bp at each end of the UAS: [Formula: see text]. In addition, at least one half of the UAS must have a GC or CG base pair at position 1 for high-affinity binding; LAC9 binds preferentially to the half containing the GC base pair. Bases at positions 2, 3, and 4 in each half of the UAS make little if any contribution to binding. The center base pair is not essential for high-affinity LAC9 binding when DNA-binding activity measured in vitro. However, the center base pair must play an essential role in vivo, since all natural UASs have 17, not 16, bp. Hydroxyl radical footprinting shows that a LAC9 dimer binds an unusually broad region on one face of the DNA helix. Because of the data, we suggest that LAC9 contacts positions 6, 7, and 8, both plus and minus, of the UAS, which are separated by more than one turn of the DNA helix, and twists part way around the DNA, thus protecting the broad region of the minor groove between the major-groove contacts. Images PMID:2005880

  20. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, Karin D.; Chu, Tun-Jen; Pitt, William G.

    1992-01-01

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through said smino groups contained on the surface thereof. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to said target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membrances may be reprobed numerous times.

  1. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, K.D.; Chu, T.J.; Pitt, W.G.

    1992-05-12

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings

  2. Sequencing Intractable DNA to Close Microbial Genomes

    SciTech Connect

    Hurt, Jr., Richard Ashley; Brown, Steven D; Podar, Mircea; Palumbo, Anthony Vito; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  3. Nanopore-CMOS Interfaces for DNA Sequencing.

    PubMed

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-08-06

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.

  4. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    NASA Astrophysics Data System (ADS)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  5. Chimeric DNA methyltransferases target DNA methylation to specific DNA sequences and repress expression of target genes

    PubMed Central

    Li, Fuyang; Papworth, Monika; Minczuk, Michal; Rohde, Christian; Zhang, Yingying; Ragozin, Sergei; Jeltsch, Albert

    2007-01-01

    Gene silencing by targeted DNA methylation has potential applications in basic research and therapy. To establish targeted methylation in human cell lines, the catalytic domains (CDs) of mouse Dnmt3a and Dnmt3b DNA methyltransferases (MTases) were fused to different DNA binding domains (DBD) of GAL4 and an engineered Cys2His2 zinc finger domain. We demonstrated that (i) Dense DNA methylation can be targeted to specific regions in gene promoters using chimeric DNA MTases. (ii) Site-specific methylation leads to repression of genes controlled by various cellular or viral promoters. (iii) Mutations affecting any of the DBD, MTase or target DNA sequences reduce targeted methylation and gene silencing. (iv) Targeted DNA methylation is effective in repressing Herpes Simplex Virus type 1 (HSV-1) infection in cell culture with the viral titer reduced by at least 18-fold in the presence of an MTase fused to an engineered zinc finger DBD, which binds a single site in the promoter of HSV-1 gene IE175k. In short, we show here that it is possible to direct DNA MTase activity to predetermined sites in DNA, achieve targeted gene silencing in mammalian cell lines and interfere with HSV-1 propagation. PMID:17151075

  6. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    PubMed

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  7. Quadruplex DNA: sequence, topology and structure

    PubMed Central

    Burge, Sarah; Parkinson, Gary N.; Hazel, Pascale; Todd, Alan K.; Neidle, Stephen

    2006-01-01

    G-quadruplexes are higher-order DNA and RNA structures formed from G-rich sequences that are built around tetrads of hydrogen-bonded guanine bases. Potential quadruplex sequences have been identified in G-rich eukaryotic telomeres, and more recently in non-telomeric genomic DNA, e.g. in nuclease-hypersensitive promoter regions. The natural role and biological validation of these structures is starting to be explored, and there is particular interest in them as targets for therapeutic intervention. This survey focuses on the folding and structural features on quadruplexes formed from telomeric and non-telomeric DNA sequences, and examines fundamental aspects of topology and the emerging relationships with sequence. Emphasis is placed on information from the high-resolution methods of X-ray crystallography and NMR, and their scope and current limitations are discussed. Such information, together with biological insights, will be important for the discovery of drugs targeting quadruplexes from particular genes. PMID:17012276

  8. Female-specific DNA sequences in geese.

    PubMed

    Huang, M C; Lin, W C; Horng, Y M; Rouvier, R; Huang, C W

    2003-07-01

    1. The OPAE random primers (Operon Technologies, Inc., CA) were used for random amplified polymorphic DNA (RAPD) fingerprinting in Chinese, White Roman and Landaise geese. One of these primers, OPAE-06, produced a 938-bp sex-specific fragment in all females and in no males of Chinese geese only. 2. A novel female-specific DNA sequence in Chinese goose was cloned and sequenced. Two primers, CGSex-F and CGSex-R, were designed in order to amplify a 912-bp sex-specific polymerase chain reaction (PCR) fragment on genomic DNA from female geese. 3. It was shown that a simple and effective PCR-based sexing technique could be used in the three goose breeds studied. 4. Nucleotide sequencing of the sex-specific fragments in White Roman and Landaise geese was performed and sequence differences were observed among these three breeds.

  9. Dynamics and control of DNA sequence amplification

    NASA Astrophysics Data System (ADS)

    Marimuthu, Karthikeyan; Chakrabarti, Raj

    2014-10-01

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  10. Dynamics and control of DNA sequence amplification

    SciTech Connect

    Marimuthu, Karthikeyan; Chakrabarti, Raj E-mail: rajc@andrew.cmu.edu

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  11. Compressing DNA sequence databases with coil

    PubMed Central

    White, W Timothy J; Hendy, Michael D

    2008-01-01

    Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794

  12. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  13. Active DNA unwinding dynamics during processive DNA replication.

    PubMed

    Morin, José A; Cao, Francisco J; Lázaro, José M; Arias-Gonzalez, J Ricardo; Valpuesta, José M; Carrascosa, José L; Salas, Margarita; Ibarra, Borja

    2012-05-22

    Duplication of double-stranded DNA (dsDNA) requires a fine-tuned coordination between the DNA replication and unwinding reactions. Using optical tweezers, we probed the coupling dynamics between these two activities when they are simultaneously carried out by individual Phi29 DNA polymerase molecules replicating a dsDNA hairpin. We used the wild-type and an unwinding deficient polymerase variant and found that mechanical tension applied on the DNA and the DNA sequence modulate in different ways the replication, unwinding rates, and pause kinetics of each polymerase. However, incorporation of pause kinetics in a model to quantify the unwinding reaction reveals that both polymerases destabilize the fork with the same active mechanism and offers insights into the topological strategies that could be used by the Phi29 DNA polymerase and other DNA replication systems to couple unwinding and replication reactions.

  14. Expression of active human blood clotting factor IX in transgenic mice: use of a cDNA with complete mRNA sequence.

    PubMed Central

    Choo, K H; Raphael, K; McAdam, W; Peterson, M G

    1987-01-01

    Haemophilia B is a bleeding disorder caused by a functional deficiency of the clotting factor IX. A full length human factor IX complementary DNA clone containing all the natural mRNA sequences plus some flanking intron sequences was constructed with a metallothionein promoter and introduced into transgenic mice by microinjection into the pronuclei of fertilised eggs. The transgenic mice expressed high levels of messenger RNA, gamma-carboxylated and glycosylated protein, and biological clotting activity that are indistinguishable from normal human plasma factor IX. This study demonstrates the feasibility of expressing highly complex heterologous proteins in transgenic mice. It also provides the groundwork for the production of large amounts of human factor IX in larger transgenic livestock for therapeutic use, and the investigation of alternative genetic therapies for haemophilia B. Images PMID:3029708

  15. Inferring ethnicity from mitochondrial DNA sequence

    PubMed Central

    2011-01-01

    Background The assignment of DNA samples to coarse population groups can be a useful but difficult task. One such example is the inference of coarse ethnic groupings for forensic applications. Ethnicity plays an important role in forensic investigation and can be inferred with the help of genetic markers. Being maternally inherited, of high copy number, and robust persistence in degraded samples, mitochondrial DNA may be useful for inferring coarse ethnicity. In this study, we compare the performance of methods for inferring ethnicity from the sequence of the hypervariable region of the mitochondrial genome. Results We present the results of comprehensive experiments conducted on datasets extracted from the mtDNA population database, showing that ethnicity inference based on support vector machines (SVM) achieves an overall accuracy of 80-90%, consistently outperforming nearest neighbor and discriminant analysis methods previously proposed in the literature. We also evaluate methods of handling missing data and characterize the most informative segments of the hypervariable region of the mitochondrial genome. Conclusions Support vector machines can be used to infer coarse ethnicity from a small region of mitochondrial DNA sequence with surprisingly high accuracy. In the presence of missing data, utilizing only the regions common to the training sequences and a test sequence proves to be the best strategy. Given these results, SVM algorithms are likely to also be useful in other DNA sequence classification applications. PMID:21554759

  16. Sequencing of long stretches of repetitive DNA

    PubMed Central

    De Bustos, Alfredo; Cuadrado, Angeles; Jouve, Nicolás

    2016-01-01

    Repetitive DNA is widespread in eukaryotic genomes, in some cases making up more than 80% of the total. SSRs are a type of repetitive DNA formed by short motifs repeated in tandem arrays. In some species, SSRs may be organized into long stretches, usually associated with the constitutive heterochromatin. Variation in repeats can alter the expression of genes, and changes in the number of repeats have been linked to certain human diseases. Unfortunately, the molecular characterization of these repeats has been hampered by technical limitations related to cloning and sequencing. Indeed, most sequenced genomes contain gaps owing to repetitive DNA-related assembly difficulties. This paper reports an alternative method for sequencing of long stretches of repetitive DNA based on the combined use of 1) a linear vector to stabilize the cloning process, and 2) the use of exonuclease III for obtaining progressive deletions of SSR-rich fragments. This strategy allowed the sequencing of a fragment containing a stretch of 6.2 kb of continuous SSRs. To demonstrate that this procedure can sequence other kinds of repetitive DNA, it was used to examine a 4.5 kb fragment containing a cluster of 15 repeats of the 5S rRNA gene of barley. PMID:27819354

  17. DNA sequencing by synthesis based on elongation delay detection

    NASA Astrophysics Data System (ADS)

    Manturov, Alexey O.; Grigoryev, Anton V.

    2015-03-01

    The one of most important problem in modern genetics, biology and medicine is determination of the primary nucleotide sequence of the DNA of living organisms (DNA sequencing). This paper describes the label-free DNA sequencing approach, based on the observation of a discrete dynamics of DNA sequence elongation phase. The proposed DNA sequencing principle are studied by numerical simulation. The numerical model for proposed label-free DNA sequencing approach is based on a cellular automaton, which can simulate the elongation stage (growth of DNA strands) and dynamics of nucleotides incorporation to rising DNA strand. The estimates for number of copied DNA sequences for required probability of nucleotide incorporation event detection and correct DNA sequence determination was obtained. The proposed approach can be applied at all known DNA sequencing devices with "sequencing by synthesis" principle of operation.

  18. Unzipping of DNA with correlated base sequence.

    PubMed

    Allahverdyan, A E; Gevorkian, Zh S; Hu, Chin-Kun; Wu, Ming-Chya

    2004-06-01

    We consider force-induced unzipping transition for a heterogeneous DNA model with a correlated base sequence. Both finite-range and long-range correlated situations are considered. It is shown that finite-range correlations increase stability of DNA with respect to the external unzipping force. Due to long-range correlations the number of unzipped base pairs displays two widely different scenarios depending on the details of the base sequence: either there is no unzipping phase transition at all, or the transition is realized via a sequence of jumps with magnitude comparable to the size of the system. Both scenarios are different from the behavior of the average number of unzipped base pairs (non-self-averaging). The results can be relevant for explaining the biological purpose of correlated structures in DNA.

  19. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  20. A Bioluminometric Method of DNA Sequencing

    NASA Technical Reports Server (NTRS)

    Ronaghi, Mostafa; Pourmand, Nader; Stolc, Viktor; Arnold, Jim (Technical Monitor)

    2001-01-01

    Pyrosequencing is a bioluminometric single-tube DNA sequencing method that takes advantage of co-operativity between four enzymes to monitor DNA synthesis. In this sequencing-by-synthesis method, a cascade of enzymatic reactions yields detectable light, which is proportional to incorporated nucleotides. Pyrosequencing has the advantages of accuracy, flexibility and parallel processing. It can be easily automated. Furthermore, the technique dispenses with the need for labeled primers, labeled nucleotides and gel-electrophoresis. In this chapter, the use of this technique for different applications is discussed.

  1. Sequence specificity of DNA cleavage by Micrococcus luteus. gamma. endonuclease

    SciTech Connect

    Hentosh, P.; Henner, W.D.; Reynolds, R.J.

    1985-04-01

    DNA fragments of defined sequence have been used to determine the sites of cleavage by ..gamma..-endonuclease activity in extracts prepared from Micrococcus luteus. End-labeled DNA restriction fragments of pBR322 DNA that had been irradiated under nitrogen in the presence of potassium iodide or t-butanol were treated with M. luteus ..gamma.. endonuclease and analyzed on irradiated DNA preferentially at the positions of cytosines and thymines. DNA cleavage occurred immediately to the 3' side of pyrimidines in irradiated DNA and resulted in fragments that terminate in a 5'-phosphoryl group. These studies indicate that both altered cytosines and thymines may be important DNA lesions requiring repair after exposure to ..gamma.. radiation.

  2. Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

    PubMed

    Gupta, P D

    2016-10-01

    In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.

  3. A microchannel electrophoresis DNA sequencing system

    SciTech Connect

    Madabhushi, R S; Warth, T; Balch, J W; Bass, M; Brewer, L R; Copeland, A C; Davidson, J C; Fitch, J P; Kegelmeyer, L M; Kimbrough, J R; McCready, P; Nelson, D; Pastrone, R L; Richardson, P M; Swierkowski, S P; Tarte, L A; Vainer, M

    1999-01-01

    In order to increase the DNA sequencing throughput of the Joint Genome Institute, we have developed a microchannel electrophoresis system. The critical new and unique elements of this system include 1) a process for the production of arrays of 96 and 384 microchannels on bonded glass substrates up to 14 x 58 cm and 2) new sieving media for high resolution and high speed separations. With custom fabrication apparatus, microchannels are etched in a borosilicate substrate, and then fusion bonded to a top substrate 1.1 mm thick that has access holes formed in it. SEM examination shows a typical microchannel to be 40 micrometers deep x 180 micrometers wide by 46 cm long. This technology offers significant advantages over discrete capillaries or conventional slab-gel approaches. High throughput DNA sequencing with over 550 base pairs resolution has been achieved in roughly half the time of conventional sequencers. In February 1999, we begin a pre-production evaluation protocol for the microchannel and for three glass capillary electrophoresis systems (two from industry and one developed by Lawrence Berkeley National Laboratory for the Joint Genome Institute). In order to utilize these instruments for DNA production sequencing, we have been evaluating and implementing software to convert raw electropherograms into called DNA bases with an associated probability of error. Our original intent was to utilize the DNA base calling software known as Plan and Phred developed by the University of Washington. This software has been outstanding for our slab gel electrophoresis systems currently in the production facility. In our tests and evaluations of this software applied to microchannel data, we observed that the electropherograms are of a different statistical and underlying signal structure compared to slab gels. Even with substantial modifications to the software, base calling performance was not satisfactory for the microchannel data. In this paper, we will present o The

  4. A Simulation of DNA Sequencing Utilizing 3M Post-It[R] Notes

    ERIC Educational Resources Information Center

    Christensen, Doug

    2009-01-01

    An inexpensive and equipment free approach to teaching the technical aspects of DNA sequencing. The activity described requires an instructor with a familiarity of DNA sequencing technology but provides a straight forward method of teaching the technical aspects of sequencing in the absence of expensive sequencing equipment. The final sequence…

  5. DNA sequence and structure requirements for cleavage of V(D)J recombination signal sequences.

    PubMed Central

    Cuomo, C A; Mundy, C L; Oettinger, M A

    1996-01-01

    Purified RAG1 and RAG2 proteins can cleave DNA at V(D)J recombination signals. In dissecting the DNA sequence and structural requirements for cleavage, we find that the heptamer and nonamer motifs of the recombination signal sequence can independently direct both steps of the cleavage reaction. Proper helical spacing between these two elements greatly enhances the efficiency of cleavage, whereas improper spacing can lead to interference between the two elements. The signal sequences are surprisingly tolerant of structural variation and function efficiently when nicks, gaps, and mismatched bases are introduced or even when the signal sequence is completely single stranded. Sequence alterations that facilitate unpairing of the bases at the signal/coding border activate the cleavage reaction, suggesting that DNA distortion is critical for V(D)J recombination. PMID:8816481

  6. New Stopping Criteria for Segmenting DNA Sequences

    SciTech Connect

    Li, Wentian

    2001-06-18

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian information criterion in the model selection framework. When this criterion is applied to telomere of S.cerevisiae and the complete sequence of E.coli, borders of biologically meaningful units were identified, and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  7. New Stopping Criteria for Segmenting DNA Sequences

    NASA Astrophysics Data System (ADS)

    Li, Wentian

    2001-06-01

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian information criterion in the model selection framework. When this criterion is applied to telomere of S. cerevisiae and the complete sequence of E. coli, borders of biologically meaningful units were identified, and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  8. DNA Sequence Alignment during Homologous Recombination.

    PubMed

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination.

  9. The first determination of DNA sequence of a specific gene.

    PubMed

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  10. Imaging of DNA sequences with chemiluminescence.

    PubMed Central

    Tizard, R; Cate, R L; Ramachandran, K L; Wysk, M; Voyta, J C; Murphy, O J; Bronstein, I

    1990-01-01

    We have coupled a chemiluminescent detection method that uses an alkaline phosphatase label to the genomic DNA sequencing protocol of Church and Gilbert [Church, G. M. & Gilbert, W. (1984) Proc. Natl. Acad. Sci. USA 81, 1991-1995]. Images of sequence ladders are obtained on x-ray film with exposure times of less than 30 min, as compared to 40 h required for a similar exposure with a 32P-labeled oligomer. Chemically cleaved DNA from a sequencing gel is transferred to a nylon membrane, and specific sequence ladders are selected by hybridization to DNA oligonucleotides labeled with alkaline phosphatase or with biotin, leading directly or indirectly to deposition of enzyme. If a biotinylated probe is used, an incubation with avidin-alkaline phosphatase conjugate follows. The membrane is soaked in the chemiluminescent substrate (AMPPD) and is exposed to film. Dephosphorylation of AMPPD leads in a two-step pathway to a highly localized emission of visible light. The demonstrated shorter exposure times may improve the efficiency of a serial reprobing strategy such as the multiplex sequencing approach of Church and Kieffer-Higgins [Church, G. M. & Kieffer-Higgins, S. (1988) Science 240, 185-188]. Images PMID:2191292

  11. Nanopore-CMOS Interfaces for DNA Sequencing

    PubMed Central

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-01-01

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces. PMID:27509529

  12. Repetitive DNA sequences in Mycoplasma pneumoniae.

    PubMed Central

    Wenzel, R; Herrmann, R

    1988-01-01

    Two types of different repetitive DNA sequences called RepMP1 and RepMP2 were identified in the genome of Mycoplasma pneumoniae. The number of these repeated elements, their nucleotide sequence and their localization on a physical map of the M. pneumoniae genome were determined. The results show that RepMP1 appears at least 10 times and RepMP2 at least 8 times in the genome. The repeated elements are dispersed on the chromosome and, in three cases, linked to each other by a homologous DNA sequence of 400 bp. The elements themselves are 300 bp (for RepMP1) and 150 bp (for RepMP2) long showing a high degree of homology. One copy of RepMP2 is a translated part of the gene for the major cytadhesin protein P1 which is responsible for the adsorption of M. pneumoniae to its host cell. Images PMID:3138660

  13. DNA sequencing by nanopores: advances and challenges

    NASA Astrophysics Data System (ADS)

    Agah, Shaghayegh; Zheng, Ming; Pasquali, Matteo; Kolomeisky, Anatoly B.

    2016-10-01

    Developing inexpensive and simple DNA sequencing methods capable of detecting entire genomes in short periods of time could revolutionize the world of medicine and technology. It will also lead to major advances in our understanding of fundamental biological processes. It has been shown that nanopores have the ability of single-molecule sensing of various biological molecules rapidly and at a low cost. This has stimulated significant experimental efforts in developing DNA sequencing techniques by utilizing biological and artificial nanopores. In this review, we discuss recent progress in the nanopore sequencing field with a focus on the nature of nanopores and on sensing mechanisms during the translocation. Current challenges and alternative methods are also discussed.

  14. Sequence-Dependent Persistence Lengths of DNA.

    PubMed

    Mitchell, Jonathan S; Glowacki, Jaroslaw; Grandchamp, Alexandre E; Manning, Robert S; Maddocks, John H

    2017-03-24

    A Monte Carlo code applied to the cgDNA coarse-grain rigid-base model of B-form double-stranded DNA is used to predict a sequence-averaged persistence length of lF = 53.5 nm in the sense of Flory, and of lp = 160 bp or 53.5 nm in the sense of apparent tangent-tangent correlation decay. These estimates are slightly higher than the consensus experimental values of 150 bp or 50 nm, but we believe the agreement to be good given that the cgDNA model is itself parametrized from molecular dynamics simulations of short fragments of length 10-20 bp, with no explicit fit to persistence length. Our Monte Carlo simulations further predict that there can be substantial dependence of persistence lengths on the specific sequence [Formula: see text] of a fragment. We propose, and confirm the numerical accuracy of, a simple factorization that separates the part of the apparent tangent-tangent correlation decay [Formula: see text] attributable to intrinsic shape, from a part [Formula: see text] attributable purely to stiffness, i.e., a sequence-dependent version of what has been called sequence-averaged dynamic persistence length l̅d (=58.8 nm within the cgDNA model). For ensembles of both random and λ-phage fragments, the apparent persistence length [Formula: see text] has a standard deviation of 4 nm over sequence, whereas our dynamic persistence length [Formula: see text] has a standard deviation of only 1 nm. However, there are notable dynamic persistence length outliers, including poly(A) (exceptionally straight and stiff), poly(TA) (tightly coiled and exceptionally soft), and phased A-tract sequence motifs (exceptionally bent and stiff). The results of our numerical simulations agree reasonably well with both molecular dynamics simulation and diverse experimental data including minicircle cyclization rates and stereo cryo-electron microscopy images.

  15. Sequence-specific recognition of DNA nanostructures.

    PubMed

    Rusling, David A; Fox, Keith R

    2014-05-15

    DNA is the most exploited biopolymer for the programmed self-assembly of objects and devices that exhibit nanoscale-sized features. One of the most useful properties of DNA nanostructures is their ability to be functionalized with additional non-nucleic acid components. The introduction of such a component is often achieved by attaching it to an oligonucleotide that is part of the nanostructure, or hybridizing it to single-stranded overhangs that extend beyond or above the nanostructure surface. However, restrictions in nanostructure design and/or the self-assembly process can limit the suitability of these procedures. An alternative strategy is to couple the component to a DNA recognition agent that is capable of binding to duplex sequences within the nanostructure. This offers the advantage that it requires little, if any, alteration to the nanostructure and can be achieved after structure assembly. In addition, since the molecular recognition of DNA can be controlled by varying pH and ionic conditions, such systems offer tunable properties that are distinct from simple Watson-Crick hybridization. Here, we describe methodology that has been used to exploit and characterize the sequence-specific recognition of DNA nanostructures, with the aim of generating functional assemblies for bionanotechnology and synthetic biology applications.

  16. Compilation of DNA sequences of Escherichia coli

    PubMed Central

    Kröger, Manfred

    1989-01-01

    We have compiled the DNA sequence data for E.coli K12 available from the GENBANK and EMBO databases and over a period of several years independently from the literature. We have introduced all available genetic map data and have arranged the sequences accordingly. As far as possible the overlaps are deleted and a total of 940,449 individual bp is found to be determined till the beginning of 1989. This corresponds to a total of 19.92% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2% derived from the sequence of lysogenic bacteriophage lambda and the various insertion sequences. This compilation may be available in machine readable form from one of the international databanks in some future. PMID:2654890

  17. Complete sequence of Euglena gracilis chloroplast DNA.

    PubMed Central

    Hallick, R B; Hong, L; Drager, R G; Favreau, M R; Monfort, A; Orsat, B; Spielmann, A; Stutz, E

    1993-01-01

    We report the complete DNA sequence of the Euglena gracilis, Pringsheim strain Z chloroplast genome. This circular DNA is 143,170 bp, counting only one copy of a 54 bp tandem repeat sequence that is present in variable copy number within a single culture. The overall organization of the genome involves a tandem array of three complete and one partial ribosomal RNA operons, and a large single copy region. There are genes for the 16S, 5S, and 23S rRNAs of the 70S chloroplast ribosomes, 27 different tRNA species, 21 ribosomal proteins plus the gene for elongation factor EF-Tu, three RNA polymerase subunits, and 27 known photosynthesis-related polypeptides. Several putative genes of unknown function have also been identified, including five within large introns, and five with amino acid sequence similarity to genes in other organisms. This genome contains at least 149 introns. There are 72 individual group II introns, 46 individual group III introns, 10 group II introns and 18 group III introns that are components of twintrons (introns-within-introns), and three additional introns suspected to be twintrons composed of multiple group II and/or group III introns, but not yet characterized. At least 54,804 bp, or 38.3% of the total DNA content is represented by introns. PMID:8346031

  18. DNA SEQUENCING RESEARCH GROUP (DSRG) 2003—A GENERAL SURVEY OF CORE DNA SEQUENCING FACILITIES

    PubMed Central

    Wiebe, Glenis J.; Pershad, Rashmi; Escobar, Helaman; Hawes, John W.; Hunter, Timothy; Jackson-Machelski, Emily; Knudtson, Kevin L.; Robertson, Margaret; Thannhauser, Theodore W.

    2003-01-01

    DNA sequencing core facilities serve as centralized resources within both academic and commercial institutions, providing expertise in the area of DNA analysis. The composition and configuration of these facilities continue to evolve in response to new developments in instrumentation and methodology. The goal of the 2003 DNA Sequencing Research Group (DSRG) survey was to identify recent changes in staffing, funding, instrumentation, services, and customer relations. Responses to 58 survey questions from 30 participants are presented to offer a look at the current typical DNA core sequencing facility. The results from this study will serve as a resource for institutions to benchmark their shared core laboratories, and to give facility directors an opportunity to compare and contrast their respective services and experiences.

  19. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    DTIC Science & Technology

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  20. An oligonucleotide hybridization approach to DNA sequencing.

    PubMed

    Khrapko, K R; Lysov YuP; Khorlyn, A A; Shick, V V; Florentiev, V L; Mirzabekov, A D

    1989-10-09

    We have proposed a DNA sequencing method based on hybridization of a DNA fragment to be sequenced with the complete set of fixed-length oligonucleotides (e.g., 4(8) = 65,536 possible 8-mers) immobilized individually as dots of a 2-D matrix [(1989) Dokl. Akad. Nauk SSSR 303, 1508-1511]. It was shown that the list of hybridizing octanucleotides is sufficient for the computer-assisted reconstruction of the structures for 80% of random-sequence fragments up to 200 bases long, based on the analysis of the octanucleotide overlapping. Here a refinement of the method and some experimental data are presented. We have performed hybridizations with oligonucleotides immobilized on a glass plate, and obtained their dissociation curves down to heptanucleotides. Other approaches, e.g., an additional hybridization of short oligonucleotides which continuously extend duplexes formed between the fragment and immobilized oligonucleotides, should considerably increase either the probability of unambiguous reconstruction, or the length of reconstructed sequences, or decrease the size of immobilized oligonucleotides.

  1. Text mining of DNA sequence homology searches.

    PubMed

    McCallum, John; Ganesh, Siva

    2003-01-01

    Primary tasks in analysis and annotation of expressed sequence tag (EST) datasets are to identify similarity among sequences by unsupervised clustering and assign putative function based on BLAST homology searches. We investigated the usefulness of text mining as a simple approach for further higher-level clustering of EST datasets using IBM Intelligent Miner for Text v2.3 tools. Agglomerative and k-means clustering tools were used to cluster BLASTx homology search documents from two onion EST datasets and optimised by pre-processing and pruning. Subjective evaluation confirmed that these tools provided biologically useful and complementary views of the two libraries, provided new insights into their composition and revealed clusters previously identified by human experts. We compared BLASTx textual clusters for two gene families with their DNA sequence-based clusters and confirmed that these shared similar morphology.

  2. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1987-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113

  3. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1989-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889

  4. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1988-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330

  5. A comprehensive list of cloned human DNA sequences

    PubMed Central

    Schmidtke, Jörg; Cooper, David N.

    1990-01-01

    A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227

  6. Aspects of coverage in medical DNA sequencing

    PubMed Central

    Wendl, Michael C; Wilson, Richard K

    2008-01-01

    Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study. PMID:18485222

  7. Negatively supercoiled simian virus 40 DNA contains Z-DNA segments within transcriptional enhancer sequences

    NASA Technical Reports Server (NTRS)

    Nordheim, A.; Rich, A.

    1983-01-01

    Three 8-base pair (bp) segments of alternating purine-pyrimidine from the simian virus 40 enhancer region form Z-DNA on negative supercoiling; minichromosome DNase I-hypersensitive sites determined by others bracket these three segments. A survey of transcriptional enhancer sequences reveals a pattern of potential Z-DNA-forming regions which occur in pairs 50-80 bp apart. This may influence local chromatin structure and may be related to transcriptional activation.

  8. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    SciTech Connect

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  9. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

    1997-05-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  10. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    SciTech Connect

    Not Available

    1992-01-01

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3' to 5' exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  11. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    SciTech Connect

    Not Available

    1992-12-31

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3` to 5` exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  12. Sequence dependent hole evolution in DNA.

    PubMed

    Lakhno, V D

    2004-06-01

    The paper examines thedynamical behavior of a radical cation(G(+*)) generated in adouble stranded DNA for differentoligonucleotide sequences. The resonancehole tunneling through an oligonucleotidesequence is studied by the method ofnumerical integration of self-consistentquantum-mechanical equations. The holemotion is considered quantum mechanicallyand nucleotide base oscillations aretreated classically. The results obtaineddemonstrate a strong dependence of chargetransfer on the type of nucleotidesequence. The rates of the hole transferare calculated for different nucleotidesequences and compared with experimentaldata on the transfer from (G(+*))to a GGG unit.

  13. Recent advances in DNA sequencing techniques

    NASA Astrophysics Data System (ADS)

    Singh, Rama Shankar

    2013-06-01

    Successful mapping of the draft human genome in 2001 and more recent mapping of the human microbiome genome in 2012 have relied heavily on the parallel processing of the second generation/Next Generation Sequencing (NGS) DNA machines at a cost of several millions dollars and long computer processing times. These have been mainly biochemical approaches. Here a system analysis approach is used to review these techniques by identifying the requirements, specifications, test methods, error estimates, repeatability, reliability and trends in the cost reduction. The first generation, NGS and the Third Generation Single Molecule Real Time (SMART) detection sequencing methods are reviewed. Based on the National Human Genome Research Institute (NHGRI) data, the achieved cost reduction of 1.5 times per yr. from Sep. 2001 to July 2007; 7 times per yr., from Oct. 2007 to Apr. 2010; and 2.5 times per yr. from July 2010 to Jan 2012 are discussed.

  14. Transverse Electronic Signature of DNA for Electronic Sequencing

    NASA Astrophysics Data System (ADS)

    Xu, Mingsheng; Endres, Robert G.; Arakawa, Yasuhiko

    In recent years, the proliferation of large-scale DNA sequencing projects for applications in clinical medicine and health care has driven the search for new methods that could reduce the time and cost. The commonly used Sanger sequencing method relies on the chemistry to read the bases in DNA and is far too slow and expensive for reading personal genetic codes. There were earlier attempts to sequence DNA by directly visualizing the nucleotide composition of the DNA molecules by scanning tunneling microscopy (STM). However, sequencing DNA based on directly imaging DNA's atomic structure has not yet been successful. In Chap. 9, Xu, Endres, and Arakawa report a potential physical alternative by detecting unique transverse electronic signatures of DNA bases using ultrahigh vacuum STM. Supported by the principles, calculations and statistical analyses, these authors argue that it would be possible to directly sequence DNA by the STM-based technology without any modification of the DNA.

  15. Determining orientation and direction of DNA sequences

    DOEpatents

    Goodwin, Edwin H.; Meyne, Julianne

    2000-01-01

    Determining orientation and direction of DNA sequences. A method by which fluorescence in situ hybridization can be made strand specific is described. Cell cultures are grown in a medium containing a halogenated nucleotide. The analog is partially incorporated in one DNA strand of each chromatid. This substitution takes place in opposite strands of the two sister chromatids. After staining with the fluorescent DNA-binding dye Hoechst 33258, cells are exposed to long-wavelength ultraviolet light which results in numerous strand nicks. These nicks enable the substituted strand to be denatured and solubilized by heat, treatment with high or low pH aqueous solutions, or by immersing the strands in 2.times.SSC (0.3M NaCl+0.03M sodium citrate), to name three procedures. It is unnecessary to enzymatically digest the strands using Exo III or another exonuclease in order to excise and solubilize nucleotides starting at the sites of the nicks. The denaturing/solubilizing process removes most of the substituted strand while leaving the prereplication strand largely intact. Hybridization of a single-stranded probe of a tandem repeat arranged in a head-to-tail orientation will result in hybridization only to the chromatid with the complementary strand present.

  16. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  17. Non-random DNA fragmentation in next-generation sequencing

    NASA Astrophysics Data System (ADS)

    Poptsova, Maria S.; Il'Icheva, Irina A.; Nechipurenko, Dmitry Yu.; Panchenko, Larisa A.; Khodikov, Mingian V.; Oparina, Nina Y.; Polozov, Robert V.; Nechipurenko, Yury D.; Grokhovsky, Sergei L.

    2014-03-01

    Next Generation Sequencing (NGS) technology is based on cutting DNA into small fragments, and their massive parallel sequencing. The multiple overlapping segments termed ``reads'' are assembled into a contiguous sequence. To reduce sequencing errors, every genome region should be sequenced several dozen times. This sequencing approach is based on the assumption that genomic DNA breaks are random and sequence-independent. However, previously we showed that for the sonicated restriction DNA fragments the rates of double-stranded breaks depend on the nucleotide sequence. In this work we analyzed genomic reads from NGS data and discovered that fragmentation methods based on the action of the hydrodynamic forces on DNA, produce similar bias. Consideration of this non-random DNA fragmentation may allow one to unravel what factors and to what extent influence the non-uniform coverage of various genomic regions.

  18. Sequence-specific DNA alkylation by tandem Py-Im polyamide conjugates.

    PubMed

    Taylor, Rhys Dylan; Kawamoto, Yusuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

    2014-09-01

    Tandem N-methylpyrrole-N-methylimidazole (Py-Im) polyamides with good sequence-specific DNA-alkylating activities have been designed and synthesized. Three alkylating tandem Py-Im polyamides with different linkers, which each contained the same moiety for the recognition of a 10 bp DNA sequence, were evaluated for their reactivity and selectivity by DNA alkylation, using high-resolution denaturing gel electrophoresis. All three conjugates displayed high reactivities for the target sequence. In particular, polyamide 1, which contained a β-alanine linker, displayed the most-selective sequence-specific alkylation towards the target 10 bp DNA sequence. The tandem Py-Im polyamide conjugates displayed greater sequence-specific DNA alkylation than conventional hairpin Py-Im polyamide conjugates (4 and 5). For further research, the design of tandem Py-Im polyamide conjugates could play an important role in targeting specific gene sequences.

  19. DNA extraction from vegetative tissue for next-generation sequencing.

    PubMed

    Furtado, Agnelo

    2014-01-01

    The quality of extracted DNA is crucial for several applications in molecular biology. If the DNA is to be used for next-generation sequencing (NGS), then microgram quantities of good-quality DNA is required. In addition, the DNA must substantially be of high molecular weight so that it can be used for library preparation and NGS sequencing. Contaminating phenol or starch in the isolated DNA can be easily removed by filtration through kit-based cartridges. In this chapter we describe a simple two-reagent DNA extraction protocol which yields a high quality and quantity of DNA which can be used for different applications including NGS.

  20. Nucleotide sequence of a preferred maize chloroplast genome template for in vitro DNA synthesis.

    PubMed Central

    Gold, B; Carrillo, N; Tewari, K K; Bogorad, L

    1987-01-01

    Maize chloroplast DNA sequences representing 94% of the chromosome have been surveyed for their activity as autonomously replicating sequences in yeast and as templates for DNA synthesis in vitro by a partially purified chloroplast DNA polymerase. A maize chloroplast DNA region extending over about 9 kilobase pairs is especially active as a template for the DNA synthesis reaction. Fragments from within this region are much more active than DNA from elsewhere in the chromosome and 50- to 100-fold more active than DNA of the cloning vector pBR322. The smallest of the strongly active subfragments that we have studied, the 1368-base-pair EcoRI fragment x, has been sequenced and found to contain the coding region of chloroplast ribosomal protein L16. EcoRI fragment x shows sequence homology with a portion of the Chlamydomonas reinhardtii chloroplast chromosome that forms a displacement loop [Wang, X.-M., Chang, C.H., Waddell, J. & Wu, M. (1984) Nucleic Acids Res. 12, 3857-3872]. Maize chloroplast DNA fragments that permit autonomous replication of DNA in yeast are not active as templates for DNA synthesis in the in vitro assay. The template active region we have identified may represent one of the origins of replication of maize chloroplast DNA. Images PMID:3025853

  1. Solid-Phase Purification of Synthetic DNA Sequences.

    PubMed

    Grajkowski, Andrzej; Cieslak, Jacek; Beaucage, Serge L

    2016-08-05

    Although high-throughput methods for solid-phase synthesis of DNA sequences are currently available for synthetic biology applications and technologies for large-scale production of nucleic acid-based drugs have been exploited for various therapeutic indications, little has been done to develop high-throughput procedures for the purification of synthetic nucleic acid sequences. An efficient process for purification of phosphorothioate and native DNA sequences is described herein. This process consists of functionalizing commercial aminopropylated silica gel with aminooxyalkyl functions to enable capture of DNA sequences carrying a 5'-siloxyl ether linker with a "keto" function through an oximation reaction. Deoxyribonucleoside phosphoramidites functionalized with the 5'-siloxyl ether linker were prepared in yields of 75-83% and incorporated last into the solid-phase assembly of DNA sequences. Capture of nucleobase- and phosphate-deprotected DNA sequences released from the synthesis support is demonstrated to proceed near quantitatively. After shorter than full-length DNA sequences were washed from the capture support, the purified DNA sequences were released from this support upon treatment with tetra-n-butylammonium fluoride in dry DMSO. The purity of released DNA sequences exceeds 98%. The scalability and high-throughput features of the purification process are demonstrated without sacrificing purity of the DNA sequences.

  2. What Advances Are Being Made in DNA Sequencing?

    MedlinePlus

    ... DNA building blocks (nucleotides) in an individual's genetic code, called DNA sequencing, has advanced the study of ... breakthrough that helped scientists determine the human genetic code, but it is time-consuming and expensive. The ...

  3. Agrobacterium T-DNA integration in Arabidopsis is correlated with DNA sequence compositions that occur frequently in gene promoter regions.

    PubMed

    Schneeberger, Richard G; Zhang, Ke; Tatarinova, Tatiana; Troukhan, Max; Kwok, Shing F; Drais, Josh; Klinger, Kevin; Orejudos, Francis; Macy, Kimberly; Bhakta, Amit; Burns, James; Subramanian, Gopal; Donson, Jonathan; Flavell, Richard; Feldmann, Kenneth A

    2005-10-01

    Mobile insertion elements such as transposons and T-DNA generate useful genetic variation and are important tools for functional genomics studies in plants and animals. The spectrum of mutations obtained in different systems can be highly influenced by target site preferences inherent in the mechanism of DNA integration. We investigated the target site preferences of Agrobacterium T-DNA insertions in the chromosomes of the model plant Arabidopsis thaliana. The relative frequencies of insertions in genic and intergenic regions of the genome were calculated and DNA composition features associated with the insertion site flanking sequences were identified. Insertion frequencies across the genome indicate that T-strand integration is suppressed near centromeres and rDNA loci, progressively increases towards telomeres, and is highly correlated with gene density. At the gene level, T-DNA integration events show a statistically significant preference for insertion in the 5' and 3' flanking regions of protein coding sequences as well as the promoter region of RNA polymerase I transcribed rRNA gene repeats. The increased insertion frequencies in 5' upstream regions compared to coding sequences are positively correlated with gene expression activity and DNA sequence composition. Analysis of the relationship between DNA sequence composition and gene activity further demonstrates that DNA sequences with high CG-skew ratios are consistently correlated with T-DNA insertion site preference and high gene expression. The results demonstrate genomic and gene-specific preferences for T-strand integration and suggest that DNA sequences with a pronounced transition in CG- and AT-skew ratios are preferred targets for T-DNA integration.

  4. A new specific DNA endonuclease activity in yeast mitochondria.

    PubMed

    Sargueil, B; Delahodde, A; Hatat, D; Tian, G L; Lazowska, J; Jacq, C

    1991-02-01

    Two group I intron-encoded proteins from the yeast mitochondrial genome have already been shown to have a specific DNA endonuclease activity. This activity mediates intron insertion by cleaving the DNA sequence corresponding to the splice junction of an intronless strain. We have discovered in mitochondrial extracts from the yeast strain 777-3A a new DNA endonuclease activity which cleaves the fused exon A3-exon A4 junction sequence of the CO XI gene.

  5. Microfluidic devices for DNA sequencing: sample preparation and electrophoretic analysis.

    PubMed

    Paegel, Brian M; Blazej, Robert G; Mathies, Richard A

    2003-02-01

    Modern DNA sequencing 'factories' have revolutionized biology by completing the human genome sequence, but in the race to completion we are left with inefficient, cumbersome, and costly macroscale processes and supporting facilities. During the same period, microfabricated DNA sequencing, sample processing and analysis devices have advanced rapidly toward the goal of a 'sequencing lab-on-a-chip'. Integrated microfluidic processing dramatically reduces analysis time and reagent consumption, and eliminates costly and unreliable macroscale robotics and laboratory apparatus. A microfabricated device for high-throughput DNA sequencing that couples clone isolation, template amplification, Sanger extension, purification, and electrophoretic analysis in a single microfluidic circuit is now attainable.

  6. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    NASA Astrophysics Data System (ADS)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  7. DNA sequence determination by hybridization: A strategy for efficient large-scale sequencing

    SciTech Connect

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoody, J.; Crkvenjakov, R. ); Funkhouser, W.K.; Koop, B.; Hood, L. )

    1993-06-11

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project. 22 refs., 3 figs.

  8. From DNA sequence to transcriptional behaviour: a quantitative approach.

    PubMed

    Segal, Eran; Widom, Jonathan

    2009-07-01

    Complex transcriptional behaviours are encoded in the DNA sequences of gene regulatory regions. Advances in our understanding of these behaviours have been recently gained through quantitative models that describe how molecules such as transcription factors and nucleosomes interact with genomic sequences. An emerging view is that every regulatory sequence is associated with a unique binding affinity landscape for each molecule and, consequently, with a unique set of molecule-binding configurations and transcriptional outputs. We present a quantitative framework based on existing methods that unifies these ideas. This framework explains many experimental observations regarding the binding patterns of factors and nucleosomes and the dynamics of transcriptional activation. It can also be used to model more complex phenomena such as transcriptional noise and the evolution of transcriptional regulation.

  9. Recent patents of nanopore DNA sequencing technology: progress and challenges.

    PubMed

    Zhou, Jianfeng; Xu, Bingqian

    2010-11-01

    DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.

  10. Highly conserved repetitive DNA sequences are present at human centromeres.

    PubMed Central

    Grady, D L; Ratliff, R L; Robinson, D L; McCanlies, E C; Meyne, J; Moyzis, R K

    1992-01-01

    Highly conserved repetitive DNA sequence clones, largely consisting of (GGAAT)n repeats, have been isolated from a human recombinant repetitive DNA library by high-stringency hybridization with rodent repetitive DNA. This sequence, the predominant repetitive sequence in human satellites II and III, is similar to the essential core DNA of the Saccharomyces cerevisiae centromere, centromere DNA element (CDE) III. In situ hybridization to human telophase and Drosophila polytene chromosomes shows localization of the (GGAAT)n sequence to centromeric regions. Hyperchromicity studies indicate that the (GGAAT)n sequence exhibits unusual hydrogen bonding properties. The purine-rich strand alone has the same thermal stability as the duplex. Hyperchromicity studies of synthetic DNA variants indicate that all sequences with the composition (AATGN)n exhibit this unusual thermal stability. DNA-mobility-shift assays indicate that specific HeLa-cell nuclear proteins recognize this sequence with a relative affinity greater than 10(5). The extreme evolutionary conservation of this DNA sequence, its centromeric location, its unusual hydrogen bonding properties, its high affinity for specific nuclear proteins, and its similarity to functional centromeres isolated from yeast suggest that this sequence may be a component of the functional human centromere. Images PMID:1542662

  11. Characterization of the DNA-binding activity of GCR1: in vivo evidence for two GCR1-binding sites in the upstream activating sequence of TPI of Saccharomyces cerevisiae.

    PubMed Central

    Huie, M A; Scott, E W; Drazinic, C M; Lopez, M C; Hornstra, I K; Yang, T P; Baker, H V

    1992-01-01

    GCR1 gene function is required for high-level glycolytic gene expression in Saccharomyces cerevisiae. Recently, we suggested that the CTTCC sequence motif found in front of many genes encoding glycolytic enzymes lay at the core of the GCR1-binding site. Here we mapped the DNA-binding domain of GCR1 to the carboxy-terminal 154 amino acids of the polypeptide. DNase I protection studies showed that a hybrid MBP-GCR1 fusion protein protected a region of the upstream activating sequence of TPI (UASTPI), which harbored the CTTCC sequence motif, and suggested that the fusion protein might also interact with a region of the UAS that contained the related sequence CATCC. A series of in vivo G methylation protection experiments of the native TPI promoter were carried out with wild-type and gcr1 deletion mutant strains. The G doublets that correspond to the C doublets in each site were protected in the wild-type strain but not in the gcr1 mutant strain. These data demonstrate that the UAS of TPI contains two GCR1-binding sites which are occupied in vivo. Furthermore, adjacent RAP1/GRF1/TUF- and REB1/GRF2/QBP/Y-binding sites in UASTPI were occupied in the backgrounds of both strains. In addition, DNA band-shift assays were used to show that the MBP-GCR1 fusion protein was able to form nucleoprotein complexes with oligonucleotides that contained CTTCC sequence elements found in front of other glycolytic genes, namely, PGK, ENO1, PYK, and ADH1, all of which are dependent on GCR1 gene function for full expression. However, we were unable to detect specific interactions with CTTCC sequence elements found in front of the translational component genes TEF1, TEF2, and CRY1. Taken together, these experiments have allowed us to propose a consensus GCR1-binding site which is 5'-(T/A)N(T/C)N(G/A)NC(T/A)TCC(T/A)N(T/A)(T/A)(T/G)-3'. Images PMID:1588965

  12. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

  13. Nanopores: A journey towards DNA sequencing

    PubMed Central

    Wanunu, Meni

    2013-01-01

    Much more than ever, nucleic acids are recognized as key building blocks in many of life's processes, and the science of studying these molecular wonders at the single-molecule level is thriving. A new method of doing so has been introduced in the mid 1990's. This method is exceedingly simple: a nanoscale pore that spans across an impermeable thin membrane is placed between two chambers that contain an electrolyte, and voltage is applied across the membrane using two electrodes. These conditions lead to a steady stream of ion flow across the pore. Nucleic acid molecules in solution can be driven through the pore, and structural features of the biomolecules are observed as measurable changes in the trans-membrane ion current. In essence, a nanopore is a high-throughput ion microscope and a single-molecule force apparatus. Nanopores are taking center stage as a tool that promises to read a DNA sequence, and this promise has resulted in overwhelming academic, industrial, and national interest. Regardless of the fate of future nanopore applications, in the process of this 16-year-long exploration, many studies have validated the indispensability of nanopores in the toolkit of single-molecule biophysics. This review surveys past and current studies related to nucleic acid biophysics, and will hopefully provoke a discussion of immediate and future prospects for the field. PMID:22658507

  14. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements

    PubMed Central

    Kuntz, Steven G.; Schwarz, Erich M.; DeModena, John A.; De Buysscher, Tristan; Trout, Diane; Shizuya, Hiroaki; Sternberg, Paul W.; Wold, Barbara J.

    2008-01-01

    To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced ∼0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide. PMID:18981268

  15. Preparing DNA libraries for multiplexed paired-end deep sequencing for Illumina GA sequencers.

    PubMed

    Son, Mike S; Taylor, Ronald K

    2011-02-01

    Whole-genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions, and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data.

  16. Sequence Recognition in the Pairing of DNA Duplexes

    NASA Astrophysics Data System (ADS)

    Kornyshev, A. A.; Leikin, S.

    2001-04-01

    Pairing of DNA fragments with homologous sequences occurs in gene shuffling, DNA repair, and other vital processes. While chemical individuality of base pairs is hidden inside the double helix, x ray and NMR revealed sequence-dependent modulation of the structure of DNA backbone. Here we show that the resulting modulation of the DNA surface charge pattern enables duplexes longer than ~50 base pairs to recognize sequence homology electrostatically at a distance of up to several water layers. This may explain the local recognition observed in pairing of homologous chromosomes and the observed length dependence of homologous recombination.

  17. Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

    NASA Astrophysics Data System (ADS)

    Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

    1998-03-01

    Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.

  18. Scanning probe and nanopore DNA sequencing: core techniques and possibilities.

    PubMed

    Lund, John; Parviz, Babak A

    2009-01-01

    We provide an overview of the current state of research towards DNA sequencing using nanopore and scanning probe techniques. Additionally, we provide methods for the creation of two key experimental platforms for studies relating to nanopore and scanning probe DNA studies: a synthetic nanopore apparatus and an atomically flat conductive substrate with stretched DNA molecules.

  19. cDNA cloning and sequencing of tarantula hemocyanin subunits.

    PubMed

    Voit, R; Feldmaier-Fuchs, G

    1990-01-01

    Tarantula heart cDNA libraries were screened with synthetic oligonucleotide probes deduced from the highly conserved amino acid sequences of the two copper-binding sites, copper A and copper B, found in chelicerate hemocyanins. Positive cDNA clones could be obtained and four different cDNA types were characterized.

  20. Food Fish Identification from DNA Extraction through Sequence Analysis

    ERIC Educational Resources Information Center

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  1. Characteristics of cloned repeated DNA sequences in the barley genome

    SciTech Connect

    Anan'ev, E.V.; Bochkanov, S.S.; Ryzhik, M.V.; Sonina, N.V.; Chernyshev, A.I.; Shchipkova, N.I.; Yakovleva, E.Yu.

    1986-12-01

    A partial clone library of barley DNA fragments based on plasmid pBR325 was created. The cloned EcoRI-fragments of chromosomal DNA are from 2 to 14 kbp in length. More than 95% of the barley DNA inserts comprise repeated sequences of different complexity and copy number. Certain of these DNA sequences are from families comprising at least 1% of the barley genome. A significant proportion of the clones hybridize with numerous sets of restriction fragments of genome DNA and they are dispersed throughout the barley chromosomes.

  2. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    ERIC Educational Resources Information Center

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  3. DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present.

    PubMed

    Chen, Cheng-Yao

    2014-01-01

    Next-generation sequencing (NGS) technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. Escherichia coli DNA polymerase I proteolytic (Klenow) fragment was originally utilized in Sanger's dideoxy chain-terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE) and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ϕ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ϕ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  4. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, S.; Richardson, C.

    1997-03-25

    A modified gene encoding a modified DNA polymerase is disclosed. The modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase. 6 figs.

  5. DNA polymerase having modified nucleotide binding site for DNA sequencing

    DOEpatents

    Tabor, Stanley; Richardson, Charles

    1997-01-01

    Modified gene encoding a modified DNA polymerase wherein the modified polymerase incorporates dideoxynucleotides at least 20-fold better compared to the corresponding deoxynucleotides as compared with the corresponding naturally-occurring DNA polymerase.

  6. Neandertal DNA sequences and the origin of modern humans.

    PubMed

    Krings, M; Stone, A; Schmitz, R W; Krainitzki, H; Stoneking, M; Pääbo, S

    1997-07-11

    DNA was extracted from the Neandertal-type specimen found in 1856 in western Germany. By sequencing clones from short overlapping PCR products, a hitherto unknown mitochondrial (mt) DNA sequence was determined. Multiple controls indicate that this sequence is endogenous to the fossil. Sequence comparisons with human mtDNA sequences, as well as phylogenetic analyses, show that the Neandertal sequence falls outside the variation of modern humans. Furthermore, the age of the common ancestor of the Neandertal and modern human mtDNAs is estimated to be four times greater than that of the common ancestor of human mtDNAs. This suggests that Neandertals went extinct without contributing mtDNA to modern humans.

  7. Application of 2-D graphical representation of DNA sequence

    NASA Astrophysics Data System (ADS)

    Liao, Bo; Tan, Mingshu; Ding, Kequan

    2005-10-01

    Recently, we proposed a 2-D graphical representation of DNA sequence [Bo Liao, A 2-D graphical representation of DNA sequence, Chem. Phys. Lett. 401 (2005) 196-199]. Based on this representation, we consider properties of mutations and compute the similarities among 11 mitochondrial sequences belonging to different species. The elements of the similarity matrix are used to construct phylogenic tree. Unlike most existing phylogeny construction methods, the proposed method does not require multiple alignment.

  8. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.; Lobzin, V. V.

    2004-07-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions.

  9. Evolution of a complex minisatellite DNA sequence.

    PubMed

    Barros, Paula; Blanco, Miguel G; Boán, Francisco; Gómez-Márquez, Jaime

    2008-11-01

    Minisatellites are tandem repeats of short DNA units widely distributed in genomes. However, the information on their dynamics in a phylogenetic context is very limited. Here we have studied the organization of the MsH43 locus in several species of primates and from these data we have reconstructed the evolutionary history of this complex minisatellite. Overall, with the exception of gibbon, MsH43 has an organization that is asymmetric, since the distribution of repeats is distinct between the 5' and 3' halves, and heterogeneous since there are many different repeats, some of them characteristic of each species. Inspection of the MsH43 arrays showed the existence of many duplications and deletions, suggesting the implication of slippage processes in the generation of polymorphism. Concerning the evolutionary history of this minisatellite, we propose that the birth of MsH43 may be situated before the divergence of Old World Monkeys since we found the existence of some MsH43 repeat motifs in prosimians and New World Monkeys. The analysis of MsH43 in apes revealed the existence of an evolutionary breakpoint in the pathway that originated African great apes and humans. Remarkably, human MsH43 is more homologous to orang-utan than to the corresponding sequence in gorilla and chimpanzee. This finding does not comply with the evolutionary paradigm that continuous alterations occur during the course of genome evolution. To adjust our results to the standard phylogeny of primates, we propose the existence of a wandering allele that was maintained almost unaltered during the period that extends between orang-utan and humans.

  10. Advanced microinstrumentation for rapid DNA sequencing and large DNA fragment separation

    SciTech Connect

    Balch, J.; Davidson, J.; Brewer, L.; Gingrich, J.; Koo, J.; Mariella, R.; Carrano, A.

    1995-01-25

    Our efforts to develop novel technology for a rapid DNA sequencer and large fragment analysis system based upon gel electrophoresis are described. We are using microfabrication technology to build dense arrays of high speed micro electrophoresis lanes that will ultimately increase the sequencing rate of DNA by at least 100 times the rate of current sequencers. We have demonstrated high resolution DNA fragment separation needed for sequencing in polyacrylamide microgels formed in glass microchannels. We have built prototype arrays of microchannels having up to 48 channels. Significant progress has also been made in developing a sensitive fluorescence detection system based upon a confocal microscope design that will enable the diagnostics and detection of DNA fragments in ultrathin microchannel gels. Development of a rapid DNA sequencer and fragment analysis system will have a major impact on future DNA instrumentation used in clinical, molecular and forensic analysis of DNA fragments.

  11. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  12. Multiplexed Sequence Encoding: A Framework for DNA Communication

    PubMed Central

    Zakeri, Bijan; Carr, Peter A.; Lu, Timothy K.

    2016-01-01

    Synthetic DNA has great propensity for efficiently and stably storing non-biological information. With DNA writing and reading technologies rapidly advancing, new applications for synthetic DNA are emerging in data storage and communication. Traditionally, DNA communication has focused on the encoding and transfer of complete sets of information. Here, we explore the use of DNA for the communication of short messages that are fragmented across multiple distinct DNA molecules. We identified three pivotal points in a communication—data encoding, data transfer & data extraction—and developed novel tools to enable communication via molecules of DNA. To address data encoding, we designed DNA-based individualized keyboards (iKeys) to convert plaintext into DNA, while reducing the occurrence of DNA homopolymers to improve synthesis and sequencing processes. To address data transfer, we implemented a secret-sharing system—Multiplexed Sequence Encoding (MuSE)—that conceals messages between multiple distinct DNA molecules, requiring a combination key to reveal messages. To address data extraction, we achieved the first instance of chromatogram patterning through multiplexed sequencing, thereby enabling a new method for data extraction. We envision these approaches will enable more widespread communication of information via DNA. PMID:27050646

  13. [DNA analysis for the post genome-sequencing era].

    PubMed

    Kambara, Hideki

    2002-05-01

    With the completion of the human genome sequencing, the new post genome-sequencing era has started. The major subjects are clarifying the function of genes to apply this information to medical as well as various industrial fields. Various DNA analysis methods and instruments for gene expression profiling as well as genetic diversity including SNPs typing are required and have been developed. Here, the history and technologies related to DNA analysis including the Wada project in the early 1980's, and the Human genome project from 1990 are described. Various new technologies have developed in this decade. They include a capillary gel array DNA sequencer, DNA chips, bead probe arrays, a new DNA sequencing method using pyrosequencing and an efficient SNP typing method by BAMPER.

  14. Haplogrouping mitochondrial DNA sequences in Legal Medicine/Forensic Genetics.

    PubMed

    Bandelt, Hans-Jürgen; van Oven, Mannis; Salas, Antonio

    2012-11-01

    Haplogrouping refers to the classification of (partial) mitochondrial DNA (mtDNA) sequences into haplogroups using the current knowledge of the worldwide mtDNA phylogeny. Haplogroup assignment of mtDNA control-region sequences assists in the focused comparison with closely related complete mtDNA sequences and thus serves two main goals in forensic genetics: first is the a posteriori quality analysis of sequencing results and second is the prediction of relevant coding-region sites for confirmation or further refinement of haplogroup status. The latter may be important in forensic casework where discrimination power needs to be as high as possible. However, most articles published in forensic genetics perform haplogrouping only in a rudimentary or incorrect way. The present study features PhyloTree as the key tool for assigning control-region sequences to haplogroups and elaborates on additional Web-based searches for finding near-matches with complete mtDNA genomes in the databases. In contrast, none of the automated haplogrouping tools available can yet compete with manual haplogrouping using PhyloTree plus additional Web-based searches, especially when confronted with artificial recombinants still present in forensic mtDNA datasets. We review and classify the various attempts at haplogrouping by using a multiplex approach or relying on automated haplogrouping. Furthermore, we re-examine a few articles in forensic journals providing mtDNA population data where appropriate haplogrouping following PhyloTree immediately highlights several kinds of sequence errors.

  15. A mathematical model and numerical method for thermoelectric DNA sequencing

    NASA Astrophysics Data System (ADS)

    Shi, Liwei; Guilbeau, Eric J.; Nestorova, Gergana; Dai, Weizhong

    2014-05-01

    Single nucleotide polymorphisms (SNPs) are single base pair variations within the genome that are important indicators of genetic predisposition towards specific diseases. This study explores the feasibility of SNP detection using a thermoelectric sequencing method that measures the heat released when DNA polymerase inserts a deoxyribonucleoside triphosphate into a DNA strand. We propose a three-dimensional mathematical model that governs the DNA sequencing device with a reaction zone that contains DNA template/primer complex immobilized to the surface of the lower channel wall. The model is then solved numerically. Concentrations of reactants and the temperature distribution are obtained. Results indicate that when the nucleoside is complementary to the next base in the DNA template, polymerization occurs lengthening the complementary polymer and releasing thermal energy with a measurable temperature change, implying that the thermoelectric conceptual device for sequencing DNA may be feasible for identifying specific genes in individuals.

  16. DNA Shape Dominates Sequence Affinity in Nucleosome Formation

    NASA Astrophysics Data System (ADS)

    Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.

    2014-10-01

    Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.

  17. Design of Sequence-Specific DNA Binding Molecules for DNA Methyltransferase Inhibition

    PubMed Central

    2015-01-01

    The CpG dyad, an important genomic feature in DNA methylation and transcriptional regulation, is an attractive target for small molecules. To assess the utility of minor groove binding oligomers for CpG recognition, we screened a small library of hairpin pyrrole-imidazole polyamides targeting the sequence 5′-CGCG-3′ and assessed their sequence specificity using an unbiased next-generation sequencing assay. Our findings indicate that hairpin polyamide of sequence PyImβIm-γ-PyImβIm (1), previously identified as a high affinity 5′-CGCG-3′ binder, favors 5′-GCGC-3′ in an unanticipated reverse binding orientation. Replacement of one β alanine with Py to afford PyImPyIm-γ-PyImβIm (3) restores the preference for 5′-CGCG-3′ binding in a forward orientation. The minor groove binding hairpin 3 inhibits DNA methyltransferase activity in the major groove at its target site more effectively than 1, providing a molecular basis for design of sequence-specific antagonists of CpG methylation. PMID:24502234

  18. An Evolution Based Biosensor Receptor DNA Sequence Generation Algorithm

    PubMed Central

    Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M.; Lee, Jaewan; Zang, Yupeng

    2010-01-01

    A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements. PMID:22315543

  19. Laser desorption mass spectrometry for DNA analysis and sequencing

    SciTech Connect

    Chen, C.H.; Taranenko, N.I.; Tang, K.; Allman, S.L.

    1995-03-01

    Laser desorption mass spectrometry has been considered as a potential new method for fast DNA sequencing. Our approach is to use matrix-assisted laser desorption to produce parent ions of DNA segments and a time-of-flight mass spectrometer to identify the sizes of DNA segments. Thus, the approach is similar to gel electrophoresis sequencing using Sanger`s enzymatic method. However, gel, radioactive tagging, and dye labeling are not required. In addition, the sequencing process can possibly be finished within a few hundred microseconds instead of hours and days. In order to use mass spectrometry for fast DNA sequencing, the following three criteria need to be satisfied. They are (1) detection of large DNA segments, (2) sensitivity reaching the femtomole region, and (3) mass resolution good enough to separate DNA segments of a single nucleotide difference. It has been very difficult to detect large DNA segments by mass spectrometry before due to the fragile chemical properties of DNA and low detection sensitivity of DNA ions. We discovered several new matrices to increase the production of DNA ions. By innovative design of a mass spectrometer, we can increase the ion energy up to 45 KeV to enhance the detection sensitivity. Recently, we succeeded in detecting a DNA segment with 500 nucleotides. The sensitivity was 100 femtomole. Thus, we have fulfilled two key criteria for using mass spectrometry for fast DNA sequencing. The major effort in the near future is to improve the resolution. Different approaches are being pursued. When high resolution of mass spectrometry can be achieved and automation of sample preparation is developed, the sequencing speed to reach 500 megabases per year can be feasible.

  20. An Optimal Seed Based Compression Algorithm for DNA Sequences

    PubMed Central

    Gopalakrishnan, Gopakumar; Karunakaran, Muralikrishnan

    2016-01-01

    This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms. PMID:27555868

  1. DNA Methyltransferase Accessibility Protocol for Individual Templates by Deep Sequencing

    PubMed Central

    Darst, Russell P.; Nabilsi, Nancy H.; Pardo, Carolina E.; Riva, Alberto; Kladde, Michael P.

    2013-01-01

    A single-molecule probe of chromatin structure can uncover dynamic chromatin states and rare epigenetic variants of biological importance that bulk measures of chromatin structure miss. In bisulfite genomic sequencing, each sequenced clone records the methylation status of multiple sites on an individual molecule of DNA. An exogenous DNA methyltransferase can thus be used to image nucleosomes and other protein–DNA complexes. In this chapter, we describe the adaptation of this technique, termed Methylation Accessibility Protocol for individual templates, to modern high-throughput sequencing, which both simplifies the workflow and extends its utility. PMID:22929770

  2. DNA sequence analysis with droplet-based microfluidics

    PubMed Central

    Abate, Adam R.; Hung, Tony; Sperling, Ralph A.; Mary, Pascaline; Rotem, Assaf; Agresti, Jeremy J.; Weiner, Michael A.; Weitz, David A.

    2014-01-01

    Droplet-based microfluidic techniques can form and process micrometer scale droplets at thousands per second. Each droplet can house an individual biochemical reaction, allowing millions of reactions to be performed in minutes with small amounts of total reagent. This versatile approach has been used for engineering enzymes, quantifying concentrations of DNA in solution, and screening protein crystallization conditions. Here, we use it to read the sequences of DNA molecules with a FRET-based assay. Using probes of different sequences, we interrogate a target DNA molecule for polymorphisms. With a larger probe set, additional polymorphisms can be interrogated as well as targets of arbitrary sequence. PMID:24185402

  3. Current-voltage characteristics of double-strand DNA sequences

    NASA Astrophysics Data System (ADS)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  4. DNA Sequencing by Hexagonal Boron Nitride Nanopore: A Computational Study

    PubMed Central

    Zhang, Liuyang; Wang, Xianqiao

    2016-01-01

    The single molecule detection associated with DNA sequencing has motivated intensive efforts to identify single DNA bases. However, little research has been reported utilizing single-layer hexagonal boron nitride (hBN) for DNA sequencing. Here we employ molecular dynamics simulations to explore pathways for single-strand DNA (ssDNA) sequencing by nanopore on the hBN sheet. We first investigate the adhesive strength between nucleobases and the hBN sheet, which provides the foundation for the hBN-base interaction and nanopore sequencing mechanism. Simulation results show that the purine base has a more remarkable energy profile and affinity than the pyrimidine base on the hBN sheet. The threading of ssDNA through the hBN nanopore can be clearly identified due to their different energy profiles and conformations with circular nanopores on the hBN sheet. The sequencing process is orientation dependent when the shape of the hBN nanopore deviates from the circle. Our results open up a promising avenue to explore the capability of DNA sequencing by hBN nanopore.

  5. Plasmonic Nanopores for Trapping, Controlling Displacement, and Sequencing of DNA

    PubMed Central

    2015-01-01

    With the aim of developing a DNA sequencing methodology, we theoretically examine the feasibility of using nanoplasmonics to control the translocation of a DNA molecule through a solid-state nanopore and to read off sequence information using surface-enhanced Raman spectroscopy. Using molecular dynamics simulations, we show that high-intensity optical hot spots produced by a metallic nanostructure can arrest DNA translocation through a solid-state nanopore, thus providing a physical knob for controlling the DNA speed. Switching the plasmonic field on and off can displace the DNA molecule in discrete steps, sequentially exposing neighboring fragments of a DNA molecule to the pore as well as to the plasmonic hot spot. Surface-enhanced Raman scattering from the exposed DNA fragments contains information about their nucleotide composition, possibly allowing the identification of the nucleotide sequence of a DNA molecule transported through the hot spot. The principles of plasmonic nanopore sequencing can be extended to detection of DNA modifications and RNA characterization. PMID:26401685

  6. DNA sequence compression using the burrows-wheeler transform.

    PubMed

    Adjeroh, Don; Zhang, Yong; Mukherjee, Amar; Powell, Matt; Bell, Tim

    2002-01-01

    We investigate off-line dictionary oriented approaches to DNA sequence compression, based on the Burrows-Wheeler Transform (BWT). The preponderance of short repeating patterns is an important phenomenon in biological sequences. Here, we propose off-line methods to compress DNA sequences that exploit the different repetition structures inherent in such sequences. Repetition analysis is performed based on the relationship between the BWT and important pattern matching data structures, such as the suffix tree and suffix array. We discuss how the proposed approach can be incorporated in the BWT compression pipeline.

  7. DNA sequence selective adenine alkylation, mechanism of adduct repair, and in vivo antitumor activity of the novel achiral seco-amino-cyclopropylbenz[e]indolone analogue of duocarmycin AS-I-145.

    PubMed

    Kiakos, Konstantinos; Sato, Atsushi; Asao, Tetsuji; McHugh, Peter J; Lee, Moses; Hartley, John A

    2007-10-01

    AS-I-145 is a novel achiral seco-amino-cyclopropylbenz[e]indolone (seco-amino-CBI) analogue of duocarmycin that has evolved from an alternative strategy of designing CC-1065/duocarmycin agents lacking the characteristic chiral center of the natural agents. The sequence specificity of this compound was assessed by a Taq polymerase stop assay, identifying the sites of covalent modification on plasmid DNA. The adenine-N3 adducts were confirmed at AT-rich sequences using a thermally induced strand cleavage assay. These studies reveal that this compound retains the inherent sequence selectivity of the related natural compounds. The AS-I-145 sensitivity of yeast mutants deficient in excision and post-replication repair (PRR) pathways was assessed. The sensitivity profile suggests that the sequence-specific adenine-N3 adducts are substrates for nucleotide excision repair (NER) but not base excision repair (BER). Single-strand ligation PCR was employed to follow the induction and repair of the lesions at nucleotide resolution in yeast cells. Sequence specificity was preserved in intact cells, and adduct elimination occurred in a transcription-coupled manner and was dependent on a functional NER pathway and Rad18. The involvement of NER as the predominant excision pathway was confirmed in mammalian DNA repair mutant cells. AS-I-145 showed good in vivo antitumor activity in the National Cancer Institute standard hollow fiber assay and was active against the human breast MDA-MD-435 xenograft when administered i.v. or p.o. Its novel structure and in vivo activity renders AS-I-145 a new paradigm in the design of novel achiral analogues of CC-1065 and the duocarmycins.

  8. DNA sequencing using polymerase substrate-binding kinetics

    PubMed Central

    Previte, Michael John Robert; Zhou, Chunhong; Kellinger, Matthew; Pantoja, Rigo; Chen, Cheng-Yao; Shi, Jin; Wang, BeiBei; Kia, Amirali; Etchin, Sergey; Vieceli, John; Nikoomanzar, Ali; Bomati, Erin; Gloeckner, Christian; Ronaghi, Mostafa; He, Molly Min

    2015-01-01

    Next-generation sequencing (NGS) has transformed genomic research by decreasing the cost of sequencing. However, whole-genome sequencing is still costly and complex for diagnostics purposes. In the clinical space, targeted sequencing has the advantage of allowing researchers to focus on specific genes of interest. Routine clinical use of targeted NGS mandates inexpensive instruments, fast turnaround time and an integrated and robust workflow. Here we demonstrate a version of the Sequencing by Synthesis (SBS) chemistry that potentially can become a preferred targeted sequencing method in the clinical space. This sequencing chemistry uses natural nucleotides and is based on real-time recording of the differential polymerase/DNA-binding kinetics in the presence of correct or mismatch nucleotides. This ensemble SBS chemistry has been implemented on an existing Illumina sequencing platform with integrated cluster amplification. We discuss the advantages of this sequencing chemistry for targeted sequencing as well as its limitations for other applications. PMID:25612848

  9. Characterizing self-similarity in bacteria DNA sequences

    NASA Astrophysics Data System (ADS)

    Lu, Xin; Sun, Zhirong; Chen, Huimin; Li, Yanda

    1998-09-01

    In this paper some parametric methods are introduced to characterize the self-similarity of DNA sequences. Compared with Fourier analysis, these methods perform statistically more stably and yield more reliable results. Using these methods, eight whole genomes of bacteria provided by NCBI are analyzed. Long-range correlation properties in the nucleotide density distribution along these DNA sequences are explored. Estimation results show that the long-range correlation structure prevails through the entire molecule of DNA. Higher order statistics through coarse graining reveal that rather than multifractal, there are only monofractal phenomena presented in the sequences. Hence, the nucleotide density distribution can be modeled asymptotically as fractional Gaussian noise. This result points to a new direction for analyzing and understanding the intrinsic structures of DNA sequences.

  10. Microchannel DNA Sequencing by End-Labelled Free Solution Electrophoresis

    SciTech Connect

    Barron, A.

    2005-09-29

    The further development of End-Labeled Free-Solution Electrophoresis will greatly simplify DNA separation and sequencing on microfluidic devices. The development and optimization of drag-tags is critical to the success of this research.

  11. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Cancer.gov

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  12. Distinct immune responses of recombinant plasmid DNA replicon vaccines expressing two types of antigens with or without signal sequences.

    PubMed

    Yu, Yun-Zhou; Li, Na; Wang, Wen-Bin; Wang, Shuang; Ma, Yao; Yu, Wei-Yuan; Sun, Zhi-Wei

    2010-11-03

    Here, DNA replicon vaccines encoding the Hc domain of botulinum neurotoxin serotype A (AHc) or the receptor binding domain of anthrax protective antigen (PA4) with or without signal sequences were evaluated in mice. Strong antibody and protective responses were elicited only from AHc DNA vaccines with an Ig κ signal sequence or tissue plasminogen activator signal sequence. Meanwhile, there were no differences in total antibody responses or isotypes, lymphocyte proliferative responses, cytokine profiles and protective immune responses with the PA4 DNA vaccines with or without a signal sequence. Therefore, use of targeting sequences in designing DNA replicon vaccines depends on the specific antigen.

  13. Sequence-Specific Molecular Lithography on Single DNA Molecules

    NASA Astrophysics Data System (ADS)

    Keren, Kinneret; Krueger, Michael; Gilad, Rachel; Ben-Yoseph, Gdalyahu; Sivan, Uri; Braun, Erez

    2002-07-01

    Recent advances in the realization of individual molecular-scale electronic devices emphasize the need for novel tools and concepts capable of assembling such devices into large-scale functional circuits. We demonstrated sequence-specific molecular lithography on substrate DNA molecules by harnessing homologous recombination by RecA protein. In a sequence-specific manner, we patterned the coating of DNA with metal, localized labeled molecular objects and grew metal islands on specific sites along the DNA substrate, and generated molecularly accurate stable DNA junctions for patterning the DNA substrate connectivity. In our molecular lithography, the information encoded in the DNA molecules replaces the masks used in conventional microelectronics, and the RecA protein serves as the resist. The molecular lithography works with high resolution over a broad range of length scales from nanometers to many micrometers.

  14. Nucleotide correlations and electronic transport of DNA sequences

    NASA Astrophysics Data System (ADS)

    Albuquerque, E. L.; Vasconcelos, M. S.; Lyra, M. L.; de Moura, F. A. B. F.

    2005-02-01

    We use a tight-binding formulation to investigate the transmissivity and wave-packet dynamics of sequences of single-strand DNA molecules made up from the nucleotides guanine G , adenine A , cytosine C , and thymine T . In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of two artificial sequences: (i) the Rudin-Shapiro one, which has long-range correlations; (ii) a random sequence, which is a kind of prototype of a short-range correlated system, presented here with the same first-neighbor pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the persistence of resonances of finite segments. On the other hand, the wave-packet dynamics seems to be mostly influenced by the short-range correlations.

  15. Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer

    PubMed Central

    Johnson, Sarah S.; Zaikova, Elena; Goerlitz, David S.; Bai, Yu; Tighe, Scott W.

    2017-01-01

    The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions. PMID:28337073

  16. Real-Time DNA Sequencing in the Antarctic Dry Valleys Using the Oxford Nanopore Sequencer.

    PubMed

    Johnson, Sarah S; Zaikova, Elena; Goerlitz, David S; Bai, Yu; Tighe, Scott W

    2017-04-01

    The ability to sequence DNA outside of the laboratory setting has enabled novel research questions to be addressed in the field in diverse areas, ranging from environmental microbiology to viral epidemics. Here, we demonstrate the application of offline DNA sequencing of environmental samples using a hand-held nanopore sequencer in a remote field location: the McMurdo Dry Valleys, Antarctica. Sequencing was performed using a MK1B MinION sequencer from Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) that was equipped with software to operate without internet connectivity. One-direction (1D) genomic libraries were prepared using portable field techniques on DNA isolated from desiccated microbial mats. By adequately insulating the sequencer and laptop, it was possible to run the sequencing protocol for up to 2½ h under arduous conditions.

  17. Nuclear and mitochondrial DNA sequences from two Denisovan individuals

    PubMed Central

    Sawyer, Susanna; Renaud, Gabriel; Viola, Bence; Hublin, Jean-Jacques; Gansauge, Marie-Theres; Shunkov, Michael V.; Derevianko, Anatoly P.; Prüfer, Kay; Pääbo, Svante

    2015-01-01

    Denisovans, a sister group of Neandertals, have been described on the basis of a nuclear genome sequence from a finger phalanx (Denisova 3) found in Denisova Cave in the Altai Mountains. The only other Denisovan specimen described to date is a molar (Denisova 4) found at the same site. This tooth carries a mtDNA sequence similar to that of Denisova 3. Here we present nuclear DNA sequences from Denisova 4 and a morphological description, as well as mitochondrial and nuclear DNA sequence data, from another molar (Denisova 8) found in Denisova Cave in 2010. This new molar is similar to Denisova 4 in being very large and lacking traits typical of Neandertals and modern humans. Nuclear DNA sequences from the two molars form a clade with Denisova 3. The mtDNA of Denisova 8 is more diverged and has accumulated fewer substitutions than the mtDNAs of the other two specimens, suggesting Denisovans were present in the region over an extended period. The nuclear DNA sequence diversity among the three Denisovans is comparable to that among six Neandertals, but lower than that among present-day humans. PMID:26630009

  18. Effects of sequence on DNA wrapping around histones

    NASA Astrophysics Data System (ADS)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  19. Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human

    PubMed Central

    Wu, Chengchao; Yao, Shixin; Li, Xinghao; Chen, Chujia; Hu, Xuehai

    2017-01-01

    DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation. PMID:28212312

  20. Probe mapping to facilitate transposon-based DNA sequencing

    SciTech Connect

    Strausbaugh, L.D.; Bourke, M.T.; Sommer, M.T.; Coon, M.E.; Berg, C.M. )

    1990-08-01

    A promising strategy for DNA sequencing exploits transposons to provide mobile sites for the binding of sequencing primers. For such a strategy to be maximally efficient, the location and orientation of the transposon must be readily determined and the insertion sites should be randomly distributed. The authors demonstrate an efficient probe-based method for the localization and orientation of transposon-borne primer sites, which is adaptable to large-scale sequencing strategies. This approach requires no prior restriction enzyme mapping or knowledge of the cloned sequence and eliminates the inefficiency inherent in totally random sequencing methods. To test the efficiency of probe mapping, 49 insertions of the transposon {gamma}{delta} (Tn1000) in a cloned fragment of Drosophila melanogaster DNA were mapped and oriented. In addition, oligonucleotide primers specific for unique subterminal {gamma}{delta} segments were used to prime dideoxynucleotide double-stranded sequencing. These data provided an opportunity to rigorously examine {gamma}{delta} insertion sites. The insertions were quire randomly distributed, even though the target DNA fragment had both A+T-rich and G+C-rich regions; in G+C-rich DNA, the insertions were found in A+T-rich valleys. These data demonstrate that {gamma}{delta} is an excellent choice for supplying mobile primer binding sites to cloned DNA and that transposon-based probe mapping permits the sequences of large cloned segments to be determined without any subcloning.

  1. Affordable hands-on DNA sequencing and genotyping: an exercise for teaching DNA analysis to undergraduates.

    PubMed

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations.

  2. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    PubMed

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  3. Terminal repetitive sequences in herpesvirus saimiri virion DNA.

    PubMed

    Bankier, A T; Dietrich, W; Baer, R; Barrell, B G; Colbère-Garapin, F; Fleckenstein, B; Bodemer, W

    1985-07-01

    The H-DNA repeat unit of Herpesvirus saimiri strain 11 was cloned in plasmid vector pAGO, and the nucleotide sequence was determined by the dideoxy chain termination method. One unit of repetitive DNA has 1,444 base pairs with 70.8% G+C content. The structural features of repeat DNA sequences at the termini of intact virion M-DNA (160 kilobases) and orientation of reiterated DNA were analyzed by radioactive end labeling of M-DNA, followed by cleavage of the end fragments with restriction endonucleases. The termini appeared to be blunt ended with a 5'-phosphate group, probably generated during encapsidation by cleavage in the immediate vicinity of the single ApaI recognition site in the H-DNA repeat unit. The sequence did not reveal sizeable open reading frames, the longest hypothetical peptide from H-DNA being 85 amino acids. There was no evidence for an mRNA promoter or terminator element, and H-DNA-specific transcription could not be found in productively infected cells.

  4. The properties and applications of single-molecule DNA sequencing

    PubMed Central

    2011-01-01

    Single-molecule sequencing enables DNA or RNA to be sequenced directly from biological samples, making it well-suited for diagnostic and clinical applications. Here we review the properties and applications of this rapidly evolving and promising technology. PMID:21349208

  5. Ancient DNA sequence revealed by error-correcting codes

    PubMed Central

    Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

    2015-01-01

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228

  6. Human gamma X satellite DNA: an X chromosome specific centromeric DNA sequence.

    PubMed

    Lee, C; Li, X; Jabs, E W; Court, D; Lin, C C

    1995-11-01

    The cosmid clone, CX16-2D12, was previously localized to the centromeric region of the human X chromosome and shown to lack human X-specific alpha satellite DNA. A 1.2 kb EcoRI fragment was subcloned from the CX16-2D12 cosmid and was named 2D12/E2. DNA sequencing revealed that this 1,205 bp fragment consisted of approximately five tandemly repeated DNA monomers of 220 bp. DNA sequence homology between the monomers of 2D12/E2 ranged from 72.8% to 78.6%. Interestingly, DNA sequence analysis of the 2D12/E2 clone displayed a change in monomer unit orientation between nucleotide positions 585-586 from a "tail-to-head" arrangement to a "head-to-tail" configuration. This may reflect the existence of at least one inversion within this repetitive DNA array in the centromeric region of the human X chromosome. The DNA consensus sequence derived from a compilation of these 220 bp monomers had approximately 62% DNA sequence similarity to the previously determined gamma 8 satellite DNA consensus sequence. Comparison of the 2D12/E2 and gamma 8 consensus sequences revealed a 20 bp DNA sequence that was well conserved in both DNA consensus sequences. Slot-blot analysis revealed that this repetitive DNA sequence comprises approximately 0.015% of the human genome, similar to that found with gamma 8 satellite DNA. These observations suggest that this satellite DNA clone is derived from a subfamily of gamma satellite DNA and is thus designated gamma X satellite DNA. When genomic DNA from six unrelated males and two unrelated females was cut with SstI or HpaI and separated by pulsed-field gel electrophoresis, no restriction fragment length polymorphisms were observed for either gamma X (2D12/E2) or gamma 8 (50E4) probes. Fluorescence in situ hybridization localized the 2D12/E2 clone to the lateral sides of the primary constriction specifically on the human X chromosome.

  7. Microfabricated bioprocessor for integrated nanoliter-scale Sanger DNA sequencing.

    PubMed

    Blazej, Robert G; Kumaresan, Palani; Mathies, Richard A

    2006-05-09

    An efficient, nanoliter-scale microfabricated bioprocessor integrating all three Sanger sequencing steps, thermal cycling, sample purification, and capillary electrophoresis, has been developed and evaluated. Hybrid glass-polydimethylsiloxane (PDMS) wafer-scale construction is used to combine 250-nl reactors, affinity-capture purification chambers, high-performance capillary electrophoresis channels, and pneumatic valves and pumps onto a single microfabricated device. Lab-on-a-chip-level integration enables complete Sanger sequencing from only 1 fmol of DNA template. Up to 556 continuous bases were sequenced with 99% accuracy, demonstrating read lengths required for de novo sequencing of human and other complex genomes. The performance of this miniaturized DNA sequencer provides a benchmark for predicting the ultimate cost and efficiency limits of Sanger sequencing.

  8. Electronic Transport and Thermopower in Aperiodic DNA Sequences

    NASA Astrophysics Data System (ADS)

    Roche, Stephan; Maciá, Enrique

    A detailed study of charge transport properties of synthetic and genomic DNA sequences is reported. Genomic sequences of the Chromosome 22, λ-bacteriophage, and D1s80 genes of Human and Pygmy chimpanzee are considered in this work, and compared with both periodic and quasiperiodic (Fibonacci) sequences of nucleotides. Charge transfer efficiency is compared for all these different sequences, and large variations in charge transfer efficiency, stemming from sequence-dependent effects, are reported. In addition, basic characteristics of tunneling currents, including contact effects, are described. Finally, the thermoelectric power of nucleobases connected in between metallic contacts at different temperatures is presented.

  9. Beyond reasonable doubt: evolution from DNA sequences.

    PubMed

    White, W Timothy J; Zhong, Bojian; Penny, David

    2013-01-01

    We demonstrate quantitatively that, as predicted by evolutionary theory, sequences of homologous proteins from different species converge as we go further and further back in time. The converse, a non-evolutionary model can be expressed as probabilities, and the test works for chloroplast, nuclear and mitochondrial sequences, as well as for sequences that diverged at different time depths. Even on our conservative test, the probability that chance could produce the observed levels of ancestral convergence for just one of the eight datasets of 51 proteins is ≈1×10⁻¹⁹ and combined over 8 datasets is ≈1×10⁻¹³². By comparison, there are about 10⁸⁰ protons in the universe, hence the probability that the sequences could have been produced by a process involving unrelated ancestral sequences is about 10⁵⁰ lower than picking, among all protons, the same proton at random twice in a row. A non-evolutionary control model shows no convergence, and only a small number of parameters are required to account for the observations. It is time that that researchers insisted that doubters put up testable alternatives to evolution.

  10. DNA-protein recognition and sequence-dependent variations of DNA conformational properties

    NASA Astrophysics Data System (ADS)

    Vologodskii, Alexander

    2015-03-01

    Parameters of B-DNA, the major form of the double helix, depend on its sequence. This dependence can contribute to the recognition of specific DNA sequences by proteins. Here we try to analyze this contribution quantitatively. In the first approach to this goal we used experimental data on the sequence dependence of DNA bending rigidity and its helical repeat. The solution data on these parameters of B-DNA were derived from the experiments on cyclization of short DNA fragments with specially designed sequences. The data allowed calculating the sequence variations of DNA bending energy, as well as the variations of the energy of torsional deformation of the double helix associated with a protein binding. The results show that DNA conformational parameters can have very limited influence on the sequence specificity of protein binding. In the second approach we analyzed the experimental data on the binding affinity of the nucleosome core with DNA fragments of different sequences. The conclusions derived in these two approaches are in a good agreement with one another.

  11. The Enzyme-Like Domain of Arabidopsis Nuclear β-Amylases Is Critical for DNA Sequence Recognition and Transcriptional Activation[C][W][OPEN

    PubMed Central

    Soyk, Sebastian; Šimková, Klára; Zürcher, Evelyne; Luginbühl, Leonie; Brand, Luise H.; Vaughan, Cara K.; Wanke, Dierk; Zeeman, Samuel C.

    2014-01-01

    Plant BZR1-BAM transcription factors contain a β-amylase (BAM)–like domain, characteristic of proteins involved in starch breakdown. The enzyme-derived domains appear to be noncatalytic, but they determine the function of the two Arabidopsis thaliana BZR1-BAM isoforms (BAM7 and BAM8) during transcriptional initiation. Removal or swapping of the BAM domains demonstrates that the BAM7 BAM domain restricts DNA binding and transcriptional activation, while the BAM8 BAM domain allows both activities. Furthermore, we demonstrate that BAM7 and BAM8 interact on the protein level and cooperate during transcriptional regulation. Site-directed mutagenesis of residues in the BAM domain of BAM8 shows that its function as a transcriptional activator is independent of catalysis but requires an intact substrate binding site, suggesting it may bind a ligand. Microarray experiments with plants overexpressing truncated versions lacking the BAM domain indicate that the pseudo-enzymatic domain increases selectivity for the preferred cis-regulatory element BBRE (BZR1-BAM Responsive Element). Side specificity toward the G-box may allow crosstalk to other signaling networks. This work highlights the importance of the enzyme-derived domain of BZR1-BAMs, supporting their potential role as metabolic sensors. PMID:24748042

  12. Molecular Poltergeists: Mitochondrial DNA Copies (numts) in Sequenced Nuclear Genomes

    PubMed Central

    Hazkani-Covo, Einat; Zeller, Raymond M.; Martin, William

    2010-01-01

    The natural transfer of DNA from mitochondria to the nucleus generates nuclear copies of mitochondrial DNA (numts) and is an ongoing evolutionary process, as genome sequences attest. In humans, five different numts cause genetic disease and a dozen human loci are polymorphic for the presence of numts, underscoring the rapid rate at which mitochondrial sequences reach the nucleus over evolutionary time. In the laboratory and in nature, numts enter the nuclear DNA via non-homolgous end joining (NHEJ) at double-strand breaks (DSBs). The frequency of numt insertions among 85 sequenced eukaryotic genomes reveal that numt content is strongly correlated with genome size, suggesting that the numt insertion rate might be limited by DSB frequency. Polymorphic numts in humans link maternally inherited mitochondrial genotypes to nuclear DNA haplotypes during the past, offering new opportunities to associate nuclear markers with mitochondrial markers back in time. PMID:20168995

  13. Nanopore-based fourth-generation DNA sequencing technology.

    PubMed

    Feng, Yanxiao; Zhang, Yuechuan; Ying, Cuifeng; Wang, Deqiang; Du, Chunlei

    2015-02-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than $100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  14. Plasma DNA aberrations in systemic lupus erythematosus revealed by genomic and methylomic sequencing.

    PubMed

    Chan, Rebecca W Y; Jiang, Peiyong; Peng, Xianlu; Tam, Lai-Shan; Liao, Gary J W; Li, Edmund K M; Wong, Priscilla C H; Sun, Hao; Chan, K C Allen; Chiu, Rossa W K; Lo, Y M Dennis

    2014-12-09

    We performed a high-resolution analysis of the biological characteristics of plasma DNA in systemic lupus erythematosus (SLE) patients using massively parallel genomic and methylomic sequencing. A number of plasma DNA abnormalities were found. First, aberrations in measured genomic representations (MGRs) were identified in the plasma DNA of SLE patients. The extent of the aberrations in MGRs correlated with anti-double-stranded DNA (anti-dsDNA) antibody level. Second, the plasma DNA of active SLE patients exhibited skewed molecular size-distribution profiles with a significantly increased proportion of short DNA fragments. The extent of plasma DNA shortening in SLE patients correlated with the SLE disease activity index (SLEDAI) and anti-dsDNA antibody level. Third, the plasma DNA of active SLE patients showed decreased methylation densities. The extent of hypomethylation correlated with SLEDAI and anti-dsDNA antibody level. To explore the impact of anti-dsDNA antibody on plasma DNA in SLE, a column-based protein G capture approach was used to fractionate the IgG-bound and non-IgG-bound DNA in plasma. Compared with healthy individuals, SLE patients had higher concentrations of IgG-bound DNA in plasma. More IgG binding occurs at genomic locations showing increased MGRs. Furthermore, the IgG-bound plasma DNA was shorter in size and more hypomethylated than the non-IgG-bound plasma DNA. These observations have enhanced our understanding of the spectrum of plasma DNA aberrations in SLE and may provide new molecular markers for SLE. Our results also suggest that caution should be exercised when interpreting plasma DNA-based noninvasive prenatal testing and cancer testing conducted for SLE patients.

  15. Efficient depletion of host DNA contamination in malaria clinical sequencing.

    PubMed

    Oyola, Samuel O; Gu, Yong; Manske, Magnus; Otto, Thomas D; O'Brien, John; Alcock, Daniel; Macinnis, Bronwyn; Berriman, Matthew; Newbold, Chris I; Kwiatkowski, Dominic P; Swerdlow, Harold P; Quail, Michael A

    2013-03-01

    The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci.

  16. Ultrasensitive fluorescence detection of DNA sequencing gels

    SciTech Connect

    Mathies, R.A.

    1991-01-01

    During the three years of this grant we have: (1) Developed and applied a new theory for optimizing high-sensitivity fluorescence detection. (2) Developed and patented a new high-sensitivity confocal-fluorescence laser-excited gel-scanner. (3) Applied this scanner to the development of a new class of versatile and sensitive fluorescent dyes for DNA detection. (4) Developed methods for the detection of single fluorescent molecules by fluorescence burst detection. 11 refs., 10 figs.

  17. Recognizing a Single Base in an Individual DNA Strand: A Step Toward Nanopore DNA Sequencing**

    PubMed Central

    Ashkenasy, N.; Sánchez-Quesada, J.; Ghadiri, M. R.; Bayley, H.

    2007-01-01

    Functional supramolecular chemistry at the single-molecule level. Single strands of DNA can be captured inside α-hemolysin transmembrane pore protein to form single-species α-HL·DNA pseudorotaxanes. This process can be used to identify a single adenine nucleotide at a specific location on a strand of DNA by the characteristic reductions in the α-HL ion conductance. This study suggests that α-HL-mediated single-molecule DNA sequencing might be fundamentally feasible. PMID:15666419

  18. PCR Primers for Metazoan Mitochondrial 12S Ribosomal DNA Sequences

    PubMed Central

    Machida, Ryuji J.; Kweskin, Matthew; Knowlton, Nancy

    2012-01-01

    Background Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. Methodology/Principal Findings A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. Conclusions/Significance Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans. PMID:22536450

  19. Sequence of figwort mosaic virus DNA (caulimovirus group).

    PubMed

    Richins, R D; Scholthof, H B; Shepherd, R J

    1987-10-26

    The nucleotide sequence of an infectious clone of figwort mosaic virus (FMV) was determined using the dideoxynucleotide chain termination method. The double-stranded DNA genome (7743 base pairs) contained eight open reading frames (ORFs), seven of which corresponded approximately in size and location to the ORFs found in the genome of cauliflower mosaic virus (CaMV) and carnation etched ring virus (CERV). ORFs I and V of FMV demonstrated the highest degrees of nucleotide and amino acid sequence homology with the equivalent coding regions of CaMV and CERV. Regions II, III and IV showed somewhat less homology with the analogous regions of CaMV and CERV, and ORF VI showed homology with the corresponding gene of CaMV and CERV in only a short segment near the middle of the putative gene product. A 16 nucleotide sequence, complementary to the 3' terminus of methionine initiator tRNA (tRNAimet) and presumed to be the primer binding site for initiation of reverse transcription to produce minus strand DNA, was found in the FMV genome near the discontinuity in the minus strand. Sequences near the three interruptions in the plus strand of FMV DNA bear strong resemblance to similarly located sequences of 3 other caulimoviruses and are inferred to be initiation sites for second strand DNA synthesis. Additional conserved sequences in the small and large intergenic regions are pointed out including a highly conserved 35 bp sequence that occurs in the latter region.

  20. Crystal Structure of Human Thymine DNA Glycosylase Bound to DNA Elucidates Sequence-Specific Mismatch Recognition

    SciTech Connect

    Maiti, A.; Morgan, M.T.; Pozharski, E.; Drohat, A.C.

    2009-05-19

    Cytosine methylation at CpG dinucleotides produces m{sup 5}CpG, an epigenetic modification that is important for transcriptional regulation and genomic stability in vertebrate cells. However, m{sup 5}C deamination yields mutagenic G{center_dot}T mispairs, which are implicated in genetic disease, cancer, and aging. Human thymine DNA glycosylase (hTDG) removes T from G{center_dot}T mispairs, producing an abasic (or AP) site, and follow-on base excision repair proteins restore the G{center_dot}C pair. hTDG is inactive against normal A{center_dot}T pairs, and is most effective for G{center_dot}T mispairs and other damage located in a CpG context. The molecular basis of these important catalytic properties has remained unknown. Here, we report a crystal structure of hTDG (catalytic domain, hTDG{sup cat}) in complex with abasic DNA, at 2.8 {angstrom} resolution. Surprisingly, the enzyme crystallized in a 2:1 complex with DNA, one subunit bound at the abasic site, as anticipated, and the other at an undamaged (nonspecific) site. Isothermal titration calorimetry and electrophoretic mobility-shift experiments indicate that hTDG and hTDG{sup cat} can bind abasic DNA with 1:1 or 2:1 stoichiometry. Kinetics experiments show that the 1:1 complex is sufficient for full catalytic (base excision) activity, suggesting that the 2:1 complex, if adopted in vivo, might be important for some other activity of hTDG, perhaps binding interactions with other proteins. Our structure reveals interactions that promote the stringent specificity for guanine versus adenine as the pairing partner of the target base and interactions that likely confer CpG sequence specificity. We find striking differences between hTDG and its prokaryotic ortholog (MUG), despite the relatively high (32%) sequence identity.

  1. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    PubMed

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  2. DNA endonuclease activities on psoralen plus ultraviolet light treated DNA

    SciTech Connect

    Lambert, M.W.; Clark, M.

    1986-03-01

    Activities of nuclear DNA endonucleases (Endos) from normal human lymphoblastoid cells on DNA treated with the DNA interstrand cross-linking agents 4,5'8-trimethyl psoralen (TMP) or 8-methoxypsoralen (MOP) plus long-wavelength (320-400 nm) ultraviolet light (UVA) were examined. Chromatin-associated DNA Endos were isolated from both cell lines and subjected to isoelectric focusing (IF). Each IF fraction was assayed for DNA Endo activity. Peaks of activity were pooled and assayed for activity on undamaged PM2 bacteriophage DNA and on PM2 DNA that had been treated with 15 ..mu..g/ml TMP or MOP in the dark and then exposed to UVA light. Unbound psoralen was removed by dialysis and a second dose of UVA light was given in order to increase the number of DNA cross-links. Two Endo activities were found which were active on TMP- and MOP-DNA: a major one, pI 4.6, which is also active on intercalated DNA, and a second, lesser one, pI 7.6, which is active on UVC (254 nm) light irradiated DNA. These results indicate that there are two different DNA Endos which act on both TMP- and MOP-treated DNA and that the major activity recognizes the intercalation of, and/or the cross-link produced by interaction of, psoralen with DNA.

  3. Modeling the early stage of DNA sequence recognition within RecA nucleoprotein filaments.

    PubMed

    Saladin, Adrien; Amourda, Christopher; Poulain, Pierre; Férey, Nicolas; Baaden, Marc; Zacharias, Martin; Delalande, Olivier; Prévost, Chantal

    2010-10-01

    Homologous recombination is a fundamental process enabling the repair of double-strand breaks with a high degree of fidelity. In prokaryotes, it is carried out by RecA nucleofilaments formed on single-stranded DNA (ssDNA). These filaments incorporate genomic sequences that are homologous to the ssDNA and exchange the homologous strands. Due to the highly dynamic character of this process and its rapid propagation along the filament, the sequence recognition and strand exchange mechanism remains unknown at the structural level. The recently published structure of the RecA/DNA filament active for recombination (Chen et al., Mechanism of homologous recombination from the RecA-ssDNA/dsDNA structure, Nature 2008, 453, 489) provides a starting point for new exploration of the system. Here, we investigate the possible geometries of association of the early encounter complex between RecA/ssDNA filament and double-stranded DNA (dsDNA). Due to the huge size of the system and its dense packing, we use a reduced representation for protein and DNA together with state-of-the-art molecular modeling methods, including systematic docking and virtual reality simulations. The results indicate that it is possible for the double-stranded DNA to access the RecA-bound ssDNA while initially retaining its Watson-Crick pairing. They emphasize the importance of RecA L2 loop mobility for both recognition and strand exchange.

  4. Theoretical modelling of epigenetically modified DNA sequences

    PubMed Central

    Carvalho, Alexandra Teresa Pires; Gouveia, Maria Leonor; Raju Kanna, Charan; Wärmländer, Sebastian K. T. S.; Platts, Jamie; Kamerlin, Shina Caroline Lynn

    2015-01-01

    We report herein a set of calculations designed to examine the effects of epigenetic modifications on the structure of DNA. The incorporation of methyl, hydroxymethyl, formyl and carboxy substituents at the 5-position of cytosine is shown to hardly affect the geometry of CG base pairs, but to result in rather larger changes to hydrogen-bond and stacking binding energies, as predicted by dispersion-corrected density functional theory (DFT) methods. The same modifications within double-stranded GCG and ACA trimers exhibit rather larger structural effects, when including the sugar-phosphate backbone as well as sodium counterions and implicit aqueous solvation. In particular, changes are observed in the buckle and propeller angles within base pairs and the slide and roll values of base pair steps, but these leave the overall helical shape of DNA essentially intact. The structures so obtained are useful as a benchmark of faster methods, including molecular mechanics (MM) and hybrid quantum mechanics/molecular mechanics (QM/MM) methods. We show that previously developed MM parameters satisfactorily reproduce the trimer structures, as do QM/MM calculations which treat bases with dispersion-corrected DFT and the sugar-phosphate backbone with AMBER. The latter are improved by inclusion of all six bases in the QM region, since a truncated model including only the central CG base pair in the QM region is considerably further from the DFT structure. This QM/MM method is then applied to a set of double-stranded DNA heptamers derived from a recent X-ray crystallographic study, whose size puts a DFT study beyond our current computational resources. These data show that still larger structural changes are observed than in base pairs or trimers, leading us to conclude that it is important to model epigenetic modifications within realistic molecular contexts. PMID:26448859

  5. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    PubMed

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  6. Selective enrichment of damaged DNA molecules for ancient genome sequencing

    PubMed Central

    2014-01-01

    Contamination by present-day human and microbial DNA is one of the major hindrances for large-scale genomic studies using ancient biological material. We describe a new molecular method, U selection, which exploits one of the most distinctive features of ancient DNA—the presence of deoxyuracils—for selective enrichment of endogenous DNA against a complex background of contamination during DNA library preparation. By applying the method to Neanderthal DNA extracts that are heavily contaminated with present-day human DNA, we show that the fraction of useful sequence information increases ∼10-fold and that the resulting sequences are more efficiently depleted of human contamination than when using purely computational approaches. Furthermore, we show that U selection can lead to a four- to fivefold increase in the proportion of endogenous DNA sequences relative to those of microbial contaminants in some samples. U selection may thus help to lower the costs for ancient genome sequencing of nonhuman samples also. PMID:25081630

  7. Applications of recursive segmentation to the analysis of DNA sequences.

    PubMed

    Li, Wentian; Bernaola-Galván, Pedro; Haghighi, Fatameh; Grosse, Ivo

    2002-07-01

    Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.

  8. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    PubMed Central

    Inbamalar, T. M.; Sivakumar, R.

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system. PMID:26000337

  9. A Fast Algorithm for Exonic Regions Prediction in DNA Sequences

    PubMed Central

    Saberkari, Hamidreza; Shamsi, Mousa; Heravi, Hamed; Sedaaghi, Mohammad Hossein

    2013-01-01

    The main purpose of this paper is to introduce a fast method for gene prediction in DNA sequences based on the period-3 property in exons. First, the symbolic DNA sequences were converted to digital signal using the electron ion interaction potential method. Then, to reduce the effect of background noise in the period-3 spectrum, we used the discrete wavelet transform at three levels and applied it on the input digital signal. Finally, the Goertzel algorithm was used to extract period-3 components in the filtered DNA sequence. The proposed algorithm leads to decrease the computational complexity and hence, increases the speed of the process. Detection of small size exons in DNA sequences, exactly, is another advantage of the algorithm. The proposed algorithm ability in exon prediction was compared with several existing methods at the nucleotide level using: (i) specificity - sensitivity values; (ii) receiver operating curves (ROC); and (iii) area under ROC curve. Simulation results confirmed that the proposed method can be used as a promising tool for exon prediction in DNA sequences. PMID:24672762

  10. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    PubMed Central

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  11. Mapping DNA polymerase errors by single-molecule sequencing

    PubMed Central

    Lee, David F.; Lu, Jenny; Chang, Seungwoo; Loparo, Joseph J.; Xie, Xiaoliang S.

    2016-01-01

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replication product is tagged with a unique nucleotide sequence before amplification. This allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases. PMID:27185891

  12. Label-free DNA sequencing using Millikan detection.

    PubMed

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications.

  13. Label-Free DNA Sequencing Using Millikan Detection

    PubMed Central

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-01-01

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of ~ 1% were reliably detected during DNA polymerization allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications. PMID:26151683

  14. Correlations in DNA sequences across the three domains of life

    NASA Astrophysics Data System (ADS)

    Guharay, Sabyasachi; Hunt, Brian R.; Yorke, James A.; White, Owen R.

    2000-11-01

    We report statistical studies of correlation properties of ∼7500 gene sequences, covering coding (exon) and non-coding (intron) sequences for DNA and primary amino acid sequences for proteins, across all three domains of life, namely Eukaryotes (cells with nuclei), Prokaryotes (bacteria) and Archaea (archaebacteria). Mutual information function, power spectrum and Hölder exponent analyses show exons with somewhat greater correlation content than the introns studied. These results are further confirmed with hypothesis testing. While ∼30% of the Eukaryote coding sequences show distinct correlations above noise threshold, this is true for only ∼10% of the Prokaryote and Archaea coding sequences. For protein sequences, we observe correlation lengths similar to that of “random” sequences.

  15. Transcriptional Regulation in Mammalian Cells by Sequence-Specific DNA Binding Proteins

    NASA Astrophysics Data System (ADS)

    Mitchell, Pamela J.; Tjian, Robert

    1989-07-01

    The cloning of genes encoding mammalian DNA binding transcription factors for RNA polymerase II has provided the opportunity to analyze the structure and function of these proteins. This review summarizes recent studies that define structural domains for DNA binding and transcriptional activation functions in sequence-specific transcription factors. The mechanisms by which these factors may activate transcriptional initiation and by which they may be regulated to achieve differential gene expression are also discussed.

  16. Dialects of the DNA Uptake Sequence in Neisseriaceae

    PubMed Central

    Frye, Stephan A.; Nilsen, Mariann; Tønjum, Tone; Ambur, Ole Herman

    2013-01-01

    In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS), which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS–dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5′-CTG-3′ is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS–dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic transformation

  17. A simple and rapid method for the preparation of homologous DNA oligonucleotide hybridization probes from heterologous gene sequences and probes.

    PubMed

    Maxwell, E S; Sarge, K D

    1988-11-30

    We describe a simple and rapid method for the preparation of homologous DNA oligonucleotide probes for hybridization analysis and/or cDNA/genomic library screening. With this method, a synthetic DNA oligonucleotide derived from a known heterologous DNA/RNA/protein sequence is annealed to an RNA preparation containing the gene transcript of interest. Any unpaired 3'-terminal oligonucleotides of the heterologous DNA primer are then removed using the 3' exonuclease activity of the DNA Polymerase I Klenow fragment before primer extension/dideoxynucleotide sequencing of the annealed RNA species with AMV reverse transcriptase. From the determined RNA sequence, a completely homologous DNA oligonucleotide probe is then prepared. This approach has been used to prepare a homologous DNA oligonucleotide probe for the successful library screening of the yeast hybRNA gene starting with a heterologous mouse hybRNA DNA oligonucleotide probe.

  18. SiteOut: An Online Tool to Design Binding Site-Free DNA Sequences.

    PubMed

    Estrada, Javier; Ruiz-Herrero, Teresa; Scholes, Clarissa; Wunderlich, Zeba; DePace, Angela H

    2016-01-01

    DNA-binding proteins control many fundamental biological processes such as transcription, recombination and replication. A major goal is to decipher the role that DNA sequence plays in orchestrating the binding and activity of such regulatory proteins. To address this goal, it is useful to rationally design DNA sequences with desired numbers, affinities and arrangements of protein binding sites. However, removing binding sites from DNA is computationally non-trivial since one risks creating new sites in the process of deleting or moving others. Here we present an online binding site removal tool, SiteOut, that enables users to design arbitrary DNA sequences that entirely lack binding sites for factors of interest. SiteOut can also be used to delete sites from a specific sequence, or to introduce site-free spacers between functional sequences without creating new sites at the junctions. In combination with commercial DNA synthesis services, SiteOut provides a powerful and flexible platform for synthetic projects that interrogate regulatory DNA. Here we describe the algorithm and illustrate the ways in which SiteOut can be used; it is publicly available at https://depace.med.harvard.edu/siteout/.

  19. SiteOut: An Online Tool to Design Binding Site-Free DNA Sequences

    PubMed Central

    Scholes, Clarissa; Wunderlich, Zeba; DePace, Angela H.

    2016-01-01

    DNA-binding proteins control many fundamental biological processes such as transcription, recombination and replication. A major goal is to decipher the role that DNA sequence plays in orchestrating the binding and activity of such regulatory proteins. To address this goal, it is useful to rationally design DNA sequences with desired numbers, affinities and arrangements of protein binding sites. However, removing binding sites from DNA is computationally non-trivial since one risks creating new sites in the process of deleting or moving others. Here we present an online binding site removal tool, SiteOut, that enables users to design arbitrary DNA sequences that entirely lack binding sites for factors of interest. SiteOut can also be used to delete sites from a specific sequence, or to introduce site-free spacers between functional sequences without creating new sites at the junctions. In combination with commercial DNA synthesis services, SiteOut provides a powerful and flexible platform for synthetic projects that interrogate regulatory DNA. Here we describe the algorithm and illustrate the ways in which SiteOut can be used; it is publicly available at https://depace.med.harvard.edu/siteout/. PMID:26987123

  20. Mitochondrial DNA Sequence Divergence among Lycopersicon and Related Solanum Species

    PubMed Central

    McClean, Phillip E.; Hanson, Maureen R.

    1986-01-01

    Sequence divergence among the mitochondrial (mt) DNA of nine Lycopersicon and two closely related Solanum species was estimated using the shared fragment method. A portion of each mt genome was highlighted by probing total DNA with a series of plasmid clones containing mt-specific DNA fragments from Lycopersicon pennellii. A total of 660 fragments were compared. As calculated by the shared fragment method, sequence divergence among the mtDNAs ranged from 0.4% for the L. esculentum-L. esculentum var. cerasiforme pair to 2.7% for the Solanum rickii-L. pimpinellifolium and L. cheesmanii-L. chilense pairs. The mtDNA divergence is higher than that reported for Lycopersicon chloroplast (cp) DNA, which indicates that the DNAs of the two plant organelles are evolving at different rates. The percentages of shared fragments were used to construct a phenogram that illustrates the present-day relationships of the mtDNAs. The mtDNA-derived phenogram places L. hirsutum closer to L. esculentum than taxonomic and cpDNA comparisons. Further, the recent assignment of L. pennellii to the genus Lycopersicon is supported by the mtDNA analysis. PMID:17246320

  1. Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA.

    PubMed

    Poinar, Hendrik N; Schwarz, Carsten; Qi, Ji; Shapiro, Beth; Macphee, Ross D E; Buigues, Bernard; Tikhonov, Alexei; Huson, Daniel H; Tomsho, Lynn P; Auch, Alexander; Rampp, Markus; Miller, Webb; Schuster, Stephan C

    2006-01-20

    We sequenced 28 million base pairs of DNA in a metagenomics approach, using a woolly mammoth (Mammuthus primigenius) sample from Siberia. As a result of exceptional sample preservation and the use of a recently developed emulsion polymerase chain reaction and pyrosequencing technique, 13 million base pairs (45.4%) of the sequencing reads were identified as mammoth DNA. Sequence identity between our data and African elephant (Loxodonta africana) was 98.55%, consistent with a paleontologically based divergence date of 5 to 6 million years. The sample includes a surprisingly small diversity of environmental DNAs. The high percentage of endogenous DNA recoverable from this single mammoth would allow for completion of its genome, unleashing the field of paleogenomics.

  2. Accelerating Computation of DNA Sequence Alignment in Distributed Environment

    NASA Astrophysics Data System (ADS)

    Guo, Tao; Li, Guiyang; Deaton, Russel

    Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.

  3. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  4. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1987-10-07

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  5. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, J.H.; Keller, R.A.; Martin, J.C.; Moyzis, R.K.; Ratliff, R.L.; Shera, E.B.; Stewart, C.C.

    1990-10-09

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed. 2 figs.

  6. Method for rapid base sequencing in DNA and RNA

    DOEpatents

    Jett, James H.; Keller, Richard A.; Martin, John C.; Moyzis, Robert K.; Ratliff, Robert L.; Shera, E. Brooks; Stewart, Carleton C.

    1990-01-01

    A method is provided for the rapid base sequencing of DNA or RNA fragments wherein a single fragment of DNA or RNA is provided with identifiable bases and suspended in a moving flow stream. An exonuclease sequentially cleaves individual bases from the end of the suspended fragment. The moving flow stream maintains the cleaved bases in an orderly train for subsequent detection and identification. In a particular embodiment, individual bases forming the DNA or RNA fragments are individually tagged with a characteristic fluorescent dye. The train of bases is then excited to fluorescence with an output spectrum characteristic of the individual bases. Accordingly, the base sequence of the original DNA or RNA fragment can be reconstructed.

  7. Multiple Base Substitution Corrections in DNA Sequence Evolution

    NASA Astrophysics Data System (ADS)

    Kowalczuk, M.; Mackiewicz, P.; Szczepanik, D.; Nowicka, A.; Dudkiewicz, M.; Dudek, M. R.; Cebrat, S.

    We discuss the Jukes and Cantor's one-parameter model and Kimura's two-parameter model unability to describe evolution of asymmetric DNA molecules. The standard distance measure between two DNA sequences, which is the number of substitutions per site, should include the effect of multiple base substitutions separately for each type of the base. Otherwise, the respective tables of substitutions cannot reconstruct the asymmetric DNA molecule with respect to the composition. Basing on Kimura's neutral theory, we have derived a linear law for the correlation of the mean survival time of nucleotides under constant mutation pressure and their fraction in the genome. According to the law, the corrections to Kimura's theory have been discussed to describe evolution of genomes with asymmetric nucleotide composition. We consider the particular case of the strongly asymmetric Borrelia burgdorferi genome and we discuss in detail the corrections, which should be introduced into the distance measure between two DNA sequences to include multiple base substitutions.

  8. Next Generation DNA Sequencing and the Future of Genomic Medicine

    PubMed Central

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpretation, laboratory workflow, data storage, and ethical considerations. This review describes the current high-throughput sequencing platforms commercially available, and compares the inherent advantages and disadvantages of each. The potential applications for clinical diagnostics are considered, as well as the need for software and analysis tools to interpret the vast amount of data generated. Finally, we discuss the clinical and ethical implications of the wealth of genetic information generated by these methods. Despite the challenges, we anticipate that the evolution and refinement of high-throughput DNA sequencing technologies will catalyze a new era of personalized medicine based on individualized genomic analysis. PMID:24710010

  9. Ancient mtDNA sequences from the First Australians revisited

    PubMed Central

    Subramanian, Sankar; Wright, Joanne L.; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D.; Willerslev, Eske; Lambert, David M.

    2016-01-01

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537–542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the “Out of Africa” model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains. PMID:27274055

  10. Ancient mtDNA sequences from the First Australians revisited.

    PubMed

    Heupink, Tim H; Subramanian, Sankar; Wright, Joanne L; Endicott, Phillip; Westaway, Michael Carrington; Huynen, Leon; Parson, Walther; Millar, Craig D; Willerslev, Eske; Lambert, David M

    2016-06-21

    The publication in 2001 by Adcock et al. [Adcock GJ, et al. (2001) Proc Natl Acad Sci USA 98(2):537-542] in PNAS reported the recovery of short mtDNA sequences from ancient Australians, including the 42,000-y-old Mungo Man [Willandra Lakes Hominid (WLH3)]. This landmark study in human ancient DNA suggested that an early modern human mitochondrial lineage emerged in Asia and that the theory of modern human origins could no longer be considered solely through the lens of the "Out of Africa" model. To evaluate these claims, we used second generation DNA sequencing and capture methods as well as PCR-based and single-primer extension (SPEX) approaches to reexamine the same four Willandra Lakes and Kow Swamp 8 (KS8) remains studied in the work by Adcock et al. Two of the remains sampled contained no identifiable human DNA (WLH15 and WLH55), whereas the Mungo Man (WLH3) sample contained no Aboriginal Australian DNA. KS8 reveals human mitochondrial sequences that differ from the previously inferred sequence. Instead, we recover a total of five modern European contaminants from Mungo Man (WLH3). We show that the remaining sample (WLH4) contains ∼1.4% human DNA, from which we assembled two complete mitochondrial genomes. One of these was a previously unidentified Aboriginal Australian haplotype belonging to haplogroup S2 that we sequenced to a high coverage. The other was a contaminating modern European mitochondrial haplotype. Although none of the sequences that we recovered matched those reported by Adcock et al., except a contaminant, these findings show the feasibility of obtaining important information from ancient Aboriginal Australian remains.

  11. Aryl hydrocarbon-induced interactions at multiple DNA elements of diverse sequence--a multicomponent mechanism for activation of cytochrome P4501A1 (CYP1A1) gene transcription.

    PubMed Central

    Robertson, R W; Zhang, L; Pasco, D S; Fagan, J B

    1994-01-01

    In vivo footprinting experiments, augmented with gel shift and transfection analyses suggest that activation of the CYP1A1 gene by aryl hydrocarbons may be a multicomponent process. During the first 30 minutes of exposure to aryl hydrocarbon carcinogens and environmental contaminants, in vivo footprints appear at nine distinct sites within a 281 bp region centered 950 bp upstream of the CYP1A1 transcription start site. Six of these sites are unrelated in sequence to the three xenobiotic response elements (XREs) within this region, at which the aryl hydrocarbon (AH) receptor is known to bind. These six display a variety of footprint patterns, are diverse in sequence and range in G-C content from 60 to 75%. This diversity suggests that multiple nuclear factors may be responsible for these six in vivo footprints. These observations are consistent with competition gel shift experiments showing that the nuclear factors binding at two of these sites are different from each other, as well as from the AH receptor. Gel shifts also indicate that the sequence-specific factors binding at these sites are expressed constitutively. This is consistent with a model in which in vivo footprints are induced at these six sites, not through direct activation or de novo synthesis of DNA-binding factors, but through a two phase mechanism in which binding of the nuclear AH receptor complex to XREs facilitates the binding of constitutive factors at these sites. This facilitation could be mediated either through specific protein-protein interactions or through alterations in chromatin structure that make these sites accessible to constitutive nuclear factors. A function for the sequences at which aryl hydrocarbons induce in vivo footprints is suggested by transfection experiments showing that one of these sequences cooperates with a weak XRE to confer on a reporter gene responsiveness to aryl hydrocarbons. Images PMID:8202380

  12. Mitochondrial DNA sequences from a 7000-year old brain.

    PubMed Central

    Pääbo, S; Gifford, J A; Wilson, A C

    1988-01-01

    Pieces of mitochondrial DNA from a 7000-year-old human brain were amplified by the polymerase chain reaction and sequenced. Albumin and high concentrations of polymerase were required to overcome a factor in the brain extract that inhibits amplification. For this and other sources of ancient DNA, we find an extreme inverse dependence of the amplification efficiency on the length of the sequence to be amplified. This property of ancient DNA distinguishes it from modern DNA and thus provides a new criterion of authenticity for use in research on ancient DNA. The brain is from an individual recently excavated from Little Salt Spring in southwestern Florida and the anthropologically informative sequences it yielded are the first obtained from archaeologically retrieved remains. The sequences show that this ancient individual belonged to a mitochondrial lineage that is rare in the Old World and not previously known to exist among Native Americans. Our finding brings to three the number of maternal lineages known to have been involved in the prehistoric colonization of the New World. Images PMID:3186445

  13. Preparation of next-generation sequencing libraries from damaged DNA.

    PubMed

    Briggs, Adrian W; Heyn, Patricia

    2012-01-01

    Next-generation sequencing (NGS) has revolutionized ancient DNA research, especially when combined with high-throughput target enrichment methods. However, attaining high sequencing depth and accuracy from samples often remains problematic due to the damaged state of ancient DNA, in particular the extremely low copy number of ancient DNA and the abundance of uracil residues derived from cytosine deamination that lead to miscoding errors. It is therefore critical to use a highly efficient procedure for conversion of a raw DNA extract into an adaptor-ligated sequencing library, and equally important to reduce errors from uracil residues. We present a protocol for NGS library preparation that allows highly efficient conversion of DNA fragments into an adaptor-ligated form. The protocol incorporates an option to remove the vast majority of uracil miscoding lesions as part of the library preparation process. The procedure requires only two spin column purification steps and no gel purification or bead handling. Starting from an aliquot of DNA extract, a finished, highly amplified library can be generated in 5 h, or under 3 h if uracil removal is not required.

  14. Isolation of a sex-linked DNA sequence in cranes.

    PubMed

    Duan, W; Fuerst, P A

    2001-01-01

    A female-specific DNA fragment (CSL-W; crane sex-linked DNA on W chromosome) was cloned from female whooping cranes (Grus americana). From the nucleotide sequence of CSL-W, a set of polymerase chain reaction (PCR) primers was identified which amplify a 227-230 bp female-specific fragment from all existing crane species and some other noncrane species. A duplicated versions of the DNA segment, which is found to have a larger size (231-235 bp) than CSL-W in both sexes, was also identified, and was designated CSL-NW (crane sex-linked DNA on non-W chromosome). The nucleotide similarity between the sequences of CSL-W and CSL-NW from whooping cranes was 86.3%. The CSL primers do not amplify any sequence from mammalian DNA, limiting the potential for contamination from human sources. Using the CSL primers in combination with a quick DNA extraction method allows the noninvasive identification of crane gender in less than 10 h. A test of the methodology was carried out on fully developed body feathers from 18 captive cranes and resulted in 100% successful identification.

  15. Bayesian estimation of sequence damage in ancient DNA.

    PubMed

    Ho, Simon Y W; Heupink, Tim H; Rambaut, Andrew; Shapiro, Beth

    2007-06-01

    DNA extracted from archaeological and paleontological remains is usually damaged by biochemical processes postmortem. Some of these processes lead to changes in the structure of the DNA molecule, which can result in the incorporation of incorrect nucleotides during polymerase chain reaction. These base misincorporations, or miscoding lesions, can lead to the inclusion of spurious additional mutations in ancient DNA (aDNA) data sets. This has the potential to affect the outcome of phylogenetic and population genetic analyses, including estimates of mutation rates and genetic diversity. We present a novel model, termed the delta model, which estimates the amount of damage in DNA data and accounts for its effects in a Bayesian phylogenetic framework. The ability of the delta model to estimate damage is first investigated using a simulation study. The model is then applied to 13 aDNA data sets. The amount of damage in these data sets is shown to be significant but low (about 1 damaged base per 750 nt), suggesting that precautions for limiting the influence of damaged sites, such as cloning and enzymatic treatment, are worthwhile. The results also suggest that relatively high rates of mutation previously estimated from aDNA data are not entirely an artifact of sequence damage and are likely to be due to other factors such as the persistence of transient polymorphisms. The delta model appears to be particularly useful for placing upper credibility limits on the amount of sequence damage in an alignment, and this capacity might be beneficial for future aDNA studies or for the estimation of sequencing errors in modern DNA.

  16. Real-time DNA sequencing from single polymerase molecules.

    PubMed

    Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David; Baybayan, Primo; Bettman, Brad; Bibillo, Arkadiusz; Bjornson, Keith; Chaudhuri, Bidhan; Christians, Frederick; Cicero, Ronald; Clark, Sonya; Dalal, Ravindra; Dewinter, Alex; Dixon, John; Foquet, Mathieu; Gaertner, Alfred; Hardenbol, Paul; Heiner, Cheryl; Hester, Kevin; Holden, David; Kearns, Gregory; Kong, Xiangxu; Kuse, Ronald; Lacroix, Yves; Lin, Steven; Lundquist, Paul; Ma, Congcong; Marks, Patrick; Maxham, Mark; Murphy, Devon; Park, Insil; Pham, Thang; Phillips, Michael; Roy, Joy; Sebra, Robert; Shen, Gene; Sorenson, Jon; Tomaney, Austin; Travers, Kevin; Trulson, Mark; Vieceli, John; Wegener, Jeffrey; Wu, Dawn; Yang, Alicia; Zaccarin, Denis; Zhao, Peter; Zhong, Frank; Korlach, Jonas; Turner, Stephen

    2009-01-02

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequencing reactions. Conjugation of fluorophores to the terminal phosphate moiety of the dNTPs allows continuous observation of DNA synthesis over thousands of bases without steric hindrance. The data report directly on polymerase dynamics, revealing distinct polymerization states and pause sites corresponding to DNA secondary structure. Sequence data were aligned with the known reference sequence to assay biophysical parameters of polymerization for each template position. Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.

  17. RNA–DNA sequence differences in Saccharomyces cerevisiae

    PubMed Central

    Wang, Isabel X.; Grunseich, Christopher; Chung, Youree G.; Kwak, Hojoong; Ramrattan, Girish; Zhu, Zhengwei; Cheung, Vivian G.

    2016-01-01

    Alterations of RNA sequences and structures, such as those from editing and alternative splicing, result in two or more RNA transcripts from a DNA template. It was thought that in yeast, RNA editing only occurs in tRNAs. Here, we found that Saccharomyces cerevisiae have all 12 types of RNA–DNA sequence differences (RDDs) in the mRNA. We showed these sequence differences are propagated to proteins, as we identified peptides encoded by the RNA sequences in addition to those by the DNA sequences at RDD sites. RDDs are significantly enriched at regions with R-loops. A screen of yeast mutants showed that RDD formation is affected by mutations in genes regulating R-loops. Loss-of-function mutations in ribonuclease H, senataxin, and topoisomerase I that resolve RNA–DNA hybrids lead to increases in RDD frequency. Our results demonstrate that RDD is a conserved process that diversifies transcriptomes and proteomes and provide a mechanistic link between R-loops and RDDs. PMID:27638543

  18. Reduced-stringency DNA reassociation: sequence specific duplex formation.

    PubMed Central

    Burr, H E; Schimke, R T

    1982-01-01

    Reduced-stringency DNA reassociation conditions allow low stability duplexes to be detected in prokaryotic, plant, fish, avian, mammalian, and primate genomes. Highly diverged families of sequences can be detected in avian, mouse, and human unique sequence dNAs. Such a family has been described among twelve species of birds; based on species specific melting profiles and fractionation of sequences belonging to this family, it was concluded that permissive reassociation conditions did not artifactually produce low stability structures (1). We report S1 nuclease and optical melting experiments, and further fractionation of the diverged family to confirm sequence specific DNA reassociation at 50 degrees in 0.5 M phosphate buffer. PMID:6278429

  19. Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

    NASA Astrophysics Data System (ADS)

    Chen, K.

    2017-01-01

    With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).

  20. Bacterial diversity assessment in soil of an active Brazilian copper mine using high-throughput sequencing of 16S rDNA amplicons.

    PubMed

    Rodrigues, Viviane D; Torres, Tatiana T; Ottoboni, Laura M M

    2014-11-01

    Mining activities pose severe environmental risks worldwide, generating extreme pH conditions and high concentrations of heavy metals, which can have major impacts on the survival of organisms. In this work, pyrosequencing of the V3 region of the 16S rDNA was used to analyze the bacterial communities in soil samples from a Brazilian copper mine. For the analysis, soil samples were collected from the slopes (geotechnical structures) and the surrounding drainage of the Sossego mine (comprising the Sossego and Sequeirinho deposits). The results revealed complex bacterial diversity, and there was no influence of deposit geographic location on the composition of the communities. However, the environment type played an important role in bacterial community divergence; the composition and frequency of OTUs in the slope samples were different from those of the surrounding drainage samples, and Acidobacteria, Chloroflexi, Firmicutes, and Gammaproteobacteria were responsible for the observed difference. Chemical analysis indicated that both types of sample presented a high metal content, while the amounts of organic matter and water were higher in the surrounding drainage samples. Non-metric multidimensional scaling (N-MDS) analysis identified organic matter and water as important distinguishing factors between the bacterial communities from the two types of mine environment. Although habitat-specific OTUs were found in both environments, they were more abundant in the surrounding drainage samples (around 50 %), and contributed to the higher bacterial diversity found in this habitat. The slope samples were dominated by a smaller number of phyla, especially Firmicutes. The bacterial communities from the slope and surrounding drainage samples were different in structure and composition, and the organic matter and water present in these environments contributed to the observed differences.

  1. An optimization approach and its application to compare DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Liwei; Li, Chao; Bai, Fenglan; Zhao, Qi; Wang, Ying

    2015-02-01

    Studying the evolutionary relationship between biological sequences has become one of the main tasks in bioinformatics research by means of comparing and analyzing the gene sequence. Many valid methods have been applied to the DNA sequence alignment. In this paper, we propose a novel comparing method based on the Lempel-Ziv (LZ) complexity to compare biological sequences. Moreover, we introduce a new distance measure and make use of the corresponding similarity matrix to construct phylogenic tree without multiple sequence alignment. Further, we construct phylogenic tree for 24 species of Eutherian mammals and 48 countries of Hepatitis E virus (HEV) by an optimization approach. The results indicate that this new method improves the efficiency of sequence comparison and successfully construct phylogenies.

  2. Contrasting DNA sequence organisation patterns in sauropsidian genomes.

    PubMed

    Epplen, J T; Diedrich, U; Wagenmann, M; Schmidtke, J; Engel, W

    1979-11-01

    The genomic DNA organisation patterns of four sauropsidian species, namely Python reticularis, Caiman crocodilus, Terrapene carolina triungius and Columba livia domestica were investigated by reassociation of short and long DNA fragments, by hyperchromicity measurements of reannealed fragments and by length estimations of S1-nuclease resistant repetitive duplexes. While the genomic DNA of the three reptilian species shows a short period interspersion pattern, the genome of the avian species is organised in a long period interspersion pattern apparently typical for birds. These findings are discussed in view of the close phylogenetic relationships of birds and reptiles, and also with regard to a possible relationship between the extent of sequence interspersion and genome size.

  3. Base-sequence-dependent sliding of proteins on DNA.

    PubMed

    Barbi, M; Place, C; Popkov, V; Salerno, M

    2004-10-01

    The possibility that the sliding motion of proteins on DNA is influenced by the base sequence through a base pair reading interaction, is considered. Referring to the case of the T7 RNA-polymerase, we show that the protein should follow a noise-influenced sequence-dependent motion which deviate from the standard random walk usually assumed. The general validity and the implications of the results are discussed.

  4. Feature Extraction From DNA Sequences by Multifractal Analysis

    DTIC Science & Technology

    2007-11-02

    genome may lead to an under- standing of the genome and to the understanding of life. Recently a draft sequence of the human genome ...which covers 96% of the entire human genome containing base pairs, has been published by the Human Genome Project (HGP) and Celera Genomics . However...time series model based on the global structure of the complete genome , and showed long-range correlations in the bacteria DNA sequences . Although

  5. DNA Sequence Analysis of SLC26A5, Encoding Prestin, in a Patient-Control Cohort: Identification of Fourteen Novel DNA Sequence Variations

    PubMed Central

    Minor, Jacob S.; Tang, Hsiao-Yuan; Pereira, Fred A.; Alford, Raye Lynn

    2009-01-01

    Background Prestin, encoded by the gene SLC26A5, is a transmembrane protein of the cochlear outer hair cell (OHC). Prestin is required for the somatic electromotile activity of OHCs, which is absent in OHCs and causes severe hearing impairment in mice lacking prestin. In humans, the role of sequence variations in SLC26A5 in hearing loss is less clear. Although prestin is expected to be required for functional human OHCs, the clinical significance of reported putative mutant alleles in humans is uncertain. Methodology/Principal Findings To explore the hypothesis that SLC26A5 may act as a modifier gene, affecting the severity of hearing loss caused by an independent etiology, a patient-control cohort was screened for DNA sequence variations in SLC26A5 using sequencing and allele specific methods. Patients in this study carried known pathogenic or controversial sequence variations in GJB2, encoding Connexin 26, or confirmed or suspected sequence variations in SLC26A5; controls included four ethnic populations. Twenty-three different DNA sequence variations in SLC26A5, 14 of which are novel, were observed: 4 novel sequence variations were found exclusively among patients; 7 novel sequence variations were found exclusively among controls; and, 12 sequence variations, 3 of which are novel, were found in both patients and controls. Twenty-one of the 23 DNA sequence variations were located in non-coding regions of SLC26A5. Two coding sequence variations, both novel, were observed only in patients and predict a silent change, p.S434S, and an amino acid substitution, p.I663V. In silico analysis of the p.I663V amino acid variation suggested this variant might be benign. Using Fisher's exact test, no statistically significant difference was observed between patients and controls in the frequency of the identified DNA sequence variations. Haplotype analysis using HaploView 4.0 software revealed the same predominant haplotype in patients and controls and derived haplotype blocks

  6. Alignments of DNA and protein sequences containing frameshift errors.

    PubMed

    Guan, X; Uberbacher, E C

    1996-02-01

    Molecular sequences, like all experimental data, are subject to error. Many current DNA sequencing protocols have very significant error rates and often generate artefactual insertions and deletions of bases (indels) which corrupt the translation of sequences and compromise the detection of protein homologies. The impact of these errors on the utility of molecular sequence data is dependent on the analytic technique used to interpret the data. In the presence of frameshift errors, standard algorithms using six-frame translation can miss important homologies because only subfragments of the correct translation are available in any given frame. We present a new algorithm which can detect and correct frameshift errors in DNA sequences during comparison of translated sequences with protein sequences in the databases. This algorithm can recognize homologous proteins sharing 30% identity even in the presence of a 7% frameshift error rate. Our algorithm uses dynamic programming, producing a guaranteed optimal alignment in the presence of frameshifts, and has a sensitivity equivalent to Smith-Waterman. The computational efficiency of the algorithm is O(nm) where n and m are the sizes of two sequences being compared. The algorithm does not rely on prior knowledge or heuristic rules and performs significantly better than any previously reported method.

  7. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    NASA Astrophysics Data System (ADS)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

    2014-12-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5'-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 . 10-14 cm2 and 7.06 . 10-14 cm2. The highest cross section was found for 5'-TT(ATA)3TT and 5'-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy.

  8. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    PubMed Central

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie; Liévin, Jacques; Körzdörfer, Thomas; Rotaru, Alexandru; Gothelf, Kurt V.; Besenbacher, Flemming; Bald, Ilko

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5′-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections between 2.66 · 10−14 cm2 and 7.06 · 10−14 cm2. The highest cross section was found for 5′-TT(ATA)3TT and 5′-TT(ABrUA)3TT, respectively. BrU is a radiosensitizer, which was discussed to be used in cancer radiation therapy. The replacement of T by BrU into the investigated DNA sequences leads to a slight increase of the absolute strand break cross sections resulting in sequence-dependent enhancement factors between 1.14 and 1.66. Nevertheless, the variation of strand break cross sections due to the specific nucleotide sequence is considerably higher. Thus, the present results suggest the development of targeted radiosensitizers for cancer radiation therapy. PMID:25487346

  9. Investigation of a Sybr-Green-Based Method to Validate DNA Sequences for DNA Computing

    DTIC Science & Technology

    2005-05-01

    stranded DNA . We previously demonstrated that this technique can be exploited to distinguish between stably-hybridized Watson - Crick duplexes and...et al., 2004) we described the difference between the canonical Watson - Crick base pairs of DNA and the usually less stable mismatches that can also...computing, cross-hybridized duplexes represent errors. It is therefore crucial that DNA sequences be designed so that the formation of a Watson - Crick

  10. Mitochondrial DNA sequence evolution in the Arctoidea.

    PubMed Central

    Zhang, Y P; Ryder, O A

    1993-01-01

    Some taxa in the superfamily Arctoidea, such as the giant panda and the lesser panda, have presented puzzles to taxonomists. In the present study, approximately 397 bases of the cytochrome b gene, 364 bases of the 12S rRNA gene, and 74 bases of the tRNA(Thr) and tRNA(Pro) genes from the giant panda, lesser panda, kinkajou, raccoon, coatimundi, and all species of the Ursidae were sequenced. The high transition/transversion ratios in cytochrome b and RNA genes prior to saturation suggest that the presumed transition bias may represent a trend for some mammalian lineages rather than strictly a primate phenomenon. Transversions in the 12S rRNA gene accumulate in arctoids at about half the rate reported for artiodactyls. Different arctoid lineages evolve at different rates: the kinkajou, a procyonid, evolves the fastest, 1.7-1.9 times faster than the slowest lineage that comprises the spectacled and polar bears. Generation-time effect can only partially explain the different rates of nucleotide substitution in arctoids. Our results based on parsimony analysis show that the giant panda is more closely related to bears than to the lesser panda; the lesser panda is neither closely related to bears nor to the New World procyonids. The kinkajou, raccoon, and coatimundi diverged from each other very early, even though they group together. The polar bear is closely related to the spectacled bear, and they began to diverge from a common mitochondrial ancestor approximately 2 million years ago. Relationships of the remaining five bear species are derived. PMID:8415740

  11. Compilation and analysis of Escherichia coli promoter DNA sequences.

    PubMed Central

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter mutations. Nearly all of the altered base pairs in the mutants conform to the following general rule: down-mutations decrease homology and up-mutations increase homology to the consensus sequence. PMID:6344016

  12. Effect of Noise on DNA Sequencing via Transverse Electronic Transport

    PubMed Central

    Krems, Matt; Zwolak, Michael; Pershin, Yuriy V.; Di Ventra, Massimiliano

    2009-01-01

    Abstract Previous theoretical studies have shown that measuring the transverse current across DNA strands while they translocate through a nanopore or channel may provide a statistically distinguishable signature of the DNA bases, and may thus allow for rapid DNA sequencing. However, fluctuations of the environment, such as ionic and DNA motion, introduce important scattering processes that may affect the viability of this approach to sequencing. To understand this issue, we have analyzed a simple model that captures the role of this complex environment in electronic dephasing and its ability to remove charge carriers from current-carrying states. We find that these effects do not strongly influence the current distributions due to the off-resonant nature of tunneling through the nucleotides—a result we expect to be a common feature of transport in molecular junctions. In particular, only large scattering strengths, as compared to the energetic gap between the molecular states and the Fermi level, significantly alter the form of the current distributions. Since this gap itself is quite large, the current distributions remain protected from this type of noise, further supporting the possibility of using transverse electronic transport measurements for DNA sequencing. PMID:19804730

  13. Light-generated oligonucleotide arrays for rapid DNA sequence analysis.

    PubMed Central

    Pease, A C; Solas, D; Sullivan, E J; Cronin, M T; Holmes, C P; Fodor, S P

    1994-01-01

    In many areas of molecular biology there is a need to rapidly extract and analyze genetic information; however, current technologies for DNA sequence analysis are slow and labor intensive. We report here how modern photolithographic techniques can be used to facilitate sequence analysis by generating miniaturized arrays of densely packed oligonucleotide probes. These probe arrays, or DNA chips, can then be applied to parallel DNA hybridization analysis, directly yielding sequence information. In a preliminary experiment, a 1.28 x 1.28 cm array of 256 different octanucleotides was produced in 16 chemical reaction cycles, requiring 4 hr to complete. The hybridization pattern of fluorescently labeled oligonucleotide targets was then detected by epifluorescence microscopy. The fluorescence signals from complementary probes were 5-35 times stronger than those with single or double base-pair hybridization mismatches, demonstrating specificity in the identification of complementary sequences. This method should prove to be a powerful tool for rapid investigations in human genetics and diagnostics, pathogen detection, and DNA molecular recognition. Images PMID:8197176

  14. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  15. DNA methylation mapping by tag-modified bisulfite genomic sequencing.

    PubMed

    Han, Weiguo; Cauchi, Stephane; Herman, James G; Spivack, Simon D

    2006-08-01

    A tag-modified bisulfite genomic sequencing (tBGS) method employing direct cycle sequencing of polymerase chain reaction (PCR) products at kilobase scale, without conventional DNA fragment cloning, was developed for simplified evaluation of DNA methylation sites. The method entails subjecting bisulfite-modified genomic DNA to a second-round PCR amplification employing GC-tagged primers. Qualitative results from tBGS closely correlated with those from conventional BGS (R=0.935, p=0.002). In application, the intertissue and interindividual CpG methylation differences in promoter sequence for two genes, CYP1B1 and GSTP1, were then explored across four human tissue types (peripheral blood cells, exfoliated buccal cells, paired nontumor-tumor lung tissues), and two lung cell types in culture (normal NHBE and malignant A549). Predominantly conserved methylation maps for the two gene promoters were apparent across donors and tissues. At any given CpG site, variation in the degree of methylation could be determined by the relative height of C and T peaks in the sequencing trace. Methylation maps for the GSTP1 promoter diverged between NHBE (unmethylated) and A549 (completely methylated) cells in a previously unexplored upstream region, correlating with a 2.7-fold difference in GSTP1 mRNA expression (p<0.01). The tBGS method simplifies detailed methylation scanning of kilobase-scale genomic DNA, facilitating more ambitious genomic methylation mapping studies.

  16. Decoding long nanopore sequencing reads of natural DNA.

    PubMed

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands.

  17. Derivatized versions of ligase enzymes for constructing DNA sequences

    DOEpatents

    Mariella, Jr., Raymond P.; Christian, Allen T.; Tucker, James D.; Dzenitis, John M.; Papavasiliou, Alexandros P.

    2006-08-15

    A method of making very long, double-stranded synthetic poly-nucleotides. A multiplicity of short oligonucleotides is provided. The short oligonucleotides are sequentially hybridized to each other. Enzymatic ligation of the oligonucleotides provides a contiguous piece of PCR-ready DNA of predetermined sequence.

  18. The Replication Focus Targeting Sequence (RFTS) Domain Is a DNA-competitive Inhibitor of Dnmt1

    SciTech Connect

    Syeda, Farisa; Fagan, Rebecca L.; Wean, Matthew; Avvakumov, George V.; Walker, John R.; Xue, Sheng; Dhe-Paganon, Sirano; Brenner, Charles

    2015-11-30

    Dnmt1 (DNA methyltransferase 1) is the principal enzyme responsible for maintenance of cytosine methylation at CpG dinucleotides in the mammalian genome. The N-terminal replication focus targeting sequence (RFTS) domain of Dnmt1 has been implicated in subcellular localization, protein association, and catalytic function. However, progress in understanding its function has been limited by the lack of assays for and a structure of this domain. Here, we show that the naked DNA- and polynucleosome-binding activities of Dnmt1 are inhibited by the RFTS domain, which functions by virtue of binding the catalytic domain to the exclusion of DNA. Kinetic analysis with a fluorogenic DNA substrate established the RFTS domain as a 600-fold inhibitor of Dnmt1 enzymatic activity. The crystal structure of the RFTS domain reveals a novel fold and supports a mechanism in which an RFTS-targeted Dnmt1-binding protein, such as Uhrf1, may activate Dnmt1 for DNA binding.

  19. Exome Sequencing of Cell-Free DNA from Metastatic Cancer Patients Identifies Clinically Actionable Mutations Distinct from Primary Disease

    PubMed Central

    Butler, Timothy M.; Johnson-Camacho, Katherine; Peto, Myron; Wang, Nicholas J.; Macey, Tara A.; Korkola, James E.; Koppie, Theresa M.; Corless, Christopher L.; Gray, Joe W.; Spellman, Paul T.

    2015-01-01

    The identification of the molecular drivers of cancer by sequencing is the backbone of precision medicine and the basis of personalized therapy; however, biopsies of primary tumors provide only a snapshot of the evolution of the disease and may miss potential therapeutic targets, especially in the metastatic setting. A liquid biopsy, in the form of cell-free DNA (cfDNA) sequencing, has the potential to capture the inter- and intra-tumoral heterogeneity present in metastatic disease, and, through serial blood draws, track the evolution of the tumor genome. In order to determine the clinical utility of cfDNA sequencing we performed whole-exome sequencing on cfDNA and tumor DNA from two patients with metastatic disease; only minor modifications to our sequencing and analysis pipelines were required for sequencing and mutation calling of cfDNA. The first patient had metastatic sarcoma and 47 of 48 mutations present in the primary tumor were also found in the cell-free DNA. The second patient had metastatic breast cancer and sequencing identified an ESR1 mutation in the cfDNA and metastatic site, but not in the primary tumor. This likely explains tumor progression on Anastrozole. Significant heterogeneity between the primary and metastatic tumors, with cfDNA reflecting the metastases, suggested separation from the primary lesion early in tumor evolution. This is best illustrated by an activating PIK3CA mutation (H1047R) which was clonal in the primary tumor, but completely absent from either the metastasis or cfDNA. Here we show that cfDNA sequencing supplies clinically actionable information with minimal risks compared to metastatic biopsies. This study demonstrates the utility of whole-exome sequencing of cell-free DNA from patients with metastatic disease. cfDNA sequencing identified an ESR1 mutation, potentially explaining a patient’s resistance to aromatase inhibition, and gave insight into how metastatic lesions differ from the primary tumor. PMID:26317216

  20. Probing the linearity and nonlinearity in DNA sequences

    NASA Astrophysics Data System (ADS)

    Tsonis, Anastasios A.; Heller, Fred L.; Tsonis, Panagiotis A.

    2002-09-01

    In this paper, we apply the principles of information theory that relate to the definition of nonlinear predictability, which is a measure that describes both the linear and nonlinear components of a system. By comparing this measure to a measure of linear predictability, one can assess whether a given system has a strong linear or a strong nonlinear component. This provides insights as to whether the system should be modeled by a nonlinear or a linear model. We apply these ideas to DNA sequences. Our results, which extend previous results on this issue indicate that all DNA sequences (coding and noncoding) exhibit strong nonlinear structure. At the same time the results provide insights to understand DNA structure and possible clues about evolutionary mechanisms.

  1. Electronic density of states in sequence dependent DNA molecules

    NASA Astrophysics Data System (ADS)

    de Oliveira, B. P. W.; Albuquerque, E. L.; Vasconcelos, M. S.

    2006-09-01

    We report in this work a numerical study of the electronic density of states (DOS) in π-stacked arrays of DNA single-strand segments made up from the nucleotides guanine G, adenine A, cytosine C and thymine T, forming a Rudin-Shapiro (RS) as well as a Fibonacci (FB) polyGC quasiperiodic sequences. Both structures are constructed starting from a G nucleotide as seed and following their respective inflation rules. Our theoretical method uses Dyson's equation together with a transfer-matrix treatment, within an electronic tight-binding Hamiltonian model, suitable to describe the DNA segments modelled by the quasiperiodic chains. We compared the DOS spectra found for the quasiperiodic structure to those using a sequence of natural DNA, as part of the human chromosome Ch22, with a remarkable concordance, as far as the RS structure is concerned. The electronic spectrum shows several peaks, corresponding to localized states, as well as a striking self-similar aspect.

  2. Sequence heterogeneity accelerates protein search for targets on DNA

    NASA Astrophysics Data System (ADS)

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-12-01

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  3. Sequence heterogeneity accelerates protein search for targets on DNA

    SciTech Connect

    Shvets, Alexey A.; Kolomeisky, Anatoly B.

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  4. A blind testing design for authenticating ancient DNA sequences.

    PubMed

    Yang, H; Golenberg, E M; Shoshani, J

    1997-04-01

    Reproducibility is a serious concern among researchers of ancient DNA. We designed a blind testing procedure to evaluate laboratory accuracy and authenticity of ancient DNA obtained from closely related extant and extinct species. Soft tissue and bones of fossil and contemporary museum proboscideans were collected and identified based on morphology by one researcher, and other researchers carried out DNA testing on the samples, which were assigned anonymous numbers. DNA extracted using three principal isolation methods served as template in PCR amplifications of a segment of the cytochrome b gene (mitochondrial genome), and the PCR product was directly sequenced and analyzed. The results show that such a blind testing design performed in one laboratory, when coupled with phylogenetic analysis, can nonarbitrarily test the consistency and reliability of ancient DNA results. Such reproducible results obtained from the blind testing can increase confidence in the authenticity of ancient sequences obtained from postmortem specimens and avoid bias in phylogenetic analysis. A blind testing design may be applicable as an alternative to confirm ancient DNA results in one laboratory when independent testing by two laboratories is not available.

  5. Reiterated DNA sequences in Rhizobium and Agrobacterium spp.

    PubMed Central

    Flores, M; González, V; Brom, S; Martínez, E; Piñero, D; Romero, D; Dávila, G; Palacios, R

    1987-01-01

    Repeated DNA sequences are a general characteristic of eucaryotic genomes. Although several examples of DNA reiteration have been found in procaryotic organisms, only in the case of the archaebacteria Halobacterium halobium and Halobacterium volcanii [C. Sapienza and W. F. Doolittle, Nature (London) 295:384-389, 1982], has DNA reiteration been reported as a common genomic feature. The genomes of two Rhizobium phaseoli strains, one Rhizobium meliloti strain, and one Agrobacterium tumefaciens strain were analyzed for the presence of repetitive DNA. Rhizobium and Agrobacterium spp. are closely related soil bacteria that interact with plants and that belong to the taxonomical family Rhizobiaceae. Rhizobium species establish a nitrogen-fixing symbiosis in the roots of legumes, whereas Agrobacterium species is a pathogen in different plants. The four strains revealed a large number of repeated DNA sequences. The family size was usually small, from 2 to 5 elements, but some presented more than 10 elements. Rhizobium and Agrobacterium spp. contain large plasmids in addition to the chromosomes. Analysis of the two Rhizobium strains indicated that DNA reiteration is not confined to the chromosome or to some plasmids but is a property of the whole genome. Images PMID:3450286

  6. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W.

    1992-01-01

    We are developing a machine learning system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being learned. Using this information (which we call a domain theory''), our learning algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, the KBANN algorithm maps inference rules, such as consensus sequences, into a neural (connectionist) network. Neural network training techniques then use the training examples of refine these inference rules. We have been applying this approach to several problems in DNA sequence analysis and have also been extending the capabilities of our learning system along several dimensions.

  7. Applying machine learning techniques to DNA sequence analysis

    SciTech Connect

    Shavlik, J.W. . Dept. of Computer Sciences); Noordewier, M.O. . Dept. of Computer Science)

    1992-01-01

    We are primarily developing a machine teaming (ML) system that modifies existing knowledge about specific types of biological sequences. It does this by considering sample members and nonmembers of the sequence motif being teamed. Using this information, our teaming algorithm produces a more accurate representation of the knowledge needed to categorize future sequences. Specifically, our KBANN algorithm maps inference rules about a given recognition task into a neural network. Neural network training techniques then use the training examples to refine these inference rules. We call these rules a domain theory, following the convention in the machine teaming community. We have been applying this approach to several problems in DNA sequence analysis. In addition, we have been extending the capabilities of our teaming system along several dimensions. We have also been investigating parallel algorithms that perform sequence alignments in the presence of frameshift errors.

  8. FlbD has a DNA-binding activity near its carboxy terminus that recognizes ftr sequences involved in positive and negative regulation of flagellar gene transcription in Caulobacter crescentus.

    PubMed Central

    Mullin, D A; Van Way, S M; Blankenship, C A; Mullin, A H

    1994-01-01

    G. Our results demonstrate that FlbD contains a sequence-specific DNA-binding activity within the 87 amino acids at its carboxy terminus, and the results suggest that FlbD exerts its effect as a positive and negative regulator of C. crescentus flagellar genes by binding to ftr sequences. Images PMID:7928958

  9. Cloning, sequencing and analysis of dnaK -dnaJ gene cluster of Bacillus megaterium.

    PubMed

    Bao, Fangming; Gong, Lei; Shao, Weilan

    2008-12-01

    The DNA fragment of heat shock genes (hrcA-grpE-dnaK-dnaJ) containing complete hrcA-grpE-dnaK operon and the transcription unit of dnaJ was cloned, sequensed and analyzed from Bacillus megaterium RF5. The sequence of hrcA, grpE and dnaJ were first time reported, and their coding products exibit 60%, 63% and 81% of identities to the homologs of B. subtilis. A sigmaA-type promoter of Gram-positive bacteria (PA1) and a terminator were located upstream of the hrcA and downstream of dnaK, and a Controlling inverted repeat of chaperone expression element (CIRCE) was identified between PA1 and hrcA. Another sigmaA-type promoter (PA2) and a terminator were found upstream and downstream of dnaJ, indicating B. megaterium has a transcription unit containing a single gene dnaJ. The structure of dnaJ transcription unit is more similar to that of Listeria monocytogenes than other species of Bacillus. A partial protein-based phylogenetic tree, derived from Gram-positive bacteria using HrcA sequence, indicated a closer phylogenetic relationship between B. megaterium and Geobacillus species than other two Bacillus species.

  10. DNA sequence analysis using hierarchical ART-based classification networks

    SciTech Connect

    LeBlanc, C.; Hruska, S.I.; Katholi, C.R.; Unnasch, T.R.

    1994-12-31

    Adaptive resonance theory (ART) describes a class of artificial neural network architectures that act as classification tools which self-organize, work in real-time, and require no retraining to classify novel sequences. We have adapted ART networks to provide support to scientists attempting to categorize tandem repeat DNA fragments from Onchocerca volvulus. In this approach, sequences of DNA fragments are presented to multiple ART-based networks which are linked together into two (or more) tiers; the first provides coarse sequence classification while the sub- sequent tiers refine the classifications as needed. The overall rating of the resulting classification of fragments is measured using statistical techniques based on those introduced to validate results from traditional phylogenetic analysis. Tests of the Hierarchical ART-based Classification Network, or HABclass network, indicate its value as a fast, easy-to-use classification tool which adapts to new data without retraining on previously classified data.

  11. Methyl-binding DNA capture Sequencing for Patient Tissues

    PubMed Central

    Jadhav, Rohit R.; Wang, Yao V.; Hsu, Ya-Ting; Liu, Joseph; Garcia, Dawn; Lai, Zhao; Huang, Tim H. M.; Jin, Victor X.

    2016-01-01

    Methylation is one of the essential epigenetic modifications to the DNA, which is responsible for the precise regulation of genes required for stable development and differentiation of different tissue types. Dysregulation of this process is often the hallmark of various diseases like cancer. Here, we outline one of the recent sequencing techniques, Methyl-Binding DNA Capture sequencing (MBDCap-seq), used to quantify methylation in various normal and disease tissues for large patient cohorts. We describe a detailed protocol of this affinity enrichment approach along with a bioinformatics pipeline to achieve optimal quantification. This technique has been used to sequence hundreds of patients across various cancer types as a part of the 1,000 methylome project (Cancer Methylome System). PMID:27842364

  12. Spectral sum rules and search for periodicities in DNA sequences

    NASA Astrophysics Data System (ADS)

    Chechetkin, V. R.

    2011-04-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory.

  13. Nucleotide sequence of alkyl-dihydroxyacetonephosphate synthase cDNA from Dictyostelium discoideum.

    PubMed

    de Vet, E C; van den Bosch, H

    1998-11-27

    The nucleotide sequence is reported of alkyl-dihydroxyacetonephosphate synthase cDNA from the cellular slime mold Dictyostelium discoideum. The open reading frame encodes a protein of 611 amino acids which shows a 33% amino acid identity to the human enzyme. This D. discoideum homolog carries a variant of the peroxisomal targeting signal type 1 at its C-terminus (PKL). Expression of the cDNA in Escherichia coli yielded an enzymatically active protein.

  14. Human DNA polymerase alpha gene: sequences controlling expression in cycling and serum-stimulated cells.

    PubMed Central

    Pearson, B E; Nasheuer, H P; Wang, T S

    1991-01-01

    We have investigated the DNA polymerase alpha promoter sequence requirements for the expression of a heterologous gene in actively cycling cells and following serum addition to serum-deprived cells. An 11.4-kb genomic clone that spans the 5' end of this gene and includes 1.62 kb of sequence upstream from the translation start site was isolated. The transcription start site was mapped at 46 +/- 1 nucleotides upstream from the translation start site. The upstream sequence is GC rich and lacks a TATA sequence but has a CCAAT sequence on the opposite strand. Analysis of a set of deletion constructs in transient transfection assays demonstrated that efficient expression of the reporter in cycling cells requires 248 bp of sequence upstream from the cap site. Clustered within these 248 nucleotides are sequences similar to consensus sequences for Sp1-, Ap1-, Ap2-, and E2F-binding sites. The CCAAT sequence and the potential E2F- and Ap1-binding sites are shown to be protected from DNase I digestion by partially purified nuclear proteins. The DNA polymerase alpha promoter can confer upon the reporter an appropriate, late response to serum addition. No single sequence element could be shown to confer serum inducibility. Rather, multiple sequence elements appear to mediate the full serum response. Images PMID:2005899

  15. Detection of DNA sequence polymorphisms in human genomic DNA by using denaturing gradient gel blots

    SciTech Connect

    Gray, M.R. )

    1992-02-01

    Denaturing gradient gel electrophoresis can detect sequence differences outside restriction-enzyme recognition sites. DNA sequence polymorphisms can be detected as restriction-fragment melting polymorphisms (RFMPs) in genomic DNA by using blots made from denaturing gradient gels. In contrast to the use of Southern blots to find sequence differences, denaturing gradient gel blots can detect differences almost anywhere, not just at 4-6-bp restriction-enzyme recognition sites. Human genomic DNA was digested with one of several randomly selected 4-bp recognition-site restriction enzymes, electrophoresed in denaturing gradient gels, and transferred to nylon membranes. The blots were hydridized with radioactive probes prepared from the factor VIII, type II collagen, insulin receptor, [beta][sub 2]-adrenergic receptor, and 21-hydroxylase genes; in unrelated individuals, several RFM's were found in fragments from every locus tested. No restriction map or sequence information was used to detect RFMP's.

  16. Chloroplast DNA Sequence Homologies among Vascular Plants 1

    PubMed Central

    Lamppa, Gayle K.; Bendich, Arnold J.

    1979-01-01

    The extent of sequence conservation in the chloroplast genome of higher plants has been investigated. Supercoiled chloroplast DNA, prepared from pea seedlings, was labeled in vitro and used as a probe in reassociation experiments with a high concentration of total DNAs extracted from several angiosperms, gymnosperms, and lower vascular plants. In each case the probe reassociation was accelerated, demonstrating that some chloroplast sequences have been highly conserved throughout the evolution of vascular plants. Only among the flowering plants were distinct levels of cross-reaction with the pea chloroplast probe evident; broad bean and barley exhibited the highest and lowest levels, respectively. With the hydroxylapatite assay these levels decreased with a decrease in probe fragment length (from 1,860 to 735 bases), indicating that many conserved sequences in the chloroplast genome are separated by divergent sequences on a rather fine scale. Despite differences observed in levels of homology with the hydroxylapatite assay, S1 nuclease analysis of heteroduplexes showed that outside of the pea family the extent of sequence relatedness between the probe and various heterologous DNAs is approximately the same: 30%. In our interpretation, the fundamental changes in the chloroplast genome during angiosperm evolution involved the rearrangement of this 30% with respect to the more rapidly changing sequences of the genome. These rearrangements may have been more extensive in dicotyledons than in monocotyledons. We have estimated the amount of conserved and divergent DNA interspersed between one another. From the reassociation experiments, determinations were made of the percentage of chloroplast DNA in total DNA extracts from different higher plants; this value remained relatively constant when compared with the large variation in the diploid genome size of the plants. PMID:16660786

  17. Identification of parasite DNA in common bile duct stones by PCR and DNA sequencing

    PubMed Central

    Jang, Ji Sun; Kim, Kyung Ho; Yu, Jae-Ran

    2007-01-01

    We attempted to identify parasite DNA in the biliary stones of humans via PCR and DNA sequencing. Genomic DNA was isolated from each of 15 common bile duct (CBD) stones and 5 gallbladder (GB) stones. The patients who had the CBD stones suffered from cholangitis, and the patients with GB stones showed acute cholecystitis, respectively. The 28S and 18S rDNA genes were amplified successfully from 3 and/or 1 common bile duct stone samples, and then cloned and sequenced. The 28S and 18S rDNA sequences were highly conserved among isolates. Identity of the obtained 28S D1 rDNA with that of Clonorchis sinensis was higher than 97.6%, and identity of the 18S rDNA with that of other Ascarididae was 97.9%. Almost no intra-specific variations were detected in the 28S and 18S rDNA with the exception of a few nucleotide variations, i.e., substitution and deletion. These findings suggest that C. sinensis and Ascaris lumbricoides may be related with the biliary stone formation and development. PMID:18165713

  18. Detecting and Analyzing DNA Sequencing Errors: Toward a Higher Quality of the Bacillus subtilis Genome Sequence

    PubMed Central

    Médigue, Claudine; Rose, Matthias; Viari, Alain; Danchin, Antoine

    1999-01-01

    During the determination of a DNA sequence, the introduction of artifactual frameshifts and/or in-frame stop codons in putative genes can lead to misprediction of gene products. Detection of such errors with a method based on protein similarity matching is only possible when related sequences are available in databases. Here, we present a method to detect frameshift errors in DNA sequences that is based on the intrinsic properties of the coding sequences. It combines the results of two analyses, the search for translational initiation/termination sites and the prediction of coding regions. This method was used to screen the complete Bacillus subtilis genome sequence and the regions flanking putative errors were resequenced for verification. This procedure allowed us to correct the sequence and to analyze in detail the nature of the errors. Interestingly, in several cases in-frame termination codons or frameshifts were not sequencing errors but confirmed to be present in the chromosome, indicating that the genes are either nonfunctional (pseudogenes) or subject to regulatory processes such as programmed translational frameshifts. The method can be used for checking the quality of the sequences produced by any prokaryotic genome sequencing project. PMID:10568751

  19. Entire Mitochondrial DNA Sequencing on Massively Parallel Sequencing for the Korean Population

    PubMed Central

    2017-01-01

    Mitochondrial DNA (mtDNA) genome analysis has been a potent tool in forensic practice as well as in the understanding of human phylogeny in the maternal lineage. The traditional mtDNA analysis is focused on the control region, but the introduction of massive parallel sequencing (MPS) has made the typing of the entire mtDNA genome (mtGenome) more accessible for routine analysis. The complete mtDNA information can provide large amounts of novel genetic data for diverse populations as well as improved discrimination power for identification. The genetic diversity of the mtDNA sequence in different ethnic populations has been revealed through MPS analysis, but the Korean population not only has limited MPS data for the entire mtGenome, the existing data is mainly focused on the control region. In this study, the complete mtGenome data for 186 Koreans, obtained using Ion Torrent Personal Genome Machine (PGM) technology and retrieved from rather common mtDNA haplogroups based on the control region sequence, are described. The results showed that 24 haplogroups, determined with hypervariable regions only, branched into 47 subhaplogroups, and point heteroplasmy was more frequent in the coding regions. In addition, sequence variations in the coding regions observed in this study were compared with those presented in other reports on different populations, and there were similar features observed in the sequence variants for the predominant haplogroups among East Asian populations, such as Haplogroup D and macrohaplogroups M9, G, and D. This study is expected to be the trigger for the development of Korean specific mtGenome data followed by numerous future studies. PMID:28244283

  20. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    SciTech Connect

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  1. Delineating relative homogeneous G+C domains in DNA sequences.

    PubMed

    Li, W

    2001-10-03

    The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses.

  2. An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones

    PubMed Central

    Butterfield, Yaron S. N.; Marra, Marco A.; Asano, Jennifer K.; Chan, Susanna Y.; Guin, Ranabir; Krzywinski, Martin I.; Lee, Soo Sen; MacDonald, Kim W. K.; Mathewson, Carrie A.; Olson, Teika E.; Pandoh, Pawan K.; Prabhu, Anna-Liisa; Schnerch, Angelique; Skalska, Ursula; Smailus, Duane E.; Stott, Jeff M.; Tsai, Miranda I.; Yang, George S.; Zuyderduyn, Scott D.; Schein, Jacqueline E.; Jones, Steven J. M.

    2002-01-01

    We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites. PMID:12034834

  3. A measure of DNA sequence similarity by Fourier Transform with applications on hierarchical clustering.

    PubMed

    Yin, Changchuan; Chen, Ying; Yau, Stephen S-T

    2014-10-21

    Multiple sequence alignment (MSA) is a prominent method for classification of DNA sequences, yet it is hampered with inherent limitations in computational complexity. Alignment-free methods have been developed over past decade for more efficient comparison and classification of DNA sequences than MSA. However, most alignment-free methods may lose structural and functional information of DNA sequences because they are based on feature extractions. Therefore, they may not fully reflect the actual differences among DNA sequences. Alignment-free methods with information conservation are needed for more accurate comparison and classification of DNA sequences. We propose a new alignment-free similarity measure of DNA sequences using the Discrete Fourier Transform (DFT). In this method, we map DNA sequences into four binary indicator sequences and apply DFT to the indicator sequences to transform them into frequency domain. The Euclidean distance of full DFT power spectra of the DNA sequences is used as similarity distance metric. To compare the DFT power spectra of DNA sequences with different lengths, we propose an even scaling method to extend shorter DFT power spectra to equal the longest length of the sequences compared. After the DFT power spectra are evenly scaled, the DNA sequences are compared in the same DFT frequency space dimensionality. We assess the accuracy of the similarity metric in hierarchical clustering using simulated DNA and virus sequences. The results demonstrate that the DFT based method is an effective and accurate measure of DNA sequence similarity.

  4. Pericentric satellite DNA sequences in Pipistrellus pipistrellus (Vespertilionidae; Chiroptera).

    PubMed

    Barragán, M J L; Martínez, S; Marchal, J A; Fernández, R; Bullejos, M; Díaz de la Guardia, R; Sánchez, A

    2003-09-01

    This paper reports the molecular and cytogenetic characterization of a HindIII family of satellite DNA in the bat species Pipistrellus pipistrellus. This satellite is organized in tandem repeats of 418 bp monomer units, and represents approximately 3% of the whole genome. The consensus sequence from five cloned monomer units has an A-T content of 62.20%. We have found differences in the ladder pattern of bands between two populations of the same species. These differences are probably because of the absence of the target sites for the HindIII enzyme in most monomer units of one population, but not in the other. Fluorescent in situ hybridization (FISH) localized the satellite DNA in the pericentromeric regions of all autosomes and the X chromosome, but it was absent from the Y chromosome. Digestion of genomic DNAs with HpaII and its isoschizomer MspI demonstrated that these repetitive DNA sequences are not methylated. Other bat species were tested for the presence of this repetitive DNA. It was absent in five Vespertilionidae and one Rhinolophidae species, indicating that it could be a species/genus specific, repetitive DNA family.

  5. Yeast general transcription factor GFI: sequence requirements for binding to DNA and evolutionary conservation.

    PubMed Central

    Dorsman, J C; van Heeswijk, W C; Grivell, L A

    1990-01-01

    GFI is an abundant DNA binding protein in the yeast S. cerevisiae. The protein binds to specific sequences in both ARS elements and the upstream regions of a large number of genes and is likely to play an important role in yeast cell growth. To get insight into the relative strength of the various GFI-DNA binding sites within the yeast genome, we have determined dissociation rates for several GFI-DNA complexes and found them to vary over a 70-fold range. Strong binding sites for GFI are present in the upstream activating sequences of the gene encoding the 40 kDa subunit II of the QH2:cytochrome c reductase, the gene encoding ribosomal protein S33 and in the intron of the actin gene. The binding site in the ARS1-TRP1 region is of intermediate strength. All strong binding sites conform to the sequence 5' RTCRYYYNNNACG-3'. Modification interference experiments and studies with mutant binding sites indicate that critical bases for GFI recognition are within the two elements of the consensus DNA recognition sequence. Proteins with the DNA binding specificities of GFI and GFII can also be detected in the yeast K. lactis, suggesting evolutionary conservation of at least the respective DNA-binding domains in both yeasts. Images PMID:2187179

  6. A frameshift error detection algorithm for DNA sequencing projects.

    PubMed Central

    Fichant, G A; Quentin, Y

    1995-01-01

    During the determination of DNA sequences, frameshift errors are not the most frequent but they are the most bothersome as they corrupt the amino acid sequence over several residues. Detection of such errors by sequence alignment is only possible when related sequences are found in the databases. To avoid this limitation, we have developed a new tool based on the distribution of non-overlapping 3-tuples or 6-tuples in the three frames of an ORF. The method relies upon the result of a correspondence analysis. It has been extensively tested on Bacillus subtilis and Saccharomyces cerevisiae sequences and has also been examined with human sequences. The results indicate that it can detect frameshift errors affecting as few as 20 bp with a low rate of false positives (no more than 1.0/1000 bp scanned). The proposed algorithm can be used to scan a large collection of data, but it is mainly intended for laboratory practice as a tool for checking the quality of the sequences produced during a sequencing project. PMID:7659513

  7. Sequence context effects on 8-methoxypsoralen photobinding to defined DNA fragments

    SciTech Connect

    Sage, E.; Moustacchi, E.

    1987-06-16

    The photoreaction of 8-methoxypsoralen (8-MOP) with DNA fragments of defined sequence was studied. The authors took advantage of the blockage by bulky adducts of the 3'-5'-exonuclease activity associated with the T4 DNA polymerase. The action of the exonuclease is stopped by biadducts as well as by monoadducts. The termination products were analyzed on sequencing gels. A strong sequence specificity was observed in the DNA photobinding of 8-MOP. The exonuclease terminates its digestion near thymine residues, mainly at potentially cross-linkable sites. There is an increasing reactivity of thymine residues in the order T < TT << TTT in a GC environment. For thymine residues in cross-linkable sites, the reactivity follows the order AT << TA approx. TAT << ATA < ATAT < ATATAA. Repeated A-T sequences are hot spots for the photochemical reaction of 8-MOP with DNA. Both monoadducts and interstrand cross-links are formed preferentially in 5'-TpA sites. The results highlight the role of the sequence and consequently of the conformation around a potential site in the photobinding of 8-MOP to DNA.

  8. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    SciTech Connect

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S.; Arbuthnot, Patrick

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  9. A CLIQUE algorithm using DNA computing techniques based on closed-circle DNA sequences.

    PubMed

    Zhang, Hongyan; Liu, Xiyu

    2011-07-01

    DNA computing has been applied in broad fields such as graph theory, finite state problems, and combinatorial problem. DNA computing approaches are more suitable used to solve many combinatorial problems because of the vast parallelism and high-density storage. The CLIQUE algorithm is one of the gird-based clustering techniques for spatial data. It is the combinatorial problem of the density cells. Therefore we utilize DNA computing using the closed-circle DNA sequences to execute the CLIQUE algorithm for the two-dimensional data. In our study, the process of clustering becomes a parallel bio-chemical reaction and the DNA sequences representing the marked cells can be combined to form a closed-circle DNA sequences. This strategy is a new application of DNA computing. Although the strategy is only for the two-dimensional data, it provides a new idea to consider the grids to be vertexes in a graph and transform the search problem into a combinatorial problem.

  10. Development of a protein microarray using sequence-specific DNA binding domain on DNA chip surface

    SciTech Connect

    Choi, Yoo Seong; Pack, Seung Pil; Yoo, Young Je . E-mail: yjyoo@snu.ac.kr

    2005-04-22

    A protein microarray based on DNA microarray platform was developed to identify protein-protein interactions in vitro. The conventional DNA chip surface by 156-bp PCR product was prepared for a substrate of protein microarray. High-affinity sequence-specific DNA binding domain, GAL4 DNA binding domain, was introduced to the protein microarray as fusion partner of a target model protein, enhanced green fluorescent protein. The target protein was oriented immobilized directly on the DNA chip surface. Finally, monoclonal antibody of the target protein was used to identify the immobilized protein on the surface. This study shows that the conventional DNA chip can be used to make a protein microarray directly, and this novel protein microarray can be applicable as a tool for identifying protein-protein interactions.

  11. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  12. Directly repeated sequences associated with pathogenic mitochondrial DNA deletions.

    PubMed Central

    Johns, D R; Rutledge, S L; Stine, O C; Hurko, O

    1989-01-01

    We determined the nucleotide sequences of junctional regions associated with large deletions of mitochondrial DNA found in four unrelated individuals with a phenotype of chronic progressive external ophthalmoplegia. In each patient, the deletion breakpoint occurred within a directly repeated sequence of 13-18 base pairs, present in different regions of the normal mitochondrial genome-separated by 4.5-7.7 kilobases. In two patients, the deletions were identical. When all four repeated sequences are compared, a consensus sequence of 11 nucleotides emerges, similar to putative recombination signals, suggesting the involvement of a recombinational event. Partially deleted and normal mitochondrial DNAs were found in all tissues examined, but in very different proportions, indicating that these mutations originated before the primary cell layers diverged. Images PMID:2813377

  13. DNA sequence detection using surface-enhanced resonance Raman spectroscopy in a homogeneous multiplexed assay.

    PubMed

    MacAskill, Alexandra; Crawford, David; Graham, Duncan; Faulds, Karen

    2009-10-01

    Detection of specific DNA sequences is central to modern molecular biology and also to molecular diagnostics where identification of a particular disease is based on nucleic acid identification. Many methods exist, and fluorescence spectroscopy dominates the detection technologies employed with different assay formats. This study demonstrates the use of surface-enhanced resonance Raman scattering (SERRS) to detect specific DNA sequences when coupled with modified SERRS-active probes that have been designed to modify the affinity of double- and single-stranded DNA for the surface of silver nanoparticles resulting in discernible differences in the SERRS which can be correlated to the specific DNA hybridization event. The principle of the assay lies on the lack of affinity of double-stranded DNA for silver nanoparticle surfaces; therefore, hybridization of the probe to the target results in a reduction in the SERRS signal. Use of locked nucleic acid (LNA) residues in the DNA probes resulted in greater discrimination between exact match and mismatches when used in comparison to unmodified labeled DNA probes. Polymerase chain reaction (PCR) products were detected using this methodology, and ultimately a multiplex detection of sequences relating to a hospital-acquired infection, namely, methicillin-resistant Staphylococcus aureus (MRSA), demonstrated the versatility and applicability of this approach to real-life situations.

  14. Cloning and sequencing of cDNA and genomic DNA encoding PDM phosphatase of Fusarium moniliforme.

    PubMed

    Yoshida, Hiroshi; Iizuka, Mari; Narita, Takao; Norioka, Naoko; Norioka, Shigemi

    2006-12-01

    PDM phosphatase was purified approximately 500-fold through six steps from the extract of dried powder of the culture filtrate of Fusarium moniliforme. The purified preparation appeared homogeneous on SDS-PAGE although the protein band was broad. Amino acid sequence information was collected on tryptic peptides from this preparation. cDNA cloning was carried out based on the information. A full-length cDNA was obtained and sequenced. The sequence had an open reading frame of 651 amino acid residues with a molecular mass of 69,988 Da. Cloning and sequencing of the genomic DNA corresponding to the cDNA was also conducted. The deduced amino acid sequence could account for many but not all of the tryptic peptides, suggesting presence of contaminant protein(s). SDS-PAGE analysis after chemical deglycosylation showed two proteins with molecular masses of 58 and 68 kDa. This implied that the 58 kDa protein had been copurified with PDM phosphatase. Homology search showed that PDM phosphatase belongs to the purple acid phosphatase family, which is widely distributed in the biosphere. Sequence data of fungal purple acid phosphatases were collected from the database. Processing of the data revealed presence of two types, whose evolutionary relationships were discussed.

  15. The carbohydrate domain of calicheamicin gamma I1 determines its sequence specificity for DNA cleavage.

    PubMed Central

    Drak, J; Iwasawa, N; Danishefsky, S; Crothers, D M

    1991-01-01

    We have investigated the DNA cleaving properties of calicheamicinone, the synthetic core aglycone of calicheamicin gamma I1, a natural product with extremely potent antitumor activity. Our experiments have shown that the synthetic analog binds and cleaves DNA, albeit without any sequence selectivity and with less efficiency than the natural compound. We propose that a key element in the sequence recognition process is the thiobenzoate ring present in the natural compound. We have demonstrated by one-dimensional NMR that there is direct hydrogen abstraction from DNA by calicheamicinone, with enhanced binding affinity contributed by the carbohydrate domain. The reduced efficiency of hydrogen abstraction from DNA by bound calicheamicinone, compared with the natural compound, implicates the carbohydrate moiety in positioning the drug for hydrogen abstraction. Images PMID:1881884

  16. Viral Discovery and Sequence Recovery Using DNA Microarrays

    PubMed Central

    Wang, David; Urisman, Anatoly; Liu, Yu-Tsueng; Springer, Michael; Ksiazek, Thomas G; Erdman, Dean D; Mardis, Elaine R; Hickenbotham, Matthew; Magrini, Vincent; Eldred, James; Latreille, J. Phillipe; Wilson, Richard K; Ganem, Don

    2003-01-01

    Because of the constant threat posed by emerging infectious diseases and the limitations of existing approaches used to identify new pathogens, there is a great demand for new technological methods for viral discovery. We describe herein a DNA microarray-based platform for novel virus identification and characterization. Central to this approach was a DNA microarray designed to detect a wide range of known viruses as well as novel members of existing viral families; this microarray contained the most highly conserved 70mer sequences from every fully sequenced reference viral genome in GenBank. During an outbreak of severe acute respiratory syndrome (SARS) in March 2003, hybridization to this microarray revealed the presence of a previously uncharacterized coronavirus in a viral isolate cultivated from a SARS patient. To further characterize this new virus, approximately 1 kb of the unknown virus genome was cloned by physically recovering viral sequences hybridized to individual array elements. Sequencing of these fragments confirmed that the virus was indeed a new member of the coronavirus family. This combination of array hybridization followed by direct viral sequence recovery should prove to be a general strategy for the rapid identification and characterization of novel viruses and emerging infectious disease. PMID:14624234

  17. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  18. Human somatostatin I: sequence of the cDNA.

    PubMed Central

    Shen, L P; Pictet, R L; Rutter, W J

    1982-01-01

    RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875

  19. A DNA Sequence Recognition Loop on APOBEC3A Controls Substrate Specificity

    PubMed Central

    Dhuey, Erica; Zhang, Ruonan; Cao, Ping; Herate, Cecile; Chauveau, Lise; Hubbard, Stevan R.; Landau, Nathaniel R.

    2014-01-01

    APOBEC3A (A3A), one of the seven-member APOBEC3 family of cytidine deaminases, lacks strong antiviral activity against lentiviruses but is a potent inhibitor of adeno-associated virus and endogenous retroelements. In this report, we characterize the biochemical properties of mammalian cell-produced and catalytically active E. coli-produced A3A. The enzyme binds to single-stranded DNA with a Kd of 150 nM and forms dimeric and monomeric fractions. A3A, unlike APOBEC3G (A3G), deaminates DNA substrates nonprocessively. Using a panel of oligonucleotides that contained all possible trinucleotide contexts, we identified the preferred target sequence as TC (A/G). Based on a three-dimensional model of A3A, we identified a putative binding groove that contains residues with the potential to bind substrate DNA and to influence target sequence specificity. Taking advantage of the sequence similarity to the catalytic domain of A3G, we generated A3A/A3G chimeric proteins and analyzed their target site preference. We identified a recognition loop that altered A3A sequence specificity, broadening its target sequence preference. Mutation of amino acids in the predicted DNA binding groove prevented substrate binding, confirming the role of this groove in substrate binding. These findings shed light on how APOBEC3 proteins bind their substrate and determine which sites to deaminate. PMID:24827831

  20. A syntactic pattern recognition system for DNA sequences

    SciTech Connect

    Searls, D.; Dong, S.

    1993-12-31

    The authors review both theoretical and practical results of a linguistic approach to studying the structure of features of DNA sequences. Using generative grammars, complex assemblages can not only be described and analyzed abstractly, but also concretely, such that features can be searched for by a general-purpose parser. Their parser, called GENLANG, uses an extended logic grammar formalism and has found features as complex as TRNA genes, group 1 introns, and protein-encoding genes, within input sequences on a genomic scale.

  1. Detection theory in identification of RNA-DNA sequence differences using RNA-sequencing.

    PubMed

    Toung, Jonathan M; Lahens, Nicholas; Hogenesch, John B; Grant, Gregory

    2014-01-01

    Advances in sequencing technology have allowed for detailed analyses of the transcriptome at single-nucleotide resolution, facilitating the study of RNA editing or sequence differences between RNA and DNA genome-wide. In humans, two types of post-transcriptional RNA editing processes are known to occur: A-to-I deamination by ADAR and C-to-U deamination by APOBEC1. In addition to these sequence differences, researchers have reported the existence of all 12 types of RNA-DNA sequence differences (RDDs); however, the validity of these claims is debated, as many studies claim that technical artifacts account for the majority of these non-canonical sequence differences. In this study, we used a detection theory approach to evaluate the performance of RNA-Sequencing (RNA-Seq) and associated aligners in accurately identifying RNA-DNA sequence differences. By generating simulated RNA-Seq datasets containing RDDs, we assessed the effect of alignment artifacts and sequencing error on the sensitivity and false discovery rate of RDD detection. Overall, we found that even in the presence of sequencing errors, false negative and false discovery rates of RDD detection can be contained below 10% with relatively lenient thresholds. We also assessed the ability of various filters to target false positive RDDs and found them to be effective in discriminating between true and false positives. Lastly, we used the optimal thresholds we identified from our simulated analyses to identify RDDs in a human lymphoblastoid cell line. We found approximately 6,000 RDDs, the majority of which are A-to-G edits and likely to be mediated by ADAR. Moreover, we found the majority of non A-to-G RDDs to be associated with poorer alignments and conclude from these results that the evidence for widespread non-canonical RDDs in humans is weak. Overall, we found RNA-Seq to be a powerful technique for surveying RDDs genome-wide when coupled with the appropriate thresholds and filters.

  2. cisExpress: motif detection in DNA sequences

    PubMed Central

    Triska, Martin; Grocutt, David; Southern, James; Murphy, Denis J.; Tatarinova, Tatiana

    2013-01-01

    Motivation: One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest. Availability: cisExpress is available at www.cisexpress.org. Contact: tatiana.tatarinova@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23793750

  3. Effect of dephasing on DNA sequencing via transverse electronic transport

    SciTech Connect

    Zwolak, Michael; Krems, Matt; Pershin, Yuriy V; Di Ventra, Massimiliano

    2009-01-01

    We study theoretically the effects of dephasing on DNA sequencing in a nanopore via transverse electronic transport. To do this, we couple classical molecular dynamics simulations with transport calculations using scattering theory. Previous studies, which did not include dephasing, have shown that by measuring the transverse current of a particular base multiple times, one can get distributions of currents for each base that are distinguishable. We introduce a dephasing parameter into transport calculations to simulate the effects of the ions and other fluctuations. These effects lower the overall magnitude of the current, but have little effect on the current distributions themselves. The results of this work further implicate that distinguishing DNA bases via transverse electronic transport has potential as a sequencing tool.

  4. T-DNA integration into the Arabidopsis genome depends on sequences of pre-insertion sites

    PubMed Central

    Brunaud, Véronique; Balzergue, Sandrine; Dubreucq, Bertrand; Aubourg, Sébastien; Samson, Franck; Chauvin, Stéphanie; Bechtold, Nicole; Cruaud, Corinne; DeRose, Richard; Pelletier, Georges; Lepiniec, Loïc; Caboche, Michel; Lecharny, Alain

    2002-01-01

    A statistical analysis of 9000 flanking sequence tags characterizing transferred DNA (T-DNA) transformants in Arabidopsis sheds new light on T-DNA insertion by illegitimate recombination. T-DNA integration is favoured in plant DNA regions with an A-T-rich content. The formation of a short DNA duplex between the host DNA and the left end of the T-DNA sets the frame for the recombination. The sequence immediately downstream of the plant A-T-rich region is the master element for setting up the DNA duplex, and deletions into the left end of the integrated T-DNA depend on the location of a complementary sequence on the T-DNA. Recombination at the right end of the T-DNA with the host DNA involves another DNA duplex, 2–3 base pairs long, that preferentially includes a G close to the right end of the T-DNA. PMID:12446565

  5. Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains

    SciTech Connect

    Lee, James Weifu; Meller, Amit

    2007-01-01

    Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, which looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.

  6. Measuring Cation Dependent DNA Polymerase Fidelity Landscapes by Deep Sequencing

    PubMed Central

    Kording, Konrad; Schmidt, Daniel; Martin-Alarcon, Daniel; Tyo, Keith; Boyden, Edward S.; Church, George

    2012-01-01

    High-throughput recording of signals embedded within inaccessible micro-environments is a technological challenge. The ideal recording device would be a nanoscale machine capable of quantitatively transducing a wide range of variables into a molecular recording medium suitable for long-term storage and facile readout in the form of digital data. We have recently proposed such a device, in which cation concentrations modulate the misincorporation rate of a DNA polymerase (DNAP) on a known template, allowing DNA sequences to encode information about the local cation concentration. In this work we quantify the cation sensitivity of DNAP misincorporation rates, making possible the indirect readout of cation concentration by DNA sequencing. Using multiplexed deep sequencing, we quantify the misincorporation properties of two DNA polymerases – Dpo4 and Klenow exo− – obtaining the probability and base selectivity of misincorporation at all positions within the template. We find that Dpo4 acts as a DNA recording device for Mn2+ with a misincorporation rate gain of ∼2%/mM. This modulation of misincorporation rate is selective to the template base: the probability of misincorporation on template T by Dpo4 increases >50-fold over the range tested, while the other template bases are affected less strongly. Furthermore, cation concentrations act as scaling factors for misincorporation: on a given template base, Mn2+ and Mg2+ change the overall misincorporation rate but do not alter the relative frequencies of incoming misincorporated nucleotides. Characterization of the ion dependence of DNAP misincorporation serves as the first step towards repurposing it as a molecular recording device. PMID:22928047

  7. Diepoxybutane cross-links DNA at 5'-GNC sequences.

    PubMed

    Millard, J T; White, M M

    1993-03-02

    Epoxides are cancer-causing agents chemically analogous to the nitrogen mustards, a family of powerful antitumor drugs. We found that the DNA interstrand cross-linking sequence preference of diepoxybutane is the same as that of the mustard mechlorethamine: 5'-GNC. Therefore, the genomic site of cross-linking alone cannot explain why some interstrand cross-linkers act as antitumor agents whereas others are deadly toxins.

  8. DSBCapture: in situ capture and sequencing of DNA breaks.

    PubMed

    Lensing, Stefanie V; Marsico, Giovanni; Hänsel-Hertsch, Robert; Lam, Enid Y; Tannahill, David; Balasubramanian, Shankar

    2016-10-01

    Double-strand DNA breaks (DSBs) continuously arise and cause mutations and chromosomal rearrangements. Here, we present DSBCapture, a sequencing-based method that captures DSBs in situ and directly maps these at single-nucleotide resolution, enabling the study of DSB origin. DSBCapture shows substantially increased sensitivity and data yield compared with other methods. Using DSBCapture, we uncovered a striking relationship between DSBs and elevated transcription within nucleosome-depleted chromatin.

  9. Color image encryption scheme using CML and DNA sequence operations.

    PubMed

    Wang, Xing-Yuan; Zhang, Hui-Li; Bao, Xue-Mei

    2016-06-01

    In this paper, an encryption algorithm for color images using chaotic system and DNA (Deoxyribonucleic acid) sequence operations is proposed. Three components for the color plain image is employed to construct a matrix, then perform confusion operation on the pixels matrix generated by the spatiotemporal chaos system, i.e., CML (coupled map lattice). DNA encoding rules, and decoding rules are introduced in the permutation phase. The extended Hamming distance is proposed to generate new initial values for CML iteration combining color plain image. Permute the rows and columns of the DNA matrix and then get the color cipher image from this matrix. Theoretical analysis and experimental results prove the cryptosystem secure and practical, and it is suitable for encrypting color images of any size.

  10. DNA polymerase-catalyzed elongation of repetitive hexanucleotide sequences: application to creation of repetitive DNA libraries.

    PubMed

    Kurihara, Hiroyuki; Nagamune, Teruyuki

    2004-01-01

    We demonstrate the elongation of various hexanucleotide sequences with thermophilic DNA polymerase, under isothermal or thermal cyclic reaction conditions. We prepared 10 types of double repeat hexanucleotide duplexes with various GC compositions containing between 0 and 6 GC nucleotides per repeat and incubated these duplexes with thermophilic Taq DNA polymerase and dNTPs at various temperatures. All of the model repetitive short duplexes were elongated under the isothermal incubation conditions, although there were some differences in the elongation efficiencies derived from the GC composition in the repetitive sequences. It was also found that all of the model repetitive duplexes were extended more effectively by a 3-step thermal cyclic reaction involving denaturation, annealing, and extension. On the basis of this technique, we prepared a glutamate-encoding short repetitive duplex and created long repetitive DNAs under isothermal and thermal cyclic reaction conditions. DNA sequencing analysis of the cloned repetitive DNA revealed that well-ordered long repetitive DNAs of various chain lengths were created by this DNA polymerase-catalyzed ligation method, and these were easily cloned into vectors by the TA-cloning method. This method could be useful for obtaining DNAs encoding arbitrary long repetitive amino acid sequences more effectively than the conventional T4 ligase-catalyzed ligation method.

  11. DNA topology confers sequence specificity to nonspecific architectural proteins.

    PubMed

    Wei, Juan; Czapla, Luke; Grosner, Michael A; Swigon, David; Olson, Wilma K

    2014-11-25

    Topological constraints placed on short fragments of DNA change the disorder found in chain molecules randomly decorated by nonspecific, architectural proteins into tightly organized 3D structures. The bacterial heat-unstable (HU) protein builds up, counter to expectations, in greater quantities and at particular sites along simulated DNA minicircles and loops. Moreover, the placement of HU along loops with the "wild-type" spacing found in the Escherichia coli lactose (lac) and galactose (gal) operons precludes access to key recognition elements on DNA. The HU protein introduces a unique spatial pathway in the DNA upon closure. The many ways in which the protein induces nearly the same closed circular configuration point to the statistical advantage of its nonspecificity. The rotational settings imposed on DNA by the repressor proteins, by contrast, introduce sequential specificity in HU placement, with the nonspecific protein accumulating at particular loci on the constrained duplex. Thus, an architectural protein with no discernible DNA sequence-recognizing features becomes site-specific and potentially assumes a functional role upon loop formation. The locations of HU on the closed DNA reflect long-range mechanical correlations. The protein responds to DNA shape and deformability—the stiff, naturally straight double-helical structure—rather than to the unique features of the constituent base pairs. The structures of the simulated loops suggest that HU architecture, like nucleosomal architecture, which modulates the ability of regulatory proteins to recognize their binding sites in the context of chromatin, may influence repressor-operator interactions in the context of the bacterial nucleoid.

  12. Large microchannel array fabrication and results for DNA sequencing

    SciTech Connect

    Pastrone, R L; Balch, J W; Brewer, L R; Copeland, A C; Davidson , J C; Fitch, J P; Kimbrough, J R; Madabhushi, R S; Richardson, P M; Swierkowski, S P; Tarte, L A; Vainer, M

    1999-01-07

    We have developed a process for the production of microchannel arrays on bonded glass substrates up to I4 x 58 cm, for DNA sequencing. Arrays of 96 and 384 microchannels, each 46 cm long have been built. This technology offers significant advantages over discrete capillaries or conventional slab-gel approaches. High throughput DNA sequencing with over 550 base pairs resolution has been achieved. With custom fabrication apparatus, microchannels are etched in a borosilicate substrate, and then fusion bonded to a top substrate 1.1 mm thick that has access holes formed in it. SEM examination shows a typical microchannel to be 40 x 180 micrometers by 46 cm Iong; the etch is approximately isotropic, leaving a key undercut, for forming a rounded channel. The surface roughness at the bottom of the 40 micrometer deep channel has been profilometer measured to be as low as 20 nm; the roughness at the top surface was 2 nm. Etch uniformity of about 5% has been obtained using a 22% vol. HF / 78% Acetic acid solution. The simple lithography, etching, and bonding of these substrates enables efficient production of these arrays and extremely precise replication From master masks and precision machining with a mandrel. Keywords: microchannels, microchannel plates, DNA sequencing, electrophoresis, borosilicate glass

  13. Compilation of DNA sequences of Escherichia coli (update 1992)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Schachtel, Gabriel; Rice, Peter

    1992-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the fourth listing replacing and increasing the former listings substantially. However, in order to save space this printed version contains DNA sequence information only, if they are publically available in electronic form. The complete compilation including a full set of genetic map data and the E.coli protein index can be obtained in machine readable form from the EMBL data library (ECD release 10) or from the CD-ROM version of this supplement issue directly. After deletion of all detected overlaps a total of 1 820 237 individual bp is found to be determined till the beginning of 1992. This corresponds to a total of 38.56% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for other strains of E.coli. PMID:1598239

  14. Short-Sequence DNA Repeats in Prokaryotic Genomes

    PubMed Central

    van Belkum, Alex; Scherer, Stewart; van Alphen, Loek; Verbrugh, Henri

    1998-01-01

    Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneous. SSRs are encountered in many different branches of the prokaryote kingdom. They are found in genes encoding products as diverse as microbial surface components recognizing adhesive matrix molecules and specific bacterial virulence factors such as lipopolysaccharide-modifying enzymes or adhesins. SSRs enable genetic and consequently phenotypic flexibility. SSRs function at various levels of gene expression regulation. Variations in the number of repeat units per locus or changes in the nature of the individual repeat sequences may result from recombination processes or polymerase inadequacy such as slipped-strand mispairing (SSM), either alone or in combination with DNA repair deficiencies. These rather complex phenomena can occur with relative ease, with SSM approaching a frequency of 10−4 per bacterial cell division and allowing high-frequency genetic switching. Bacteria use this random strategy to adapt their genetic repertoire in response to selective environmental pressure. SSR-mediated variation has important implications for bacterial pathogenesis and evolutionary fitness. Molecular analysis of changes in SSRs allows epidemiological studies on the spread of pathogenic bacteria. The occurrence, evolution and function of SSRs, and the molecular methods used to analyze them are discussed in the context of responsiveness to environmental factors, bacterial pathogenicity, epidemiology, and the availability of full-genome sequences for increasing numbers of microorganisms, especially those that are medically relevant. PMID:9618442

  15. Detection and mapping of homologous, repeated and amplified DNA sequences by DNA renaturation in agarose gels.

    PubMed Central

    Roninson, I B

    1983-01-01

    A new molecular hybridization approach to the analysis of complex genomes has been developed. Tracer and driver DNAs were digested with the same restriction enzyme(s), and tracer DNA was labeled with 32P using T4 DNA polymerase. Tracer DNA was mixed with an excess amount of driver, and the mixture was electrophoresed in an agarose gel. Following electrophoresis, DNA was alkali-denatured in situ and allowed to reanneal in the gel, so that tracer DNA fragments could hybridize to the driver only when homologous driver DNA sequences were present at the same place in the gel, i.e. within a restriction fragment of the same size. After reannealing, unhybridized single-stranded DNA was digested in situ with S1 nuclease. The hybridized tracer DNA was detected by autoradiography. The general applicability of this technique was demonstrated in the following experiments. The common EcoRI restriction fragments were identified in the genomes of E. coli and four other species of bacteria. Two of these fragments are conserved in all Enterobacteriaceae. In other experiments, repeated EcoRI fragments of eukaryotic DNA were visualized as bands of various intensity after reassociation of a total genomic restriction digest in the gel. The situation of gene amplification was modeled by the addition of varying amounts of lambda phage DNA to eukaryotic DNA prior to restriction enzyme digestion. Restriction fragments of lambda DNA were detectable at a ratio of 15 copies per chicken genome and 30 copies per human genome. This approach was used to detect amplified DNA fragments in methotrexate (MTX)-resistant mouse cells and to identify commonly amplified fragments in two independently derived MTX-resistant lines. Images PMID:6310499

  16. DNA sequence chromatogram browsing using JAVA and CORBA.

    PubMed

    Parsons, J D; Buehler, E; Hillier, L

    1999-03-01

    DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.

  17. Analysis of the complete DNA sequence of murine cytomegalovirus.

    PubMed Central

    Rawlinson, W D; Farrell, H E; Barrell, B G

    1996-01-01

    The complete DNA sequence of the Smith strain of murine cytomegalovirus (MCMV) was determined from virion DNA by using a whole-genome shotgun approach. The genome has an overall G+C content of 58.7%, consists of 230,278 bp, and is arranged as a single unique sequence with short (31-bp) terminal direct repeats and several short internal repeats. Significant similarity to the genome of the sequenced human cytomegalovirus (HCMV) strain AD169 is evident, particularly for 78 open reading frames encoded by the central part of the genome. There is a very similar distribution of G+C content across the two genomes. Sequences toward the ends of the MCMV genome encode tandem arrays of homologous glycoproteins (gps) arranged as two gene families. The left end encodes 15 gps that represent one family, and the right end encodes a different family of 11 gps. A homolog (m144) of cellular major histocompatibility complex (MHC) class I genes is located at the end of the genome opposite the HCMV MHC class I homolog (UL18). G protein-coupled receptor (GCR) homologs (M33 and M78) occur in positions congruent with two (UL33 and UL78) of the four putative HCMV GCR homologs. Counterparts of all of the known enzyme homologs in HCMV are present in the MCMV genome, including the phosphotransferase gene (M97), whose product phosphorylates ganciclovir in HCMV-infected cells, and the assembly protein (M80). PMID:8971012

  18. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  19. Model identification for DNA sequence-structure relationships.

    PubMed

    Hawley, Stephen Dwyer; Chiu, Anita; Chizeck, Howard Jay

    2006-11-01

    We investigate the use of algebraic state-space models for the sequence dependent properties of DNA. By considering the DNA sequence as an input signal, rather than using an all atom physical model, computational efficiency is achieved. A challenge in deriving this type of model is obtaining its structure and estimating its parameters. Here we present two candidate model structures for the sequence dependent structural property Slide and a method of encoding the models so that a recursive least squares algorithm can be applied for parameter estimation. These models are based on the assumption that the value of Slide at a base-step is determined by the surrounding tetranucleotide sequence. The first model takes the four bases individually as inputs and has a median root mean square deviation of 0.90 A. The second model takes the four bases pairwise and has a median root mean square deviation of 0.88 A. These values indicate that the accuracy of these models is within the useful range for structure prediction. Performance is comparable to published predictions of a more physically derived model, at significantly less computational cost.

  20. Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction.

    PubMed

    Laehnemann, David; Borkhardt, Arndt; McHardy, Alice Carolyn

    2016-01-01

    Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here.

  1. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    PubMed

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  2. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    SciTech Connect

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  3. A sequence-dependent rigid-base model of DNA

    NASA Astrophysics Data System (ADS)

    Gonzalez, O.; Petkevičiutė, D.; Maddocks, J. H.

    2013-02-01

    A novel hierarchy of coarse-grain, sequence-dependent, rigid-base models of B-form DNA in solution is introduced. The hierarchy depends on both the assumed range of energetic couplings, and the extent of sequence dependence of the model parameters. A significant feature of the models is that they exhibit the phenomenon of frustration: each base cannot simultaneously minimize the energy of all of its interactions. As a consequence, an arbitrary DNA oligomer has an intrinsic or pre-existing stress, with the level of this frustration dependent on the particular sequence of the oligomer. Attention is focussed on the particular model in the hierarchy that has nearest-neighbor interactions and dimer sequence dependence of the model parameters. For a Gaussian version of this model, a complete coarse-grain parameter set is estimated. The parameterized model allows, for an oligomer of arbitrary length and sequence, a simple and explicit construction of an approximation to the configuration-space equilibrium probability density function for the oligomer in solution. The training set leading to the coarse-grain parameter set is itself extracted from a recent and extensive database of a large number of independent, atomic-resolution molecular dynamics (MD) simulations of short DNA oligomers immersed in explicit solvent. The Kullback-Leibler divergence between probability density functions is used to make several quantitative assessments of our nearest-neighbor, dimer-dependent model, which is compared against others in the hierarchy to assess various assumptions pertaining both to the locality of the energetic couplings and to the level of sequence dependence of its parameters. It is also compared directly against all-atom MD simulation to assess its predictive capabilities. The results show that the nearest-neighbor, dimer-dependent model can successfully resolve sequence effects both within and between oligomers. For example, due to the presence of frustration, the model can

  4. On-Demand Indexing for Referential Compression of DNA Sequences

    PubMed Central

    Alves, Fernando; Cogo, Vinicius; Wandelt, Sebastian; Leser, Ulf; Bessani, Alysson

    2015-01-01

    The decreasing costs of genome sequencing is creating a demand for scalable storage and processing tools and techniques to deal with the large amounts of generated data. Referential compression is one of these techniques, in which the similarity between the DNA of organisms of the same or an evolutionary close species is exploited to reduce the storage demands of genome sequences up to 700 times. The general idea is to store in the compressed file only the differences between the to-be-compressed and a well-known reference sequence. In this paper, we propose a method for improving the performance of referential compression by removing the most costly phase of the process, the complete reference indexing. Our approach, called On-Demand Indexing (ODI) compresses human chromosomes five to ten times faster than other state-of-the-art tools (on average), while achieving similar compression ratios. PMID:26146838

  5. Conservation patterns in angiosperm rDNA ITS2 sequences.

    PubMed Central

    Hershkovitz, M A; Zimmer, E A

    1996-01-01

    The two internal transcribed spacers (ITS1 and ITS2) of nuclear ribosomal DNA have become commonly exploited sources of informative variation for interspecific-/intergeneric-level phylogenetic analyses among angiosperms and other eukaryotes. We present an alignment in which one-third to one-half of the ITS2 sequence is alignable above the family level in angiosperms and a phenetic analysis showing that ITS2 contains information sufficient to diagnose lineages at several hierarchical levels. Base compositional analysis shows that angiosperm ITS2 is inherently GC-rich, and that the proportion of T is much more variable than that for other bases. We propose a general model of angiosperm ITS2 secondary structure that shows common pairing relationships for most of the conserved sequence tracts. Variations in our secondary structure predictions for sequences from different taxa indicate that compensatory mutation is not limited to paired positions. PMID:8760866

  6. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants

    PubMed Central

    Llauro, Christel; Jobet, Edouard; Robakowska-Hyzorek, Dagmara; Lasserre, Eric; Ghesquière, Alain; Panaud, Olivier

    2017-01-01

    Retrotransposons are mobile genetic elements abundant in plant and animal genomes. While efficiently silenced by the epigenetic machinery, they can be reactivated upon stress or during development. Their level of transcription not reflecting their transposition ability, it is thus difficult to evaluate their contribution to the active mobilome. Here we applied a simple methodology based on the high throughput sequencing of extrachromosomal circular DNA (eccDNA) forms of active retrotransposons to characterize the repertoire of mobile retrotransposons in plants. This method successfully identified known active retrotransposons in both Arabidopsis and rice material where the epigenome is destabilized. When applying mobilome-seq to developmental stages in wild type rice, we identified PopRice as a highly active retrotransposon producing eccDNA forms in the wild type endosperm. The mobilome-seq strategy opens new routes for the characterization of a yet unexplored fraction of plant genomes. PMID:28212378

  7. Neil2-null Mice Accumulate Oxidized DNA Bases in the Transcriptionally Active Sequences of the Genome and Are Susceptible to Innate Inflammation.

    PubMed

    Chakraborty, Anirban; Wakamiya, Maki; Venkova-Canova, Tatiana; Pandita, Raj K; Aguilera-Aguirre, Leopoldo; Sarker, Altaf H; Singh, Dharmendra Kumar; Hosoki, Koa; Wood, Thomas G; Sharma, Gulshan; Cardenas, Victor; Sarkar, Partha S; Sur, Sanjiv; Pandita, Tej K; Boldogh, Istvan; Hazra, Tapas K

    2015-10-09

    Why mammalian cells possess multiple DNA glycosylases (DGs) with overlapping substrate ranges for repairing oxidatively damaged bases via the base excision repair (BER) pathway is a long-standing question. To determine the biological role of these DGs, null animal models have been generated. Here, we report the generation and characterization of mice lacking Neil2 (Nei-like 2). As in mice deficient in each of the other four oxidized base-specific DGs (OGG1, NTH1, NEIL1, and NEIL3), Neil2-null mice show no overt phenotype. However, middle-aged to old Neil2-null mice show the accumulation of oxidative genomic damage, mostly in the transcribed regions. Immuno-pulldown analysis from wild-type (WT) mouse tissue showed the association of NEIL2 with RNA polymerase II, along with Cockayne syndrome group B protein, TFIIH, and other BER proteins. Chromatin immunoprecipitation analysis from mouse tissue showed co-occupancy of NEIL2 and RNA polymerase II only on the transcribed genes, consistent with our earlier in vitro findings on NEIL2's role in transcription-coupled BER. This study provides the first in vivo evidence of genomic region-specific repair in mammals. Furthermore, telomere loss and genomic instability were observed at a higher frequency in embryonic fibroblasts from Neil2-null mice than from the WT. Moreover, Neil2-null mice are much more responsive to inflammatory agents than WT mice. Taken together, our results underscore the importance of NEIL2 in protecting mammals from the development of various pathologies that are linked to genomic instability and/or inflammation. NEIL2 is thus likely to play an important role in long term genomic maintenance, particularly in long-lived mammals such as humans.

  8. Neil2-null Mice Accumulate Oxidized DNA Bases in the Transcriptionally Active Sequences of the Genome and Are Susceptible to Innate Inflammation* ♦

    PubMed Central

    Chakraborty, Anirban; Wakamiya, Maki; Venkova-Canova, Tatiana; Pandita, Raj K.; Aguilera-Aguirre, Leopoldo; Sarker, Altaf H.; Singh, Dharmendra Kumar; Hosoki, Koa; Wood, Thomas G.; Sharma, Gulshan; Cardenas, Victor; Sarkar, Partha S.; Sur, Sanjiv; Pandita, Tej K.; Boldogh, Istvan; Hazra, Tapas K.

    2015-01-01

    Why mammalian cells possess multiple DNA glycosylases (DGs) with overlapping substrate ranges for repairing oxidatively damaged bases via the base excision repair (BER) pathway is a long-standing question. To determine the biological role of these DGs, null animal models have been generated. Here, we report the generation and characterization of mice lacking Neil2 (Nei-like 2). As in mice deficient in each of the other four oxidized base-specific DGs (OGG1, NTH1, NEIL1, and NEIL3), Neil2-null mice show no overt phenotype. However, middle-aged to old Neil2-null mice show the accumulation of oxidative genomic damage, mostly in the transcribed regions. Immuno-pulldown analysis from wild-type (WT) mouse tissue showed the association of NEIL2 with RNA polymerase II, along with Cockayne syndrome group B protein, TFIIH, and other BER proteins. Chromatin immunoprecipitation analysis from mouse tissue showed co-occupancy of NEIL2 and RNA polymerase II only on the transcribed genes, consistent with our earlier in vitro findings on NEIL2's role in transcription-coupled BER. This study provides the first in vivo evidence of genomic region-specific repair in mammals. Furthermore, telomere loss and genomic instability were observed at a higher frequency in embryonic fibroblasts from Neil2-null mice than from the WT. Moreover, Neil2-null mice are much more responsive to inflammatory agents than WT mice. Taken together, our results underscore the importance of NEIL2 in protecting mammals from the development of various pathologies that are linked to genomic instability and/or inflammation. NEIL2 is thus likely to play an important role in long term genomic maintenance, particularly in long-lived mammals such as humans. PMID:26245904

  9. Next Generation Sequencing of DNA-Launched Chikungunya Vaccine Virus

    PubMed Central

    Hidajat, Rachmat; Nickols, Brian; Forrester, Naomi; Tretyakova, Irina; Weaver, Scott; Pushko, Peter

    2016-01-01

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3’ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. PMID:26855330

  10. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    PubMed

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA.

  11. Genomic and cDNA actin sequences from a virulent strain of Entamoeba histolytica.

    PubMed Central

    Edman, U; Meza, I; Agabian, N

    1987-01-01

    Invasiveness of Entamoeba histolytica strains that cause acute amoebiasis is characterized by aggressive behavior associated with cell motility and actin function. Analysis of actin genes from E. histolytica was initiated by devising methods for the isolation of biologically active nucleic acids, which allowed the preparation of cDNA and genomic DNA libraries. E. histolytica actin-encoding cDNAs and genomic clones have been isolated from libraries prepared from the virulent HM1:IMSS strain using a heterologous actin probe. Nucleotide sequence analysis of three independent cDNA clones and one genomic clone reveals a highly unusual codon bias and the absence of intervening sequences in E. histolytica actin. The coding sequence of the genomic clone is identical to that of two of the three cDNA clones. These represent at least two distinct mRNAs differing only by five silent changes in the protein coding sequence. Multiple genomic copies of the actin gene can be detected by Southern hybridization. E. histolytica actin exhibits a higher degree of homology to cytoplasmic than to muscle actin. Although the protein has been shown not to bind DNase I, the inferred amino acid sequence indicates conservation of all residues implied to participate in this binding. Images PMID:2883657

  12. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    SciTech Connect

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; Johnson, Reid C.; Leng, Fenfei

    2016-03-09

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequences in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.

  13. Complete genome sequence of mitochondrial DNA (mtDNA) of Chlorella sorokiniana.

    PubMed

    Orsini, Massimiliano; Costelli, Cristina; Malavasi, Veronica; Cusano, Roberto; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete sequence of mitochondrial genome of the Chlorella sorokiniana strain (SAG 111-8 k) is presented in this work. Within the Chlorella genus, it represents the second species with a complete sequenced and annotated mitochondrial genome (GenBank accession no. KM241869). The genome consists of circular chromosomes of 52,528 bp and encodes a total of 31 protein coding genes, 3 rRNAs and 26 tRNAs. The overall AT contents of the C. sorokiniana mtDNA is 70.89%, while the coding sequence is of 97.4%.

  14. DNA sequence templates adjacent nucleosome and ORC sites at gene amplification origins in Drosophila.

    PubMed

    Liu, Jun; Zimmer, Kurt; Rusch, Douglas B; Paranjape, Neha; Podicheti, Ram; Tang, Haixu; Calvi, Brian R

    2015-10-15

    Eukaryotic origins of DNA replication are bound by the origin recognition complex (ORC), which scaffolds assembly of a pre-replicative complex (pre-RC) that is then activated to initiate replication. Both pre-RC assembly and activation are strongly influenced by developmental changes to the epigenome, but molecular mechanisms remain incompletely defined. We have been examining the activation of origins responsible for developmental gene amplification in Drosophila. At a specific time in oogenesis, somatic follicle cells transition from genomic replication to a locus-specific replication from six amplicon origins. Previous evidence indicated that these amplicon origins are activated by nucleosome acetylation, but how this affects origin chromatin is unknown. Here, we examine nucleosome position in follicle cells using micrococcal nuclease digestion with Ilumina sequencing. The results indicate that ORC binding sites and other essential origin sequences are nucleosome-depleted regions (NDRs). Nucleosome position at the amplicons was highly similar among developmental stages during which ORC is or is not bound, indicating that being an NDR is not sufficient to specify ORC binding. Importantly, the data suggest that nucleosomes and ORC have opposite preferences for DNA sequence and structure. We propose that nucleosome hyperacetylation promotes pre-RC assembly onto adjacent DNA sequences that are disfavored by nucleosomes but favored by ORC.

  15. DNA sequence templates adjacent nucleosome and ORC sites at gene amplification origins in Drosophila

    PubMed Central

    Liu, Jun; Zimmer, Kurt; Rusch, Douglas B.; Paranjape, Neha; Podicheti, Ram; Tang, Haixu; Calvi, Brian R.

    2015-01-01

    Eukaryotic origins of DNA replication are bound by the origin recognition complex (ORC), which scaffolds assembly of a pre-replicative complex (pre-RC) that is then activated to initiate replication. Both pre-RC assembly and activation are strongly influenced by developmental changes to the epigenome, but molecular mechanisms remain incompletely defined. We have been examining the activation of origins responsible for developmental gene amplification in Drosophila. At a specific time in oogenesis, somatic follicle cells transition from genomic replication to a locus-specific replication from six amplicon origins. Previous evidence indicated that these amplicon origins are activated by nucleosome acetylation, but how this affects origin chromatin is unknown. Here, we examine nucleosome position in follicle cells using micrococcal nuclease digestion with Ilumina sequencing. The results indicate that ORC binding sites and other essential origin sequences are nucleosome-depleted regions (NDRs). Nucleosome position at the amplicons was highly similar among developmental stages during which ORC is or is not bound, indicating that being an NDR is not sufficient to specify ORC binding. Importantly, the data suggest that nucleosomes and ORC have opposite preferences for DNA sequence and structure. We propose that nucleosome hyperacetylation promotes pre-RC assembly onto adjacent DNA sequences that are disfavored by nucleosomes but favored by ORC. PMID:26227968

  16. Genomic Heat Shock Element Sequences Drive Cooperative Human Heat Shock Factor 1 DNA Binding and Selectivity*

    PubMed Central

    Jaeger, Alex M.; Makley, Leah N.; Gestwicki, Jason E.; Thiele, Dennis J.

    2014-01-01

    The heat shock transcription factor 1 (HSF1) activates expression of a variety of genes involved in cell survival, including protein chaperones, the protein degradation machinery, anti-apoptotic proteins, and transcription factors. Although HSF1 activation has been linked to amelioration of neurodegenerative disease, cancer cells exhibit a dependence on HSF1 for survival. Indeed, HSF1 drives a program of gene expression in cancer cells that is distinct from that activated in response to proteotoxic stress, and HSF1 DNA binding activity is elevated in cycling cells as compared with arrested cells. Active HSF1 homotrimerizes and binds to a DNA sequence consisting of inverted repeats of the pentameric sequence nGAAn, known as heat shock elements (HSEs). Recent comprehensive ChIP-seq experiments demonstrated that the architecture of HSEs is very diverse in the human genome, with deviations from the consensus sequence in the spacing, orientation, and extent of HSE repeats that could influence HSF1 DNA binding efficacy and the kinetics and magnitude of target gene expression. To understand the mechanisms that dictate binding specificity, HSF1 was purified as either a monomer or trimer and used to evaluate DNA-binding site preferences in vitro using fluorescence polarization and thermal denaturation profiling. These results were compared with quantitative chromatin immunoprecipitation assays in vivo. We demonstrate a role for specific orientations of extended HSE sequences in driving preferential HSF1 DNA binding to target loci in vivo. These studies provide a biochemical basis for understanding differential HSF1 target gene recognition and transcription in neurodegenerative disease and in cancer. PMID:25204655

  17. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  18. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  19. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  20. Random-breakage mapping method applied to human DNA sequences

    NASA Technical Reports Server (NTRS)

    Lobrich, M.; Rydberg, B.; Cooper, P. K.; Chatterjee, A. (Principal Investigator)

    1996-01-01

    The random-breakage mapping method [Game et al. (1990) Nucleic Acids Res., 18, 4453-4461] was applied to DNA sequences in human fibroblasts. The methodology involves NotI restriction endonuclease digestion of DNA from irradiated calls, followed by pulsed-field gel electrophoresis, Southern blotting and hybridization with DNA probes recognizing the single copy sequences of interest. The Southern blots show a band for the unbroken restriction fragments and a smear below this band due to radiation induced random breaks. This smear pattern contains two discontinuities in intensity at positions that correspond to the distance of the hybridization site to each end of the restriction fragment. By analyzing the positions of those discontinuities we confirmed the previously mapped position of the probe DXS1327 within a NotI fragment on the X chromosome, thus demonstrating the validity of the technique. We were also able to position the probes D21S1 and D21S15 with respect to the ends of their corresponding NotI fragments on chromosome 21. A third chromosome 21 probe, D21S11, has previously been reported to be close to D21S1, although an uncertainty about a second possible location existed. Since both probes D21S1 and D21S11 hybridized to a single NotI fragment and yielded a similar smear pattern, this uncertainty is removed by the random-breakage mapping method.

  1. Sequence Heterogeneity Accelerates Protein Search for Targets on DNA

    NASA Astrophysics Data System (ADS)

    Shvets, Alexey; Kolomeisky, Anatoly

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry and heterogeneity of a genome. The work was supported by the Welch Foundation (Grant C-1559), by the NSF (Grant CHE-1360979), and by the Center for Theoretical Biological Physics sponsored by the NSF (Grant PHY-1427654).

  2. cDNA encoding a polypeptide including a hevein sequence

    SciTech Connect

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  3. Using Synthetic Nanopores for Single-Molecule Analyses: Detecting SNPs, Trapping DNA Molecules, and the Prospects for Sequencing DNA

    ERIC Educational Resources Information Center

    Dimitrov, Valentin V.

    2009-01-01

    This work focuses on studying properties of DNA molecules and DNA-protein interactions using synthetic nanopores, and it examines the prospects of sequencing DNA using synthetic nanopores. We have developed a method for discriminating between alleles that uses a synthetic nanopore to measure the binding of a restriction enzyme to DNA. There exists…

  4. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    PubMed

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  5. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    PubMed

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  6. Genetic variability of Taenia saginata inferred from mitochondrial DNA sequences.

    PubMed

    Rostami, Sima; Salavati, Reza; Beech, Robin N; Babaei, Zahra; Sharbatkhori, Mitra; Harandi, Majid Fasihi

    2015-04-01

    Taenia saginata is an important tapeworm, infecting humans in many parts of the world. The present study was undertaken to identify inter- and intraspecific variation of T. saginata isolated from cattle in different parts of Iran using two mitochondrial CO1 and 12S rRNA genes. Up to 105 bovine specimens of T. saginata were collected from 20 slaughterhouses in three provinces of Iran. DNA were extracted from the metacestode Cysticercus bovis. After PCR amplification, sequencing of CO1 and 12S rRNA genes were carried out and two phylogenetic analyses of the sequence data were generated by Bayesian inference on CO1 and 12S rRNA sequences. Sequence analyses of CO1 and 12S rRNA genes showed 11 and 29 representative profiles respectively. The level of pairwise nucleotide variation between individual haplotypes of CO1 gene was 0.3-2.4% while the overall nucleotide variation among all 11 haplotypes was 4.6%. For 12S rRNA sequence data, level of pairwise nucleotide variation was 0.2-2.5% and the overall nucleotide variation was determined as 5.8% among 29 haplotypes of 12S rRNA gene. Considerable genetic diversity was found in both mitochondrial genes particularly in 12S rRNA gene.

  7. Phylogenetic inference of Indian malaria vectors from multilocus DNA sequences.

    PubMed

    Dixit, Jyotsana; Srivastava, Hemlata; Sharma, Meenu; Das, Manoj K; Singh, O P; Raghavendra, K; Nanda, Nutan; Dash, Aditya P; Saksena, D N; Das, Aparup

    2010-08-01

    Inferences on the taxonomic positions, phylogenetic interrelationships and divergence time among closely related species of medical importance is essential to understand evolutionary patterns among species, and based on which, disease control measures could be devised. To this respect, malaria is one of the important mosquito borne diseases of tropical and sub-tropical parts of the globe. Taxonomic status of malaria vectors has been so far documented based on morphological, cytological and few molecular genetic features. However, utilization of multilocus DNA sequences in phylogenetic inferences are still in dearth. India contains one of the richest resources of mosquito species diversity but little molecular taxonomic information is available in Indian malaria vectors. We herewith utilized the whole genome sequence information of An. gambiae to amplify and sequence three orthologous nuclear genetic regions in six Indian malaria vector species (An. culicifacies, An. minimus, An. sundaicus, An. fluviatilis, An. annularis and An. stephensi). Further, we utilized the previously published DNA sequence information on the COII and ITS2 genes in all the six species, making the total number of loci to five. Multilocus molecular phylogenetic study of Indian anophelines and An. gambiae was conducted at each individual genetic region using Neighbour Joining (NJ), Maximum Likelihood (ML), Maximum Parsimony (MP) and Bayesian approaches. Although tree topologies with COII, and ITS2 genes were similar, for no other three genetic regions similar tree topologies were observed. In general, the reconstructed phylogenetic status of Indian malaria vectors follows the pattern based on morphological and cytological classifications that was reconfirmed with COII and ITS2 genetic regions. Further, divergence times based on COII gene sequences were estimated among the seven Anopheles species which corroborate the earlier hypothesis on the radiation of different species of the Anopheles

  8. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    PubMed Central

    2011-01-01

    Background High throughput sequencing (HTS) technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR). We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants. PMID:21599914

  9. Computational identification of transcriptional regulatory elements in DNA sequence

    PubMed Central

    GuhaThakurta, Debraj

    2006-01-01

    Identification and annotation of all the functional elements in the genome, including genes and the regulatory sequences, is a fundamental challenge in genomics and computational biology. Since regulatory elements are frequently short and variable, their identification and discovery using computational algorithms is difficult. However, significant advances have been made in the computational methods for modeling and detection of DNA regulatory elements. The availability of complete genome sequence from multiple organisms, as well as mRNA profiling and high-throughput experimental methods for mapping protein-binding sites in DNA, have contributed to the development of methods that utilize these auxiliary data to inform the detection of transcriptional regulatory elements. Progress is also being made in the identification of cis-regulatory modules and higher order structures of the regulatory sequences, which is essential to the understanding of transcription regulation in the metazoan genomes. This article reviews the computational approaches for modeling and identification of genomic regulatory elements, with an emphasis on the recent developments, and current challenges. PMID:16855295

  10. Compilation of DNA sequences of Escherichia coli (update 1993).

    PubMed Central

    Kröger, M; Wahl, R; Rice, P

    1993-01-01

    We have compiled the DNA sequence data for E. coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the fifth listing replacing and increasing the former listings substantially. However, in order to save space this printed version contains DNA sequence information only, if they are publically available in electronic form. The complete compilation including a full set of genetic map data and the E. coli protein index can be obtained in machine readable form from the EMBL data library (ECD release 15) as a part of the CD-ROM issue of the EMBL sequence database, released and updated every three months. After deletion of all detected overlaps a total of 2,353,635 individual bp is found to be determined till the end of April 1993. This corresponds to a total of 49.87% of the entire E. coli chromosome consisting of about 4,720 kbp. This number may actually be higher by 9161 bp derived from other strains of E. coli. PMID:8332520

  11. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE PAGES

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

    2016-03-09

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  12. The complete 685-kilobase DNA sequence of the human {Beta} T cell receptor locus

    SciTech Connect

    Rowen, L.; Koop, B.F.; Hood, L.

    1996-06-21

    The human {Beta} T cell receptor (TCR) locus, comprising a complex family of genes, has been sequenced. The locus contains two types of coding elements-TCR elements (65 variable gene segments and two clusters of diversity, joining, and constant segments) and eight trypsinogen genes-that constitute 4.6 percent of the DNA. Genome-wide interspersed repeats and locus-specific repeats span 30 and 47 percent, respectively, of the 685-kilobase sequence. A comparison of the germline variable elements with their approximately 300 complementary DNA counterparts reveals marked differential patterns of variable gene expression, the importance of exonuclease activity in generating TCR diversity, and the predominant tendency for only functional variable elements to be present in complementary DNA libraries. 47 refs., 2 figs., 2 tabs.

  13. Cloning and sequencing of human intestinal alkaline phosphatase cDNA

    SciTech Connect

    Berger, J.; Garattini, E.; Hua, J.C.; Udenfriend, S.

    1987-02-01

    Partial protein sequence data obtained on intestinal alkaline phosphatase indicated a high degree of homology with the reported sequence of the placental isoenzyme. Accordingly, placental alkaline phosphatase cDNA was cloned and used as a probe to clone intestinal alkaline phosphatase cDNA. The latter is somewhat larger (3.1 kilobases) than the cDNA for the placental isozyme (2.8 kilobases). Although the 3' untranslated regions are quite different, there is almost 90% homology in the translated regions of the two isozymes. There are, however, significant differences at their amino and carboxyl termini and a substitution of an alanine in intestinal alkaline phosphatase for a glycine in the active site of the placental isozyme.

  14. Extreme-Depth Re-sequencing of Mitochondrial DNA Finds No Evidence of Paternal Transmission in Humans.

    PubMed

    Pyle, Angela; Hudson, Gavin; Wilson, Ian J; Coxhead, Jonathan; Smertenko, Tania; Herbert, Mary; Santibanez-Koref, Mauro; Chinnery, Patrick F

    2015-05-01

    Recent reports have questioned the accepted dogma that mammalian mitochondrial DNA (mtDNA) is strictly maternally inherited. In humans, the argument hinges on detecting a signature of inter-molecular recombination in mtDNA sequences sampled at the population level, inferring a paternal source for the mixed haplotypes. However, interpreting these data is fraught with difficulty, and direct experimental evidence is lacking. Using extreme-high depth mtDNA re-sequencing up to ~1.2 million-fold coverage, we find no evidence that paternal mtDNA haplotypes are transmitted to offspring in humans, thus excluding a simple dilution mechanism for uniparental transmission of mtDNA present in all healthy individuals. Our findings indicate that an active mechanism eliminates paternal mtDNA which likely acts at the molecular level.

  15. Extreme-Depth Re-sequencing of Mitochondrial DNA Finds No Evidence of Paternal Transmission in Humans

    PubMed Central

    Pyle, Angela; Hudson, Gavin; Wilson, Ian J.; Coxhead, Jonathan; Smertenko, Tania; Herbert, Mary; Santibanez-Koref, Mauro; Chinnery, Patrick F.

    2015-01-01

    Recent reports have questioned the accepted dogma that mammalian mitochondrial DNA (mtDNA) is strictly maternally inherited. In humans, the argument hinges on detecting a signature of inter-molecular recombination in mtDNA sequences sampled at the population level, inferring a paternal source for the mixed haplotypes. However, interpreting these data is fraught with difficulty, and direct experimental evidence is lacking. Using extreme-high depth mtDNA re-sequencing up to ~1.2 million-fold coverage, we find no evidence that paternal mtDNA haplotypes are transmitted to offspring in humans, thus excluding a simple dilution mechanism for uniparental transmission of mtDNA present in all healthy individuals. Our findings indicate that an active mechanism eliminates paternal mtDNA which likely acts at the molecular level. PMID:25973765

  16. Legume genomics: understanding biology through DNA and RNA sequencing

    PubMed Central

    O'Rourke, Jamie A.; Bolon, Yung-Tsi; Bucciarelli, Bruna; Vance, Carroll P.

    2014-01-01

    Background The legume family (Leguminosae) consists of approx. 17 000 species. A few of these species, including, but not limited to, Phaseolus vulgaris, Cicer arietinum and Cajanus cajan, are important dietary components, providing protein for approx. 300 million people worldwide. Additional species, including soybean (Glycine max) and alfalfa (Medicago sativa), are important crops utilized mainly in animal feed. In addition, legumes are important contributors to biological nitrogen, forming symbiotic relationships with rhizobia to fix atmospheric N2 and providing up to 30 % of available nitrogen for the next season of crops. The application of high-throughput genomic technologies including genome sequencing projects, genome re-sequencing (DNA-seq) and transcriptome sequencing (RNA-seq) by the legume research community has provided major insights into genome evolution, genomic architecture and domestication. Scope and Conclusions This review presents an overview of the current state of legume genomics and explores the role that next-generation sequencing technologies play in advancing legume genomics. The adoption of next-generation sequencing and implementation of associated bioinformatic tools has allowed researchers to turn each species of interest into their own model organism. To illustrate the power of next-generation sequencing, an in-depth overview of the transcriptomes of both soybean and white lupin (Lupinus albus) is provided. The soybean transcriptome focuses on analysing seed development in two near-isogenic lines, examining the role of transporters, oil biosynthesis and nitrogen utilization. The white lupin transcriptome analysis examines how phosphate deficiency alters gene expression patterns, inducing the formation of cluster roots. Such studies illustrate the power of next-generation sequencing and bioinformatic analyses in elucidating the gene networks underlying biological processes. PMID:24769535

  17. Mitochondrial DNA Sequencing of Cat Hair: An Informative Forensic Tool*

    PubMed Central

    Tarditi, Christy R.; Grahn, Robert A.; Evans, Jeffrey J.; Kurushima, Jennifer D.; Lyons, Leslie A.

    2010-01-01

    Approximately 81.7 million cats are in 37.5 million USA households. Shed fur can be criminal evidence due to transfer to victims, suspects, and / or their belongings. To improve cat hairs as forensic evidence, the mtDNA control region from single hairs, with and without root tags, was sequenced. A dataset of a 402 bp CR segment from 174 random-bred cats representing four USA geographic areas was generated to determine the informativeness of the mtDNA region. Thirty-two mtDNA mitotypes were observed ranging in frequencies from 0.6-27%. Four common types occurred in all populations. Low heteroplasmy, 1.7%, was determined. Unique mitotypes were found in 18 individuals, 10.3% of the population studied. The calculated discrimination power implied that 8.3 of 10 randomly selected individuals can be excluded by this region. The genetic characteristics of the region and the generated dataset support the use of this cat mtDNA region in forensic applications. PMID:21077873

  18. Retroviral DNA Sequences as a Means for Determining Ancient Diets

    PubMed Central

    Rivera-Perez, Jessica I.; Cano, Raul J.; Narganes-Storde, Yvonne; Chanlatte-Baik, Luis; Toranzos, Gary A.

    2015-01-01

    For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host’s diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures. PMID:26660678

  19. Entropy and long-range correlations in DNA sequences.

    PubMed

    Melnik, S S; Usatenko, O V

    2014-12-01

    We analyze the structure of DNA molecules of different organisms by using the additive Markov chain approach. Transforming nucleotide sequences into binary strings, we perform statistical analysis of the corresponding "texts". We develop the theory of N-step additive binary stationary ergodic Markov chains and analyze their differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain by means of the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses two point correlators instead of probability of block occurring, it makes possible to calculate the entropy of subsequences at much longer distances than with the use of the standard methods. We utilize the obtained analytical result for numerical evaluation of the entropy of coarse-grained DNA texts. We believe that the entropy study can be used for biological classification of living species.

  20. DNA Methyltransferase Activity Assays: Advances and Challenges

    PubMed Central

    Poh, Wan Jun; Wee, Cayden Pang Pee; Gao, Zhiqiang

    2016-01-01

    DNA methyltransferases (MTases), a family of enzymes that catalyse the methylation of DNA, have a profound effect on gene regulation. A large body of evidence has indicated that DNA MTase is potentially a predictive biomarker closely associated with genetic disorders and genetic diseases like cancer. Given the attention bestowed onto DNA MTases in molecular biology and medicine, highly sensitive detection of DNA MTase activity is essential in determining gene regulation, epigenetic modification, clinical diagnosis and therapeutics. Conventional techniques such as isotope labelling are effective, but they often require laborious sample preparation, isotope labelling, sophisticated equipment and large amounts of DNA, rendering them unsuitable for uses at point-of-care. Simple, portable, highly sensitive and low-cost assays are urgently needed for DNA MTase activity screening. In most recent technological advances, many alternative DNA MTase activity assays such as fluorescent, electrochemical, colorimetric and chemiluminescent assays have been proposed. In addition, many of them are coupled with nanomaterials and/or enzymes to significantly enhance their sensitivity. Herein we review the progress in the development of DNA MTase activity assays with an emphasis on assay mechanism and performance with some discussion on challenges and perspectives. It is hoped that this article will provide a broad coverage of DNA MTase activity assays and their latest developments and open new perspectives toward the development of DNA MTase activity assays with much improved performance for uses in molecular biology and clinical practice. PMID:26909112

  1. Sequence-dependent Structural Variation in DNA Undergoing Intrahelical Inspection by the DNA glycosylase MutM

    SciTech Connect

    Sung, Rou-Jia; Zhang, Michael; Qi, Yan; Verdine, Gregory L.

    2012-08-31

    MutM, a bacterial DNA-glycosylase, plays a critical role in maintaining genome integrity by catalyzing glycosidic bond cleavage of 8-oxoguanine (oxoG) lesions to initiate base excision DNA repair. The task faced by MutM of locating rare oxoG residues embedded in an overwhelming excess of undamaged bases is especially challenging given the close structural similarity between oxoG and its normal progenitor, guanine (G). MutM actively interrogates the DNA to detect the presence of an intrahelical, fully base-paired oxoG, whereupon the enzyme promotes extrusion of the target nucleobase from the DNA duplex and insertion into the extrahelical active site. Recent structural studies have begun to provide the first glimpse into the protein-DNA interactions that enable MutM to distinguish an intrahelical oxoG from G; however, these initial studies left open the important question of how MutM can recognize oxoG residues embedded in 16 different neighboring sequence contexts (considering only the 5'- and 3'-neighboring base pairs). In this study we set out to understand the manner and extent to which intrahelical lesion recognition varies as a function of the 5'-neighbor. Here we report a comprehensive, systematic structural analysis of the effect of the 5'-neighboring base pair on recognition of an intrahelical oxoG lesion. These structures reveal that MutM imposes the same extrusion-prone ('extrudogenic') backbone conformation on the oxoG lesion irrespective of its 5'-neighbor while leaving the rest of the DNA relatively free to adjust to the particular demands of individual sequences.

  2. cDNA sequencing and expression analysis of Dicentrarchus labrax heme oxygenase-1.

    PubMed

    Prevot-D'Alvise, N; Pierre, S; Gaillard, S; Gouze, E; Gouze, J-N; Aubert, J; Richard, S; Grillasca, J-P

    2008-11-17

    The liver cDNA encoding heme oxygenase--1 (HO-1) was sequenced from European sea bass (Dicentrarchus labrax) (accession number no. EF139130). The HO-1 cDNA was 1250 bp in nucleotide length and the open reading frame encoded 277 amino acid residues. The deduced amino acid sequence of the European sea bass had 75% and 50% identity with the amino acid sequences of tetraodontiformes (Tetraodon nigroviridis and Takifugu rubripes) and human HO-1 proteins, respectively. A short hydrophobic transmembrane domain at the C--terminal region was found, and four histidine residues were highly conserved, including human his25 that is essential for HO catalytic activity. RT-PCR of mRNA from eight different European sea bass tissues revealed that, in a homeostatis state, the heme oxygenase--1 was abundant in the spleen and liver but not in the brain.

  3. Optimization of ELFSE DNA Sequencing with EOF Counterflow and Microfluidics

    PubMed Central

    Fahrenkopf, Max A.; Mukherjee, Tamal; Ydstie, B. Erik; Schneider, James W.

    2015-01-01

    We present a non-linear optimization study of different implementations of the DNA electrophoretic method ”End-labeled Free-solution Electrophoresis” (ELFSE) in commercial capillary electrophoresis systems and microfluidics to improve the time required for readout. Here, the effect of electro-osmotic counterflows (EOF) and snap-shot detection are considered to allow for detection of peaks soon after they are electorphoretically resolved. Using drag tags available in micelle form, we identify a design capable of sequencing 600 bases in 2.8 minutes. PMID:25154385

  4. Sequence Features and Transcriptional Stalling within Centromere DNA Promote Establishment of CENP-A Chromatin

    PubMed Central

    Catania, Sandra; Pidoux, Alison L.; Allshire, Robin C.

    2015-01-01

    Centromere sequences are not conserved between species, and there is compelling evidence for epigenetic regulation of centromere identity, with location being dictated by the presence of chromatin containing the histone H3 variant CENP-A. Paradoxically, in most organisms CENP-A chromatin generally occurs on particular sequences. To investigate the contribution of primary DNA sequence to establishment of CENP-A chromatin in vivo, we utilised the fission yeast Schizosaccharomyces pombe. CENP-ACnp1 chromatin is normally assembled on ∼10 kb of central domain DNA within these regional centromeres. We demonstrate that overproduction of S. pombe CENP-ACnp1 bypasses the usual requirement for adjacent heterochromatin in establishing CENP-ACnp1 chromatin, and show that central domain DNA is a preferred substrate for de novo establishment of CENP-ACnp1 chromatin. When multimerised, a 2 kb sub-region can establish CENP-ACnp1 chromatin and form functional centromeres. Randomization of the 2 kb sequence to generate a sequence that maintains AT content and predicted nucleosome positioning is unable to establish CENP-ACnp1 chromatin. These analyses indicate that central domain DNA from fission yeast centromeres contains specific information that promotes CENP-ACnp1 incorporation into chromatin. Numerous transcriptional start sites were detected on the forward and reverse strands within the functional 2 kb sub-region and active promoters were identified. RNAPII is enriched on central domain DNA in wild-type cells, but only low levels of transcripts are detected, consistent with RNAPII stalling during transcription of centromeric DNA. Cells lacking factors involved in restarting transcription—TFIIS and Ubp3—assemble CENP-ACnp1 on central domain DNA when CENP-ACnp1 is at wild-type levels, suggesting that persistent stalling of RNAPII on centromere DNA triggers chromatin remodelling events that deposit CENP-ACnp1. Thus, sequence-encoded features of centromeric DNA create an

  5. Microarrays Made Simple: "DNA Chips" Paper Activity

    ERIC Educational Resources Information Center

    Barnard, Betsy

    2006-01-01

    DNA microarray technology is revolutionizing biological science. DNA microarrays (also called DNA chips) allow simultaneous screening of many genes for changes in expression between different cells. Now researchers can obtain information about genes in days or weeks that used to take months or years. The paper activity described in this article…

  6. Sequence-specific modification of mitochondrial DNA using a chimeric zinc finger methylase

    PubMed Central

    Minczuk, Michal; Papworth, Monika A.; Kolasinska, Paulina; Murphy, Michael P.; Klug, Aaron

    2006-01-01

    We used engineered zinc finger peptides (ZFPs) to bind selectively to predetermined sequences in human mtDNA. Surprisingly, we found that engineered ZFPs cannot be reliably routed to mitochondria by using only conventional mitochondrial targeting sequences. We here show that addition of a nuclear export signal allows zinc finger chimeric enzymes to be imported into human mitochondria. The selective binding of mitochondria-specific ZFPs to mtDNA was exemplified by targeting the T8993G mutation, which causes two mitochondrial diseases, neurogenic muscle weakness, ataxia, and retinitis pigmentosa (NARP) and also maternally inherited Leigh's syndrome. To develop a system that allows the monitoring of site-specific alteration of mtDNA we combined a ZFP with the easily assayed DNA-modifying activity of hDNMT3a methylase. Expression of the mutation-specific chimeric methylase resulted in the selective methylation of cytosines adjacent to the mutation site. This is a proof of principle that it is possible to target and alter mtDNA in a sequence-specific manner by using zinc finger technology. PMID:17170133

  7. Differential repair of DNA damage in specific nucleotide sequences in monkey cells.

    PubMed Central

    Leadon, S A

    1986-01-01

    An immunological method was developed that isolates DNA fragments containing bromouracil in repair patches from unrepaired DNA using a monoclonal antibody that recognizes bromouracil. Cultured monkey cells were exposed to either UV light or the activated carcinogen aflatoxin B1 and excision repair of damage in DNA fragments containing the integrated and transcribed E. coli gpt gene was compared to that in the genome overall. A more rapid repair, of both UV and AFB1 damage was observed in the DNA fragments containing the E. coli gpt genes. The more efficient repair of UV damage was not due to a difference in the initial level of pyrimidine dimers as determined with a specific UV endonuclease. Consistent with previous observations using different methodology, repair of UV damage in the alpha sequences was found to occur at the same rate as that in the genome overall, while repair of AFB1 damage was deficient in alpha DNA. The preferential repair of damage in the gpt gene may be related to the functional state of the sequence and/or to alterations produced in the chromatin conformation by the integration of plasmid sequences carrying the gene. Images PMID:3786142

  8. Nonlinear analysis of correlations in Alu repeat sequences in DNA

    NASA Astrophysics Data System (ADS)

    Xiao, Yi; Huang, Yanzhao; Li, Mingfeng; Xu, Ruizhen; Xiao, Saifeng

    2003-12-01

    We report on a nonlinear analysis of deterministic structures in Alu repeats, one of the richest repetitive DNA sequences in the human genome. Alu repeats contain the recognition sites for the restriction endonuclease AluI, which is what gives them their name. Using the nonlinear prediction method developed in chaos theory, we find that all Alu repeats have novel deterministic structures and show strong nonlinear correlations that are absent from exon and intron sequences. Furthermore, the deterministic structures of Alus of younger subfamilies show panlike shapes. As young Alus can be seen as mutation free copies from the “master genes,” it may be suggested that the deterministic structures of the older subfamilies are results of an evolution from a “panlike” structure to a more diffuse correlation pattern due to mutation.

  9. Optimizing Data Intensive GPGPU Computations for DNA Sequence Alignment

    PubMed Central

    Trapnell, Cole; Schatz, Michael C.

    2009-01-01

    MUMmerGPU uses highly-parallel commodity graphics processing units (GPU) to accelerate the data-intensive computation of aligning next generation DNA sequence data to a reference sequence for use in diverse applications such as disease genotyping and personal genomics. MUMmerGPU 2.0 features a new stackless depth-first-search print kernel and is 13× faster than the serial CPU version of the alignment code and nearly 4× faster in total computation time than MUMmerGPU 1.0. We exhaustively examined 128 GPU data layout configurations to improve register footprint and running time and conclude higher occupancy has greater impact than reduced latency. MUMmerGPU is available open-source at http://mummergpu.sourceforge.net. PMID:20161021

  10. Photocatalytic probing of DNA sequence by using TiO{sub 2}/dopamine-DNA triads.

    SciTech Connect

    Liu, J.; de la Garza, L.; Zhang, L.; Dimitrijevic, N. M.; Zuo, X.; Tiede, D. M.; Rajh, T.

    2007-10-15

    A method to control charge transfer reaction in DNA using hybrid nanometer-sized TiO{sub 2} nanoparticles was developed. In this system extended charge separation reflects the sequence of DNA and was measured using metallic silver deposition or by photocurrent response. Light-induced extended charge separation in these systems was found to be dependent on the DNA-bridge length and sequence. The yield of photocatalytic deposition of silver was studied in systems having GG accepting sites imbedded in AT runs at varying distances from the TiO{sub 2} nanoparticle surface. Weak distance dependence of charge separation indicative of a hole hopping through mediating adenine (A) sites was found. The quantum yield of silver deposition in the system having a GG accepting site placed 8.5 {angstrom} from the nanoparticle surface was found to be {Phi} = 0.70 (70%) and {Phi} = 0.56 (56%) for (A){sub n} and (AT){sub n/2} bridge, respectively. Hole injection to GG trapping sites as far as 70 {angstrom} from a nanoparticle surface in the absence of G hopping sites was measured. Introduction of G hopping sites increased the efficiency of hole injection. The efficiency of photocatalytic deposition of metallic silver was found to be sensitive to the presence of a single nucleobase mismatch in the DNA sequence.

  11. DNA topology, not DNA sequence, is a critical determinant for Drosophila ORC–DNA binding

    PubMed Central

    Remus, Dirk; Beall, Eileen L; Botchan, Michael R

    2004-01-01

    Drosophila origin recognition complex (ORC) localizes to defined positions on chromosomes, and in follicle cells the chorion gene amplification loci are well-studied examples. However, the mechanism of specific localization is not known. We have studied the DNA binding of DmORC to investigate the cis-requirements for DmORC:DNA interaction. DmORC displays at best six-fold differences in the relative affinities to DNA from the third chorion locus and to random fragments in vitro, and chemical probing and DNase1 protection experiments did not identify a discrete binding site for DmORC on any of these fragments. The intrinsic DNA-binding specificity of DmORC is therefore insufficient to target DmORC to origins of replication in vivo. However, the topological state of the DNA significantly influences the affinity of DmORC to DNA. We found that the affinity of DmORC for negatively supercoiled DNA is about 30-fold higher than for either relaxed or linear DNA. These data provide biochemical evidence for the notion that origin specification in metazoa likely involves mechanisms other than simple replicator–initiator interactions and that in vivo other proteins must determine ORC's localization. PMID:14765124

  12. Functional analysis of the individual enhancer core sequences of polyomavirus: Cell-specific uncoupling of DNA replication from transcription

    SciTech Connect

    Campbell, B.A.; Villarreal, L.P.

    1988-05-01

    Polyomavirus (Py) enhancer core elements were compared for their ability to activate Py early transcription and DNA replication in mouse 3T6 cells, lymphoid cell lines, and undifferentiated embryonal carcinoma cells. By examining the pattern of genetic change in a number of cell-specific Py variants, the authors identified subenhancer sequences that may be functionally important for virus replication. Four such distinct enhancer consensus sequences were synthesized and designated as the A core (homologous with adenovirus 5 E1A enhancer), B core (homologous to the simian virus 40 A enhancer core), C core (containing an inverted repeat within the Py B enhancer), and BPV core (homologous to the bovine papillomavirus enhancer). When used to replace the complete Py B enhancer, single copies of all but the BPV element were able to fully activate Py DNA replication after transfection, but this activation was usually cell type specific. In the PCC4 embryonal carcinoma cells, only the A-core sequence was able to activate transcription and DNA replication. The BPV core sequence containing the Py F441 point change was unable to activate DNA replication in the F9 embryonal carcinoma or any other cell line. No single insertion element was dominant nor did these elements display the wild-type enhancer pattern of cell-specific activation of DNA replication. In addition, differential effects were often observed on the activation of transcription versus DNA replication.

  13. Binding sequences for RdgB, a DNA damage-responsive transcriptional activator, and temperature-dependent expression of bacteriocin and pectin lyase genes in Pectobacterium carotovorum subsp. carotovorum.

    PubMed

    Yamada, Kazuteru; Kaneko, Jun; Kamio, Yoshiyuki; Itoh, Yoshifumi

    2008-10-01

    Pectobacterium carotovorum subsp. carotovorum strain Er simultaneously produces the phage tail-like bacteriocin carotovoricin (Ctv) and pectin lyase (Pnl) in response to DNA-damaging agents. The regulatory protein RdgB of the Mor/C family of proteins activates transcription of pnl through binding to the promoter. However, the optimal temperature for the synthesis of Ctv (23 degrees C) differs from that for synthesis of Pnl (30 degrees C), raising the question of whether RdgB directly activates ctv transcription. Here we report that RdgB directly regulates Ctv synthesis. Gel mobility shift assays demonstrated RdgB binding to the P(0), P(1), and P(2) promoters of the ctv operons, and DNase I footprinting determined RdgB-binding sequences (RdgB boxes) on these and on the pnl promoters. The RdgB box of the pnl promoter included a perfect 7-bp inverted repeat with high binding affinity to the regulator (K(d) [dissociation constant] = 150 nM). In contrast, RdgB boxes of the ctv promoters contained an imperfect inverted repeat with two or three mismatches that consequently reduced binding affinity (K(d) = 250 to 350 nM). Transcription of the rdgB and ctv genes was about doubled at 23 degrees C compared with that at 30 degrees C. In contrast, the amount of pnl transcription tripled at 30 degrees C. Thus, the inverse synthesis of Ctv and Pnl as a function of temperature is apparently controlled at the transcriptional level, and reduced rdgB expression at 30 degrees C obviously affected transcription from the ctv promoters with low-affinity RdgB boxes. Pathogenicity toward potato tubers was reduced in an rdgB knockout mutant, suggesting that the RdgAB system contributes to the pathogenicity of this bacterium, probably by activating pnl expression.

  14. Indirect readout of DNA sequence by p22 repressor: roles of DNA and protein functional groups in modulating DNA conformation.

    PubMed

    Harris, Lydia-Ann; Watkins, Derrick; Williams, Loren Dean; Koudelka, Gerald B

    2013-01-09

    The repressor of bacteriophage P22 (P22R) discriminates between its various DNA binding sites by sensing the identity of non-contacted base pairs at the center of its binding site. The "indirect readout" of these non-contacted bases is apparently based on DNA's sequence-dependent conformational preferences. The structures of P22R-DNA complexes indicate that the non-contacted base pairs at the center of the binding site are in the B' state. This finding suggests that indirect readout and therefore binding site discrimination depend on P22R's ability to either sense and/or impose the B' state on the non-contacted bases of its binding sites. We show here that the affinity of binding sites for P22R depends on the tendency of the central bases to assume the B'-DNA state. Furthermore, we identify functional groups in the minor groove of the non-contacted bases as the essential modulators of indirect readout by P22R. In P22R-DNA complexes, the negatively charged E44 and E48 residues are provocatively positioned near the negatively charged DNA phosphates of the non-contacted nucleotides. The close proximity of the negatively charged groups on protein and DNA suggests that electrostatics may play a key role in the indirect readout process. Changing either of two negatively charged residues to uncharged residues eliminates the ability of P22R to impose structural changes on DNA and to recognize non-contacted base sequence. These findings suggest that these negatively charged amino acids function to force the P22R-bound DNA into the B' state and therefore play a key role in indirect readout by P22R.

  15. Automated hybridization/imaging device for fluorescent multiplex DNA sequencing

    DOEpatents

    Weiss, Robert B.; Kimball, Alvin W.; Gesteland, Raymond F.; Ferguson, F. Mark; Dunn, Diane M.; Di Sera, Leonard J.; Cherry, Joshua L.

    1995-01-01

    A method is disclosed for automated multiplex sequencing of DNA with an integrated automated imaging hybridization chamber system. This system comprises an hybridization chamber device for mounting a membrane containing size-fractionated multiplex sequencing reaction products, apparatus for fluid delivery to the chamber device, imaging apparatus for light delivery to the membrane and image recording of fluorescence emanating from the membrane while in the chamber device, and programmable controller apparatus for controlling operation of the system. The multiplex reaction products are hybridized with a probe, then an enzyme (such as alkaline phosphatase) is bound to a binding moiety on the probe, and a fluorogenic substrate (such as a benzothiazole derivative) is introduced into the chamber device by the fluid delivery apparatus. The enzyme converts the fluorogenic substrate into a fluorescent product which, when illuminated in the chamber device with a beam of light from the imaging apparatus, excites fluorescence of the fluorescent product to produce a pattern of hybridization. The pattern of hybridization is imaged by a CCD camera component of the imaging apparatus to obtain a series of digital signals. These signals are converted by the controller apparatus into a string of nucleotides corresponding to the nucleotide sequence an automated sequence reader. The method and apparatus are also applicable to other membrane-based applications such as colony and plaque hybridization and Southern, Northern, and Western blots.

  16. Sequence dependence of isothermal DNA amplification via EXPAR

    PubMed Central

    Qian, Jifeng; Ferguson, Tanya M.; Shinde, Deepali N.; Ramírez-Borrero, Alissa J.; Hintze, Arend; Adami, Christoph; Niemz, Angelika

    2012-01-01

    Isothermal nucleic acid amplification is becoming increasingly important for molecular diagnostics. Therefore, new computational tools are needed to facilitate assay design. In the isothermal EXPonential Amplification Reaction (EXPAR), template sequences with similar thermodynamic characteristics perform very differently. To understand what causes this variability, we characterized the performance of 384 template sequences, and used this data to develop two computational methods to predict EXPAR template performance based on sequence: a position weight matrix approach with support vector machine classifier, and RELIEF attribute evaluation with Naïve Bayes classification. The methods identified well and poorly performing EXPAR templates with 67–70% sensitivity and 77–80% specificity. We combined these methods into a computational tool that can accelerate new assay design by ruling out likely poor performers. Furthermore, our data suggest that variability in template performance is linked to specific sequence motifs. Cytidine, a pyrimidine base, is over-represented in certain positions of well-performing templates. Guanosine and adenosine, both purine bases, are over-represented in similar regions of poorly performing templates, frequently as GA or AG dimers. Since polymerases have a higher affinity for purine oligonucleotides, polymerase binding to GA-rich regions of a single-stranded DNA template may promote non-specific amplification in EXPAR and other nucleic acid amplification reactions. PMID:22416064

  17. Automated hybridization/imaging device for fluorescent multiplex DNA sequencing

    DOEpatents

    Weiss, R.B.; Kimball, A.W.; Gesteland, R.F.; Ferguson, F.M.; Dunn, D.M.; Di Sera, L.J.; Cherry, J.L.

    1995-11-28

    A method is disclosed for automated multiplex sequencing of DNA with an integrated automated imaging hybridization chamber system. This system comprises an hybridization chamber device for mounting a membrane containing size-fractionated multiplex sequencing reaction products, apparatus for fluid delivery to the chamber device, imaging apparatus for light delivery to the membrane and image recording of fluorescence emanating from the membrane while in the chamber device, and programmable controller apparatus for controlling operation of the system. The multiplex reaction products are hybridized with a probe, the enzyme (such as alkaline phosphatase) is bound to a binding moiety on the probe, and a fluorogenic substrate (such as a benzothiazole derivative) is introduced into the chamber device by the fluid delivery apparatus. The enzyme converts the fluorogenic substrate into a fluorescent product which, when illuminated in the chamber device with a beam of light from the imaging apparatus, excites fluorescence of the fluorescent product to produce a pattern of hybridization. The pattern of hybridization is imaged by a CCD camera component of the imaging apparatus to obtain a series of digital signals. These signals are converted by the controller apparatus into a string of nucleotides corresponding to the nucleotide sequence an automated sequence reader. The method and apparatus are also applicable to other membrane-based applications such as colony and plaque hybridization and Southern, Northern, and Western blots. 9 figs.

  18. In the TTF-1 homeodomain the contribution of several amino acids to DNA recognition depends on the bound sequence.

    PubMed Central

    Fabbro, D; Tell, G; Leonardi, A; Pellizzari, L; Pucillo, C; Lonigro, R; Formisano, S; Damante, G

    1996-01-01

    The thyroid transcription factor-1 homeodomain (TTF-1HD) shows a peculiar DNA binding specificity, preferentially recognizing sequences containing the 5'-CAAG-3' core motif. Most other homeodomains instead recognize sites containing the 5'-TAAT-3' core motif. Here, we show that TTF-1HD efficiently recognizes another sequence, called D1, devoid of the 5'-CAAG-3' core motif. Different experimental approaches indicate that TTF-1HD contacts the D1 sequence in a manner which is different to that used to interact with sequences containing the 5'-CAAG-3' core motif. The binding activities that mutants of TTF-1HD display with the D1 sequence or with the sequence containing the 5'-CAAG-3' core motif indicate that the role of several DNA-contacting amino acids is different. In particular, during recognition of the D1 sequence, backbone-interacting amino acids not relevant in binding to sequences containing the 5'-CAAG-3' core motif play an important role. In the TTF-1HD, therefore, the contribution of several amino acids to DNA recognition depends on the bound sequence. These data indicate that although a common bonding network exists in all of the HD/DNA complexes, peculiarities important for DNA recognition may occur in single cases. PMID:8811078

  19. Sequence Depth, Not PCR Replication, Improves Ecological Inference from Next Generation DNA Sequencing

    PubMed Central

    Smith, Dylan P.; Peay, Kabir G.

    2014-01-01

    Recent advances in molecular approaches and DNA sequencing have greatly progressed the field of ecology and allowed for the study of complex communities in unprecedented detail. Next generation sequencing (NGS) can reveal powerful insights into the diversity, composition, and dynamics of cryptic organisms, but results may be sensitive to a number of technical factors, including molecular practices used to generate amplicons, sequencing technology, and data processing. Despite the popularity of some techniques over others, explicit tests of the relative benefits they convey in molecular ecology studies remain scarce. Here we tested the effects of PCR replication, sequencing depth, and sequencing platform on ecological inference drawn from environmental samples of soil fungi. We sequenced replicates of three soil samples taken from pine biomes in North America represented by pools of either one, two, four, eight, or sixteen PCR replicates with both 454 pyrosequencing and Illumina MiSeq. Increasing the number of pooled PCR replicates had no detectable effect on measures of α- and β-diversity. Pseudo-β-diversity – which we define as dissimilarity between re-sequenced replicates of the same sample – decreased markedly with increasing sampling depth. The total richness recovered with Illumina was significantly higher than with 454, but measures of α- and β-diversity between a larger set of fungal samples sequenced on both platforms were highly correlated. Our results suggest that molecular ecology studies will benefit more from investing in robust sequencing technologies than from replicating PCRs. This study also demonstrates the potential for continuous integration of older datasets with newer technology. PMID:24587293

  20. Noncontinuously binding loop-out primers for avoiding problematic DNA sequences in PCR and sanger sequencing.

    PubMed

    Sumner, Kelli; Swensen, Jeffrey J; Procter, Melinda; Jama, Mohamed; Wooderchak-Donahue, Whitney; Lewis, Tracey; Fong, Michael; Hubley, Lindsey; Schwarz, Monica; Ha, Youna; Paul, Eleri; Brulotte, Benjamin; Lyon, Elaine; Bayrak-Toydemir, Pinar; Mao, Rong; Pont-Kingdon, Genevieve; Best, D Hunter

    2014-09-01

    We present a method in which noncontinuously binding (loop-out) primers are used to exclude regions of DNA that typically interfere with PCR amplification and/or analysis by Sanger sequencing. Several scenarios were tested using this design principle, including M13-tagged PCR primers, non-M13-tagged PCR primers, and sequencing primers. With this technique, a single oligonucleotide is designed in two segments that flank, but do not include, a short region of problematic DNA sequence. During PCR amplification or sequencing, the problematic region is looped-out from the primer binding site, where it does not interfere with the reaction. Using this method, we successfully excluded regions of up to 46 nucleotides. Loop-out primers were longer than traditional primers (27 to 40 nucleotides) and had higher melting temperatures. This method allows the use of a standardized PCR protocol throughout an assay, keeps the number of PCRs to a minimum, reduces the chance for laboratory error, and, above all, does not interrupt the clinical laboratory workflow.

  1. DNA nuclease activity of Rev-coupled transition metal chelates.

    PubMed

    Joyner, Jeff C; Keuper, Kevin D; Cowan, J A

    2012-06-07

    paints a clearer picture of the factors governing DNA nuclease activity by redox active M-chelates than was previously possible. The results demonstrate enhancement of DNA cleavage by use of a targeting sequence, but also clearly underscore that significant orientational factors are required for optimal reactivity at the metal center. Moreover, the studies confirm high selectivity for the target HIV RRE RNA at the most likely dosage concentrations, lending further support to the feasibility of designing and applying targeted catalytic metallodrugs.

  2. Sequence preferences of DNA interstrand crosslinking agents: quantitation of interstrand crosslink locations in DNA duplex fragments containing multiple crosslinkable sites.

    PubMed Central

    Millard, J T; Weidner, M F; Kirchner, J J; Ribeiro, S; Hopkins, P B

    1991-01-01

    A general approach to the quantitative study of the sequence specificity of DNA interstrand crosslinking agents in synthetic duplex DNA fragments is described. In the first step, a DNA fragment previously treated with an interstrand crosslinking agent is subjected to denaturing PAGE. Not only does this distinguish crosslinked from native or monoadducted DNA, it is shown herein that isomeric crosslinked DNAs differing in position of the crosslink can in some cases be separated. In the second stage, the now fractionated crosslinked DNAs isolated from denaturing PAGE are subjected to fragmentation using iron(II)/EDTA. For those fractions which are structurally homogeneous, analysis of the resulting fragment distribution has previously been shown to reveal the crosslink position at nucleotide resolution. It is shown herein that in fractions which are structurally heterogeneous due to differences in position of crosslink, this analysis quantifies the relative extent of crosslinking at distinct sites. Using this method it is shown that reductively activated mitomycin C crosslinks the duplex sequences 5'-GCGC and 5'-TCGA with 3 +/- 1:1 relative efficiency. Images PMID:1903204

  3. Phylogenomics of Phrynosomatid Lizards: Conflicting Signals from Sequence Capture versus Restriction Site Associated DNA Sequencing

    PubMed Central

    Leaché, Adam D.; Chavez, Andreas S.; Jones, Leonard N.; Grummer, Jared A.; Gottscho, Andrew D.; Linkem, Charles W.

    2015-01-01

    Sequence capture and restriction site associated DNA sequencing (RADseq) are popular methods for obtaining large numbers of loci for phylogenetic analysis. These methods are typically used to collect data at different evolutionary timescales; sequence capture is primarily used for obtaining conserved loci, whereas RADseq is designed for discovering single nucleotide polymorphisms (SNPs) suitable for population genetic or phylogeographic analyses. Phylogenetic questions that span both “recent” and “deep” timescales could benefit from either type of data, but studies that directly compare the two approaches are lacking. We compared phylogenies estimated from sequence capture and double digest RADseq (ddRADseq) data for North American phrynosomatid lizards, a species-rich and diverse group containing nine genera that began diversifying approximately 55 Ma. Sequence capture resulted in 584 loci that provided a consistent and strong phylogeny using concatenation and species tree inference. However, the phylogeny estimated from the ddRADseq data was sensitive to the bioinformatics steps used for determining homology, detecting paralogs, and filtering missing data. The topological conflicts among the SNP trees were not restricted to any particular timescale, but instead were associated with short internal branches. Species tree analysis of the largest SNP assembly, which also included the most missing data, supported a topology that matched the sequence capture tree. This preferred phylogeny provides strong support for the paraphyly of the earless lizard genera Holbrookia and Cophosaurus, suggesting that the earless morphology either evolved twice or evolved once and was subsequently lost in Callisaurus. PMID:25663487

  4. Population genetics and molecular evolution of DNA sequences in transposable elements. I. A simulation framework.

    PubMed

    Kijima, T E; Innan, Hideki

    2013-11-01

    A population genetic simulation framework is developed to understand the behavior and molecular evolution of DNA sequences of transposable elements. Our model incorporates random transposition and excision of transposable element (TE) copies, two modes of selection against TEs, and degeneration of transpositional activity by point mutations. We first investigated the relationships between the behavior of the copy number of TEs and these parameters. Our results show that when selection is weak, the genome can maintain a relatively large number of TEs, but most of them are less active. In contrast, with strong selection, the genome can maintain only a limited number of TEs but the proportion of active copies is large. In such a case, there could be substantial fluctuations of the copy number over generations. We also explored how DNA sequences of TEs evolve through the simulations. In general, active copies form clusters around the original sequence, while less active copies have long branches specific to themselves, exhibiting a star-shaped phylogeny. It is demonstrated that the phylogeny of TE sequences could be informative to understand the dynamics of TE evolution.

  5. Hypervariable minisatellite DNA sequences in the Indian peafowl Pavo cristatus.

    PubMed

    Hanotte, O; Burke, T; Armour, J A; Jeffreys, A J

    1991-04-01

    We report here for the first time the large-scale isolation of hypervariable minisatellite DNA sequences from a non-human species, the Indian peafowl (Pavo cristatus). A size-selected genomic DNA fraction, rich in hypervariable minisatellites, was cloned into Charomid 9-36. This library was screened using two multilocus hypervariable probes, 33.6 and 33.15 and also, in a "probe-walking" approach, with five of the peafowl minisatellites initially isolated. Forty-eight positively hybridizing clones were characterized and found to originate from 30 different loci, 18 of which were polymorphic. Five of these variable minisatellite loci were studied further. They all showed Mendelian inheritance. The heterozygosities of these loci were relatively low (range 22-78%) in comparison with those of previously cloned human loci, as expected in view of inbreeding in our semicaptive study population. No new length allele mutations were observed in families and the mean mutation rate per locus is low (less than 0.004, 95% confidence maximum). These loci were also investigated by cross-species hybridization in related taxa. The ability of the probes to detect hypervariable sequences in other species within the same avian family was found to vary, from those probes that are species-specific to those that are apparently general to the family. We also illustrate the potential usefulness of these probes for paternity analysis in a study of sexual selection, and discuss the general application of specific hypervariable probes in behavioral and evolutionary studies.

  6. A DNA sequence analysis program for the Apple Macintosh.

    PubMed Central

    Gross, R H

    1986-01-01

    This paper describes a new set of programs for analyzing DNA sequences using the Apple Macintosh computer, a computer ideally suited for this kind of analysis. Because of the Macintosh interface and the availability of high quality software-only speech synthesis, these programs are truly easy to use. Instead of typing in commands, the user directs the program by making selections with the mouse, thereby eliminating most typographical and syntax errors. Output options are selected by "pressing buttons" and then clicking "OK" with the mouse. DNA sequences are confirmed by having the program speak them. The high resolution graphics on the Macintosh not only allow for explanatory diagrams to be used to aid in deciding on input parameters, but can be used to produce slides for presentations and figures for papers. Because of the clipboard and the ability of the Macintosh to readily share data among different applications, data can be saved for use directly in word processing documents (e.g. manuscripts). PMID:3003685

  7. Complete genome sequence of chloroplast DNA (cpDNA) of Chlorella sorokiniana.

    PubMed

    Orsini, Massimiliano; Cusano, Roberto; Costelli, Cristina; Malavasi, Veronica; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete chloroplast genome sequence of Chlorella sorokiniana strain (SAG 111-8 k) is presented in this study. The genome consists of circular chromosomes of 109,811 bp, which encode a total of 109 genes, including 74 proteins, 3 rRNAs and 31 tRNAs. Moreover, introns are not detected and all genes are present in single copy. The overall AT contents of the C. sorokiniana cpDNA is 65.9%, the coding sequence is 59.1% and a large inverted repeat (IR) is not observed.

  8. Sequence and Temperature Influence on Kinetics of DNA Strand Displacement at Gold Electrode Surfaces.

    PubMed

    Biala, Katarzyna; Sedova, Ada; Flechsig, Gerd-Uwe

    2015-09-16

    Understanding complex contributions of surface environment to tethered nucleic acid sensing experiments has proven challenging, yet it is important because it is essential for interpretation and calibration of indispensable methods, such as microarrays. We investigate the effects of DNA sequence and solution temperature gradients on the kinetics of strand displacement at heated gold wire electrodes, and at gold disc electrodes in a heated solution. Addition of a terminal double mismatch (toehold) provides a reduction in strand displacement energy barriers sufficient to probe the secondary mechanisms involved in the hybridization process. In four different DNA capture probe sequences (relevant for the identification of genetically modified maize MON810), all but one revealed a high activation energy up to 200 kJ/mol during hybridization, that we attribute to displacement of protective strands by capture probes. Protective strands contain 4 to 5 mismatches to ease their displacement by the surface-confined probes at the gold electrodes. A low activation energy (30 kJ/mol) was observed for the sequence whose protective strand contained a toehold and one central mismatch, its kinetic curves displayed significantly different shapes, and we observed a reduced maximum signal intensity as compared to other sequences. These findings point to potential sequence-related contributions to oligonucleotide diffusion influencing kinetics. Additionally, for all sequences studied with heated wire electrodes, we observed a 23 K lower optimal hybridization temperature in comparison with disc electrodes in heated solution, and greatly reduced voltammetric signals after taking into account electrode surface area. We propose that thermodiffusion due to temperature gradients may influence both hybridization and strand displacement kinetics at heated microelectrodes, an explanation supported by computational fluid dynamics. DNA assays with surface-confined capture probes and temperature

  9. Partition enrichment of nucleotide sequences (PINS)--a generally applicable, sequence based method for enrichment of complex DNA samples.

    PubMed

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5' and 50 base pairs 3' to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown.

  10. Mylodon darwinii DNA sequences from ancient fecal hair shafts.

    PubMed

    Clack, Andrew A; MacPhee, Ross D E; Poinar, Hendrik N

    2012-01-20

    Preserved hair has been increasingly used as an ancient DNA source in high throughput sequencing endeavors, and it may actually offer several advantages compared to more traditional ancient DNA substrates like bone. However, cold environments have yielded the most informative ancient hair specimens, while its preservation, and thus utility, in temperate regions is not well documented. Coprolites could represent a previously underutilized preservation substrate for hairs, which, if present therein, represent macroscopic packages of specific cells that are relatively simple to separate, clean and process. In this pilot study, we report amplicons 147-152 base pairs in length (w/primers) from hair shafts preserved in a south Chilean coprolite attributed to Darwin's extinct ground sloth, Mylodon darwinii. Our results suggest that hairs preserved in coprolites from temperate cave environments can serve as an effective source of ancient DNA. This bodes well for potential molecular-based population and phylogeographic studies on sloths, several species of which have been understudied despite leaving numerous coprolites in caves across of the Americas.

  11. Repeat sequences from complex ds DNA viruses can be used as minisatellite probes for DNA fingerprinting.

    PubMed

    Crawford, A M; Buchanan, F C; Fraser, K M; Robinson, A J; Hill, D F

    1991-01-01

    In a search for new fingerprinting probes for use with sheep, repeat sequences derived from five poxviruses, an iridovirus and a baculovirus were screened against DNA from sheep pedigrees. Probes constructed from portions of the parapox viruses, orf virus and papular stomatitis virus and the baculovirus from the alfalfa looper, Autographa californica, nuclear polyhedrosis virus all gave fingerprint patterns. Probes from three other poxviruses and an iridovirus did not give useful banding patterns.

  12. Potential use of DNA barcoding for the identification of Salvia based on cpDNA and nrDNA sequences.

    PubMed

    Wang, Meng; Zhao, Hong-Xia; Wang, Long; Wang, Tao; Yang, Rui-Wu; Wang, Xiao-Li; Zhou, Yong-Hong; Ding, Chun-Bang; Zhang, Li

    2013-10-10

    An effective DNA marker for authenticating the genus Salvia was screened using seven DNA regions (rbcL, matK, trnL-F, and psbA-trnH from the chloroplast genome, and ITS, ITS1, and ITS2 from the nuclear genome) and three combinations (rbcL+matK, psbA-trnH+ITS1, and trnL-F+ITS1). The present study collected 232 sequences from 27 Salvia species through DNA sequencing and 77 sequences within the same taxa from the GenBank. The discriminatory capabilities of these regions were evaluated in terms of PCR amplification success, intraspecific and interspecific divergence, DNA barcoding gaps, and identification efficiency via a tree-based method. ITS1 was superior to the other marker for discriminating between species, with an accuracy of 81.48%. The three combinations did not increase species discrimination. Finally, we found that ITS1 is a powerful barcode for identifying Salvia species, especially Salvia miltiorrhiza.

  13. GRAIL seeks out genes buried in DNA sequence

    SciTech Connect

    Roberts, L.

    1991-11-08

    When the Human Genome Project achieves its ultimate goal, supposedly around 2005, biologists will have in hand the exact sequence of all 3 billion nucleotides arrayed along the human chromosomes. But they have never been entirely sure how they will read the language of the long string of As, Gs, Ts, and Cs. How will they even be able to pick out the genes, which account for a mere 5% of the genome, from the mass of letters in between Now Edward Ubergacher, a biophysicist-turned-computational-biologist at Oak Ridge National Laboratory, has come one step toward providing an answer: a new artificial intelligence program, called GRAIL, that can pick out the coding regions of genes in a long stretch of sequence data. So far, the Oak Ridge team has analyzed 5 million bases of DNA. One year ago, even 6 months ago, it was virtually impossible to go into human genomic sequence and find genes by computer with any reliability. Now we can go in and find 90% of the genes very quickly. GRAIL can be used on a PC, not a supercomputer, and it provides an answer almost instantly.

  14. The DNA sequence of the human X chromosome.

    PubMed

    Ross, Mark T; Grafham, Darren V; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R; Burrows, Christine; Bird, Christine P; Frankish, Adam; Lovell, Frances L; Howe, Kevin L; Ashurst, Jennifer L; Fulton, Robert S; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C; Hurles, Matthew E; Andrews, T Daniel; Scott, Carol E; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P; Hunt, Sarah E; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Ainscough, Rachael; Ambrose, Kerrie D; Ansari-Lari, M Ali; Aradhya, Swaroop; Ashwell, Robert I S; Babbage, Anne K; Bagguley, Claire L; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E; Barlow, Karen F; Barrett, Ian P; Bates, Karen N; Beare, David M; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M; Brown, Andrew J; Brown, Mary J; Bonnin, David; Bruford, Elspeth A; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y; Clarke, Graham; Clee, Chris M; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G; Conquer, Jen S; Corby, Nicole; Connor, Richard E; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; Deshazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A; Hawes, Alicia; Heath, Paul D; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J; Huckle, Elizabeth J; Hume, Jennifer; Hunt, Paul J; Hunt, Adrienne R; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J; Joseph, Shirin S; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M; Loulseged, Hermela; Loveland, Jane E; Lovell, Jamieson D; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O'Dell, Christopher N; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V; Pearson, Danita M; Pelan, Sarah E; Perez, Lesette; Porter, Keith M; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A; Schlessinger, David; Schueler, Mary G; Sehra, Harminder K; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M; Shownkeen, Ratna; Skuce, Carl D; Smith, Michelle L; Sotheran, Elizabeth C; Steingruber, Helen E; Steward, Charles A; Storey, Roy; Swann, R Mark; Swarbreck, David; Tabor, Paul E; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C; d'Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L; Whiteley, Mathew N; Wilkinson, Jane E; Willey, David L; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L; Wray, Paul W; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J; Hillier, Ladeana W; Willard, Huntington F; Wilson, Richard K; Waterston, Robert H; Rice, Catherine M; Vaudin, Mark; Coulson, Alan; Nelson, David L; Weinstock, George; Sulston, John E; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A; Beck, Stephan; Rogers, Jane; Bentley, David R

    2005-03-17

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.

  15. Long-range correlations and charge transport properties of DNA sequences

    NASA Astrophysics Data System (ADS)

    Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

    2010-04-01

    By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5sequence displays a transition from correlation behavior to anticorrelation behavior. The resonant peaks of the transmission coefficient in genomic sequences can survive in longer sequence length than in random sequences but in shorter sequence length than in quasiperiodic sequences. It is shown that the genomic sequences have long-range correlation properties to some extent but the correlations are not strong enough to maintain the scale invariance properties.

  16. Estimation of DNA sequence context-dependent mutation rates using primate genomic sequences.

    PubMed

    Zhang, Wei; Bouffard, Gerard G; Wallace, Susan S; Bond, Jeffrey P

    2007-09-01

    It is understood that DNA and amino acid substitution rates are highly sequence context-dependent, e.g., C --> T substitutions in vertebrates may occur much more frequently at CpG sites and that cysteine substitution rates may depend on support of the context for participation in a disulfide bond. Furthermore, many applications rely on quantitative models of nucleotide or amino acid substitution, including phylogenetic inference and identification of amino acid sequence positions involved in functional specificity. We describe quantification of the context dependence of nucleotide substitution rates using baboon, chimpanzee, and human genomic sequence data generated by the NISC Comparative Sequencing Program. Relative mutation rates are reported for the 96 classes of mutations of the form 5' alphabetagamma 3' --> 5' alphadeltagamma 3', where alpha, beta, gamma, and delta are nucleotides and beta not equal delta, based on maximum likelihood calculations. Our results confirm that C --> T substitutions are enhanced at CpG sites compared with other transitions, relatively independent of the identity of the preceding nucleotide. While, as expected, transitions generally occur more frequently than transversions, we find that the most frequent transversions involve the C at CpG sites (CpG transversions) and that their rate is comparable to the rate of transitions at non-CpG sites. A four-class model of the rates of context-dependent evolution of primate DNA sequences, CpG transitions > non-CpG transitions approximately CpG transversions > non-CpG transversions, captures qualitative features of the mutation spectrum. We find that despite qualitative similarity of mutation rates among different genomic regions, there are statistically significant differences.

  17. Giant panda ribosomal protein S14: cDNA, genomic sequence cloning, sequence analysis, and overexpression.

    PubMed

    Wu, G-F; Hou, Y-L; Hou, W-R; Song, Y; Zhang, T

    2010-10-13

    RPS14 is a component of the 40S ribosomal subunit encoded by the RPS14 gene and is required for its maturation. The cDNA and the genomic sequence of RPS14 were cloned successfully from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively; they were both sequenced and analyzed. The length of the cloned cDNA fragment was 492 bp; it contained an open-reading frame of 456 bp, encoding 151 amino acids. The length of the genomic sequence is 3421 bp; it contains four exons and three introns. Alignment analysis indicates that the nucleotide sequence shares a high degree of homology with those of Homo sapiens, Bos taurus, Mus musculus, Rattus norvegicus, Gallus gallus, Xenopus laevis, and Danio rerio (93.64, 83.37, 92.54, 91.89, 87.28, 84.21, and 84.87%, respectively). Comparison of the deduced amino acid sequences of the giant panda with those of these other species revealed that the RPS14 of giant panda is highly homologous with those of B. taurus, R. norvegicus and D. rerio (85.99, 99.34 and 99.34%, respectively), and is 100% identical with the others. This degree of conservation of RPS14 suggests evolutionary selection. Topology prediction shows that there are two N-glycosylation sites, three protein kinase C phosphorylation sites, two casein kinase II phosphorylation sites, four N-myristoylation sites, two amidation sites, and one ribosomal protein S11 signature in the RPS14 protein of the giant panda. The RPS14 gene can be readily expressed in Escherichia coli. When it was fused with the N-terminally His-tagged protein, it gave rise to accumulation of an expected 22-kDa polypeptide, in good agreement with the predicted molecular weight. The expression product obtained can be purified for studies of its function.

  18. Luminescent and electroactive labels for DNA sequencing and mapping

    SciTech Connect

    Brown, G.M.

    1994-12-31

    New labels for DNA based on metalloorganic compounds that are either electrochemically active or have long-lived luminescent excited states have been prepared. A derivative of the macrocyclic chelating agent, 1,4,7,10-tetracyclododecane-1,4,7,10-tetraacetic acid (DOTA) was used to attach the lanthanide [Ln(III)] ion to oligonucleotides. This ligand proved stable providing kinetically inert complexes with such metal ions.

  19. cDNA sequence and expression pattern of the putative pheromone carrier aphrodisin.

    PubMed Central

    Mägert, H J; Hadrys, T; Cieslak, A; Gröger, A; Feller, S; Forssmann, W G

    1995-01-01

    The cDNA sequence for aphrodisin, a lipocalin from hamster vaginal discharge which is involved in pheromonal activity, has been determined. Corresponding genomic clones were isolated and the promoter region was identified. Primer extension analysis revealed an adenosine residue as the main transcription initiation site, located 50 bp upstream of the translation start codon ATG, which is surrounded by a typical Kozak sequence. However, data from polymerase chain reaction analysis suggest the existence of at least one alternative transcription initiation site. The aphrodisin cDNA is 732 bp long and codes for the mature 151-aa aphrodisin and an additional N-terminal 16-aa secretory signal peptide. The 3' nontranslated region is 228 bp long. Among the known sequences, the aphrodisin cDNA shares the highest homology with the rat odorant-binding protein cDNA (45%), which verifies the protein data. Vaginal tissue and Bartholin's glands are the main aphrodisin gene-expressing tissues of the female hamster genital tract, as demonstrated by Northern blot analysis. Under less stringent hybridization conditions, RNA isolated from rat Bartholin's glands also showed a signal, indicating the occurrence of aphrodisin-related mRNA in this species. Images Fig. 4 Fig. 5 Fig. 6 Fig. 7 PMID:7892229

  20. Mapping vaccinia virus DNA replication origins at nucleotide level by deep sequencing.

    PubMed

    Senkevich, Tatiana G; Bruno, Daniel; Martens, Craig; Porcella, Stephen F; Wolf, Yuri I; Moss, Bernard

    2015-09-01

    Poxviruses reproduce in the host cytoplasm and encode most or all of the enzymes and factors needed for expression and synthesis of their double-stranded DNA genomes. Nevertheless, the mode of poxvirus DNA replication and the nature and location of the replication origins remain unknown. A current but unsubstantiated model posits only leading strand synthesis starting at a nick near one covalently closed end of the genome and continuing around the other end to generate a concatemer that is subsequently resolved into unit genomes. The existence of specific origins has been questioned because any plasmid can replicate in cells infected by vaccinia virus (VACV), the prototype poxvirus. We applied directional deep sequencing of short single-stranded DNA fragments enriched for RNA-primed nascent strands isolated from the cytoplasm of VACV-infected cells to pinpoint replication origins. The origins were identified as the switching points of the fragment directions, which correspond to the transition from continuous to discontinuous DNA synthesis. Origins containing a prominent initiation point mapped to a sequence within the hairpin loop at one end of the VACV genome and to the same sequence within the concatemeric junction of replication intermediates. These findings support a model for poxvirus genome replication that involves leading and lagging strand synthesis and is consistent with the requirements for primase and ligase activities as well as earlier electron microscopic and biochemical studies implicating a replication origin at the end of the VACV genome.

  1. Somatic instability of the DNA sequences encoding the polymorphic polyglutamine tract of the AIB1 gene

    PubMed Central

    Dai, P; Wong, L

    2003-01-01

    Background: AIB1 contains a polymorphic polyglutamine tract (poly Q) that is encoded by a trinucleotide CAG repeat. Previously there have been conflicting results regarding the effect of the poly Q tract length on breast cancer. Since poly Q is not encoded by a perfect CAG repeat, the heterozygous polymorphic alleles need to be resolved, to understand the exact DNA sequences encoding poly Q. Methods: Poly Q encoding sequences of AIB1 from 107 DNA samples, including breast cancer cell lines, sporadic primary breast tumours, and blood samples from BRCA1/BRCA2 mutation carriers and the general population, were resolved by PCR/cloning followed by sequencing of each individual clone. Results: 25 distinct poly Q encoding sequence patterns were found. More than two distinct sequence patterns were found in a significantly higher proportion of tumours and cell lines than that of the general population, suggesting somatic instability. A significantly higher proportion of cancer cell lines or primary breast tumours than that of the general population contained rare sequence patterns. The proportion of sporadic breast tumours having at least one allele ⩽27 repeats is significantly higher than that in the blood of BRCA1/BRCA2 mutation carrier breast cancer patients or the general population. Conclusion: The poly Q encoding DNA sequences are somatically unstable in tumour tissues and cell lines. A missense mutation and a very short glutamine repeat in primary tumours suggests that AIB1 activity may be modulated through poly Q, which in turn plays a role in the cotransactivation of gene expressions in breast cancers. PMID:14684685

  2. Next-generation sequencing reveals DGUOK mutations in adult patients with mitochondrial DNA multiple deletions

    PubMed Central

    Garone, Caterina; Bordoni, Andreina; Gutierrez Rios, Purificacion; Calvo, Sarah E.; Ripolone, Michela; Ranieri, Michela; Rizzuti, Mafalda; Villa, Luisa; Magri, Francesca; Corti, Stefania; Bresolin, Nereo; Mootha, Vamsi K.; Moggio, Maurizio; DiMauro, Salvatore; Comi, Giacomo P.; Sciacco, Monica

    2012-01-01

    The molecular diagnosis of mitochondrial disorders still remains elusive in a large proportion of patients, but advances in next generation sequencing are significantly improving our chances to detect mutations even in sporadic patients. Syndromes associated with mitochondrial DNA multiple deletions are caused by different molecular defects resulting in a wide spectrum of predominantly adult-onset clinical presentations, ranging from progressive external ophthalmoplegia to multi-systemic disorders of variable severity. The mutations underlying these conditions remain undisclosed in half of the affected subjects. We applied next-generation sequencing of known mitochondrial targets (MitoExome) to probands presenting with adult-onset mitochondrial myopathy and harbouring mitochondrial DNA multiple deletions in skeletal muscle. We identified autosomal recessive mutations in the DGUOK gene (encoding mitochondrial deoxyguanosine kinase), which has previously been associated with an infantile hepatocerebral form of mitochondrial DNA depletion. Mutations in DGUOK occurred in five independent subjects, representing 5.6% of our cohort of patients with mitochondrial DNA multiple deletions, and impaired both muscle DGUOK activity and protein stability. Clinical presentations were variable, including mitochondrial myopathy with or without progressive external ophthalmoplegia, recurrent rhabdomyolysis in a young female who had received a liver transplant at 9 months of age and adult-onset lower motor neuron syndrome with mild cognitive impairment. These findings reinforce the concept that mutations in genes involved in deoxyribonucleotide metabolism can cause diverse clinical phenotypes and suggest that DGUOK should be screened in patients harbouring mitochondrial DNA deletions in skeletal muscle. PMID:23043144

  3. DNA uptake sequences in Neisseria gonorrhoeae as intrinsic transcriptional terminators and markers of horizontal gene transfer

    PubMed Central

    Gurung, Neesha

    2016-01-01

    DNA uptake sequences are widespread throughout the Neisseria gonorrhoeae genome. These short, conserved sequences facilitate the exchange of endogenous DNA between members of the genus Neisseria. Often the DNA uptake sequences are present as inverted repeats that are able to form hairpin structures. It has been suggested previously that DNA uptake sequence inverted repeats present 3′ of genes play a role in rho-independent termination and attenuation. However, there is conflicting experimental evidence to support this role. The aim of this study was to determine the role of DNA uptake sequences in transcriptional termination. Both bioinformatics predictions, conducted using TransTermHP, and experimental evidence, from RNA-seq data, were used to determine which inverted repeat DNA uptake sequences are transcriptional terminators and in which direction. Here we show that DNA uptake sequences in the inverted repeat configuration occur in N. gonorrhoeae both where the DNA uptake sequence precedes the inverted version of the sequence and also, albeit less frequently, in reverse order. Due to their symmetrical configuration, inverted repeat DNA uptake sequences can potentially act as bi-directional terminators, therefore affecting transcription on both DNA strands. This work also provides evidence that gaps in DNA uptake sequence density in the gonococcal genome coincide with areas of DNA that are foreign in origin, such as prophage. This study differentiates for the first time, to our knowledge, between DNA uptake sequences that form intrinsic transcriptional terminators and those that do not, providing characteristic features within the flanking inverted repeat that can be identified. PMID:28348864

  4. Activation of polyomavirus DNA replication by yeast GAL4 is dependent on its transcriptional activation domains.

    PubMed Central

    Bennett-Cook, E R; Hassell, J A

    1991-01-01

    The polyomavirus replication origin contains transcriptional regulatory sequences. To determine how these elements function in DNA replication, and to learn whether a common mechanism underlies the activation of transcription and DNA replication, we tested whether a well-characterized transcriptional activator, yeast GAL4, was capable of stimulating DNA replication and transcription in the same mammalian cell line. We observed that GAL4 activated polyomavirus DNA replication in mouse cells when its binding site was juxtaposed to the late border of the polyomavirus origin core. Synergistic activation of DNA replication was achieved by multimerization of the GAL4 binding site. Analysis of GAL4 mutant proteins, GAL4 hybrid proteins and mutants of the latter revealed that the activation domains of these transcriptional activators were required to stimulate DNA replication. In agreement with previously published data, the activation domains of GAL4 were also required to enhance transcription in the same mouse cell line. These observations implicate transcriptional activators in Py DNA replication and suggest that similar mechanisms govern the activation of transcription and DNA replication. Images PMID:1849079

  5. Active DNA Demethylation in Plants and Animals

    PubMed Central

    Zhang, H.; Zhu, J.-K.

    2013-01-01

    Active DNA demethylation regulates many vital biological processes, including early development and locus-specific gene expression in plants and animals. In Arabidopsis, bifunctional DNA glycosylases directly excise the 5-methylcytosine base and then cleave the DNA backbone at the abasic site. Recent evidence suggests that mammals utilize DNA glycosylases after 5-methylcytosine is oxidized and/or deaminated. In both cases, the resultant single-nucleotide gap is subsequently filled with an unmodified cytosine through the DNA base excision repair pathway. The enzymatic removal of 5-methylcytosine is tightly integrated with histone modifications and possibly noncoding RNAs. Future research will increase our understanding of the mechanisms and critical roles of active DNA demethylation in various cellular processes as well as inspire novel genetic and chemical therapies for epigenetic disorders. PMID:23197304

  6. cDNA cloning and sequence analysis of human pancreatic procarboxypeptidase A1.

    PubMed Central

    Catasús, L; Villegas, V; Pascual, R; Avilés, F X; Wicker-Planquart, C; Puigserver, A

    1992-01-01

    Using polyclonal antibodies raised against human pancreatic procarboxypeptidases, a full-length cDNA coding for an A-type proenzyme was isolated from a lambda gt11 human pancreatic library. This cDNA contains standard 3' and 5' flanking regions, a poly(A)+ tail and a central region of 1260 nucleotides coding for a protein of 419 amino acids. On the basis of sequence comparisons, the human protein was classified as a procarboxypeptidase A1 which is very similar to the previously described A1 forms from rat and bovine pancreatic glands. The presence of the amino acid sequences assumed to be of importance for the zymogen inhibition by its activation segment, primarily on the basis of the recently reported crystal structure of the B form, further supports the proposed classification. PMID:1417781

  7. Differential DNA sequence recognition is a determinant of specificity in homeotic gene action.

    PubMed Central

    Ekker, S C; von Kessler, D P; Beachy, P A

    1992-01-01

    The homeotic genes of Drosophila encode transcriptional regulatory proteins that specify distinct segment identities. Previous studies have implicated the homeodomain as a major determinant of biological specificity within these proteins, but have not established the physical basis of this specificity. We show here that the homeodomains encoded by the Ultrabithorax and Deformed homeotic genes bind optimally to distinct DNA sequences and have mapped the determinants responsible for differential recognition. We further show that relative transactivation by these two proteins in a simple in vivo system can differ by nearly two orders of magnitude. Such differences in DNA sequence recognition and target activation provide a biochemical basis for at least part of the biological specificity of homeotic gene action. Images PMID:1356765

  8. Sequence-specific binding of simian virus 40 A protein to nonorigin and cellular DNA.

    PubMed Central

    Wright, P J; DeLucia, A L; Tegtmeyer, P

    1984-01-01

    The simian virus 40 A protein (T antigen) recognized and bound to the consensus sequence 5'-GAGGC-3' in DNA from many sources. Sequence-specific binding to single pentanucleotides in randomly chosen DNA predominated over binding to nonspecific sequences. The asymmetric orientation of protein bound to nonorigin recognition sequences also resembled that of protein bound to the origin region of simian virus 40 DNA. Sequence variations in the DNA adjacent to single pentanucleotides influenced binding affinities even though methylation interference and protection studies did not reveal specific interactions outside of pentanucleotides. Thus, potential locations of A protein bound to any DNA can be predicted although the determinants of binding affinity are not yet understood. Sequence-specific binding of A protein to cellular DNA would provide a mechanism for specific alterations of host gene expression that facilitate viral function. Images PMID:6570189

  9. Genome-wide identification and characterisation of human DNA replication origins by initiation site sequencing (ini-seq)

    PubMed Central

    Langley, Alexander R.; Gräf, Stefan; Smith, James C.; Krude, Torsten

    2016-01-01

    Next-generation sequencing has enabled the genome-wide identification of human DNA replication origins. However, different approaches to mapping replication origins, namely (i) sequencing isolated small nascent DNA strands (SNS-seq); (ii) sequencing replication bubbles (bubble-seq) and (iii) sequencing Okazaki fragments (OK-seq), show only limited concordance. To address this controversy, we describe here an independent high-resolution origin mapping technique that we call initiation site sequencing (ini-seq). In this approach, newly replicated DNA is directly labelled with digoxigenin-dUTP near the sites of its initiation in a cell-free system. The labelled DNA is then immunoprecipitated and genomic locations are determined by DNA sequencing. Using this technique we identify >25,000 discrete origin sites at sub-kilobase resolution on the human genome, with high concordance between biological replicates. Most activated origins identified by ini-seq are found at transcriptional start sites and contain G-quadruplex (G4) motifs. They tend to cluster in early-replicating domains, providing a correlation between early replication timing and local density of activated origins. Origins identified by ini-seq show highest concordance with sites identified by SNS-seq, followed by OK-seq and bubble-seq. Furthermore, germline origins identified by positive nucleotide distribution skew jumps overlap with origins identified by ini-seq and OK-seq more frequently and more specifically than do sites identified by either SNS-seq or bubble-seq. PMID:27587586

  10. Development of a Novel Technology for Label Free DNA Sequencing

    DTIC Science & Technology

    2012-05-21

    Structures of Codons Interactions Among Codons and Relationship Between Sequence of the Bases and the Vibrational Modes Molecular Dynamics...device applications from the technology perspective. 2) The optically active modes lying within the terahertz spectrum typically arise out of joint...coupling between the DNA’s vibrational behavior and the dynamics within the nanodot substrate. For many normal modes , this coupling is predicted to be

  11. [Current applications of high-throughput DNA sequencing technology in antibody drug research].

    PubMed

    Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

    2012-03-01

    Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.

  12. Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

    SciTech Connect

    Fields, C.A.

    1994-09-01

    This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.

  13. Factorial Moments Analyses Show a Characteristic Length Scale in DNA Sequences

    NASA Astrophysics Data System (ADS)

    Mohanty, A. K.; Narayana Rao, A. V. S. S.

    2000-02-01

    A unique feature of most of the DNA sequences, found through the factorial moments analysis, is the existence of a characteristic length scale around which the density distribution is nearly Poissonian. Above this point, the DNA sequences, irrespective of their intron contents, show long range correlations with a significant deviation from the Gaussian statistics, while, below this point, the DNA statistics are essentially Gaussian. The famous DNA walk representation is also shown to be a special case of the present analysis.

  14. Voltammetric detection of sequence-selective DNA hybridization related to Toxoplasma gondii in PCR amplicons.

    PubMed

    Gokce, Gultekin; Erdem, Arzum; Ceylan, Cagdas; Akgöz, Muslum

    2016-01-01

    This work describes the single-use electrochemical DNA biosensor technology developed for voltammetric detection of sequence selective DNA hybridization related to important human and veterinary pathogen; Toxoplasma gondii. In the principle of electrochemical label-free detection assay, the duplex of DNA hybrid formation was detected by measuring guanine oxidation signal occured in the presence of DNA hybridization. The biosensor design consisted of the immobilization of an inosine-modified (guanine-free) probe onto the surface of pencil graphite electrode (PGE), and the detection of the duplex formation in connection with the differential pulse voltammetry(DPV) by measuring the guanine signal. Toxoplasma gondii capture probe was firstly immobilized onto the surface of the activated PGE by wet adsorption. The extent of hybridization at PGE surface between the probe and the target was then determined by measuring the guanine signal observed at +1.0V. The electrochemical monitoring of optimum DNA hybridization has been performed in the target concentration of 40µg/mL in 50min of hybridization time. The specificity of the electrochemical biosensor was then tested using non-complementary, or mismatch short DNA sequences. Under the optimum conditions, the guanine oxidation signal indicating full hybridization was measured in various target concentration from 0.5 to 25µg/mL and a detection limit was found to be 1.78µg/mL. This single-use biosensor platform was successfully applied for the voltammetric detection of DNA hybridization related to Toxoplasma gondii in PCR amplicons.

  15. Generating Exome Enriched Sequencing Libraries from Formalin-Fixed, Paraffin-Embedded Tissue DNA for Next-Generation Sequencing.

    PubMed

    Marosy, Beth A; Craig, Brian D; Hetrick, Kurt N; Witmer, P Dane; Ling, Hua; Griffith, Sean M; Myers, Benjamin; Ostrander, Elaine A; Stanford, Janet L; Brody, Lawrence C; Doheny, Kimberly F

    2017-01-11

    This unit describes a technique for generating exome-enriched sequencing libraries using DNA extracted from formalin-fixed paraffin-embedded (FFPE) samples. Utilizing commercially available kits, we present a low-input FFPE workflow starting with 50 ng of DNA. This procedure includes a repair step to address damage caused by FFPE preservation that improves sequence quality. Subsequently, libraries undergo an in-solution-targeted selection for exons, followed by sequencing using the Illumina next-generation short-read sequencing platform. © 2017 by John Wiley & Sons, Inc.

  16. Sequence-selective DNA detection using multiple laminar streams: a novel microfluidic analysis method.

    PubMed

    Yamashita, Kenichi; Yamaguchi, Yoshiko; Miyazaki, Masaya; Nakamura, Hiroyuki; Shimizu, Hazime; Maeda, Hideaki

    2004-02-01

    On-site detection methods for DNA have been demanded in the pathophysiology field. Such analysis requires a simple and accurate method, rather than high-throughput. This report describes a novel microfluidic analysis method and its application for simple sequence-selective DNA detection. The method uses a microchannel device with a serpentine structure. Sequence-specific binding of probe DNA can be detected at one side of the microchannel. This method is capable of sequence-specific detection of DNA with high accuracy. Single base mutations can also be analyzed. Combination of laminar stream and laminar secondary flow in the microchannel enable specific detection of probe-bound DNA.

  17. Genomic shotgun array: a procedure linking large-scale DNA sequencing with regional transcript mapping.

    PubMed

    Li, Ling-Hui; Li, Jian-Chiuan; Lin, Yung-Feng; Lin, Chung-Yen; Chen, Chung-Yung; Tsai, Shih-Feng

    2004-02-11

    To facilitate transcript mapping and to investigate alterations in genomic structure and gene expression in a defined genomic target, we developed a novel microarray-based method to detect transcriptional activity of the human chromosome 4q22-24 region. Loss of heterozygosity of human 4q22-24 is frequently observed in hepatocellular carcinoma (HCC). One hundred and eighteen well-characterized genes have been identified from this region. We took previously sequenced shotgun subclones as templates to amplify overlapping sequences for the genomic segment and constructed a chromosome-region-specific microarray. Using genomic DNA fragments as probes, we detected transcriptional activity from within this region among five different tissues. The hybridization results indicate that there are new transcripts that have not yet been identified by other methods. The existence of new transcripts encoded by genes in this region was confirmed by PCR cloning or cDNA library screening. The procedure reported here allows coupling of shotgun sequencing with transcript mapping and, potentially, detailed analysis of gene expression and chromosomal copy of the genomic sequence for the putative HCC tumor suppressor gene(s) in the 4q candidate region.

  18. Isolation, characterization and chromosome localization of repetitive DNA sequences in bananas (Musa spp.).

    PubMed

    Valárik, M; Simková, H; Hribová, E; Safár, J; Dolezelová, M; Dolezel, J

    2002-01-01

    Partial genomic DNA libraries were constructed in Musa acuminata and M. balbisiana and screened for clones carrying repeated sequences, and sequences carrying rDNA. Isolated clones were characterized in terms of copy number, genomic distribution in M. acuminata and M. balbisiana, and sequence similarity to known DNA sequences. Ribosomal RNA genes have been the most abundant sequences recovered. FISH with probes for DNA clones Radkal and Radka7, which carry different fragments of Musa 26S rDNA, and Radka14, for which no homology with known DNA sequences has been found, resulted in clear signals at secondary constrictions. Only one clone carrying 5S rDNA, named Radka2, has been recovered. All remaining DNA clones exhibited more or less pronounced clustering at centromeric regions. The study revealed small differences in genomic distribution of repetitive DNA sequences between M. acuminata and M. balbisiana, the only exception being the 5S rDNA where the two Musa clones under study differed in the number of sites. All repetitive sequences were more abundant in M. acuminata whose genome is about 12% larger than that of M. balbisiana. While, for some sequences, the differences in copy number between the species were relatively small, for some of them, e.g. Radka5, the difference was almost thirty-fold. These observations suggest that repetitive DNA sequences contribute to the difference in genome size between both species, albeit to different extents. Isolation and characterization of new repetitive DNA sequences improves the knowledge of long-range organization of chromosomes in

  19. True single-molecule DNA sequencing of a pleistocene horse bone

    PubMed Central

    Orlando, Ludovic; Ginolhac, Aurelien; Raghavan, Maanasa; Vilstrup, Julia; Rasmussen, Morten; Magnussen, Kim; Steinmann, Kathleen E.; Kapranov, Philipp; Thompson, John F.; Zazula, Grant; Froese, Duane; Moltke, Ida; Shapiro, Beth; Hofreiter, Michael; Al-Rasheid, Khaled A.S.; Gilbert, M. Thomas P.; Willerslev, Eske

    2011-01-01

    Second-generation sequencing platforms have revolutionized the field of ancient DNA, opening access to complete genomes of past individuals and extinct species. However, these platforms are dependent on library construction and amplification steps that may result in sequences that do not reflect the original DNA template composition. This is particularly true for ancient DNA, where templates have undergone extensive damage post-mortem. Here, we report the results of the first “true single molecule sequencing” of ancient DNA. We generated 115.9 Mb and 76.9 Mb of DNA sequences from a permafrost-preserved Pleistocene horse bone using the Helicos HeliScope and Illumina GAIIx platforms, respectively. We find that the percentage of endogenous DNA sequences derived from the horse is higher among the Helicos data than Illumina data. This result indicates that the molecular biology tools used to generate sequencing libraries of ancient DNA molecules, as required for second-generation sequencing, introduce biases into the data that reduce the efficiency of the sequencing process and limit our ability to fully explore the molecular complexity of ancient DNA extracts. We demonstrate that simple modifications to the standard Helicos DNA template preparation protocol further increase the proportion of horse DNA for this sample by threefold. Comparison of Helicos-specific biases and sequence errors in modern DNA with those in ancient DNA also reveals extensive cytosine deamination damage at the 3′ ends of ancient templates, indicating the presence of 3′-sequence overhangs. Our results suggest that paleogenomes could be sequenced in an unprecedented manner by combining current second- and third-generation sequencing approaches. PMID:21803858

  20. Studying long 16S rDNA sequences with ultrafast-metagenomic sequence classification using exact alignments (Kraken).

    PubMed

    Valenzuela-González, Fabiola; Martínez-Porchas, Marcel; Villalpando-Canchola, Enrique; Vargas-Albores, Francisco

    2016-03-01

    Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences.

  1. Fluorescence bio-barcode DNA assay based on gold and magnetic nanoparticles for detection of Exotoxin A gene sequence.

    PubMed

    Amini, Bahram; Kamali, Mehdi; Salouti, Mojtaba; Yaghmaei, Parichehreh

    2017-06-15

    Bio-barcode DNA based on gold nanoparticle (bDNA-GNPs) as a new generation of biosensor based detection tools, holds promise for biological science studies. They are of enormous importance in the emergence of rapid and sensitive procedures for detecting toxins of microorganisms. Exotoxin A (ETA) is the most toxic virulence factor of Pseudomonas aeruginosa. ETA has ADP-ribosylation activity and decisively affects the protein synthesis of the host cells. In the present study, we developed a fluorescence bio-barcode technology to trace P. aeruginosa ETA. The GNPs were coated with the first target-specific DNA probe 1 (1pDNA) and bio-barcode DNA, which acted as a signal reporter. The magnetic nanoparticles (MNPs) were coated with the second target-specific DNA probe 2 (2pDNA) that was able to recognize the other end of the target DNA. After binding the nanoparticles with the target DNA, the following sandwich structure was formed: MNP 2pDNA/tDNA/1pDNA-GNP-bDNA. After isolating the sandwiches by a magnetic field, the DNAs of the probes which have been hybridized to their complementary DNA, GNPs and MNPs, via the hydrogen, electrostatic and covalently bonds, were released from the sandwiches after dissolving in dithiothreitol solution (DTT 0.8M). This bio-barcode DNA with known DNA sequence was then detected by fluorescence spectrophotometry. The findings showed that the new method has the advantages of fast, high sensitivity (the detection limit was 1.2ng/ml), good selectivity, and wide linear range of 5-200ng/ml. The regression analysis also showed that there was a good linear relationship (∆F=0.57 [target DNA]+21.31, R(2)=0.9984) between the fluorescent intensity and the target DNA concentration in the samples.

  2. Brain feminization requires active repression of masculinization via DNA methylation

    PubMed Central

    Nugent, Bridget M.; Wright, Christopher L.; Shetty, Amol C.; Hodes, Georgia E.; Lenz, Kathryn M.; Mahurkar, Anup; Russo, Scott J.; Devine, Scott E.; McCarthy, Margaret M.

    2015-01-01

    The developing mammalian brain is destined for a female phenotype unless exposed to gonadal hormones during a perinatal sensitive period. It has been assumed that the undifferentiated brain is masculinized by direct induction of transcription by ligand-activated nuclear steroid receptors. We found that a primary effect of gonadal steroids in the highly sexually-dimorphic preoptic area (POA) is to reduce activity of DNA methyltransferase (Dnmt) enzymes, thereby decreasing DNA methylation and releasing masculinizing genes from epigenetic repression. Pharmacological inhibition of Dnmts mimicked gonadal steroids, resulting in masculinized neuronal markers and male sexual behavior in females. Conditional knockout of the de novo Dnmt isoform, Dnmt3a, also masculinized sexual behavior in female mice. RNA sequencing revealed gene and isoform variants modulated by methylation that may underlie the divergent reproductive behaviors of males versus females. Our data show that brain feminization is maintained by the active suppression of masculinization via DNA methylation. PMID:25821913

  3. Pairwise selection assembly for sequence-independent construction of long-length DNA.

    PubMed

    Blake, William J; Chapman, Brad A; Zindal, Anuradha; Lee, Michael E; Lippow, Shaun M; Baynes, Brian M

    2010-05-01

    The engineering of biological components has been facilitated by de novo synthesis of gene-length DNA. Biological engineering at the level of pathways and genomes, however, requires a scalable and cost-effective assembly of DNA molecules that are longer than approximately 10 kb, and this remains a challenge. Here we present the development of pairwise selection assembly (PSA), a process that involves hierarchical construction of long-length DNA through the use of a standard set of components and operations. In PSA, activation tags at the termini of assembly sub-fragments are reused throughout the assembly process to activate vector-encoded selectable markers. Marker activation enables stringent selection for a correctly assembled product in vivo, often obviating the need for clonal isolation. Importantly, construction via PSA is sequence-independent, and does not require primary sequence modification (e.g. the addition or removal of restriction sites). The utility of PSA is demonstrated in the construction of a completely synthetic 91-kb chromosome arm from Saccharomyces cerevisiae.

  4. A complete DNA sequence map of the ovine Major Histocompatibility Complex

    PubMed Central

    2010-01-01

    Background The ovine Major Histocompatibility Complex (MHC) harbors clusters of genes involved in overall resistance/susceptibility of an animal to infectious pathogens. However, only a limited number of ovine MHC genes have been identified and no adequate sequence information is available, as compared to those of swine and bovine. We previously constructed a BAC clone-based physical map that covers entire class I, class II and class III region of ovine MHC. Here we describe the assembling of a complete DNA sequence map for the ovine MHC by shotgun sequencing of 26 overlapping BAC clones. Results DNA shotgun sequencing generated approximately 8-fold genome equivalent data that were successfully assembled into a finished sequence map of the ovine MHC. The sequence map spans approximately 2,434,000 nucleotides in length, covering almost all of the MHC loci currently known in the sheep and cattle. Gene annotation resulted in the identification of 177 protein-coding genes/ORFs, among which 145 were not previously reported in the sheep, and 10 were ovine species specific, absent in cattle or other mammals. A comparative sequence analyses among human, sheep and cattle revealed a high conservation in the MHC structure and loci order except for the class II, which were divided into IIa and IIb subregions in the sheep and cattle, separated by a large piece of non-MHC autosome of approximately 18.5 Mb. In addition, a total of 18 non-protein-coding microRNAs were predicted in the ovine MHC region for the first time. Conclusion An ovine MHC DNA sequence map was successfully assembled by shotgun sequencing of 26 overlapping BAC clone. This makes the sheep the second ruminant species for which the complete MHC sequence information is available for evolution and functional studies, following that of the bovine. The results of the comparative analysis support a hypothesis that an inversion of the ancestral chromosome containing the MHC has shaped the MHC structures of ruminants

  5. ITS1 sequence variabilities correlate with 18S rDNA sequence types in the genus Acanthamoeba (Protozoa: Amoebozoa).

    PubMed

    Köhsler, Martina; Leitner, Brigitte; Blaschitz, Marion; Michel, Rolf; Aspöck, Horst; Walochnik, Julia

    2006-01-01

    The subgenus classification of the ubiquitously spread and potentially pathogenic acanthamoebae still poses a great challenge. Fifteen 18S rDNA sequence types (T1-T15) have been established, but the vast majority of isolates fall into sequence type T4, and so far, there is no means to reliably differentiate within T4. In this study, the first internal transcribed spacer (ITS1), a more variable region than the 18S rRNA gene, was sequenced, and the sequences of 15 different Acanthamoeba isolates were compared to reveal if ITS1 sequence variability correlates with 18S rDNA sequence typing and if the ITS1 sequencing allows a differentiation within T4. It was shown that the variability in ITS1 is tenfold higher than in the 18S rDNA, and that ITS1 clusters correlate with the 18S rDNA clusters and thus corroborate the Acanthamoeba sequence type system. Moreover, high sequence dissimilarities and distinctive microsatellite patterns could enable a more detailed differentiation within T4.

  6. Preferred sequences for DNA recognition by the TAL1 helix-loop-helix proteins

    SciTech Connect

    Hai-Ling Hsu; Lan Huang; Julia Tsou Tsan

    1994-02-01

    Tumor-specific activation of the TAL1 gene is the most common genetic alteration seen in patients with T-cell acute lymphoblastic leukemia. The TAL1 gene products contain the basic helix-loop-helix (bHLH) domain, a protein dimerization and DNA-binding motif common to several known transcription factors. A binding-site selection procedure has now been used to evaluate the DNA recognition properties of TAL1. These studies demonstrate that TAL1 polypeptides do not have intrinsic DNA-binding activity, presumably because of their inability to form bHLH homodimers. However, TAL1 readily interacts with any of the known class A bHLH proteins (E12, E47, E2-2, and HEB) to form heterodimers that bind DNA in a sequence-specific manner. The TAL1 heterodimers preferentially recognize a subset of E-box elements (CANNTG) that can be represented by the consensus sequence AACAGATGGT. This consensus is composed of half-sites for recognition by the participating class A bHLH polypeptide (AACAG) and the TAL1 polypeptide (ATGGT). TAL1 heterodimers with DNA-binding activity are readily detected in nuclear extracts of Jurkat, a leukemic cell line derived from a patient with T-cell acute lymphoblastic leukemia. Hence, TAL1 is likely to bind and regulate the transcription of a unique subset of subordinate target genes, some of which may mediate the malignant function of TAL1 during T-cell leukemogenesis. 48 refs., 10 figs.

  7. DUC-Curve, a highly compact 2D graphical representation of DNA sequences and its application in sequence alignment

    NASA Astrophysics Data System (ADS)

    Li, Yushuang; Liu, Qian; Zheng, Xiaoqi

    2016-08-01

    A highly compact and simple 2D graphical representation of DNA sequences, named DUC-Curve, is constructed through mapping four nucleotides to a unit circle with a cyclic order. DUC-Curve could directly detect nucleotide, di-nucleotide compositions and microsatellite structure from DNA sequences. Moreover, it also could be used for DNA sequence alignment. Taking geometric center vectors of DUC-Curves as sequence descriptor, we perform similarity analysis on the first exons of β-globin genes of 11 species, oncogene TP53 of 27 species and twenty-four Influenza A viruses, respectively. The obtained reasonable results illustrate that the proposed method is very effective in sequence comparison problems, and will at least play a complementary role in classification and clustering problems.

  8. Sequencing mitochondrial DNA from a tooth and application to forensic odontology.

    PubMed

    Yamada, Y; Ohira, H; Iwase, H; Takatori, T; Nagao, M; Ohtani, S

    1997-06-01

    Genetic identification can be complicated by long intervals between the time of death and examination of tissues, and sometimes only bone and teeth may be available for analysis. Several investigators have described the isolation of nuclear DNA from these materials, but all have indicated that the DNA is significantly degraded. Recently, the polymerase chain reaction (PCR) and direct DNA sequencing have enabled rapid and reliable characterization of specific highly polymorphic DNA sequences from different individuals. Above all, mitochondrial DNA sequences offer several unique advantages for the identification of human remains. The isolation of mtDNA from a tooth and the symmetrical PCR amplification and direct DNA sequencing of its most polymorphic regions are reported.

  9. Sequence-specific electrochemical detection of asymmetric PCR amplicons of traditional Chinese medicinal plant DNA.

    PubMed

    Lee, Thomas M H; Hsing, I-Ming

    2002-10-01

    In this study, an electrochemistry-based approach to detect nucleic acid amplification products of Chinese herbal genes is reported. Using asymmetric polymerase chain reaction and electrochemical techniques, single-stranded target amplicons are produced from trace amounts of DNA sample and sequence-specific electrochemical detection based on the direct hybridization of the crude amplicon mix and immobilized DNA probe can be achieved. Electrochemically active intercalator Hoechst 33258 is bound to the double-stranded duplex formed by the target amplicon hybridized with the 5'-thiol-derivated DNA probe (16-mer) on the gold electrode surface. The electrochemical current signal of the hybridization event is measured by linear sweep voltammetry, the response of which can be used to differentiate the sequence complementarities of the target amplicons. To improve the reproducibility and sensitivity of the current signal, issues such as electrode surface cleaning, probe immobilization, and target hybridization are addressed. Factors affecting hybridization efficiency including the length and binding region of the target amplicon are discussed. Using our approach, differentiation of Chinese herbal species Fritillaria (F. thunbergii and F. cirrhosa) based on the 16-mer unique sequences in the spacer region of the 5S-rRNA is demonstrated. The ability to detect PCR products using a nonoptical electrochemical detection technique is an important step toward the realization of portable biomicrodevices for on-spot bacterial and viral detections.

  10. A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...

  11. Molecular Mechanisms of DNA Replication Checkpoint Activation

    PubMed Central

    Recolin, Bénédicte; van der Laan, Siem; Tsanov, Nikolay; Maiorano, Domenico

    2014-01-01

    The major challenge of the cell cycle is to deliver an intact, and fully duplicated, genetic material to the daughter cells. To this end, progression of DNA synthesis is monitored by a feedback mechanism known as replication checkpoint that is untimely linked to DNA replication. This signaling pathway ensures coordination of DNA synthesis with cell cycle progression. Failure to activate this checkpoint in response to perturbation of DNA synthesis (replication stress) results in forced cell division leading to chromosome fragmentation, aneuploidy, and genomic instability. In this review, we will describe current knowledge of the molecular determinants of the DNA replication checkpoint in eukaryotic cells and discuss a model of activation of this signaling pathway crucial for maintenance of genomic stability. PMID:24705291

  12. Differences in sequence selectivity of DNA alkylation by isomeric intercalating aniline mustards.

    PubMed

    Prakash, A S; Denny, W A; Wakelin, L P

    1990-01-01

    Two DNA-targeted mustard derivatives, N,N-bis(2-chloroethyl)-4-(5-[9-acridinylamino]-pentamido)aniline and 4-(9-[acridinylamino]butyl 4-(N,N-bis[2-chloroethyl]-aminobenzamide, which are isomeric compounds where the mustard is linked to the DNA-binding 9-aminoacridine moiety by either a -CONH- or a -NHCO- group, show significant differences in the sequence selectivity of their alkylation of DNA. The CONH isomer is a more efficient alxylating agent than the NHCO compound by an order of magnitude, consistent with the larger electron release of the CONH group to the aniline ring. However, the pattern of alkylation by the two compounds is also very different, with the CONH isomer preferring alkylation of guanines adjacent to 3'- or 5'-adenines and cytosines (for example those in sequences 5'-CGC, 5'-AGC, 5'-CGG and 5'-AGA) while the isomeric NHCO compound shows preference for guanines in runs of Gs. In addition, both isomers alkylate 3'-adenines in runs of adenines. Both compounds also show completely different patterns of alkylation to their untargeted mustard counterparts, since 4-MeCONH-aniline mustard alkylates all guanines and adenines in runs of adenines, while 4-Me2NCO-aniline mustard fails to alkylate DNA at all. These differences in alkylation patterns between the CONH- and its isomeric NHCO- compounds and their relationships between the alkylation patterns of the isomers and their biological activities are discussed.

  13. Enrichment of error-free synthetic DNA sequences by CEL I nuclease.

    PubMed

    Hughes, Randall A; Miklos, Aleksandr E; Ellington, Andrew D

    2012-07-01

    As the availability of DNA sequence information has grown, so has the need to replicate DNA sequences synthetically. Synthetically produced DNA sequences allow the researcher to exert greater control over model systems and allow for the combinatorial design and construction of novel metabolic and regulatory pathways, as well as optimized protein-coding sequences for biotechnological applications. This utility has made synthetically produced DNA a hallmark of the molecular biosciences and a mainstay of synthetic biology. However, synthetically produced DNA has a significant shortcoming in that it typically has an error rate that is orders of magnitude higher when compared to DNA sequences derived directly from a biological source. This relatively high error rate adds to the cost and labor necessary to obtain sequence-verified clones from synthetically produced DNA sequences. This unit describes a protocol to enrich error-free sequences from a population of error-rich DNA via treatment with CEL I (Surveyor) endonuclease. This method is a straightforward and quick way of reducing the error content of synthetic DNA pools and reliably reduces the error rates by >6-fold per round of treatment.

  14. [Sequencing of low-molecular-weight DNA in blood plasma of irradiated rats].

    PubMed

    Vasilieva, I N; Bespalov, V G; Zinkin, V N; Podgornaya, O I

    2015-01-01

    Extracellular low-molecular-weight DNA in blood of irradiated rats was sequenced for the first time. The screening of sequences in the DDBJ database displayed homology of various parts of the rodent genome. Sequences of low-molecular-weight DNA in rat's plasma are enriched with G/C pairs and long interspersed elements relative to rat genome. DNA sequences in blood of rats irradiated at the doses of 8 and 100 Gy have marked distinctions. Data of sequencing of extracellular DNA from normal humans and with pathology were analyzed. DNA sequences of irradiated rats differ from the human ones by a wealth of long interspersed elements. This new knowledge lays the foundation for development of minimally invasive technologies of diagnosing the probability of pathology and controlling the adaptive resources of people in extreme environments.

  15. Analysis of Ori-S sequence of HSV-1: identification of one functional DNA binding domain.

    PubMed Central

    Deb, S; Deb, S P

    1989-01-01

    Using gel retardation assays, we have detected an Ori-S binding activity in the nuclear extract of HSV-1 infected Vero cells. The sequence-specific DNA binding activity seems to be identical to that described by Elias et al. (Proc. Natl. Acad. Sci. USA 83: 6322-6326, 1986). This activity fails to retard a mutant origin DNA that has a 5 bp deletion in the reported protein binding site along with an A to T substitution at a position 16 base-pairs away from the site. This mutant also failed to replicate in a transient replication assay, thus correlating binding of the factor on the origin to replication efficiency. Using crude nuclear extracts as the source of the factor and with the help of footprint and gel retardation analyses, we confirmed that protection is only observed on the preferred site of binding on and near the left arm of the Ori-S palindrome. In order to analyze the sequence specificity of the binding we have generated a set of binding site mutants. Competition experiments with these mutant origins indicate that the sequence 5'-TTCGCACTT-3' is crucial for binding. Images PMID:2541411

  16. Remodelers Organize Cellular Chromatin by Counteracting Intrinsic Histone-DNA Sequence Preferences in a Class-Specific Manner

    PubMed Central

    Chalkley, Gillian E.; Kan, Tsung Wai; Reddy, B. Ashok; Ozgur, Zeliha; van Ijcken, Wilfred F. J.; Dekkers, Dick H. W.; Demmers, Jeroen A.; Travers, Andrew A.

    2012-01-01

    The nucleosome is the fundamental repeating unit of eukaryotic chromatin. Here, we assessed the interplay between DNA sequence and ATP-dependent chromatin-remodeling factors (remodelers) in the nucleosomal organization of a eukaryotic genome. We compared the genome-wide distribution of Drosophila NURD, (P)BAP, INO80, and ISWI, representing the four major remodeler families. Each remodeler has a unique set of genomic targets and generates distinct chromatin signatures. Remodeler loci have characteristic DNA sequence features, predicted to influence nucleosome formation. Strikingly, remodelers counteract DNA sequence-driven nucleosome distribution in two distinct ways. NURD, (P)BAP, and INO80 increase histone density at their target sequences, which intrinsically disfavor positioned nucleosome formation. In contrast, ISWI promotes open chromatin at sites that are propitious for precise nucleosome placement. Remodelers influence nucleosome organization genome-wide, reflecting their high genomic density and the propagation of nucleosome redistribution beyond remodeler binding sites. In transcriptionally silent early embryos, nucleosome organization correlates with intrinsic histone-DNA sequence preferences. Following differential expression of the genome, however, this relationship diminishes and eventually disappears. We conclude that the cellular nucleosome landscape is the result of the balance between DNA sequence-driven nucleosome placement and active nucleosome repositioning by remodelers and the transcription machinery. PMID:22124157

  17. Isolation and sequencing of the cDNA of a novel cytochrome P450 from rat oesophagus.

    PubMed

    Brookman-Amissah, N; Mackay, A G; Swann, P F

    2001-01-01

    RT-PCR was used to find whether cytochromes P450 of the 2A, 2B and 2E sub-families are expressed in the rat oesophagus. This showed that this tissue expresses a previously unknown member of the CYP2B sub-family, now designated CYP2B21. Using a combination of 5'- and 3'-RACE (rapid amplification of cDNA ends) and library screening, the cDNA was amplified and sequenced. The cDNA sequence (GenBank accession no. AF159245) covers the whole of the coding region and the whole of the 3'-untranslated region (UTR), but only 17 nt of the 5'-UTR. The DNA sequence has strong similarity to those of CYP2B1 and CYP2B2, with the derived amino acid sequence being 84 and 83% identical, respectively. The ease with which this cDNA was found in the cDNA library suggests that CYP2B21 is a major P450 of the oesophagus. The catalytic activity of this new CYP2B is not yet known, but as previous authors have reported that other members of this sub-family (CYP2B1 or 2B2) metabolize the selective oesophageal carcinogen N:-nitrosomethylbutylamine with the chemical selectivity necessary for carcinogenesis, i.e. they preferentially hydroxylate the alpha-carbon of the butyl chain, this new CYP2B may be the nitrosamine-activating enzyme of the oesophagus.

  18. Characterization of EBV Promoters and Coding Regions by Sequencing PCR-Amplified DNA Fragments.

    PubMed

    Szenthe, Kalman; Bánáti, Ferenc

    2017-01-01

    DNA sequencing approaches originally developed in two directions, the chemical degradation method and the chain-termination method. The latter one became more widespread and a huge amount of sequencing data including whole genome sequences accumulated, based on the use of capillary sequencer systems and the application of a modified chain-termination method which proved to be relatively easy, fast, and reliable. In addition, relatively long, up to 1000 bp sequences could be obtained with a single read with high per-base accuracy. Although the recent appearance of next-generation DNA sequencing (NGS) technologies enabled high-throughput and low cost analysis of DNA, the modified chain-terminating methods are often applied in research until now. In the following, we shall present the application of capillary sequencing for the sequence characterization of viral genomes in case of partial and whole genome sequencing, and demonstrate it on the BARF1 promoter of Epstein Barr virus (EBV).

  19. mtDNAprofiler: a Web application for the nomenclature and comparison of human mitochondrial DNA sequences.

    PubMed

    Yang, In Seok; Lee, Hwan Young; Yang, Woo Ick; Shin, Kyoung-Jin

    2013-07-01

    Mitochondrial DNA (mtDNA) is a valuable tool in the fields of forensic, population, and medical genetics. However, recording and comparing mtDNA control region or entire genome sequences would be difficult if researchers are not familiar with mtDNA nomenclature conventions. Therefore, mtDNAprofiler, a Web application, was designed for the analysis and comparison of mtDNA sequences in a string format or as a list of mtDNA single-nucleotide polymorphisms (mtSNPs). mtDNAprofiler which comprises four mtDNA sequence-analysis tools (mtDNA nomenclature, mtDNA assembly, mtSNP conversion, and mtSNP concordance-check) supports not only the accurate analysis of mtDNA sequences via an automated nomenclature function, but also consistent management of mtSNP data via direct comparison and validity-check functions. Since mtDNAprofiler consists of four tools that are associated with key steps of mtDNA sequence analysis, mtDNAprofiler will be helpful for researchers working with mtDNA. mtDNAprofiler is freely available at http://mtprofiler.yonsei.ac.kr.

  20. Model System for DNA Replication of a Plasmid DNA Containing the Autonomously Replicating Sequence from Saccharomyces cerevisiae

    NASA Astrophysics Data System (ADS)

    Ishimi, Yukio; Matsumoto, Ken

    1993-06-01

    A negatively supercoiled plasmid DNA containing autonomously replicating sequence (ARS) 1 from Saccharomyces cerevisiae was replicated with the proteins required for simian virus 40 DNA replication. The proteins included simian virus 40 large tumor antigen as a DNA helicase, DNA polymerase α^\\cdotprimase, and the multisubunit human single-stranded DNA-binding protein from HeLa cells; DNA gyrase from Escherichia coli, which relaxes positive but not negative supercoils, was included as a "swivelase." DNA replication started from the ARS region, proceeded bidirectionally with the synthesis of leading and lagging strands, and resulted in the synthesis of up to 10% of the input DNA in 1 h. The addition of HeLa DNA topoisomerase I, which relaxes both positive and negative supercoils, to this system inhibited DNA replication, suggesting that negative supercoiling of the template DNA is required for initiation. These results suggest that DNA replication starts from the ARS region where the DNA duplex is unwound by torsional stress; this unwound region can be recognized by a DNA helicase with the assistance of the multisubunit human single-stranded DNA-binding protein.

  1. Organization of gene and non-gene sequences in micronuclear DNA of Oxytricha nova.

    PubMed Central

    Boswell, R E; Jahn, C L; Greslin, A F; Prescott, D M

    1983-01-01

    In order to study the derivation of the macronuclear genome from the micronuclear genome in Oxytricha nova micronuclear DNA was partially digested with EcoRI, size fractionated, and then cloned in the lambda phage Charon 8. Clones were selected a) at random b) by hybridization with macronuclear DNA or c) by hybridization with clones of macronuclear DNA. One group of these clones contains only unique sequence DNA, and all of these had sequences that were homologous to macronuclear sequences. The number of macronuclear genes with sequences homologous to these micronuclear clones indicates that macronuclear sequences are clustered in the micronuclear genome. Many micronuclear clones contain repetitive DNA sequences and hybridize to numerous EcoRI fragments of total micronuclear DNA, yielding similar but non-identical patterns. Some micronuclear clones containing these repetitive sequences also contained unique sequence DNA that hybridized to a macronuclear sequence. These clones define a major interspersed repetitive sequence family in the micronuclear genome that is eliminated during formation of the macronuclear genome. Images PMID:6304639

  2. Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

    PubMed

    Yin, Changchuan

    2015-04-01

    To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.

  3. Selective release of excreted DNA sequences from phytohemagglutinin-stimulated human peripheral blood lymphocytes. Effects of trypsin and divalent cations.

    PubMed Central

    Distelhorst, C W; Cramer, K; Rogers, J C

    1978-01-01

    We studied the synthesis of excreted DNA sequences and their release from phytohemagglutinin-stimulated human peripheral blood lymphocytes under conditions permitting optimal cell growth. Cells were labeled by constant exposure to low specific activity [3H]thymidine. Excreted DNA sequences were synthesized during the period of logarithmic cell growth and moved slowly from the high molecular weight chromosomal DNA fraction into the low molecular weight cell DNA fraction (Hirt supernate) from which they could be specifically released by treating the cells briefly with small amounts of various proteases; 1 microgram/ml trypsin for 5 min was optimal. On day 5 of culture, 13.3 +/- 6.9% of the total cellular acid-precipitable [3H]thymidine was released by this treatment. Trypsin-induced release was partially and reversibly inhibited by incubating the cells for 16 h with 5 mM dibutyryl-cyclic AMP. Cells incubated in the absence of divalent cations spontaneously released this Hirt supernatant DNA; after maximal release had occurred under these circumstances, additional trypsin treatment caused no further release of DNA. Trypsin-induced DNA release could be completely and reversibly inhibited by incubating the cells in the presence of 10 mM calcium. Trypsin-released DNA was isolated and analyzed by reassociation kinetics. A major component, representing 54% of the DNA, reassociated with a C0t1/2 of 68 mol.s/liter (the value at which DNA association is 50% complete). The reassociation of this DNA was studied in the presence of an excess of DNA isolated from stimulated lymphocytes on day 3 in culture, and in the presence of an excess of resting lymphocyte DNA. The high molecular weight fraction of day-3 cell DNA contained three times more copies of the trypsin-released DNA major component as compared to resting lymphocyte DNA. Hirt supernatant DNA isolated from day-5 stimulated lymphocytes reassociated in an intermediate component representing 34% of the DNA with a Cot1/2 of

  4. The complete sequence of soybean chlorotic mottle virus DNA and the identification of a novel promoter.

    PubMed

    Hasegawa, A; Verver, J; Shimada, A; Saito, M; Goldbach, R; Van Kammen, A; Miki, K; Kameya-Iwaki, M; Hibi, T

    1989-12-11

    The complete nucleotide sequence of an infectious clone of soybean chlorotic mottle virus (SoyCMV) DNA was determined and compared with those of three other caulimoviruses, cauliflower mosaic virus (CaMV), carnation etched ring virus and figwort mosaic virus. The double-stranded DNA genome of SoyCMV (8,175 bp) contained nine open reading frames (ORFs) and one large intergenic region. The primer binding sites, gene organization and size of ORFs were similar to those of the other caulimoviruses, except for ORF I, which was split into ORF Ia and Ib. The amino acid sequences deduced from each ORF showed only short, highly homologous regions in several of the corresponding ORFs of the three other caulimoviruses. A promoter fragment of 378 bp in SoyCMV ORF III showed a strong expression activity, comparable to that of the CaMV 35S promoter, in tobacco mesophyll protoplasts as determined by a beta-glucuronidase assay using electrotransfection. The fragment contained CAAT and TATA boxes but no transcriptional enhancer signal as reported for the CaMV 35S promoter. Instead, it had sequences homologous to a part of the translational enhancer signal reported for the 5'-leader sequence of tobacco mosaic virus RNA.

  5. Complete sequence analysis of 18S rDNA based on genomic DNA extraction from individual Demodex mites (Acari: Demodicidae).

    PubMed

    Zhao, Ya-E; Xu, Ji-Ru; Hu, Li; Wu, Li-Ping; Wang, Zheng-Hang

    2012-05-01

    The study for the first time attempted to accomplish 18S ribosomal DNA (rDNA) complete sequence amplification and analysis for three Demodex species (Demodex folliculorum, Demodex brevis and Demodex canis) based on gDNA extraction from individual mites. The mites were treated by DNA Release Additive and Hot Start II DNA Polymerase so as to promote mite disruption and increase PCR specificity. Determination of D. folliculorum gDNA showed that the gDNA yield reached the highest at 1 mite, tending to descend with the increase of mite number. The individual mite gDNA was successfully used for 18S rDNA fragment (about 900 bp) amplification examination. The alignments of 18S rDNA complete sequences of individual mite samples and those of pooled mite samples ( ≥ 1000mites/sample) showed over 97% identities for each species, indicating that the gDNA extracted from a single individual mite was as satisfactory as that from pooled mites for PCR amplification. Further pairwise sequence analyses showed that average divergence, genetic distance, transition/transversion or phylogenetic tree could not effectively identify the three Demodex species, largely due to the differentiation in the D. canis isolates. It can be concluded that the individual Demodex mite gDNA can satisfy the molecular study of Demodex. 18S rDNA complete sequence is suitable for interfamily identification in Cheyletoidea, but whether it is suitable for intrafamily identification cannot be confirmed until the ascertainment of the types of Demodex mites parasitizing in dogs.

  6. Synergy of Two Assembly Languages in DNA Nanostructures: Self-Assembly of Sequence-Defined Polymers on DNA Cages.

    PubMed

    Chidchob, Pongphak; Edwardson, Thomas G W; Serpell, Christopher J; Sleiman, Hanadi F

    2016-04-06

    DNA base-pairing is the central interaction in DNA assembly. However, this simple four-letter (A-T and G-C) language makes it difficult to create complex structures without using a large number of DNA strands of different sequences. Inspired by protein folding, we introduce hydrophobic interactions to expand the assembly language of DNA nanotechnology. To achieve this, DNA cages of different geometries are combined with sequence-defined polymers containing long alkyl and oligoethylene glycol repeat units. Anisotropic decoration of hydrophobic polymers on one face of the cage leads to hydrophobically driven formation of quantized aggregates of DNA cages, where polymer length determines the cage aggregation number. Hydrophobic chains decorated on both faces of the cage can undergo an intrascaffold "handshake" to generate DNA-micelle cages, which have increased structural stability and assembly cooperativity, and can encapsulate small molecules. The polymer sequence order can control the interaction between hydrophobic blocks, leading to unprecedented "doughnut-shaped" DNA cage-ring structures. We thus demonstrate that new structural and functional modes in DNA nanostructures can emerge from the synergy of two interactions, providing an attractive approach to develop protein-inspired assembly modules in DNA nanotechnology.

  7. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A. ); Haces, A.; Shih, P.J.; Harding, J.D. )

    1993-01-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential